Calculator Inputs
Example Data Table
These examples assume equal base frequencies, no methylation block, complete digestion, and a 1 Mb genome.
| Enzyme | Recognition Site | Site Probability | Mean Interval | Expected Cuts in 1 Mb |
|---|---|---|---|---|
| EcoRI | GAATTC | 1 / 4096 | 4.096 kb | 244.14 |
| HindIII | AAGCTT | 1 / 4096 | 4.096 kb | 244.14 |
| BamHI | GGATCC | 1 / 4096 | 4.096 kb | 244.14 |
| HaeIII | GGCC | 1 / 256 | 256 bp | 3906.25 |
| HinfI | GANTC | 1 / 256 | 256 bp | 3906.25 |
| NotI | GCGGCCGC | 1 / 65536 | 65.536 kb | 15.26 |
Formula Used
The calculator multiplies the allowed base probabilities across every site position.
P(site) = ∏ [ Σ P(base allowed at position i) ] Mean interval = 1 / P(site) Adjusted P(site) = P(site) × digestion efficiency × (1 − methylation block) Expected cuts = genome size × Adjusted P(site) Expected fragments = cuts + 1 for linear DNA, or cuts for circular DNA Effective exact-site length = −log(P(site)) / log(4)IUPAC handling: Ambiguous letters expand to allowed bases.
R means A or G. Y means C or T. N means any base.
At 50% GC, a fully specific six-base site appears once every 4,096 bases on average.
How to Use This Calculator
- Enter the enzyme recognition sequence.
- Choose the genome size and matching unit.
- Select GC-based or custom base probabilities.
- Adjust methylation blocking and digestion efficiency.
- Choose linear or circular genome topology.
- Submit the form to view frequency, cuts, fragments, and spacing.
- Use the graph for scaling trends.
- Download CSV or PDF reports for records.
FAQs
1. What does cutting frequency mean?
Cutting frequency is the expected spacing between matching recognition sites in random DNA. A probability of 1/4096 means one site appears about every 4096 bases before adjustment.
2. Why does GC content change the answer?
GC-rich genomes favor G and C positions. AT-rich genomes favor A and T positions. Recognition sites containing many G or C bases become more common as GC content rises.
3. How are ambiguous letters handled?
The calculator uses IUPAC rules. For each ambiguous position, it sums the probabilities of every allowed base. Then it multiplies those position probabilities across the whole site.
4. What does methylation blocking represent?
Methylation can prevent some enzymes from cutting otherwise matching sites. This field reduces the effective cutting probability after the site-match probability is calculated.
5. What does digestion efficiency represent?
Digestion efficiency approximates incomplete cutting caused by reaction limits, inhibitor carryover, or short incubation. Lower efficiency reduces expected cuts and increases the adjusted mean interval.
6. Why are fragments different for linear and circular DNA?
For linear DNA, expected fragments are roughly cuts plus one. For circular DNA, expected fragments are roughly equal to the number of cuts because there are no free ends.
7. Are these values exact for real genomes?
No. These are expectation values under a probabilistic model. Real genomes contain repeats, local bias, methylation patterns, and nonrandom motifs that can shift actual cut counts.
8. What is effective exact-site length?
It converts the actual site probability into an equivalent number of perfectly specific bases. Ambiguous positions and biased base compositions can make a site behave shorter or longer.