Calculator Inputs
Example Data Table
| Candidate | Motif | Best Window | Accessibility | Conservation | Overall Score | Class |
|---|---|---|---|---|---|---|
| Promoter_A | TGGGCGTG | TGGGCGTG | 78 | 0.84 | 80.42 | High-confidence candidate |
| Enhancer_B | ACCGTRAA | ACCGTGAA | 63 | 0.59 | 64.35 | Moderate-confidence candidate |
| Region_C | TTATTA | TCATTA | 41 | 0.31 | 43.18 | Low-confidence candidate |
Formula Used
This estimator combines sequence matching with contextual biology. It is a screening model and not a substitute for ChIP-seq, EMSA, reporter assays, or orthogonal validation.
- Motif match score = (best IUPAC-compatible matches ÷ motif length) × 100.
- GC suitability = clamp[100 − (|GC% − 55| ÷ 30) × 100].
- CpG suitability = clamp[100 − (|CpG ratio − 0.9| ÷ 0.9) × 100].
- TSS proximity score = clamp[100 − (|distance to TSS| ÷ 20)].
- Expression support score = ((correlation + 1) ÷ 2) × 100.
- Conservation score = conservation input × 100.
- Overall prediction score = Σ(subscore × preset weight) ÷ 100.
How to Use This Calculator
- Paste a candidate DNA region, ideally centered around the suspected binding site.
- Enter the transcription factor consensus motif using standard or degenerate IUPAC symbols.
- Choose a preset. Strict is narrower, exploratory is broader, and balanced is general-purpose.
- Add promoter accessibility, chromatin openness, conservation, expression, and co-factor evidence.
- Enter enhancer evidence and distance to the transcription start site.
- Submit the form to see the binding score, best motif window, subscores, and Plotly chart.
- Export the result as CSV or PDF for reporting, internal review, or experiment planning.
Frequently Asked Questions
1. What does this predictor estimate?
It estimates how likely a sequence region is to support transcription factor binding by combining motif quality with promoter and chromatin context features.
2. Does a high score prove binding?
No. A high score means the site is a stronger candidate for validation. Real binding still depends on cellular state, occupancy, competition, and assay conditions.
3. Can I use degenerate motif symbols?
Yes. The calculator supports common IUPAC motif characters, including R, Y, S, W, K, M, B, D, H, V, and N.
4. Why are GC and CpG included?
They help capture sequence composition and promoter-island behavior. Some transcription factor targets are more plausible within suitable GC and CpG environments.
5. Which preset should I choose?
Use balanced for routine screening, strict when false positives are costly, and exploratory when you want broader hypothesis generation.
6. How is distance to TSS handled?
The model converts absolute distance into a proximity score. Sites nearer the transcription start site receive higher promoter-proximity support.
7. Can this replace ChIP-seq or EMSA?
No. This page is a prioritization and educational tool. Experimental confirmation remains necessary before drawing biological conclusions.
8. What sequence length works best?
The sequence must be at least as long as the motif. In practice, modest promoter or enhancer windows provide more useful contextual interpretation.