Calculator Input
Example Data Table
This example uses the same default sequence list preloaded in the calculator.
| Rank | Sequence Length (bp) | Cumulative Length (bp) | Notes |
|---|---|---|---|
| 1 | 900,000 | 900,000 | Largest sequence |
| 2 | 750,000 | 1,650,000 | Still below 50% threshold |
| 3 | 600,000 | 2,250,000 | N50 reached here |
| 4 | 500,000 | 2,750,000 | NG50 reached if target is 5,000,000 bp |
| 5 | 420,000 | 3,170,000 | Useful continuity support |
| 6 | 380,000 | 3,550,000 | Median falls near this range |
| 7 | 250,000 | 3,800,000 | Approaching N90 threshold |
| 8 | 200,000 | 4,000,000 | N90 reached here |
| 9 | 150,000 | 4,150,000 | Lower tail |
| 10 | 100,000 | 4,250,000 | Lower tail |
| 11 | 50,000 | 4,300,000 | Smallest retained sequence |
Formula Used
1. Assembly span = sum of all retained sequence lengths.
2. N50 = sequence length at which cumulative sorted length reaches 50% of assembly span.
3. L50 = number of longest sequences needed to reach the N50 threshold.
4. N90 and L90 use the same method, but with a 90% threshold.
5. NG50 = sequence length where cumulative sorted length reaches 50% of the estimated genome size.
6. auN = Σ(length²) ÷ Σ(length). It weighs long sequences more heavily than simple averages.
The calculator first sorts all retained lengths from largest to smallest. It then walks through cumulative totals until each requested threshold is reached.
How to Use This Calculator
- Paste sequence lengths into the text area.
- Select the unit used by your values.
- Optionally set a minimum length filter.
- Enter a custom Nx threshold if needed.
- Add an estimated genome size for NG metrics.
- Press the calculate button to view the result section.
- Review the metrics, graph, and processed ranking table.
- Export the results as CSV or PDF when needed.
FAQs
1. What does N50 measure in genome assembly?
N50 measures continuity. It identifies the sequence length where the cumulative total of sorted sequences reaches half of the retained assembly span.
2. Why is L50 reported with N50?
L50 shows how many longest sequences were required to hit the 50% threshold. Lower L50 values usually indicate a more continuous assembly.
3. What is the difference between N50 and NG50?
N50 uses the observed assembly span. NG50 uses an expected genome size. NG50 is helpful when assemblies have incomplete total span.
4. Should I filter short sequences before calculating?
Filtering can be useful when very short fragments are noise or artifacts. However, always report your chosen threshold because filtering changes continuity metrics.
5. Is a higher N50 always better?
Not always. A higher N50 suggests larger assembled pieces, but assembly accuracy, completeness, duplication, and contamination still matter.
6. What does auN add beyond N50?
auN gives more weight to long sequences across the full distribution. It is often more informative when comparing assemblies with different length patterns.
7. Can I use scaffold lengths instead of contig lengths?
Yes. The calculator works with any positive sequence lengths. Just stay consistent when comparing assemblies or reporting results.
8. Why might NG90 show “Not reached”?
That means the retained assembly span never reached 90% of the estimated genome size. The data are insufficient for that threshold.