Advanced Genome Size Calculator

Genome size calculator form

Choose one estimation method, enter your measurements, and submit. The result will appear above this form.

Estimation method

Pick the method matching your available experimental data.

Ploidy

Used to report total nuclear DNA content from the haploid estimate.

Optional assembly span (Mb)

Optional. Used to compare assembly span with the estimated haploid genome size.

Coverage-based inputs

Read count or pair count

Use total single reads or paired read units, depending on the setting below.

Average read length (bp)

Average usable bases per read after trimming.

Expected coverage depth (x)

Average depth expected across the haploid genome.

Mapped or usable fraction (%)

Percent of bases treated as effective genome-derived sequence.

Read layout

Choose how your count should be converted into total sequenced bases.

K-mer peak inputs

Total valid k-mers

Use the histogram total after filtering obvious error k-mers.

Main peak depth

Use the dominant genomic peak, not the low-depth error shoulder.

Usable fraction (%)

Adjust if only a portion of total k-mers is suitable for estimation.

Flow cytometry ratio inputs

Sample fluorescence

Use the corrected mean or peak fluorescence for the unknown sample.

Reference fluorescence

Use a matched reference standard measured under the same conditions.

Reference genome size

Enter the known genome size for the reference standard.

Reference size unit

The calculator converts the reference into base pairs internally.

Example data table

These examples show how different biological workflows can produce a genome size estimate.

Method	Example inputs	Estimated haploid size	Total DNA at ploidy 2	Interpretation
Coverage-based	50,000,000 paired reads, 150 bp, 30×, 90% usable	450.00 Mb	900.00 Mb	Useful when total bases and expected depth are known.
K-mer peak	120,000,000,000 valid k-mers, peak depth 40, 95% usable	2,850.00 Mb	5,700.00 Mb	Helpful for short-read survey data and genome profiling.
Flow cytometry ratio	Sample 240, reference 200, standard 3,200 Mb	3,840.00 Mb	7,680.00 Mb	Best when fluorescence ratios are measured against a standard.

Formula used

1) Coverage-based estimate

Genome size (bp) = Effective sequenced bases / Coverage depth

Effective sequenced bases = Read count × Read length × Layout factor × Usable fraction.

2) K-mer peak estimate

Genome size (bp) = Effective k-mers / Peak depth

Effective k-mers = Total valid k-mers × Usable fraction.

3) Flow cytometry ratio estimate

Genome size (bp) = (Sample fluorescence / Reference fluorescence) × Reference genome size

This scales the known reference size by the observed fluorescence ratio.

Unit conversion

1 Mb = 1,000,000 bp, 1 Gb = 1,000,000,000 bp, and 1 pg ≈ 978 Mb.

How to use this calculator

Select the estimation method matching your experiment.
Enter ploidy so total nuclear DNA content can be reported correctly.
Fill in the method-specific measurements, such as coverage, k-mer peak, or fluorescence ratio values.
Optionally enter assembly span in Mb to compare assembly completeness with the estimated haploid genome size.
Click Calculate Genome Size to display the result above the form.
Review the summary, detailed method table, and sensitivity graph.
Use the CSV or PDF buttons to save the calculation report.

FAQs

1) What does genome size mean?

Genome size is the amount of DNA in one haploid set of chromosomes. It is usually reported in base pairs, megabases, gigabases, or picograms.

2) Which method should I choose?

Choose coverage-based when sequencing yield and expected depth are known. Choose k-mer when you have histogram totals and a clear peak. Choose flow cytometry when fluorescence is measured against a reference standard.

3) Why is ploidy entered separately?

Most genome size discussions use the haploid genome. Ploidy helps convert that estimate into total nuclear DNA content for diploid, triploid, or polyploid cells.

4) What can make a coverage estimate inaccurate?

Bias can come from contamination, organelle reads, uneven depth, duplicated reads, poor trimming, or a wrong assumption about the fraction of bases that truly represent the target genome.

5) What can make a k-mer estimate inaccurate?

Using the wrong peak is common. Error k-mers, repeats, heterozygosity, contamination, and incomplete histogram filtering can all shift the final estimate.

6) Why compare assembly span with the estimate?

Assembly span helps you judge completeness. A much smaller assembly may indicate missing repeats, collapsed regions, contamination filtering, or conservative assembly settings.

7) What does the graph show?

The graph shows how the estimated genome size changes if your key assumption shifts. That key assumption is coverage depth, k-mer peak depth, or fluorescence ratio, depending on the chosen method.

8) Can I report results in picograms?

Yes. The calculator automatically converts the estimate into picograms using the common approximation that 1 pg of DNA equals about 978 megabases.