Codon Usage Bias Calculator

Sequence Input Form

Sequence Name

Reading Frame

CAI Weight Source

Coding Sequence or FASTA

Only A, C, G, T, or U bases are used. FASTA headers and non-nucleotide characters are removed automatically.

Custom CAI Weights

Enter one codon weight per line with values from 0.001 to 1.000. Leave blank to skip CAI when custom mode is selected.

Exclude stop codons from counting and bias summaries

Example Data Table

Sample	Sequence Length	GC Content	GC3s	ENC	CAI
Gene A	900 bp	54.20%	62.50%	34.88	0.8421
Gene B	1200 bp	48.75%	45.00%	47.36	0.6914
Gene C	660 bp	61.10%	73.18%	29.55	0.9038

Formula Used

RSCU: Relative synonymous codon usage is calculated as observed codon count divided by expected count within the same amino acid family, where expected count equals family total divided by synonymous codon number.

GC Content: GC% = ((G + C) / total nucleotides) × 100.

GC3s: GC3s% = (synonymous codons ending in G or C / total analyzed codons) × 100.

CAI: Codon adaptation index uses the geometric mean of codon weights, CAI = exp[(Σ ln w_i) / L], where w_i is each codon weight and L is codon count.

ENC: Effective number of codons is estimated from average family homozygosity values using ENC = 2 + 9/F₂ + 1/F₃ + 5/F₄ + 3/F₆.

How to Use This Calculator

Enter a coding sequence or paste FASTA content into the sequence field.
Choose the reading frame that matches the coding region.
Select whether to exclude stop codons from the analysis.
Provide custom CAI codon weights, or use the built-in example set.
Press Calculate Bias to show the result above the form.
Review RSCU, preferred codons, GC content, GC3s, ENC, and optional CAI.
Export the summary and codon table with the CSV button.
Use the PDF button to save a print-ready report.

Frequently Asked Questions

1. What does codon usage bias measure?

It measures how unevenly synonymous codons are used in a coding sequence. Bias can reflect mutational pressure, translational efficiency, gene expression patterns, or selection on codon preference.

2. Why is RSCU useful?

RSCU normalizes codon counts within each amino acid family. That makes it easier to compare preference strength across synonymous codons, even when amino acids occur at different frequencies.

3. What does a low ENC value suggest?

A lower ENC usually indicates stronger codon bias, meaning fewer synonymous codons dominate usage. Values closer to 61 suggest weaker bias and more even synonymous codon use.

4. When should I enter custom CAI weights?

Enter custom weights when you have a species-specific reference set or highly expressed gene panel. That makes CAI more relevant to your organism and experimental context.

5. Does the calculator accept RNA sequences?

Yes. Uracil is automatically converted to thymine during preprocessing. FASTA headers, spaces, line breaks, and non-nucleotide characters are removed before codon counting begins.

6. Why might trimmed bases appear in the result?

Trimmed bases appear when the usable sequence length is not divisible by three after the selected reading frame offset. Incomplete trailing bases are excluded from codon analysis.

7. Can I compare multiple genes with this page?

Yes, but analyze one sequence at a time and export each output. The example table can help you organize and compare summary metrics across several genes.

8. Is the built-in CAI weight set universal?

No. It is a practical example set for testing. For biological interpretation, replace it with codon weights derived from a trusted reference dataset for your target organism.