Genotype Posterior Probability Example Calculator

Calculator Inputs

Reference allele

Alternate allele

Prior mode

Alternate allele frequency q

Used for HWE priors: (1-q)², 2q(1-q), q².

Manual prior AA

Manual prior AG

Manual prior GG

Likelihood mode

Reference read count

Alternate read count

Sequencing error rate

Manual likelihood AA

Manual likelihood AG

Manual likelihood GG

Example Data Table

Example	Reference Reads	Alternate Reads	Error Rate	Alt Frequency	Expected Result
Balanced site	42	18	0.01	0.20	Often supports heterozygous genotype.
Mostly reference	58	1	0.01	0.10	Usually supports reference homozygote.
Mostly alternate	2	55	0.01	0.35	Usually supports alternate homozygote.

Formula Used

Bayes formula: P(G | D) = [P(D | G) × P(G)] ÷ Σ[P(D | Gᵢ) × P(Gᵢ)]

Hardy-Weinberg priors: P(AA) = (1-q)², P(AG) = 2q(1-q), P(GG) = q²

Read likelihood: P(k alt reads | n reads, G) = C(n,k) × pᵏ × (1-p)ⁿ⁻ᵏ

Read probabilities: AA uses error rate, AG uses 0.5, and GG uses 1 - error rate.

Log values are used internally to reduce underflow when read depth is high or likelihoods are tiny.

How to Use This Calculator

Enter reference and alternate allele labels.
Select Hardy-Weinberg priors or manual priors.
Enter allele frequency when using Hardy-Weinberg mode.
Select read count likelihoods or manual likelihoods.
Enter read counts and error rate for sequencing examples.
Press calculate and review the posterior table.
Use the chart to compare genotype certainty.
Export the result as CSV or PDF for reports.

Understanding Genotype Posterior Probability

Genotype posterior probability is a Bayesian answer to a simple question. After seeing data, which genotype is most credible? The calculator combines a prior belief with evidence from read counts or manual likelihoods. It then normalizes the three genotype scores. The final values add to one.

Why Priors Matter

A prior represents knowledge before the current evidence. In genetics, Hardy-Weinberg priors are common for a biallelic marker. They use the alternate allele frequency. The calculator also supports manual priors. This is helpful when pedigree information, ancestry adjustment, or external study data is available. Priors should be nonnegative. They are normalized before use.

How Evidence Changes the Answer

Read evidence is modeled with a binomial likelihood. A homozygous reference genotype should produce very few alternate reads. A heterozygous genotype should produce about half alternate reads. A homozygous alternate genotype should produce mostly alternate reads. The error rate protects the model from impossible zero probabilities. Strong evidence can override a weak prior.

Using Manual Likelihoods

Manual likelihood mode is useful for class notes, lab examples, or imported genotype likelihoods. Enter a likelihood for each genotype. The numbers do not need to sum to one. They only need to be on the same scale. The calculator multiplies each likelihood by its prior. It then divides by the total evidence.

Interpreting the Output

The MAP genotype is the genotype with the highest posterior probability. The call quality summarizes uncertainty. Higher quality means lower chance that the call is wrong. Entropy is another uncertainty measure. It becomes lower when one genotype dominates. The expected alternate dosage estimates allele copies. It is useful for association models and teaching examples.

Good Practice

Use realistic priors. Check read depth. Avoid trusting a result based on very few reads. Compare manual likelihoods with read-based likelihoods when possible. Use the chart to see uncertainty quickly. Export results for reports. The calculator is educational. Real clinical or forensic work needs validated pipelines and expert review. For teaching, try changing one input at a time. Small changes show sensitivity. Larger read depth usually sharpens the posterior. Balanced reads often support heterozygosity. Extreme imbalance often supports a homozygous call.

FAQs

What is genotype posterior probability?

It is the probability of each genotype after combining prior information with observed data. The three probabilities are normalized, so they add to one.

What does MAP genotype mean?

MAP means maximum a posteriori. It is the genotype with the highest posterior probability after Bayes normalization.

When should I use Hardy-Weinberg priors?

Use them when a biallelic allele frequency is known and the population assumptions are reasonable for an educational or exploratory example.

Do manual likelihoods need to sum to one?

No. Likelihoods only need to be comparable across genotypes. The calculator normalizes the final posterior probabilities automatically.

Why is the error rate important?

The error rate prevents impossible likelihoods. It also explains small numbers of unexpected reads under homozygous genotypes.

What is expected alternate dosage?

It is P(heterozygote) plus two times P(alternate homozygote). It estimates the expected number of alternate allele copies.

What does entropy show?

Entropy measures uncertainty. Low entropy means one genotype dominates. Higher entropy means posterior probability is spread across genotypes.

Can this replace validated genetics software?

No. It is designed for learning, demonstrations, and simple statistical examples. Formal research or clinical work needs validated tools.