Calculator Input
Use the responsive three, two, and one column layout below.
Example Data Table
This sample compares a fair die to a biased die.
| Outcome | Distribution P | Distribution Q |
|---|---|---|
| Outcome 1 | 0.1667 | 0.10 |
| Outcome 2 | 0.1667 | 0.12 |
| Outcome 3 | 0.1667 | 0.14 |
| Outcome 4 | 0.1667 | 0.16 |
| Outcome 5 | 0.1667 | 0.20 |
| Outcome 6 | 0.1667 | 0.28 |
Formula Used
Total variation distance for two discrete distributions is:
TVD(P, Q) = 1/2 × Σ |pᵢ − qᵢ|
Here, pᵢ and qᵢ are probabilities for the same category.
If normalization is enabled, the calculator first converts raw inputs into probabilities:
pᵢ = aᵢ / Σaᵢ and qᵢ = bᵢ / Σbᵢ
The result always lies between 0 and 1. Zero means identical distributions. One means complete separation.
How to Use This Calculator
- Enter one shared category per row.
- Fill both distributions with probabilities or raw weights.
- Keep normalization checked for counts or frequencies.
- Turn normalization off only when both columns already sum to one.
- Choose the decimal precision you want.
- Press Calculate Distance.
- Review the summary metrics, detailed table, and Plotly graph.
- Download the output as CSV or PDF if needed.
Frequently Asked Questions
1. What does total variation distance measure?
It measures how different two probability distributions are across the same categories. Larger values mean stronger separation. A value of zero means the distributions match exactly.
2. What range can the result take?
For valid probability distributions, the total variation distance ranges from 0 to 1. Zero means identical distributions. One means the distributions place mass on disjoint categories.
3. Why is there a one-half factor in the formula?
The absolute differences across all categories count total mismatch twice. Multiplying by one-half scales the L1 difference into the standard total variation distance definition.
4. Should I enable normalization?
Enable normalization when your entries are counts, frequencies, or weights. Disable it only when both columns already represent valid probabilities that each sum exactly to one.
5. Can I compare distributions with different totals?
Yes, if normalization is enabled. The calculator rescales each column into probabilities before computing the distance. That makes raw totals comparable.
6. What does the overlap coefficient show?
It shows the shared probability mass between the two normalized distributions. Higher overlap means the distributions place similar weight on the same categories.
7. What does the chart help me see?
The chart compares normalized probabilities for each category and shows the absolute gap series. It quickly highlights where the largest differences occur.
8. Where is this metric useful?
It is useful in statistics, machine learning, quality checks, simulations, risk analysis, survey comparison, and model validation whenever category-level probability shifts matter.