Kullback–Leibler Divergence Calculator

Calculator

Enter P and Q as comma-separated values. Use probabilities that sum to 1, or choose counts and normalize automatically.

Distribution P

Same number of entries as Q.

Distribution Q

Avoid zeros where P is positive.

Input mode

Counts are normalized by total sum.

Probabilities must sum to 1 unless auto-normalize is enabled.

Log base

Units change with log base.

Epsilon smoothing

Adds ε to each entry and renormalizes.

Sum tolerance

Used when validating probability sums.

Auto-normalize (probabilities mode)

Scales inputs so each distribution sums to 1.

Ignore terms where leading probability is zero

Uses the standard convention 0·log(0/q)=0.

Formula used

The Kullback–Leibler divergence from P to Q is:

D_KL(P || Q) = \(\sum_{i=1}^{n} p_i \log\!\left(\frac{p_i}{q_i}\right)\)

It measures information loss when Q approximates P.
It is not symmetric: D_KL(P||Q) ≠ D_KL(Q||P).
With base-2 logs, the result is in bits.

How to use this calculator

Enter P and Q as comma-separated lists of the same length.
Select Probabilities or Counts as your input mode.
If sums are not exactly 1, enable auto-normalize.
Set epsilon smoothing to handle zeros safely.
Press Calculate to view results and term details.

Example data table

Use these sample distributions to test calculations. With natural logs, the forward divergence is about 0.0205 nats.

Category	P	Q
A	0.40	0.50
B	0.35	0.30
C	0.25	0.20

Kullback–Leibler divergence in experimental modeling

Kullback–Leibler (KL) divergence quantifies how one discrete probability model departs from another. In physics workflows, it is often used to compare measured histograms with theoretical predictions, track distribution drift in simulations, and assess whether a calibrated model remains consistent across operating conditions.

What this calculator measures

This tool evaluates D_KL(P || Q) by summing per-bin contributions p_i·log(p_i/q_i). It also reports the reverse divergence D_KL(Q || P), a simple symmetric sum, the entropy of P, and the cross-entropy of P under Q. These companion metrics help you diagnose whether the mismatch is broad or concentrated in specific bins.

Input data and normalization

You can enter probabilities that already sum to 1, or enter counts/frequencies collected from experiments or Monte Carlo runs. When counts are selected, the calculator normalizes automatically, turning totals into comparable distributions. In probabilities mode, auto-normalize is optional, which is useful when you want strict consistency checks.

Log base and physical units

The log base controls the unit of information: natural logs produce nats, base-2 produces bits, and base-10 produces dits. In practice, switching bases rescales results by a constant factor, so trends and comparisons remain valid as long as you keep the base consistent across your dataset.

Zeros, infinities, and smoothing

If a bin has p_i>0 while q_i=0, the divergence becomes infinite because the ratio explodes. Real measurements can contain empty bins, so the epsilon smoothing option adds a small ε to every entry and renormalizes. This produces a finite, stable estimate while still penalizing strong disagreement.

Reading the per-term breakdown

The per-term table shows p_i, q_i, the ratio, the logarithm, and the contribution to the sum. Large positive terms indicate bins where P assigns substantially more mass than Q. This is a practical way to locate spectral regions, energy intervals, or state populations driving the mismatch.

Interpreting magnitudes and comparisons

KL divergence is non-negative and equals zero only when P and Q match exactly. There is no universal “good” threshold because the scale depends on binning, noise level, and log base. Instead, compare results across repeated runs, parameter sweeps, or time windows to identify statistically meaningful changes.

Reporting and exporting results

For documentation and reproducibility, export CSV for spreadsheets and scripts, or export a concise PDF summary for lab notes. Because the calculator also provides entropy and cross-entropy, you can quickly verify the identity D_KL(P||Q)=H(P,Q)−H(P) and spot input issues early.

FAQs

1) Is KL divergence a distance?

No. It is not symmetric and does not satisfy the triangle inequality. It is best interpreted as a directional measure of information loss when Q is used to approximate P.

2) What if my inputs are counts, not probabilities?

Choose the counts/frequencies mode. The calculator normalizes each list by its total, converting counts into comparable probability distributions before evaluating divergence and related metrics.

3) Why do I get an infinite result?

Infinity occurs when Q has a zero in a bin where P is positive. This makes log(p_i/q_i) diverge. Apply epsilon smoothing or revise Q to avoid impossible events.

4) How should I pick epsilon smoothing?

Use a small value relative to typical bin probabilities, such as 1e−12 to 1e−6, depending on sample size. Larger ε reduces sensitivity to rare bins but improves numerical stability.

5) What does the log base change?

Only the scale. Natural logs produce nats, base-2 produces bits, and base-10 produces dits. Trends and rankings stay consistent if you use the same base throughout your analysis.

6) Why compute D_KL(Q || P) too?

Because the direction matters. D_KL(P||Q) emphasizes where P assigns mass not captured by Q, while D_KL(Q||P) emphasizes the reverse. Reporting both provides a fuller picture.

7) When is symmetric KL useful?

It offers a quick, order-independent summary of mismatch by adding the two directional divergences. For a true metric-like alternative, consider Jensen–Shannon divergence, which is bounded and symmetric.

Tip: KL divergence is widely used in statistical physics, inference, and model calibration.