Mutual Information Neural Estimation Calculator

Calculator Inputs

Use critic outputs from your neural estimator. Joint values come from matched pairs. Marginal values come from shuffled or independent pairs.

Joint critic scores

Comma, space, or line separated numbers.

Marginal critic scores

Use shuffled or independent critic outputs.

Optional EMA baseline

Positive value only.

EMA smoothing factor

Usually between 0.01 and 0.20.

Confidence z-score

1.96 gives an approximate 95% interval.

Primary output unit

Both units are still shown in results.

Example Data Table

This example matches the quick-load button above.

Sample	Joint score	Marginal score	exp(Marginal)	Pair gap
1	0.95	0.22	1.2461	0.73
2	1.21	0.35	1.4191	0.86
3	1.10	0.27	1.3100	0.83
4	1.36	0.41	1.5068	0.95
5	1.18	0.33	1.3910	0.85
6	1.44	0.38	1.4623	1.06

Formula Used

The calculator applies the Mutual Information Neural Estimation lower bound:

I(X;Z) ≥ E_joint[T_θ(x,z)] − log(E_marginal[e^T_θ(x,z′)])

Here, T_θ is the critic network output. The joint batch uses matched pairs. The marginal batch uses shuffled or independent pairs.

A stabilized version is also shown. It replaces the direct marginal exponential mean with an exponential moving average baseline. That can reduce noise when batches fluctuate strongly.

The confidence interval is an approximate delta-method interval. It is useful for quick screening, not formal inference.

How to Use This Calculator

Collect critic outputs from your trained or partially trained estimator.
Paste matched-pair scores into the joint field.
Paste shuffled or independent scores into the marginal field.
Enter a smoothing factor for EMA stabilization.
Optionally provide your own EMA baseline.
Select nats or bits as the main reporting unit.
Submit the form to view results above it.
Download CSV or PDF for reporting and review.

Frequently Asked Questions

1) What does this calculator estimate?

It estimates a lower bound on mutual information using critic outputs from a MINE-style model. It does not train the network itself.

2) What are joint critic scores?

They come from matched samples, such as paired observations from the true joint distribution. Larger values often indicate stronger learned dependence.

3) What are marginal critic scores?

They come from shuffled or independent pairings. They approximate the product of marginals used in the MINE lower-bound denominator term.

4) Why show both raw and stabilized estimates?

Raw estimates react faster. Stabilized estimates use an EMA baseline and can be less noisy across uneven mini-batches.

5) Should I report nats or bits?

Both are valid. Nats use natural logarithms. Bits divide the estimate by ln(2), which many practitioners find easier to interpret.

6) Does a negative estimate mean no dependence exists?

Not necessarily. It can reflect noisy critic outputs, weak training, poor batch construction, or limited sample size.

7) Are unequal batch lengths allowed?

Yes. Summary statistics use all entered values. The detailed table and chart use the minimum matched count for side-by-side display.

8) Is the confidence interval exact?

No. It is an approximation based on sample variability and the delta method. Use bootstrap methods for deeper uncertainty analysis.