Maximum Likelihood Missing Data Calculator

Model incomplete datasets with stable likelihood estimates. Compare observed cases, imputed means, and uncertainty metrics. Make defensible decisions using transparent statistical outputs and visuals.

Calculator Input

Use one stacked page layout. The form itself becomes 3 columns on large screens, 2 on medium, and 1 on mobile.

Separate values with commas, new lines, semicolons, or tabs.
Examples: NA, ?, null, missing
Optional label for your numeric variable.

Example Data Table

This example shows how missing entries are recorded before estimation.

Index Score Status
112.5Observed
214.1Observed
3NAMissing
415.4Observed
513.8Observed
6?Missing
716.2Observed
814.9Observed

Formula Used

This calculator fits a univariate normal model with incomplete data. It maximizes the observed-data likelihood and uses EM updates for the missing values.

Observed-data log-likelihood

ℓ(μ, σ² | y_obs) = -(n_obs / 2) ln(2πσ²) - [1 / (2σ²)] Σ(y_i - μ)²

E-step

E[y_mis | θ(t)] = μ(t),   E[y_mis² | θ(t)] = σ²(t) + μ(t)²

M-step

μ(t+1) = [Σy_obs + n_mis μ(t)] / n

σ²(t+1) = {Σy_obs² + n_mis[σ²(t) + μ(t)²]} / n - μ(t+1)²

Approximate confidence interval for the mean

CI = μ ± z × (σ / √n_obs)

How to Use This Calculator

  1. Paste a numeric series into the dataset box.
  2. Mark absent entries with tokens such as NA, ?, or null.
  3. Choose the working missingness assumption: MCAR, MAR, or MNAR.
  4. Set tolerance, maximum iterations, confidence level, decimals, and optional starting values.
  5. Submit the form to estimate mean, variance, log-likelihood, and expected missing values.
  6. Review the iteration table, imputed dataset preview, and the Plotly visualization.
  7. Export the results as CSV or PDF for documentation.

Frequently Asked Questions

1. What does this calculator estimate?

It estimates the mean and variance of one numeric variable with missing entries, using an EM-based maximum likelihood routine under a normal model.

2. Does it replace missing values permanently?

No. It reports expected values implied by the fitted model. Those values are useful for review, but separate modeling decisions may still be needed.

3. When is MAR a reasonable choice?

MAR is reasonable when missingness can depend on observed information but not on the unseen value itself after conditioning on observed data.

4. Why does MNAR trigger a warning?

MNAR usually requires a direct model for the missingness process. A simple ignorable-likelihood EM routine is not enough for full MNAR inference.

5. What distribution does this page assume?

This page assumes a univariate normal distribution for the analysis variable. Strong skewness or categorical data should be handled with other models.

6. Why show AIC and BIC?

AIC and BIC summarize model fit with a complexity penalty. They are most useful when you compare competing likelihood-based models on the same data.

7. Is the confidence interval exact?

No. The interval shown here is an approximate normal-based interval around the estimated mean, using the fitted standard deviation and observed sample size.

8. Can I analyze several variables together?

This implementation focuses on one variable at a time. Multivariate FIML, SEM, or specialized missing-data software is better for joint models.

Related Calculators

multiple imputation poolingconditional mean imputationcensored data imputationlinear interpolation imputationem algorithm calculator

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.