Multivariate Normal Test Calculator

Paste data, choose alpha, and run multiple tests. View means, covariance, correlation, and distances clearly. Export results to CSV or PDF for reporting now.

Calculator

Paste values (rows = observations, columns = variables) or upload a CSV.
If uploaded, it overrides the pasted text.
Common choices: 0.10, 0.05, 0.01.
Useful when variables have very different scales.
Adds λ to covariance diagonal to stabilize inversion.
Accepted missing markers: blank, NA, NaN, or a dot.

Example data table

Three variables with ten observations. Use it to validate parsing and outputs.
#X1X2X3
11.20.71.5
21.91.12.1
31.40.91.8
42.21.32.4
51.712
621.42.5
71.30.81.6
82.11.22.3
91.81.12.2
101.50.91.7

Formula used

Mean and covariance
μ = (1/n) Σ xᵢ
S = (1/(n−1)) Σ (xᵢ−μ)(xᵢ−μ)ᵀ (or 1/n)
dᵢ² = (xᵢ−μ)ᵀ S⁻¹ (xᵢ−μ)
Mardia’s multivariate normality tests
b₁,p = (1/n²) ΣΣ ( (xᵢ−μ)ᵀ S⁻¹ (xⱼ−μ) )³
Skew stat = n b₁,p / 6 ~ χ²(df), df = p(p+1)(p+2)/6
b₂,p = (1/n) Σ dᵢ⁴, Z = (b₂,p − p(p+2)) / √(8p(p+2)/n)
Approximations work best with moderate to large sample sizes. With small n or strong outliers, results can be unstable.

How to use this calculator

  1. Paste your dataset with rows as observations and columns as variables. Optionally upload a CSV.
  2. Choose the delimiter and whether the first row contains headers.
  3. Set alpha, select missing-value handling, and optionally standardize variables.
  4. If covariance inversion is unstable, add a small ridge value like 0.001.
  5. Run the test, then export results to CSV or PDF.

Data structure and minimum sample size

Enter observations as rows and variables as columns. The calculator requires n to exceed p + 1 so the covariance matrix can be inverted. For stable approximations, aim for n ≥ 20 and preferably n ≥ 5p when p is moderate. Use consistent delimiters and, if present, mark the first row as headers. Drop rows with missing values to avoid biased moments.

Why Mardia’s skewness and kurtosis matter

Mardia’s skewness summarizes asymmetric joint behavior through cubic cross-products, while kurtosis reflects tail weight using squared Mahalanobis distances. Under multivariate normality, the skewness statistic is compared to a chi-square distribution with df = p(p+1)(p+2)/6, and kurtosis is standardized to a Z score. Large statistics or extreme Z values indicate departures from normality. Both measures are sensitive to outliers, so interpret alongside diagnostics.

Interpreting p-values with alpha and multiple checks

Set alpha to control the false-positive rate of each decision, commonly 0.05 or 0.01 in regulated settings. If either skewness or kurtosis p-value is below alpha, the overall screen reports evidence against multivariate normality. Use the variable-wise Jarque–Bera table to locate which columns contribute most, then consider transformations or trimming rules. When many variables are tested, treat p-values as guidance rather than strict pass or fail.

Distance diagnostics and outlier control

Each row receives a Mahalanobis distance p-value from a chi-square(p) reference. Rows with p-value below alpha are flagged because they are unusually far from the mean in the joint space. The QQ plot compares sorted distances to theoretical chi-square quantiles; consistent upward curvature often signals heavy tails, while isolated high points suggest a small set of influential outliers. Reviewing the top 1% of distances is a starting point.

Practical steps for improving normality

Standardize variables when scales differ greatly, especially with mixed units, because covariance estimation can be dominated by large-variance columns. Choose the covariance denominator n−1 for sample inference, or n for maximum-likelihood estimates. Re-check after removing clear data errors, winsorizing extreme values, or applying log or Box–Cox transforms. When normality is not achievable, consider robust covariance estimators or multivariate methods that do not require normality. Export CSV or PDF to document assumptions and decisions.

FAQs

1) What sample size should I use?

Use at least n that exceeds p + 1. For smoother approximations, target n ≥ 20 and roughly n ≥ 5p when feasible, especially if p is moderate and the data contain outliers.

2) When should I standardize variables?

Standardize when variables use different units or scales. It prevents high-variance columns from dominating covariance estimates and Mahalanobis distances, making the normality screen more comparable across variables.

3) What does ridge regularization (λ) change?

A small λ adds to the covariance diagonal, improving matrix inversion when variables are collinear or n is close to p. It stabilizes distances and Mardia calculations, but too-large λ can mask structure.

4) How do I read mixed results across tests?

If skewness fails and kurtosis passes, the joint distribution may be asymmetric without heavy tails, or outliers may affect cross-products. Review the QQ plot and Jarque–Bera rows to identify which variables drive the signal.

5) Why are Mahalanobis outliers flagged?

Each row’s squared distance is compared to chi-square(p). A small p-value means the observation is far from the multivariate center given the covariance pattern. Investigate data errors, rare regimes, or legitimate extreme cases.

6) Can I rely on this as a final normality decision?

Treat it as a screening step. Combine statistical results with plots and subject-matter checks, and confirm with downstream model diagnostics. When assumptions are critical, consider robust or nonparametric multivariate methods.

Built for quick screening and reporting. Validate with domain context.

Related Calculators

Distance Matrix ToolMahalanobis Distance ToolBox M TestLoadings Matrix ToolNormalization ToolCovariance Estimator ToolDimensionality Reduction Tool

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.