Advanced Chi-Square Test Calculator

Test independence, goodness-of-fit, and feature relevance confidently. Inspect expected counts, residuals, effect sizes, and decisions. Turn categorical data into explainable machine learning evidence rapidly.

Result Summary

Run the calculator to show chi-square results, p-values, effect size, assumption checks, tables, CSV export, PDF export, and Plotly charts here.

Calculator Input

This page stays in a single vertical flow, while the calculator fields use a responsive 3-column, 2-column, and 1-column layout.

Use independence for contingency tables. Use goodness-of-fit for one categorical distribution.

Use commas or new lines.
Use commas or new lines.
Enter each row on a new line, or provide all values in one sequence.
Use commas or new lines.
Example: 18, 25, 31, 26
The calculator normalizes probabilities, percentages, or weights automatically.

Example Data Table

This sample contingency table shows whether a categorical feature distribution differs across predicted classes.

Feature Group Class A Class B Class C Row Total
Feature Low 25 30 20 75
Feature Medium 15 18 27 60
Feature High 10 22 33 65
Column Total 50 70 80 200

Formula Used

Core chi-square statistic
χ² = Σ ((O - E)² / E)
Expected count for independence tables
Eij = (Row Totali × Column Totalj) / Grand Total
Degrees of freedom
Independence / Homogeneity: df = (r - 1)(c - 1)
Goodness-of-Fit: df = k - 1 - m
Effect size
Cramer's V = √(χ² / (n × min(r - 1, c - 1)))
Phi for 2x2 tables = √(χ² / n)
Cohen's w for goodness-of-fit = √(χ² / n)

Where O is observed frequency, E is expected frequency, n is total sample size, r is row count, c is column count, k is category count, and m is the number of fitted parameters.

How to Use This Calculator

  1. Select the appropriate test type.
  2. Enter alpha, dimensions, and labels.
  3. Paste observed counts into the correct input field.
  4. For goodness-of-fit, enter expected probabilities, counts, or weights.
  5. Submit the form to see the result above the calculator.
  6. Review chi-square, p-value, critical value, residuals, and effect size.
  7. Inspect the charts to identify influential categories or cells.
  8. Use the CSV or PDF buttons to save your report.

FAQs

1. What does this calculator test?

It tests whether observed category counts differ from expected counts, or whether two categorical variables are statistically associated in a contingency table. In machine learning, that helps evaluate feature-label relationships, class imbalance, segmentation differences, and possible distribution drift.

2. When should I use goodness-of-fit?

Use goodness-of-fit when you have one categorical variable and want to compare its observed frequencies against a target distribution. Examples include expected class proportions, sampling fairness checks, or baseline category frequencies in monitoring pipelines.

3. When should I use independence or homogeneity?

Use it when you have two categorical variables organized in a table. Examples include feature bucket versus predicted class, region versus error type, or campaign source versus conversion class. It helps reveal whether the variables appear related.

4. Why are expected counts important?

Expected counts show what frequencies would look like under the null hypothesis. Comparing observed and expected values reveals which cells drive the statistic. Very small expected counts can weaken the approximation and reduce confidence in the reported p-value.

5. What do standardized residuals tell me?

Residuals help locate the cells or categories contributing most strongly to the chi-square statistic. Large positive or negative residuals suggest stronger local deviations between observed and expected frequencies, which is especially useful when debugging model segments or drift patterns.

6. What is Cramer's V or Cohen's w?

These are effect size measures. They describe practical magnitude rather than only statistical significance. A small p-value can occur with large samples even when the relationship is weak, so effect size helps judge whether the pattern is meaningfully important.

7. Why would I estimate parameters in goodness-of-fit?

If expected probabilities are learned from data rather than fully fixed in advance, each fitted parameter reduces the effective degrees of freedom. This adjustment makes the test more honest and prevents overstating evidence against the null hypothesis.

8. Can I use this for machine learning feature screening?

Yes. It is useful for categorical feature relevance checks, label association analysis, fairness slices, monitoring category drift, and comparing grouped outcomes. It works best as an exploratory statistical signal, not as the only model selection criterion.

Related Calculators

tf idf calculatorequal width binning calculatoranova f score calculatorz score normalization calculatorprincipal component calculatorz score outlier calculatormin max normalization calculatorbinary encoding calculatorone hot encoding calculator

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.