Goodness of Fit Test Calculator

Enter Test Data

Observed counts

Use commas, spaces, or new lines.

Expected counts, probabilities, or percentages

Keep the same order as observed counts.

Category labels

Optional labels improve exported reports.

Expected value type

Alpha level

Estimated parameters

Use 0 when no model parameter was estimated.

Decimal places

Scale expected counts to observed total

Example Data Table

Category	Observed count	Expected count	Reason
A	18	20	Equal share expected
B	22	20	Equal share expected
C	20	20	Equal share expected
D	25	20	Equal share expected
E	15	20	Equal share expected

Formula Used

The calculator uses the chi square goodness of fit statistic.

X² = Σ ((O - E)² / E)

O means observed count. E means expected count. The degrees of freedom are categories minus one minus estimated parameters.

df = k - 1 - m

The p value is the right tail probability from the chi square distribution. Cohen w is calculated as the square root of X² divided by total observed count.

How to Use This Calculator

Enter observed counts in category order.
Enter expected counts, probabilities, or percentages.
Add category labels if you want clearer exports.
Choose alpha and estimated parameter count.
Keep scaling enabled when expected counts use another total.
Press Calculate Test and review the result above the form.
Download the CSV or PDF report when needed.

Goodness of Fit Test Guide

Purpose

A goodness of fit test compares observed category counts with values expected by a chosen model. It helps you decide whether differences look random or meaningful. This tool supports equal proportions, custom expected counts, and probability based expectations. It also accepts estimated parameters. That makes the degree of freedom calculation more realistic for advanced work.

Statistic

The chi square statistic adds the squared gap between each observed and expected count. Each gap is divided by the expected count. Large values show poor agreement. Small values show close agreement. The calculator also reports the p value. The p value measures how unusual the statistic is when the null model is true.

Use Cases

You can use this calculator for survey choices, dice rolls, genetics, quality checks, marketing tests, or any grouped count problem. Enter one observed count for each category. Then enter expected counts, probabilities, or percentages in the same order. If expected counts use a different total, the scaling option can match them to the observed total.

Outputs

The result includes the test statistic, degrees of freedom, critical value, p value, Cohen's w, and a decision at your chosen alpha. Cohen's w describes effect size. It is helpful because a tiny p value may appear when the sample is very large. Residuals show which categories add most to the total statistic.

Assumptions

Good inputs matter. Expected counts should usually be at least five. Categories should be independent. Counts should not be repeated measurements from the same item. If assumptions are weak, combine sparse categories or choose another method. The warnings in the output help you review these issues.

Reporting

Use the export buttons after calculation. The CSV file is useful for spreadsheets. The PDF report is useful for sharing a compact result. Always record the null hypothesis before testing. Also keep the source data and the reasoning behind expected values. Clear documentation makes the test easier to defend and repeat.

Interpretation

For reports, describe the categories in plain words. State whether expected values came from theory, past data, or a business rule. Avoid changing categories after seeing the result. That can bias the conclusion. When the p value is near alpha, explain practical importance, not only significance. Context should guide action. Use judgement with every output carefully.

Frequently Asked Questions

What is a goodness of fit test?

It checks whether observed category counts match expected counts from a model, theory, or planned distribution.

Which test does this tool use?

It uses the chi square goodness of fit test. The p value comes from the right tail of the chi square distribution.

Can I enter percentages?

Yes. Select probabilities or percentages. The calculator normalizes those values and converts them into expected counts.

Why are expected counts scaled?

Scaling adjusts expected counts to match the observed total. This is useful when the expected list gives only relative category sizes.

What does alpha mean?

Alpha is your significance cutoff. A common value is 0.05. Lower values demand stronger evidence before rejecting the null hypothesis.

What are estimated parameters?

They are model values estimated from the same data. Each estimated parameter usually reduces the degrees of freedom by one.

What does Cohen w show?

Cohen w is an effect size. It helps judge practical difference, especially when a large sample makes small differences significant.

When should I avoid this test?

Avoid it when expected counts are too small, categories overlap, or observations are not independent. Consider combining categories or using another method.