Goodness of Fit Test Guide
Purpose
A goodness of fit test compares observed category counts with values expected by a chosen model. It helps you decide whether differences look random or meaningful. This tool supports equal proportions, custom expected counts, and probability based expectations. It also accepts estimated parameters. That makes the degree of freedom calculation more realistic for advanced work.
Statistic
The chi square statistic adds the squared gap between each observed and expected count. Each gap is divided by the expected count. Large values show poor agreement. Small values show close agreement. The calculator also reports the p value. The p value measures how unusual the statistic is when the null model is true.
Use Cases
You can use this calculator for survey choices, dice rolls, genetics, quality checks, marketing tests, or any grouped count problem. Enter one observed count for each category. Then enter expected counts, probabilities, or percentages in the same order. If expected counts use a different total, the scaling option can match them to the observed total.
Outputs
The result includes the test statistic, degrees of freedom, critical value, p value, Cohen's w, and a decision at your chosen alpha. Cohen's w describes effect size. It is helpful because a tiny p value may appear when the sample is very large. Residuals show which categories add most to the total statistic.
Assumptions
Good inputs matter. Expected counts should usually be at least five. Categories should be independent. Counts should not be repeated measurements from the same item. If assumptions are weak, combine sparse categories or choose another method. The warnings in the output help you review these issues.
Reporting
Use the export buttons after calculation. The CSV file is useful for spreadsheets. The PDF report is useful for sharing a compact result. Always record the null hypothesis before testing. Also keep the source data and the reasoning behind expected values. Clear documentation makes the test easier to defend and repeat.
Interpretation
For reports, describe the categories in plain words. State whether expected values came from theory, past data, or a business rule. Avoid changing categories after seeing the result. That can bias the conclusion. When the p value is near alpha, explain practical importance, not only significance. Context should guide action. Use judgement with every output carefully.