Run flexible score tests with guided statistical inputs. Compare null values, tails, and confidence settings. Interpret z statistics, p values, and decisions with confidence.
Choose a score-test family, add your numbers, and review the decision, p value, interval, and assumptions.
| Scenario | Inputs | Illustrative output |
|---|---|---|
| One-sample proportion | x = 62, n = 100, p₀ = 0.50, two-sided, 95% | Z ≈ 2.400, p ≈ 0.016, reject H₀ |
| Two-sample proportions | x₁ = 70, n₁ = 120, x₂ = 48, n₂ = 110, two-sided, 95% | Z ≈ 2.269, p ≈ 0.023, reject H₀ |
| One-sample mean | x̄ = 53.4, μ₀ = 50, σ = 8, n = 36, right-tailed, 95% | Z ≈ 2.550, p ≈ 0.005, reject H₀ |
One-sample proportion: z = (p̂ − p₀) / √[p₀(1 − p₀) / n]
Two-sample proportions: z = (p̂₁ − p̂₂) / √[p̂(1 − p̂)(1/n₁ + 1/n₂)] where p̂ is the pooled proportion under the null.
One-sample mean with known sigma: z = (x̄ − μ₀) / (σ / √n)
The p value comes from the standard normal distribution using the selected tail. Confidence intervals are included for practical interpretation.
A score test checks whether observed data are compatible with a null parameter value. It uses information evaluated under the null hypothesis and reports a z statistic and p value.
Use it when you want to compare an observed success rate against a benchmark proportion, such as a defect rate, click rate, approval rate, or response rate.
Under the null hypothesis of equal proportions, both groups are assumed to share one common underlying probability. The pooled estimate reflects that shared value when computing the score statistic.
No. A score test evaluates curvature and slope under the null. Wald tests center on the sample estimate instead. Score tests often behave better near boundaries or smaller samples.
You need independent observations, a suitable design, and adequate sample size for normal approximation. For mean tests, the population sigma must be known and the sampling model should be appropriate.
The p value is the probability of getting a test statistic at least as extreme as the observed one, assuming the null hypothesis is true.
A decision tells you whether evidence crosses a threshold. The interval adds magnitude and uncertainty, helping you judge whether the estimated effect is practically important.
Be careful. Very small samples can weaken the normal approximation used by score tests. In such cases, exact methods or alternative modeling approaches may be more appropriate.