find z given the sample proportion and sample size calculator

What problem does this solve?

The one‑sample proportion z‑test answers a simple question: does an observed proportion from a sample, denoted p̂, differ significantly from a hypothesized population proportion p₀? Examples include “Is our conversion rate different from 50%?” or “Did more than 20% of users click the new feature?” When the normal approximation is reasonable, the test statistic follows (approximately) a standard normal distribution, enabling fast calculation of p‑values and critical values.

Core notation and assumptions

Sample size: n independent trials with binary outcomes (success/failure).
Observed successes: x; the sample proportion is p̂ = x / n.
Null hypothesis: H₀: p = p₀. Alternatives: two‑sided, left‑tailed, or right‑tailed.
Independence: observations are approximately independent (simple random sample or well‑designed process).
Normal approximation conditions: a common rule of thumb is n·p₀ ≥ 5 and n·(1−p₀) ≥ 5. When these are violated or p̂ is extreme (near 0 or 1), prefer the exact binomial test.

The z‑statistic and p‑value

z = (p̂ − p₀) / SE₀, where SE₀ = √[ p₀ (1 − p₀) / n ]
Optional finite population correction (FPC) if sampling without replacement from population size N: multiply SE₀ by √[(N − n) / (N − 1)].

Under the null, and when the approximation is adequate, z is approximately standard normal. The p‑value depends on the tail of the alternative: two‑sided uses 2·min{Φ(z), 1−Φ(z)}, right‑tailed uses 1−Φ(z), and left‑tailed uses Φ(z). For extremely large |z|, compute tails via survival functions (e.g., complementary error function) to avoid underflow.

Choosing the tail

Alternative	Research question	p‑value expression
Two‑sided: `Hₐ: p ≠ p₀`	Any departure from `p₀`	`2 · min{ Φ(z), 1−Φ(z) }`
Right‑tailed: `Hₐ: p > p₀`	Is the proportion higher?	`1 − Φ(z)`
Left‑tailed: `Hₐ: p < p₀`	Is the proportion lower?	`Φ(z)`

Confidence intervals for the true proportion

Confidence intervals (CIs) communicate estimation uncertainty. Several methods exist:

Wald: p̂ ± z_α/2 · √[ p̂(1−p̂)/n ]. Simple but unreliable for small n or extreme p̂.
Wilson (score): More accurate coverage across a wide range; often the recommended default.
Agresti–Coull: A quick improvement over Wald via “add‑z²” adjustments to x and n.
Clopper–Pearson (exact): Inverts the binomial test; conservative but valid for any n and x.

Tip: For reporting, Wilson or Clopper–Pearson are good defaults. Avoid relying solely on Wald unless sample sizes are comfortably large and p̂ is not near 0 or 1.

Continuity correction (Yates)

Because the binomial distribution is discrete and the normal distribution is continuous, a small continuity correction can be applied to z when n is small. A common form subtracts 0.5/n in the numerator towards zero. It tends to make tests more conservative; many practitioners omit it for moderate or large samples.

Effect size: Cohen’s h

h = | 2·arcsin( √p̂ ) − 2·arcsin( √p₀ ) |
Rules of thumb: 0.2 (small), 0.5 (medium), 0.8 (large).

Reporting a statistically significant difference without an effect size can be misleading. Cohen’s h expresses the magnitude of change on a stabilized scale, facilitating comparison across studies.

When the normal approximation is dubious

Use the exact binomial test when n·p₀ or n·(1−p₀) is small (e.g., less than 5), or when p̂ is exactly 0 or 1. The exact test computes the probability of outcomes as or more extreme than x under the binomial(n, p₀) model. Two‑sided definitions vary (e.g., doubling the smaller tail, “as‑or‑less‑likely,” or Blaker’s test), but conclusions are usually similar in practice.

Worked example

Suppose you tested a feature with n = 100 users, observed x = 56 conversions (p̂ = 0.56), and wish to test H₀: p = 0.50 against a two‑sided alternative at α = 0.05.

Compute the standard error under the null: SE₀ = √[ 0.5·0.5 / 100 ] = √(0.0025) = 0.05.
Compute the test statistic: z = (0.56 − 0.50) / 0.05 = 1.20.
Two‑sided p‑value: approximately 2·(1 − Φ(1.20)) ≈ 2·0.115 = 0.230.
Decision: since 0.230 > 0.05, fail to reject H₀. The sample does not provide strong evidence that the conversion rate differs from 50%.
95% Wald CI for p (estimation, not testing): SÊ = √[ 0.56·0.44 / 100 ] ≈ 0.0496 and 0.56 ± 1.96·0.0496 ≈ [0.463, 0.657].
Effect size: h ≈ |2·arcsin(√0.56) − 2·arcsin(√0.5)| ≈ 0.12 (a small effect).

If your sample came from a finite population without replacement (say, drawing 100 items from a lot of N = 2000), you could apply the FPC by multiplying SE₀ by √[(N − n)/(N − 1)]. This slightly narrows the standard error and can affect z at high sampling fractions.

Power and planning

Before collecting data, you can assess power: the probability of detecting a true difference when the real proportion is p₁. For a given n, α, and tail, power increases as the difference between p₁ and p₀ grows. Conversely, you can solve for the required sample size to achieve a target power (e.g., 80%) for a practically meaningful effect. These calculations use the normal approximation and should be checked with exact methods for small samples.

Common pitfalls and good practice

Sampling bias: Random, independent sampling matters. Non‑random samples undermine inference.
P‑hacking and multiplicity: If you run multiple tests, adjust α (e.g., Bonferroni) or control the false discovery rate.
Over‑reliance on Wald intervals: Prefer Wilson or exact methods when in doubt.
Reporting only significance: Always include an effect size and a confidence interval.
Discarding tail direction: Choose your alternative based on the question before seeing the data.
Ignoring discreteness: For small n, discreteness matters—use exact tests and interpret carefully.

Summary

The one‑sample proportion z‑test provides a fast, interpretable way to compare an observed proportion to a benchmark. Compute z using the null standard error (with optional FPC), choose the tail to match the research question, and translate the result to a p‑value. Supplement testing with confidence intervals—ideally Wilson or exact—and an effect size such as Cohen’s h. When normal approximations are shaky, fall back to the exact binomial test. For planning, evaluate power and required sample sizes to ensure your study can detect practically meaningful effects.

Glossary: Φ is the standard normal CDF; SE is standard error; FPC is finite population correction.

When should I prefer the exact test?

What does Cohen’s h mean?

Why are Wald intervals not recommended?

Two-sided exact p-value definition?

Does FPC change the p-value?

What problem does this solve?

Core notation and assumptions

The z‑statistic and p‑value

Choosing the tail

Confidence intervals for the true proportion

Continuity correction (Yates)

Effect size: Cohen’s h

When the normal approximation is dubious

Worked example

Power and planning

Common pitfalls and good practice

Summary

Related Calculators