Advanced AB Test Significance Calculator

Analyze control and variant performance with clear statistics. Review lift, confidence, power, and sample balance. Make better experiment calls using fast downloadable visual reports.

Calculator Input

Control Visitors

Control Conversions

Variant Visitors

Variant Conversions

Alpha Level

Test Direction

Minimum Detectable Effect (%)

Target Power

Baseline Rate Override

Visualization

The chart compares observed conversion rates and the confidence interval of the rate difference.

Example Data Table

Variation	Visitors	Conversions	Conversion Rate	Revenue per Conversion	Estimated Revenue
Control A	10,000	820	8.20%	$45	$36,900
Variant B	9,800	910	9.29%	$45	$40,950

Use rows like these when validating the calculator with your own experiment logs.

Formula Used

Conversion rate: p = conversions / visitors.

Pooled rate: p̂ = (c₁ + c₂) / (n₁ + n₂).

Pooled standard error: SE = √[p̂(1 − p̂)(1/n₁ + 1/n₂)].

Z score: z = (p₂ − p₁) / SE.

P value: derived from the standard normal distribution using one-tailed or two-tailed rules.

Confidence interval for the difference: (p₂ − p₁) ± z_critical × SE_unpooled.

Lift: (p₂ − p₁) / p₁.

Required sample per variation: estimated from baseline rate, target uplift, alpha, and desired power using a two-proportion normal approximation.

How to Use This Calculator

Enter visitors and conversions for the control group.
Enter visitors and conversions for the variant group.
Set your alpha threshold and choose one-tailed or two-tailed testing.
Add an MDE percentage and target power for planning future tests.
Optionally override the baseline conversion rate with a decimal value.
Press calculate to view results above the form.
Review the chart, confidence interval, and winner statement.
Download a CSV or PDF summary for reporting.

Frequently Asked Questions

1. What does the calculator test?

It tests whether two conversion rates differ beyond random sampling noise. The result combines a z score, p value, confidence interval, and lift summary for practical interpretation.

2. When should I use a two-tailed test?

Use a two-tailed test when you care about either improvement or decline. It is the safer default for most product, marketing, and experiment review workflows.

3. When is a one-tailed test acceptable?

A one-tailed test can fit when you defined a single directional hypothesis before running the experiment. Do not switch after seeing the data.

4. What is lift?

Lift shows the relative percentage change from the control rate. It helps translate significance into business language, especially when comparing improvements across experiments.

5. Why can a large lift still be insignificant?

Small samples can create large-looking differences that remain noisy. Significance depends on both effect size and uncertainty, not on lift alone.

6. What does observed power mean here?

Observed power gives an approximate sense of sensitivity for the current result. Use it carefully, and rely more heavily on planned sample size before launching tests.

7. How is required sample size estimated?

The estimator uses a two-proportion normal approximation based on baseline rate, desired uplift, alpha, and target power. It returns the sample needed for each variation.

8. Can I use this for revenue metrics?

It is built for binary conversions. For average order value, revenue per user, or retention, use tests designed for continuous or time-based outcomes.