Calculator inputs
Use the form below to compare two variants using conversion data. Results appear above this form after submission.
Example data table
| Variant | Visitors | Conversions | Conversion Rate | Revenue per Conversion |
|---|---|---|---|---|
| A | 10,000 | 480 | 4.80% | $35.00 |
| B | 10,050 | 560 | 5.57% | $35.00 |
This example mirrors a common landing page experiment. Use the example button to preload the same numbers into the calculator.
Formula used
Conversion rates: pA = conversionsA ÷ visitorsA, and pB = conversionsB ÷ visitorsB.
Absolute lift: pB − pA. This shows percentage-point change between the two variants.
Relative uplift: (pB − pA) ÷ pA. This shows proportional change relative to Variant A.
Pooled rate: (conversionsA + conversionsB) ÷ (visitorsA + visitorsB).
Pooled standard error: √[p̄(1−p̄)(1/nA + 1/nB)] where p̄ is the pooled rate.
Z-score: (pB − pA) ÷ pooled standard error. Larger magnitude means stronger evidence.
P-value: Derived from the standard normal distribution using the chosen one-tailed or two-tailed setting.
Confidence interval for lift: (pB − pA) ± z* × unpooled standard error.
Projected impact: future visitors × lift = estimated extra conversions. Extra conversions × value per conversion = estimated value impact.
How to use this calculator
- Enter visitors and conversions for both experiment variants.
- Choose your confidence level based on your release risk tolerance.
- Select the hypothesis direction that matches your test design.
- Add projected visitors and value per conversion for business impact forecasting.
- Click Calculate Confidence to generate statistical and business outputs.
- Review the result panel, confidence interval, p-value, and observed power.
- Use the chart to visually compare conversion rates and interval ranges.
- Download the result summary as CSV or PDF for reporting.
Frequently asked questions
1. What does this calculator measure?
It compares two conversion rates using a z-test for proportions. It reports lift, p-value, confidence interval, confidence score, and projected business impact from the observed difference.
2. When should I use a two-tailed test?
Use a two-tailed test when you only care whether the variants differ, regardless of direction. It is the safer default when your experiment plan did not specify a directional hypothesis beforehand.
3. What does the p-value mean here?
The p-value estimates how surprising your observed difference would be if no true difference existed. Smaller values indicate stronger evidence against the null hypothesis.
4. Is confidence score the same as business certainty?
No. Statistical confidence reflects evidence in the sample, not guaranteed future performance. Business certainty also depends on seasonality, tracking quality, implementation risk, and experiment duration.
5. Why can a result be positive but still inconclusive?
A positive lift may still be too small relative to sampling noise. If variance is high or sample size is limited, the interval can still include zero, making the test inconclusive.
6. What is observed power?
Observed power is an approximate estimate based on the measured effect size and sample data. It is useful as context, but prospective sample-size planning remains the better design method.
7. Can I use this for revenue or signup tests?
Yes, as long as the main success metric is binary, such as signup or purchase. For continuous metrics like average order value, use a different statistical test.
8. Should I stop a test as soon as significance appears?
Not usually. Peeking too often can inflate false positives. Follow a preplanned sample target or stopping rule, then evaluate the final result using consistent criteria.