AB Split Test Calculator

Enter Test Data

Test Name

Control Visitors

Control Conversions

Variation Visitors

Variation Conversions

Confidence Level (%)

Power Level (%)

Planning Baseline Rate (%)

Minimum Detectable Effect (%)

Average Conversion Value

Variation Traffic Split (%)

Test Direction

Example Data Table

Variant	Visitors	Conversions	Conversion Rate	Notes
Control A	12,000	720	6.00%	Original headline and layout
Variation B	11,800	802	6.80%	New headline and call button
Planning Target	Calculated	Based on rate	Baseline plus MDE	Used for sample estimation

Formula Used

The calculator uses a two proportion z test for conversion comparison.

Control rate: pA = conversionsA / visitorsA

Variation rate: pB = conversionsB / visitorsB

Absolute lift: pB - pA

Relative uplift: ((pB - pA) / pA) × 100

Pooled rate: p = (conversionsA + conversionsB) / (visitorsA + visitorsB)

Standard error: SE = sqrt(p × (1 - p) × (1 / visitorsA + 1 / visitorsB))

Z score: z = (pB - pA) / SE

Confidence interval: difference ± z critical × unpooled standard error

Sample size: calculated from alpha, power, baseline rate, and minimum detectable effect.

How To Use This Calculator

Enter visitors and conversions for the control version.
Enter visitors and conversions for the variation version.
Choose the confidence level for your decision threshold.
Choose the test direction that matches your hypothesis.
Add baseline rate, power, and effect size for planning.
Add average conversion value for revenue per visitor estimates.
Press the calculate button to view the result.
Use the CSV or PDF buttons to save the report.

Smarter Experiment Decisions

An A/B split test compares two page versions with real visitor data. One version is the control. The other version is the variation. The goal is not only higher conversions. The goal is a reliable decision. This calculator helps you review that decision with clear metrics.

Why Split Testing Matters

Small rate changes can create large business gains. They can also appear by chance. A busy landing page may show a higher rate today and a lower rate tomorrow. Statistical testing reduces that confusion. It checks whether the observed lift is large enough for the sample size.

Key Metrics To Watch

Conversion rate is the main score. It equals conversions divided by visitors. Absolute lift shows the direct rate difference. Relative uplift shows the percentage gain over the control. The z score measures how far the variation is from the control. The p value estimates the chance of seeing a difference like this when no real effect exists.

Planning Better Tests

Good experiments start before traffic begins. Choose a baseline rate. Set a minimum detectable effect. Pick a confidence level. Pick a power level. The calculator estimates visitors needed per version. Larger samples are needed when the baseline rate is low. Larger samples are also needed when the target effect is small.

Reading The Result

A significant result does not promise permanent growth. It means the data passed the selected threshold. Review the confidence interval too. A wide interval means uncertainty remains. A narrow interval means the estimate is steadier. Also check revenue per visitor when value is entered.

Practical Advice

Run tests for full business cycles. Avoid stopping only because one day looks strong. Keep tracking source quality, device mix, and seasonal changes. Use one primary goal. Extra goals can support the story, but they should not replace the main decision. A clean test makes the final call easier.

Common Mistakes

Many teams test many changes at once. That makes results harder to explain. Some teams ignore mobile traffic. Others compare paid traffic against organic traffic. Keep audiences balanced. Keep tracking rules stable. Record each hypothesis before launch. Then your final report will show what changed, why it changed, and how strong the evidence seems clearly.

FAQs

What is an AB split test?

It is a controlled comparison between two versions. Visitors see either the control or variation. The goal is to learn which version performs better for a chosen conversion action.

What does statistical significance mean?

It means the observed difference passed your selected confidence threshold. It does not prove future results. It only shows the current sample provides stronger evidence than random noise.

What is a p value?

The p value estimates how likely this difference is under a no-effect assumption. Smaller values show stronger evidence against random chance.

Should I use a one sided test?

Use it only when your hypothesis is directional before launch. For most marketing tests, a two sided test is safer and more balanced.

What is minimum detectable effect?

It is the smallest relative improvement you want the test to detect. Smaller effects need larger samples and longer testing periods.

Why is my result not significant?

Your lift may be too small, your sample may be too small, or variation may not differ enough from control. More balanced traffic can help.

Can I stop a test early?

Stopping early can inflate false wins. Run the test through planned cycles. Check sample size, traffic quality, and tracking stability first.

What should I export?

Export the conversion rates, uplift, z score, p value, confidence interval, and sample estimate. These values help teams review the decision later.