A/B Testing Sample Size Calculator

Plan cleaner A/B tests with reliable sample estimates. Compare conversion rates, uplift, and traffic needs. Export results for teams and reports quickly without confusion.

Calculator

Formula Used

The calculator uses a normal approximation for two independent proportions. It estimates the required visitors for control and variant groups.

Equal or unequal allocation formula:

nA = [(Zα × √((1 + 1/r) × p̄ × (1 − p̄)) + Zβ × √(p1 × (1 − p1) + p2 × (1 − p2) / r))²] ÷ (p2 − p1)²

nB = r × nA

Here, p1 is baseline conversion. p2 is expected variant conversion. r is variant to control allocation ratio. Zα comes from confidence. Zβ comes from power. Dropout allowance increases the final sample.

How to Use This Calculator

  1. Enter the current conversion rate for the control experience.
  2. Enter the minimum detectable effect you care about.
  3. Select relative percent or absolute percentage point effect.
  4. Choose confidence, power, test direction, and tail type.
  5. Add traffic allocation and expected visitor loss if needed.
  6. Press Calculate to view sample size above the form.
  7. Use CSV or PDF buttons to export the same inputs and results.

Example Data Table

Scenario Baseline Minimum effect Confidence Power Use case
Landing page test 4.5% 12% relative 95% 80% Measure signup uplift
Checkout button test 8% 1 percentage point 95% 90% Detect small checkout gains
Email offer test 2.2% 20% relative 90% 80% Plan campaign sample

Why sample size matters

An A/B test needs enough visitors before you can trust a result. A small test can miss a useful change. It can also promote a weak change by chance. Sample size planning reduces that risk. It sets a target before traffic starts. That target keeps the decision honest.

What this calculator estimates

This calculator estimates the visitors required in each test group. It uses the baseline conversion rate, the expected minimum detectable effect, the confidence level, and the desired power. It also supports one sided and two sided tests. You can choose equal or unequal allocation. You can add a dropout allowance. The tool then estimates total visitors and test duration from daily traffic.

How to choose inputs

Start with a realistic baseline conversion rate. Use recent data from the same funnel. Then choose the smallest uplift worth acting on. Do not use a dramatic uplift just to make the test shorter. Choose the confidence level that matches your risk tolerance. Many teams use ninety five percent confidence. Power is often eighty percent or ninety percent. Higher power needs more visitors.

How to read results

The required sample per group is the main output. The total sample shows the full test audience. The estimated duration helps you judge whether the test is practical. If the duration is too long, reconsider the change. A larger expected effect lowers the needed sample. More traffic also shortens the calendar time. Avoid stopping early because early results often swing widely.

Practical testing guidance

Run the test through a full business cycle when possible. Keep targeting rules stable. Avoid changing tracking during the test. Check that both groups receive comparable traffic quality. Record assumptions before launch. Export the results for your experiment brief. A planned sample size helps teams decide with less bias. It also makes results easier to explain later.

Common planning mistakes

Many tests fail because the target effect is chosen after launch. Some teams also ignore traffic loss from consent banners, bot filters, or broken sessions. Include a cushion for these losses. Do not mix new campaigns into one group only. Keep the audience balanced. Review the result only after the planned sample is reached with steady patience.

FAQs

What is an A/B test sample size?

It is the number of visitors needed in each group before comparing results. A planned sample size reduces random error and supports stronger decisions.

What is baseline conversion rate?

Baseline conversion rate is your current conversion rate before the new variant is tested. Use recent and reliable funnel data for this input.

What is minimum detectable effect?

It is the smallest change you want the test to detect. Smaller effects need larger sample sizes because they are harder to separate from noise.

Should I use relative or absolute effect?

Use relative effect for percentage lift over baseline. Use absolute effect when you know the exact percentage point change you want to detect.

What confidence level should I choose?

Many teams choose 95 percent confidence. A higher confidence level lowers false positive risk, but it usually requires more visitors.

What does statistical power mean?

Power is the chance of detecting the target effect if it truly exists. Higher power reduces missed wins, but it increases sample size.

What is a two sided test?

A two sided test checks for either an increase or decrease. It is more conservative than a one sided test and often needs more visitors.

Can I stop a test early?

Stopping early can mislead results because early conversion rates often fluctuate. Use the planned sample size unless you use a valid sequential method.

Related Calculators

Paver Sand Bedding Calculator (depth-based)Paver Edge Restraint Length & Cost CalculatorPaver Sealer Quantity & Cost CalculatorExcavation Hauling Loads Calculator (truck loads)Soil Disposal Fee CalculatorSite Leveling Cost CalculatorCompaction Passes Time & Cost CalculatorPlate Compactor Rental Cost CalculatorGravel Volume Calculator (yards/tons)Gravel Weight Calculator (by material type)

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.