Advanced A/B Test Size Calculator

Calculator

Baseline conversion rate (%)

Minimum detectable effect

Effect type

Expected direction

Significance level, alpha (%)

Statistical power (%)

Test tail

Variant to control ratio

Daily eligible visitors

Design effect multiplier

Sample cushion (%)

Round each group up to

Formula Used

The calculator uses a normal approximation for two independent proportions with an allocation ratio.

nA = [zAlpha × sqrt((1 + 1 / r) × pBar × (1 - pBar)) + zBeta × sqrt(p1 × (1 - p1) + p2 × (1 - p2) / r)]² / (p2 - p1)²

nB = r × nA

pBar = (p1 + r × p2) / (1 + r)

Here, p1 is the control conversion rate. p2 is the target variant rate. r is the variant to control allocation ratio. zAlpha uses alpha. zBeta uses power.

How to Use This Calculator

Enter the current conversion rate for the control group.
Add the minimum effect that is worth detecting.
Select relative lift or absolute percentage point change.
Choose alpha, power, tail type, and allocation ratio.
Add daily traffic to estimate test duration.
Use design effect or cushion for safer planning.
Press the submit button and review the result above the form.
Download the result as CSV or PDF when needed.

Example Data Table

Baseline	Effect	Power	Alpha	Split	Control Sample	Variant Sample
5%	10% relative increase	80%	5%	1:1	31,234	31,234
10%	15% relative increase	90%	5%	1:1	8,960	8,960
20%	5% relative increase	80%	5%	1:1	25,583	25,583
3%	25% relative increase	80%	5%	1:1	9,100	9,100

What This Calculator Does

An A/B test size calculator estimates how many visitors you need before starting an experiment. It compares a control rate with a target variant rate. It also uses alpha, power, traffic split, and lift. These inputs help you avoid weak tests. A small sample can miss a real gain. A very large sample can waste traffic and time.

Why Sample Size Matters

Sample size controls the strength of your decision. Power shows the chance of detecting the planned lift. Alpha sets the risk of calling a result real when it is not. A lower alpha needs more visitors. A higher power also needs more visitors. Smaller lifts are harder to prove. This is why a tiny conversion change needs a large experiment.

Planning Before Launch

Use realistic baseline data. Pull it from recent analytics. Do not use a lucky day only. Choose the smallest change worth acting on. This is the minimum detectable effect. Enter a relative lift when you think in percent growth. Enter an absolute lift when you think in percentage points. Use equal traffic when both pages are stable. Use an uneven split when the variant is risky.

Reading the Result

The calculator returns sample size for control and variant groups. It also shows total visitors and expected conversions. If daily traffic is entered, it estimates test duration. This estimate is only a planning guide. Real tests can run longer because traffic changes. Tracking breaks can also affect results. Use the adjusted sample when dropout or design effect applies.

Good Testing Practice

Keep one main metric. Define it before launch. Do not stop only because early results look good. Wait until the planned sample is reached. Check that both groups receive similar traffic quality. Avoid running major site changes during the test. Record dates, targeting rules, device mix, and exclusions. These notes make the result easier to trust. A careful plan makes decisions cleaner and more useful.

Common Mistakes

Do not change the goal after viewing results. That creates bias. Do not compare many metrics without a plan. More checks can raise false alarms. Match the sample plan to business value. A result should be both statistically clear and practically useful for growth.

FAQs

What is an A/B test sample size?

It is the number of visitors or users needed in each group. The value helps a test detect the planned conversion difference with selected confidence and power.

What does baseline conversion mean?

Baseline conversion is the current control rate. Use recent and stable data. A poor baseline estimate can make the sample size too small or too large.

What is minimum detectable effect?

It is the smallest change you want the test to detect. Smaller effects need larger samples. Choose a value that matters for business decisions.

Should I use one-sided or two-sided testing?

Use two-sided testing when either an increase or decrease matters. Use one-sided testing only when you planned one direction before the experiment starts.

Why does higher power increase sample size?

Higher power reduces the chance of missing a real effect. That extra certainty needs more observations in both control and variant groups.

What does allocation ratio mean?

It sets how traffic is divided between variant and control. A ratio of 1 means equal traffic. A ratio of 2 gives the variant twice the control sample.

Can I stop the test early?

Stopping early can create misleading results. Decide the sample size first. Then review results after the planned sample and quality checks are complete.

What is sample cushion?

Sample cushion adds extra visitors for tracking loss, filtering, bots, or exclusions. It helps protect the final usable sample after cleanup.