SurveyMonkey Statistical Significance Calculator

Calculator Inputs

Group A label

Group B label

Input mode

Sample size A

Sample size B

Successes or percent A

Successes or percent B

Confidence level

Test direction

Design effect

Number of comparisons

Practical effect threshold (%)

Population A, optional

Population B, optional

Continuity correction

Example Data Table

Group	Sample Size	Positive Responses	Response Rate	Use Case
Old landing page	500	245	49.00%	Baseline survey result
New landing page	520	290	55.77%	Variant survey result
Difference	1,020 combined	45 extra positives	6.77 percentage points	Candidate significant lift

Formula Used

The calculator uses a two-proportion z test for survey response rates.

p1 = x1 / n1 and p2 = x2 / n2

pooled p = (x1 + x2) / (n1 + n2)

SE pooled = sqrt(pooled p × (1 - pooled p) × (1/n1 + 1/n2))

z = (p1 - p2) / SE pooled

p value = 2 × (1 - normalCDF(|z|)) for a two-sided test.

CI = (p1 - p2) ± z critical × SE unpooled

Design effect, finite population correction, continuity correction, and Bonferroni comparison adjustment are applied when selected.

How to Use This Calculator

Enter names for both survey groups.
Select whether your values are counts or percentages.
Add each sample size and success value.
Choose the confidence level for your decision.
Select a two-sided or one-sided test direction.
Add design effect when survey weighting increases variance.
Add comparison count when testing many questions.
Press calculate and review the result above the form.
Download the result as CSV or PDF for reporting.

Article: Understanding Survey Statistical Significance

Why significance matters

Survey results often compare two groups. One group may prefer an option more often. Another group may show weaker interest. The visible gap can look important. Yet random sampling noise can create a gap by chance. A statistical significance test helps separate real evidence from ordinary variation.

What the calculator measures

This calculator compares two response proportions. Each proportion is a success count divided by a sample size. The tool finds the difference between both rates. It then estimates standard error. Standard error shows how much the difference may move across repeated samples. A smaller standard error gives stronger evidence.

Reading the p value

The p value answers a focused question. It estimates how surprising your observed difference would be if both groups had the same true rate. A small p value means the gap is unlikely under that equal-rate assumption. Many teams use 95% confidence. That usually means alpha equals 0.05.

Confidence intervals

The confidence interval gives a range for the difference. If the interval stays above zero, Group A likely has a higher rate. If it stays below zero, Group B likely has a higher rate. If it crosses zero, the survey does not show a clear difference at the selected confidence level.

Practical impact

Statistical significance is not always business importance. A tiny difference can be significant with a huge sample. A large difference can be unclear with a small sample. Use the practical effect threshold to mark the smallest useful change. This keeps reporting balanced and honest.

Advanced settings

Weighted surveys may need a design effect. Limited populations may need finite population correction. Many simultaneous tests may need comparison adjustment. Continuity correction can make small count tests more conservative. These options help the calculator fit more survey workflows.

Frequently Asked Questions

1. What does statistical significance mean?

It means the observed difference is unlikely to be caused by random sampling variation alone, based on your chosen confidence level.

2. Can I use percentages instead of counts?

Yes. Select percentage mode, enter each sample size, and enter each response rate as a percent value from 0 to 100.

3. What confidence level should I choose?

Use 95% for common reporting. Use 90% for exploratory work. Use 99% when false positives are costly.

4. What is a two-sided test?

A two-sided test checks whether either group is different. It does not assume which group should be higher before testing.

5. What is design effect?

Design effect adjusts standard error when weighting, clustering, or sampling design increases uncertainty beyond simple random sampling.

6. Why use multiple comparison adjustment?

Testing many questions increases false positive risk. Bonferroni adjustment lowers alpha so your conclusion stays more conservative.

7. What is practical effect threshold?

It is the smallest difference you consider useful. It helps separate meaningful survey changes from trivial changes.

8. Is this an official SurveyMonkey tool?

No. It is an independent calculator for survey-style significance testing and general statistical reporting.