Simulation Multiple Test FWER Calculator

Calculator Input

Number of tests

True null hypotheses

Family alpha

Simulation runs

Common correlation

Alternative effect size

Sample size per group

Correction method

Test direction

Random seed

Inputs are clamped to safe ranges. The true null count cannot exceed the total test count.

Example Data Table

Scenario	Tests	True Nulls	Alpha	Correction	Expected Use
Small independent panel	10	10	0.05	Sidak	Screening clean null outcomes
Clinical endpoints	24	18	0.05	Holm	Controlling study claims
Exploratory markers	80	70	0.05	Bonferroni	Strict false positive planning

Formula Used

FWER = P(V ≥ 1), where V is the number of rejected true null hypotheses.

Simulation estimate = families with at least one false rejection ÷ total simulated families.

Independent single-step estimate = 1 - (1 - threshold)^m0, where m0 is the true null count.

Bonferroni threshold = alpha ÷ m.

Sidak threshold = 1 - (1 - alpha)^1/m.

Monte Carlo SE = sqrt(FWER × (1 - FWER) ÷ simulations).

The simulator uses correlated normal test statistics. A common factor creates dependence across tests.

How to Use This Calculator

Enter the number of hypotheses in the test family. Add the number of true null hypotheses. Choose alpha and the correction method. Set the correlation to model related outcomes. Add effect size and sample size for alternative tests. Select a seed for repeatable results. Press Calculate FWER. Use CSV or PDF buttons to export the same scenario.

Understanding Family-Wise Error Rate

Family-wise error rate, or FWER, is the chance of making at least one false rejection inside a family of related tests. It matters when one project runs many hypotheses at once. A single alpha level may look small. Yet many tests create many chances for a false positive. This calculator models that risk with repeated simulations.

Why Simulation Helps

Exact formulas are easy only under simple independence. Real studies often have correlated outcomes, mixed null and alternative hypotheses, and planned correction rules. Simulation gives a practical view of that setting. It repeatedly creates test statistics, converts them to p values, applies the selected correction, and checks whether any true null was rejected.

Core Inputs

The number of tests defines the family size. The alpha value sets the target error level. The true null count tells the tool which hypotheses can create false positives. The correlation field adds shared movement across tests. The effect size and sample size shape alternative hypotheses. Larger effects usually raise true positive detection. They do not directly define false positives, but they influence the full rejection pattern.

Correction Choices

No correction uses the raw alpha for every test. Bonferroni divides alpha by the number of tests. Sidak uses a slightly less conservative threshold when tests are independent. Holm works step by step on sorted p values. It often improves power while still controlling FWER under broad conditions.

Reading Results

The simulated FWER is the estimated probability of at least one false rejection. The Monte Carlo interval shows simulation uncertainty. Expected false rejections show average false positives per simulated family. Power shows how often alternative hypotheses were detected. Review these results together. A low FWER with very low power may be too strict for discovery work.

Good Practice

Use enough simulations for stable estimates. Ten thousand runs are often useful for planning. Increase the run count when alpha is tiny or results are near a decision boundary. Try several correlation values. Also compare correction methods before committing to an analysis plan. The calculator supports transparent planning, teaching, and sensitivity checks for multiple testing. Save exported summaries with assumptions, because reviewers can then repeat the scenario and compare future changes with the same baseline.

FAQs

What does FWER mean?

FWER means the chance of at least one false positive across a family of tests. It is stricter than checking each p value alone.

Why simulate FWER?

Simulation handles complex settings. It can include many tests, mixed nulls, alternatives, and correlated outcomes without needing a simple closed formula.

What is a true null hypothesis?

A true null hypothesis has no real effect. Rejecting it is a false positive. FWER focuses on whether any such rejection occurs.

When should I use Bonferroni?

Use Bonferroni when you need a simple and conservative correction. It is easy to explain and works under many dependence structures.

How is Sidak different?

Sidak uses a threshold based on independence. It is slightly less conservative than Bonferroni when tests are independent or close to independent.

What does Holm step-down do?

Holm sorts p values and tests them in order. It can reject more hypotheses than Bonferroni while keeping strong FWER control.

Does correlation change FWER?

Yes. Positive dependence can change the chance that false positives appear in the same simulated family. Try several correlation values.

How many simulations are enough?

Use at least ten thousand runs for planning. Increase runs when estimates are near the target alpha or when the expected error is very small.