Equivalence Test Calculator

Analyze one-sample, paired, and independent equivalence tests. Set margins, confidence levels, and assumptions before calculating. Review results, charts, exports, formulas, examples, and guidance instantly.

Calculator

Test Type

Alpha

Confidence Level

Lower Margin Magnitude

Upper Margin Magnitude

Variance Assumption

Two-Sample Inputs

Group 1 Mean

Group 1 SD

Group 1 n

Group 2 Mean

Group 2 SD

Group 2 n

One-Sample Inputs

Sample Mean

Sample SD

Sample Size

Reference Mean

Paired Inputs

Mean Difference

SD of Differences

Number of Pairs

Example Data Table

This example illustrates paired observations and their within-subject differences. You can adapt the same idea for one-sample or two-group studies.

Dataset	Value A	Value B	Difference
Pair 1	12.2	12.0	0.2
Pair 2	11.8	11.9	-0.1
Pair 3	12.4	12.3	0.1
Pair 4	12.1	12.2	-0.1
Pair 5	11.9	12.0	-0.1

Formula Used

Equivalence testing usually applies the two one-sided tests procedure, called TOST. Instead of checking whether the effect equals zero, it checks whether the effect stays between a lower and upper practical bound.

Effect estimate: For one-sample testing, estimate = sample mean − reference mean. For two independent groups, estimate = mean₁ − mean₂. For paired data, estimate = mean paired difference.

Lower one-sided test: t_L = (estimate − lower bound) / SE

Upper one-sided test: t_U = (estimate − upper bound) / SE

Decision rule: Conclude equivalence when both one-sided p-values are below alpha. The corresponding confidence interval must also lie entirely within the equivalence bounds.

Standard error examples: One-sample SE = s / √n. Paired SE = s_d / √n. Independent groups use either pooled or Welch standard errors, depending on the chosen assumption.

How to Use This Calculator

Select one-sample, paired, or two-sample independent testing.
Enter alpha, usually 0.05, then define practical lower and upper equivalence margins.
Provide the relevant summary statistics for your study design.
Choose Welch when group variances may differ; choose pooled when equal variance is defensible.
Click Calculate Equivalence to view the decision, p-values, confidence interval, and chart.
Use the export buttons to download a CSV summary or a printable PDF view.

FAQs

1. What is an equivalence test?

An equivalence test checks whether an effect is small enough to be practically unimportant. It does not test for a zero difference exactly. Instead, it evaluates whether the estimate falls between predefined lower and upper similarity bounds.

2. What does TOST mean?

TOST stands for two one-sided tests. One test checks whether the effect is above the lower bound. The second checks whether it is below the upper bound. Both must pass to conclude equivalence.

3. How are equivalence margins chosen?

Margins should come from subject-matter reasoning, clinical relevance, engineering tolerance, or established domain guidance. They should represent the largest acceptable difference that still counts as practically similar.

4. When should I use Welch instead of pooled variance?

Use Welch when group spreads or sample sizes differ noticeably. It is more robust when equal variances are uncertain. Use pooled variance only when the equal-variance assumption is justified by design or diagnostics.

5. Can I conclude equivalence if a traditional t-test is non-significant?

No. A non-significant difference test only says evidence for a difference was insufficient. Equivalence requires evidence that the true effect is small enough to stay inside your chosen practical bounds.

6. Why is the confidence interval important here?

The confidence interval gives a visual summary of uncertainty. If the entire interval lies inside the equivalence range, it supports the same conclusion as passing both one-sided tests at the matching alpha level.

7. Can this calculator work from summary statistics only?

Yes. This page is designed for summary inputs such as means, standard deviations, sample sizes, and paired-difference statistics. Raw data are not required for the core TOST calculations shown here.

8. What does the final decision mean?

If both one-sided tests are significant, the result supports practical equivalence within your selected margins. If not, the evidence is insufficient to conclude equivalence under the current assumptions and inputs.