Example Data Table
| Example field |
Group 1 |
Group 2 |
| Mean |
84.2 |
79.1 |
| Sample standard deviation |
9.6 |
10.4 |
| Sample size |
32 |
28 |
| Suggested method |
Welch unequal variance test |
Formula Used
Welch Two Sample T Test
t = ((x̄₁ - x̄₂) - Δ₀) / √(s₁² / n₁ + s₂² / n₂)
df = (s₁² / n₁ + s₂² / n₂)² / [((s₁² / n₁)² / (n₁ - 1)) + ((s₂² / n₂)² / (n₂ - 1))]
Pooled Two Sample T Test
sₚ² = [((n₁ - 1)s₁²) + ((n₂ - 1)s₂²)] / (n₁ + n₂ - 2)
t = ((x̄₁ - x̄₂) - Δ₀) / [sₚ√(1 / n₁ + 1 / n₂)]
df = n₁ + n₂ - 2
Effect Size
Cohen's d = (x̄₁ - x̄₂) / sₚ
Hedges g = d × [1 - 3 / (4(n₁ + n₂) - 9)]
Glass delta = (x̄₁ - x̄₂) / s₂
How to Use This Calculator
Choose summary statistics if you already know both means, standard deviations, and sample sizes.
Choose raw data if you want the calculator to compute summary values first.
Select Welch when group variances or sample sizes may differ.
Select pooled only when equal variance is a reasonable study assumption.
Enter the null mean difference. Most tests use zero.
Choose a two sided or one sided alternative before calculating.
Press Calculate. The result appears above the form.
Use the CSV or PDF buttons to save your result.
Two Sample Mean Testing Overview
A two sample t test of means compares two independent group averages. It asks whether the observed mean difference is large enough to reject a chosen null difference. The method is useful when population standard deviations are unknown. It works well for experiments, surveys, quality checks, medical studies, and classroom data.
The calculator supports summary values and raw observations. Summary mode uses each group mean, sample standard deviation, and sample size. Raw mode calculates those values first. This helps users verify data before reading the test result.
Choosing The Right Variance Method
Welch testing is usually the safer default. It allows different group variances and different sample sizes. The degrees of freedom are estimated with the Welch Satterthwaite equation. This often protects the test when spreads are unequal.
The pooled option assumes both groups share one common variance. Use it only when that assumption is reasonable. It combines both sample variances into one pooled estimate. That estimate can give more power when the assumption is true.
Reading The Result
The t value measures the standardized distance from the null difference. A larger absolute t value usually gives a smaller p value. The p value shows how unusual the sample result would be if the null claim were true.
The confidence interval gives a likely range for the true mean difference. A two sided interval that excludes the null difference supports rejection. One tailed tests use one sided bounds. They should be chosen before seeing the data.
Effect Size And Practical Meaning
Statistical significance is not the same as importance. The calculator adds Cohen's d, Hedges g, and Glass delta. These values standardize the mean difference. They help compare results across scales.
A small p value can occur with a very large sample. A large effect can still miss significance when samples are small. Always read the estimate, interval, p value, and sample context together.
Good Data Habits
Check independence before using this test. Each observation should belong to one group only. Look for extreme outliers. Review units before entering values. Use raw data when possible. Keep the exported report with the study notes. It makes review and reporting easier. Share methods with readers when needed.
FAQs
What is a two sample t test of means?
It compares the averages of two independent groups. It checks whether their observed difference is statistically meaningful under a chosen null hypothesis.
When should I use Welch's test?
Use Welch's test when sample sizes differ, standard deviations differ, or equal variance is uncertain. It is a safe default for many independent group comparisons.
When should I use the pooled test?
Use the pooled test only when both populations can reasonably share the same variance. This assumption should come from study design or prior evidence.
What does the p value mean?
The p value estimates how unusual your sample difference is if the null difference is true. Smaller values give stronger evidence against the null claim.
What is the null mean difference?
It is the difference assumed by the null hypothesis. Most tests use zero, meaning both population means are assumed equal before testing.
What does the confidence interval show?
It shows a plausible range for the true mean difference. If a two sided interval excludes the null difference, the result supports rejection.
Why include effect sizes?
Effect sizes show practical difference in standardized units. They help judge importance beyond statistical significance and sample size effects.
Can I enter raw data?
Yes. Choose raw data mode and enter values separated by commas, spaces, or line breaks. The calculator computes means and standard deviations first.