Example Data Table
| x |
y |
Note |
| 10 | 6.1 | Replicate 1 |
| 10 | 6.4 | Replicate 2 |
| 10 | 6.2 | Replicate 3 |
| 20 | 8.0 | Replicate 1 |
| 20 | 7.7 | Replicate 2 |
| 20 | 8.3 | Replicate 3 |
| 30 | 10.3 | Replicate 1 |
| 30 | 9.9 | Replicate 2 |
Formula Used
For each repeated x group, calculate the group mean of y.
SSPE = Σ Σ (yij - ȳi)²
dfPE = Σ(ni - 1)
MSPE = SSPE / dfPE
The fitted regression model gives SSE. Lack of fit is calculated as:
SSLOF = SSE - SSPE
F = MSLOF / MSPE
How to Use This Calculator
Paste data with x values in the first column and y values in the second column. Keep repeated x settings together or separate. The tool groups them automatically.
Select the delimiter, model degree, intercept option, grouping tolerance, alpha level, and decimal places. Press Calculate to view pure error and lack of fit results above the form.
Use CSV for spreadsheet work. Use PDF for a quick report copy. Compare the R code below with your own analysis.
df <- data.frame(
x = c(10,10,10,20,20,20),
y = c(6.1,6.4,6.2,8.0,7.7,8.3)
)
fit <- lm(y ~ poly(x, 1, raw = TRUE), data = df)
full <- lm(y ~ factor(x), data = df)
sspe <- deviance(full)
dfpe <- df.residual(full)
sslof <- deviance(fit) - sspe
dflof <- df.residual(fit) - dfpe
fvalue <- (sslof / dflof) / (sspe / dfpe)
pvalue <- pf(fvalue, dflof, dfpe, lower.tail = FALSE)
Understanding Pure Error
Pure error is the part of residual variation caused by repeated observations at the same input setting. It does not come from the chosen model shape. It comes from natural scatter, measurement noise, or process variation. When data contains replicated x values, pure error gives a clean estimate of experimental noise.
Why It Matters in Statistics
Pure error is useful for checking whether a regression curve is too simple. A fitted line may have large residuals for two reasons. The model may miss curvature. The observations may also be noisy. Pure error separates the second reason from the first. This makes lack of fit testing more meaningful.
How R Handles the Idea
In R, analysts usually fit the planned regression model first. Then they fit a full replicate model using factor levels for the repeated x values. The full model can match each group mean. Its residual sum of squares is the pure error sum of squares. The difference between the two residual sums is lack of fit.
Interpreting the Output
A small pure error mean square means replicated points are close together. A large value means the data has wide scatter inside repeated groups. The lack of fit F value compares model shape error against pure error. A high F value suggests the selected model misses a pattern that group means can explain.
Practical Data Tips
Pure error requires at least one replicated input setting. More repeats give a stronger estimate. Keep units consistent. Enter repeated x values in the same way, or use a small grouping tolerance. Review the group table before trusting the test.
Use in Reporting
Report SSPE, df, MSPE, and replicate groups. Also report lack of fit if a regression model is fitted. Include the polynomial degree used. Add the R code to make the result easy to verify. This calculator helps prepare those numbers before you run final analysis in R.
Common Mistakes
Do not treat every residual as pure error. Pure error only uses repeats at identical settings. Do not group values that should be different. Do not use the test when every x value appears once. In case, pure error degrees of freedom are zero, no estimate is available.
FAQs
What is pure error?
Pure error is variation among repeated y values at the same x setting. It estimates random scatter that remains even when the group mean is known.
Why do I need replicated x values?
Pure error is based on within group variation. If every x value appears only once, there is no within group variation to measure.
What is SSPE?
SSPE means pure error sum of squares. It sums squared differences between each repeated y value and its own group mean.
What is dfPE?
dfPE is the pure error degrees of freedom. It equals the total number of observations minus the number of unique x groups.
What does lack of fit mean?
Lack of fit is model error left after removing pure error. It suggests the selected regression shape may not follow the group means well.
Can I use a polynomial model?
Yes. Choose a polynomial degree from the form. The tool fits the model and compares its residual error with pure error.
What does grouping tolerance do?
Grouping tolerance combines x values that are very close. It helps when repeated settings contain small rounding differences.
Is the p value exact?
The p value uses the F distribution from the lack of fit test. It is suitable for standard replicated regression checks.