Calculator
Example Data Table
| y | x1 | x2 | x3 |
|---|---|---|---|
| 18 | 2 | 1 | 7 |
| 20 | 3 | 2 | 6 |
| 22 | 4 | 2 | 5 |
| 25 | 5 | 3 | 5 |
| 27 | 6 | 4 | 4 |
| 30 | 7 | 5 | 3 |
| 31 | 8 | 6 | 2 |
| 33 | 9 | 7 | 2 |
Formula Used
- Model: y = β₀ + β₁x₁ + … + βₖxₖ + ε
- OLS coefficients: β̂ = (XᵀX)⁻¹Xᵀy
- Residual variance: MSE = SSE / (n − k)
- Std error: SE(β̂ⱼ) = √(MSE · (XᵀX)⁻¹ⱼⱼ)
- t-test: tⱼ = β̂ⱼ / SE(β̂ⱼ), df = n − k
- Two-sided p-value: pⱼ = 2 · (1 − CDFt(|tⱼ|, df))
- Backward elimination: remove the predictor with the largest p-value if it exceeds alpha, then refit.
How to Use This Calculator
- Paste your data with a header row. Put y first.
- Choose alpha (common values: 0.10, 0.05, 0.01).
- Press Run to see each elimination step.
- Review final coefficients, p-values, and adjusted R².
- Download CSV or PDF for reporting or homework.
FAQs
1) What is backward elimination regression?
It is a stepwise approach that starts with all predictors, then removes the least significant variable repeatedly until all remaining predictors pass a chosen p-value threshold.
2) What does alpha mean here?
Alpha is the maximum acceptable p-value for keeping a predictor. If a predictor’s p-value is greater than alpha, it becomes a candidate for removal.
3) Why can a predictor be removed even if R² drops?
R² often decreases when predictors are removed. Adjusted R² and interpretability can improve because the model becomes simpler and penalizes unnecessary variables.
4) What if the matrix inversion fails?
That usually means predictors are highly collinear or there are too few rows. Remove redundant predictors, add more observations, or standardize your dataset formatting.
5) Do I need an intercept term?
Most linear regression models include an intercept to avoid forcing the fit through the origin. This calculator always includes an intercept term by default.
6) Are p-values always reliable for selection?
Not always. Stepwise selection can overfit and inflate significance. Use domain knowledge, cross-validation, and diagnostics to confirm that the final model generalizes.
7) What data format should I paste?
Paste CSV or TSV with a header row. The first column is y. Every value should be numeric, and each row must have the same number of columns.