Backward Elimination Regression Calculator

Select predictors, remove weak ones, and refine quickly. See p-values, R², adjusted R², and errors. Built for learners who want reliable regression decisions now.

Calculator

Predictors with p > alpha are removed, one at a time.
First column must be the response y. Remaining columns are predictors.
Reset Tip: include at least 3–5× more rows than predictors.

Example Data Table

You can paste this example into the dataset box above.
yx1x2x3
18217
20326
22425
25535
27644
30753
31862
33972

Formula Used

  • Model: y = β₀ + β₁x₁ + … + βₖxₖ + ε
  • OLS coefficients: β̂ = (XᵀX)⁻¹Xᵀy
  • Residual variance: MSE = SSE / (n − k)
  • Std error: SE(β̂ⱼ) = √(MSE · (XᵀX)⁻¹ⱼⱼ)
  • t-test: tⱼ = β̂ⱼ / SE(β̂ⱼ), df = n − k
  • Two-sided p-value: pⱼ = 2 · (1 − CDFt(|tⱼ|, df))
  • Backward elimination: remove the predictor with the largest p-value if it exceeds alpha, then refit.
This tool uses t-distribution CDF computed via the incomplete beta function.

How to Use This Calculator

  1. Paste your data with a header row. Put y first.
  2. Choose alpha (common values: 0.10, 0.05, 0.01).
  3. Press Run to see each elimination step.
  4. Review final coefficients, p-values, and adjusted R².
  5. Download CSV or PDF for reporting or homework.

FAQs

1) What is backward elimination regression?
It is a stepwise approach that starts with all predictors, then removes the least significant variable repeatedly until all remaining predictors pass a chosen p-value threshold.

2) What does alpha mean here?
Alpha is the maximum acceptable p-value for keeping a predictor. If a predictor’s p-value is greater than alpha, it becomes a candidate for removal.

3) Why can a predictor be removed even if R² drops?
R² often decreases when predictors are removed. Adjusted R² and interpretability can improve because the model becomes simpler and penalizes unnecessary variables.

4) What if the matrix inversion fails?
That usually means predictors are highly collinear or there are too few rows. Remove redundant predictors, add more observations, or standardize your dataset formatting.

5) Do I need an intercept term?
Most linear regression models include an intercept to avoid forcing the fit through the origin. This calculator always includes an intercept term by default.

6) Are p-values always reliable for selection?
Not always. Stepwise selection can overfit and inflate significance. Use domain knowledge, cross-validation, and diagnostics to confirm that the final model generalizes.

7) What data format should I paste?
Paste CSV or TSV with a header row. The first column is y. Every value should be numeric, and each row must have the same number of columns.

Related Calculators

Linear Regression CalculatorMultiple Regression CalculatorLogistic Regression CalculatorSimple Regression CalculatorPower Regression CalculatorLogarithmic Regression CalculatorR Squared CalculatorAdjusted R SquaredSlope Intercept CalculatorCorrelation Coefficient Calculator

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.