Enter predictor and response data
Enter predictor rows in the first box. Enter one response value per row in the second box. The sample data below already matches correctly.
Example data table
This example contains four predictors and one response. You can paste the same values into the calculator.
| Sample | X1 | X2 | X3 | X4 | Y |
|---|---|---|---|---|---|
| 1 | 8 | 18 | 4 | 210 | 32 |
| 2 | 10 | 22 | 6 | 250 | 38 |
| 3 | 12 | 25 | 6 | 275 | 41 |
| 4 | 14 | 28 | 7 | 290 | 47 |
| 5 | 16 | 32 | 8 | 330 | 54 |
| 6 | 18 | 35 | 8 | 360 | 58 |
| 7 | 20 | 39 | 9 | 400 | 64 |
| 8 | 22 | 43 | 10 | 430 | 71 |
| 9 | 24 | 47 | 11 | 465 | 77 |
| 10 | 26 | 50 | 12 | 500 | 82 |
Formula used
Partial least squares regression builds latent components from predictors and response together. It is especially useful when predictors are highly correlated or numerous.
1) Standardize data: X* = (X - mean(X)) / sd(X) y* = (y - mean(y)) / sd(y) 2) Component extraction: w_h = normalize(X_h' y_h) t_h = X_h w_h p_h = (X_h' t_h) / (t_h' t_h) q_h = (y_h' t_h) / (t_h' t_h) 3) Deflation: X_(h+1) = X_h - t_h p_h' y_(h+1) = y_h - q_h t_h 4) Standardized coefficient vector: b* = W (P'W)^(-1) q 5) Back-transform to original scale: beta_j = (sd(y) / sd(X_j)) * b*_j intercept = mean(y) - Σ(beta_j mean(X_j)) 6) Prediction: y_hat = intercept + X beta
VIP scores summarize predictor importance across extracted components. A VIP score above 1.0 often suggests strong influence, though interpretation should follow your modeling context.
How to use this calculator
- Paste predictor observations into the X box. Keep columns aligned.
- Paste one response value for each predictor row.
- Choose the number of latent components to extract.
- Keep standardization enabled for differently scaled predictors.
- Click Calculate Now to generate the fitted model.
- Review R², RMSE, coefficients, VIP scores, and fitted values.
- Use the CSV button for spreadsheet review.
- Use the PDF button for sharing or archiving results.
Frequently asked questions
1. What does this calculator estimate?
It fits a partial least squares regression model for one response variable. It estimates an intercept, predictor coefficients, fitted values, VIP scores, and standard performance metrics from the pasted dataset.
2. When should I use partial least squares regression?
Use it when predictors are strongly correlated, numerous, or noisy. PLS reduces predictors into latent components while preserving information useful for predicting the response.
3. Why are components needed?
Components compress the predictor matrix into smaller latent dimensions. Too few components may underfit. Too many may add noise. Start with one or two, then compare the reported fit.
4. What does the VIP score mean?
VIP stands for variable importance in projection. Larger values indicate stronger overall contribution across components. Values near or above 1.0 are commonly treated as influential predictors.
5. Should I standardize the data?
Usually yes. Standardization is helpful when predictor units differ greatly. It prevents large-scale variables from dominating the component extraction step.
6. What does R² tell me here?
R² shows how much response variation is explained by the fitted model on the provided dataset. Higher values indicate closer fit, but they do not guarantee future predictive performance.
7. Can I use several response variables?
This page handles a single response variable. For multiple responses, a PLS2 implementation would be required with a response matrix instead of one response vector.
8. Why might the calculator show an input warning?
Warnings appear when requested components exceed the allowable maximum or when the dataset has structural issues. Keep row counts matched and avoid constant response values.