Paste data, set penalty strength, and train quickly. See coefficients, predictions, and diagnostics immediately here. Export results to share with teammates and clients easily.
This sample shows three features predicting a numeric target.
| x1 | x2 | x3 | y |
|---|---|---|---|
| 1 | 0 | 3 | 9 |
| 2 | 1 | 2 | 10 |
| 3 | 1 | 0 | 7 |
| 4 | 2 | 1 | 12 |
| 5 | 3 | 0 | 11 |
| 6 | 5 | 1 | 16 |
| 7 | 8 | 2 | 22 |
| 8 | 13 | 3 | 30 |
Lasso regression minimizes the squared error with an L1 penalty that encourages sparsity:
This calculator fits a sparse linear model where many coefficients can become exactly zero. As the penalty λ increases, weaker predictors are removed first, which is useful when you want a smaller, more stable model. The “Zero coefficients” count is a quick signal of complexity: fewer active features usually means less variance and cleaner explanations. For high-dimensional datasets, this can behave like automated feature screening in real-world projects daily.
Coefficients represent the expected change in the target for a one‑unit increase in a feature, holding others constant. Positive weights increase predictions; negative weights reduce them. The intercept is the baseline prediction when all features are zero. When “Fit intercept” is enabled, the model centers data internally and then maps results back to the original scale, keeping interpretations consistent.
Regularization is a trade‑off between fit and simplicity. Start with a small grid such as 0.01, 0.1, 0.5, and 1.0 and track test MSE and test R². If training metrics are excellent but test metrics worsen, your model is likely too flexible. If both sets perform poorly, λ may be too large or the features may not capture the signal. For consistent comparisons, keep the same split seed and note how sparsity changes alongside the test error.
L1 penalties are scale sensitive: a large‑magnitude feature can dominate the optimization and receive a smaller relative penalty. With “Standardize features” enabled, each column is centered and scaled before coordinate descent updates. This makes the penalty comparable across variables, improves numerical stability, and often produces a more reliable set of selected predictors. It is strongly recommended when mixing units such as currency, percentages, and counts.
MSE emphasizes large errors and is helpful when outliers are costly. R² summarizes explained variance and is easy to compare across datasets. Use the fixed seed option to reproduce the same split while testing different λ values. Exporting to CSV or PDF supports consistent reporting, especially when documenting experiments, assumptions, and chosen hyperparameters. If results fluctuate, increase iterations slightly and confirm coefficients stop changing beyond your tolerance setting.
It minimizes mean squared error plus an L1 penalty on coefficients. The penalty encourages sparse solutions, often setting weaker coefficients to exactly zero.
The L1 penalty applies soft‑thresholding during updates. If a feature’s contribution is smaller than the penalty, the optimal coefficient shrinks to zero and the feature is excluded.
Use it when features have different scales. It makes the penalty fair across columns and usually improves convergence. If all features are already comparable, it is optional.
Compare test MSE and test R² across a small grid of values. Prefer the smallest test error with a reasonable number of active features for interpretability.
Negative test R² means the model performs worse than predicting the test-set mean. This can happen with weak features, heavy regularization, or noisy targets.
This implementation is for linear regression. Classification typically uses logistic loss with an L1 penalty and outputs probabilities rather than continuous predictions.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.