Calculator
Example data table
This sample shows a near-linear trend with mild noise.
| x | y |
|---|---|
| 1 | 1.2 |
| 2 | 1.9 |
| 3 | 3.1 |
| 4 | 3.9 |
| 5 | 5.2 |
| 6 | 5.8 |
| 7 | 7.1 |
| 8 | 8.0 |
| 9 | 9.2 |
| 10 | 9.9 |
Formula used
LOESS estimates a smooth value at each target x by fitting a local polynomial with distance-based weights.
- Neighborhood size: k = ceil(span × n) nearest points.
- Tricube weight: wi = (1 − |u|³)³, where u = (xi − x₀) / dmax, and |u| < 1.
- Local model: y ≈ β₀ + β₁(x − x₀) [+ β₂(x − x₀)²].
- Weighted least squares: β = (XᵀWX)⁻¹XᵀWy, prediction is ŷ(x₀)=β₀.
- Robust option: iteratively reweight by bisquare on residuals to reduce outlier influence.
Using centered (x − x₀) improves numerical stability for local fits.
How to use this calculator
- Paste your x,y pairs, one per line.
- Choose a span to control smoothing strength.
- Select degree 1 for straight local trends.
- Select degree 2 for curved local behavior.
- Set robust iterations if outliers exist.
- Pick Grid output or enter prediction x values.
- Press Submit to see results above the form.
- Download CSV or PDF for documentation and sharing.
Why LOESS is useful
LOESS is a local smoothing method that reveals structure when scatter is noisy, irregular, or partially missing. Instead of fitting one global equation, it fits many small regressions around each x and stitches the predictions into a smooth curve. This makes it practical for exploratory modeling, sensor calibration checks, process drift monitoring, and quick validation of whether a relationship is linear, curved, or changing across the range in practice today.
Span controls bias and variance
The span is the fraction of points used in each neighborhood. A small span follows rapid changes but can chase noise; a large span reduces variance but can blur real features. In this calculator, k equals ceil(span × n), so adding data points increases local sample size automatically. A good workflow is to start near 0.5, then adjust downward to capture curvature, or upward to emphasize overall trend.
Local regression mechanics
For each target x0, distances to all observations are computed and scaled by the kth-nearest distance dmax. Tricube weights drop smoothly to zero outside the neighborhood, emphasizing nearby points. A weighted least squares fit is then solved for a centered polynomial in (x − x0). The predicted value at x0 is the fitted intercept, which improves stability and reduces rounding issues compared with evaluating the polynomial directly.
Robust reweighting for outliers
Outliers can distort local fits, especially at small spans. Robust iterations address this by recalculating weights using residuals from an initial pass. Residuals are scaled by six times the median absolute deviation and converted to bisquare weights, which downweight large deviations without discarding data. One or two iterations often improves trend clarity in operational datasets where occasional spikes, logging errors, or rare events appear.
Interpreting outputs and diagnostics
The exported table lists x and smoothed y values for either a grid or custom prediction points. Grid output is ideal for plotting a smooth curve on dashboards or reports. RSS and R² are computed on the input points using the fitted values at those x locations; they support quick comparisons across spans but are not formal inference statistics. Always review residual patterns and consider domain context before making decisions.
FAQs
1) What span should I start with?
Start around 0.5 for balanced smoothing. Decrease span to capture sharper curvature, and increase it to suppress noise when the curve looks too wiggly.
2) When should I use degree 1 versus degree 2?
Use degree 1 for simpler local trends and more stability near edges. Use degree 2 when the relationship curves and you want smoother turning behavior.
3) What do robust iterations do?
They downweight points with large residuals using a bisquare rule. This reduces the impact of outliers and makes the smoothed curve reflect the typical pattern.
4) Why do I see different results on Grid vs Predict?
The method is the same, but x locations differ. Grid evaluates evenly across the data range, while Predict evaluates only at your custom x values.
5) Can I use this for extrapolation beyond my data?
LOESS is designed for interpolation within the observed x range. Predictions outside the range are unreliable because neighborhoods become one-sided and sparse.
6) How should I interpret RSS and R² here?
They summarize fit quality on the input points for the chosen settings. Use them to compare spans and robustness, but don’t treat them as formal hypothesis tests.