Validate forecasts across folds, horizons, and time drift. Review errors by fold, overall, and confidence. Make stronger planning decisions from repeatable forecast evidence today.
| Index | Actual | Predicted | Suggested Fold |
|---|---|---|---|
| 1 | 120 | 118 | 1 |
| 2 | 127 | 126 | 1 |
| 3 | 129 | 131 | 1 |
| 4 | 134 | 133 | 2 |
| 5 | 138 | 136 | 2 |
| 6 | 141 | 143 | 2 |
| 7 | 145 | 144 | 3 |
| 8 | 149 | 150 | 3 |
| 9 | 153 | 151 | 3 |
| 10 | 156 | 155 | 4 |
| 11 | 160 | 161 | 4 |
| 12 | 164 | 163 | 4 |
The calculator validates forecast accuracy across blocked folds, then projects a forward forecast using trend, optional seasonality, and bias correction from cross-validation residuals.
Residual = Actual - PredictedMAE = mean(|Residual|)RMSE = sqrt(mean(Residual²))MAPE = mean(|Residual / Actual|) × 100 (non-zero actuals only)SMAPE = mean(2×|A-P| / (|A|+|P|)) × 100Bias = mean(Residual) for directional error adjustmentQuality Score = 100 - Weighted Error using normalized weightsForecast(step) = Base + Drift + Seasonal + BiasInterval = Forecast ± Z × ResidualStd × √stepSeasonal component uses deviations from the latest full season window when a season length is provided. Drift is estimated from the latest trend and adjusted by the drift percentage.
Cross validation forecasting performs best when observations are ordered, complete, and representative of real operating conditions. Most teams start with three to five blocked folds and at least twelve points, so each fold captures changing levels. This calculator supports manual fold IDs or automatic fold assignment, which makes repeated experiments easier to compare. Standardizing the same fold design across model versions improves auditability, trend reviews, and approval decisions. It also reduces leakage and confusion.
The calculator reports MAE, RMSE, MAPE, SMAPE, and bias because forecast quality is multidimensional. MAE describes average miss size, RMSE penalizes larger misses, MAPE shows proportional error, and SMAPE remains useful across changing scales. Bias reveals directional overforecasting or underforecasting. Reviewing all five measures together is more reliable than using a single metric, especially when demand patterns include promotions, shocks, or uneven volatility. This is critical for budget planning.
A weighted quality score combines multiple error signals into one operational indicator. Analysts can increase RMSE weight when large misses are expensive, or increase MAPE weight when percentage precision drives planning. The calculator normalizes weights before scoring, keeping comparisons fair between different reviewers. Even with a strong score, teams should still inspect raw metrics, because equal scores can hide very different risk profiles and response needs. Document chosen weights in model governance logs.
After validation, the tool projects future values using a base level, drift adjustment, optional seasonality, and residual bias correction. This gives a practical preview before production deployment. Confidence intervals are built from residual standard deviation and a Z multiplier, so ranges widen as forecast steps increase. In operations, widening intervals usually indicate uncertainty growth, staffing buffers, procurement exposure, or a retraining requirement.
For professional reporting, export CSV for analysis and PDF for management summaries. Record the dataset period, fold strategy, horizon, season length, and confidence settings with every run to preserve reproducibility. Compare recurring runs using the same setup to detect drift early. If bias increases while MAE stays flat, recalibration is often needed. If error metrics decline together, the forecasting process is usually becoming more dependable.
It evaluates forecast performance by comparing actual and predicted values across folds, then summarizes MAE, RMSE, MAPE, SMAPE, bias, and a weighted quality score for model review.
Use blocked folds for time ordered data. Random folds can leak future information into training logic and make forecast accuracy look better than production performance.
MAPE is intuitive for percentage error, while SMAPE is more stable when scales change or values vary widely. Reviewing both gives a safer comparison across datasets.
Increase the weight of the metric that matches business risk. For example, prioritize RMSE for large misses or MAPE for percentage based planning targets.
Bias is the average residual. Positive bias means predictions are generally low, while negative bias means predictions are generally high relative to actual outcomes.
Retrain when bias trends worsen, error metrics rise across repeated runs, or confidence intervals become too wide for operational decisions and planning commitments.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.