Calculator Inputs
Use fixed windows for equal train blocks, or expanding windows to grow training history while keeping chronological test slices intact.
Example Data Table
Try this sample configuration to test the calculator quickly.
| Use Case | Observations | Train | Gap | Test | Stride | Horizon | Window | Sample Scores |
|---|---|---|---|---|---|---|---|---|
| Retail demand forecast | 240 | 96 | 6 | 24 | 24 | 7 | Fixed | 0.842, 0.801, 0.828, 0.815, 0.833 |
| Sensor drift monitoring | 360 | 120 | 12 | 30 | 30 | 14 | Expanding | 0.921, 0.934, 0.917, 0.940, 0.936 |
Formula Used
Fold boundaries
fixed window → train_start(k) = 1 + start_offset + (k − 1) × stride
fixed window → train_end(k) = train_start(k) + train_size − 1
expanding window → train_start(k) = 1 + start_offset
expanding window → train_end(k) = train_start(k) + train_size − 1 + (k − 1) × stride
test_start(k) = train_end(k) + gap + 1
test_end(k) = test_start(k) + test_size − 1
Coverage and risk
unique test coverage % = unique test observations ÷ total observations × 100
required gap = max(0, forecast_horizon − 1)
gap safety % = min(1, gap ÷ required gap) × 100
leakage risk score combines gap shortfall and overlapping test windows as a planning heuristic.
Score statistics
mean = sum of fold scores ÷ number of scores
sample standard deviation = √( Σ(score − mean)² ÷ (n − 1) )
standard error = standard deviation ÷ √n
confidence interval = mean ± z × standard error
stability index = 100 ÷ (1 + coefficient of variation)
How to Use This Calculator
- Enter the total number of ordered observations in your dataset.
- Choose train, gap, test, and stride sizes that match your deployment timeline.
- Set the forecast horizon so the calculator can estimate the minimum safe gap.
- Select fixed windows for constant train lengths, or expanding windows for growing history.
- Paste fold scores if you already evaluated model runs externally.
- Click the calculate button to see fold counts, leakage checks, score stability, the graph, and export buttons.
FAQs
1) What is blocked cross validation?
Blocked cross validation splits ordered data into chronological train and test blocks. It avoids shuffling, so performance estimates better reflect real sequential prediction settings.
2) Why use a gap between train and test blocks?
A gap reduces information leakage from nearby records. This matters when lagged features, delayed labels, or horizon-based targets could let training rows reveal future behavior.
3) When should I choose a fixed window?
Use a fixed window when model recency matters more than older history. It keeps train size constant and is useful for drifting environments or memory-limited workflows.
4) When is an expanding window better?
An expanding window is helpful when older observations still add signal. Each fold preserves order while increasing training history, which often stabilizes parameter estimates.
5) What does unique test coverage mean?
Unique test coverage measures how much of the full timeline is tested at least once. Higher coverage usually means broader evaluation across changing time conditions.
6) How should I interpret the leakage risk score?
Lower values are better. The score rises when the chosen gap is too short for the forecast horizon or when the stride creates overlapping test windows.
7) Can I use this for classification and regression?
Yes. The planning logic works for any ordered prediction task. Pick a relevant metric like accuracy, F1, AUC, RMSE, MAE, MAPE, or log loss.
8) Does this calculator train a model directly?
No. It plans blocked folds and summarizes the quality of external fold scores. Use it to design validation structure before or after running model experiments.