Map fold distributions and external test splits. Review reuse, leakage risk, and coverage before training. Choose smarter validation settings for dependable machine learning experiments.
The page uses a single vertical content flow, while the input controls adapt to large, medium, and mobile screens.
This table shows how different settings change the number of training and validation rows per fit.
| Scenario | Total Samples | Holdout % | CV Pool | Method | Folds / Repeats | Train per Fit | Validation per Fit |
|---|---|---|---|---|---|---|---|
| Balanced tabular model | 1,000 | 20% | 800 | K-Fold | 5 / 1 | 640 | 160 |
| Imbalanced binary classifier | 2,400 | 10% | 2,160 | Stratified K-Fold | 6 / 1 | 1,800 | 360 |
| Variance reduction study | 1,200 | 15% | 1,020 | Repeated K-Fold | 5 / 3 | 816 | 204 |
| Ordered forecasting series | 600 | 0% | 600 | Time-Series Split | 4 / 1 | 120 to 480 | 120 |
1) External holdout:
External Test Samples = round(Total Samples × Holdout % ÷ 100)
2) Cross-validation pool:
CV Pool = Total Samples − External Test Samples
3) Standard K-Fold validation size:
Validation per Fold ≈ CV Pool ÷ K
4) Standard K-Fold training size:
Training per Fold = CV Pool − Validation per Fold
5) Repeated K-Fold fits:
Total Fits = K × Repeats
6) Leave-One-Out:
Validation per Split = 1, Training per Split = CV Pool − 1
7) Time-series default test window:
Test Window = floor(CV Pool ÷ (Splits + 1))
8) Time-series growing train window:
Base Train Window = (Split Index × Test Window) + Remainder
9) Time-series gap adjustment:
Effective Train Window = Base Train Window − Gap Size
10) Stratified approximation:
Minority Fold Count ≈ Minority Samples ÷ K
This calculator focuses on split sizing, repetition load, reuse intensity, and leakage-aware planning. It does not train models or estimate actual performance metrics.
Cross validation divides available modeling data into repeated train and validation subsets. It estimates generalization more reliably than one random split because every sample gets evaluated across multiple iterations.
Use stratified splitting when your target classes are imbalanced. It keeps each fold closer to the overall class ratio, reducing unstable validation scores caused by missing minority examples.
An external test set stays untouched until final evaluation. Cross validation helps tune models, while the holdout test set provides a cleaner estimate of real-world performance.
More folds usually lower bias but increase runtime. Five or ten folds are common because they balance stability, training size, and computational cost for many datasets.
Repeated k-fold reruns shuffled folds multiple times. It smooths random variation and can produce more stable average metrics, especially on modest datasets.
Time series data should preserve chronology. Use time-series splitting when future rows must never influence earlier training windows, otherwise leakage can make results look unrealistically strong.
Group k-fold keeps related records together in the same fold. It is useful for user-level, patient-level, or device-level datasets where grouped leakage would inflate performance.
Small datasets can still use cross validation, but settings matter. Avoid too many folds when validation subsets become tiny, and watch minority class counts carefully.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.