Calculator Inputs
Example Data Table
These example rows show how different models can be compared with the same weighting scheme and time budget.
| Rank | Candidate | Parameter Snapshot | Mean CV | CV Std | Train | Validation | Fit Time | Final Score |
|---|---|---|---|---|---|---|---|---|
| 1 | SVM - RBF Kernel | C=10, gamma=0.01, class_weight=balanced | 0.903 | 0.012 | 0.915 | 0.900 | 0.88s | 91.73 |
| 2 | Elastic Net - Alpha 0.2 | alpha=0.2, l1_ratio=0.35, max_iter=5000 | 0.886 | 0.009 | 0.894 | 0.884 | 0.21s | 91.47 |
| 3 | Random Forest - Depth 12 | n_estimators=300, max_depth=12, min_samples_leaf=2 | 0.914 | 0.018 | 0.941 | 0.909 | 0.62s | 90.01 |
| 4 | XGBoost - LR 0.05 | max_depth=6, learning_rate=0.05, subsample=0.8 | 0.927 | 0.026 | 0.972 | 0.918 | 1.10s | 83.86 |
Formula Used
The calculator first converts your score inputs into a 0 to 100 scale. That keeps performance, spread, efficiency, and penalty values comparable.
1. Performance Score
Performance Score = (0.70 × Mean CV Score) + (0.30 × Validation Score)
2. Generalization Gap
Generalization Gap = |Training Score − Validation Score|
3. Stability Score
Stability Score = 100 − (2 × CV Standard Deviation)
4. Runtime Estimate
Runtime Estimate = Total Candidates × CV Folds × (Fit Time + Score Time)
5. Efficiency Score
Efficiency Score = 100 × (Time Budget ÷ Runtime Estimate)
6. Weighted Base Score
Weighted Base = (Performance × Performance Weight + Stability × Stability Weight + Efficiency × Efficiency Weight) ÷ Total Weights
7. Overfit Penalty
Overfit Penalty = Generalization Gap × Overfit Penalty Factor
8. Final Grid Search Score
Final Score = Weighted Base − Overfit Penalty
How to Use This Calculator
- Enter a clear candidate label for the parameter combination you want to review.
- Select whether your metric values are decimals or percentages.
- Provide mean CV score, CV standard deviation, training score, and validation score.
- Enter fit time, score time, candidate count, folds, and your time budget.
- Adjust the three weights to reflect your preference for quality, stability, and speed.
- Set an overfit penalty factor. Larger values punish wide train-validation gaps more strongly.
- Submit the form to view the score summary above the calculator.
- Use the CSV or PDF buttons to export the result for reporting.
FAQs
1. What does this calculator measure?
It combines validation quality, cross-validation stability, runtime efficiency, and overfitting risk into one comparable score. That helps rank grid search candidates more consistently than using mean CV score alone.
2. Why not rely only on mean CV score?
Mean CV score is useful, but it can hide instability, excessive runtime, or a large train-validation gap. This calculator adds those trade-offs into a single decision-friendly result.
3. What is the generalization gap?
The generalization gap is the absolute difference between training and validation performance. A larger gap often suggests overfitting, so the calculator converts that gap into a penalty.
4. How is stability scored?
Stability depends on the cross-validation standard deviation. Lower spread means the model behaves more consistently across folds, so the stability score rises as the standard deviation falls.
5. What does the time budget do?
The time budget acts as your acceptable runtime target. If the estimated full search runtime exceeds that target, the efficiency score falls and the final score becomes less favorable.
6. Can I use loss metrics like RMSE?
Yes, but transform the loss into a higher-is-better score before entering it. This calculator assumes larger performance values mean better model behavior.
7. What final score is considered good?
Scores above 80 usually indicate a strong candidate. Scores near 90 or more often suggest a very balanced configuration, assuming the inputs and weighting scheme are realistic.
8. When should I increase the overfit penalty factor?
Increase it when you care more about dependable real-world performance than pure training strength. It is especially useful when complex models score high but show wider train-validation gaps.