Advanced Grid Search Score Calculator

Calculator Inputs

Candidate Label

Score Scale

Mean CV Score

CV Standard Deviation

Training Score

Validation Score

Fit Time per Fold (seconds)

Score Time per Fold (seconds)

Total Grid Candidates

CV Folds

Search Time Budget (seconds)

Performance Weight

Stability Weight

Efficiency Weight

Overfit Penalty Factor

This tool assumes a higher-is-better scoring metric. For loss metrics, enter an already transformed score before calculation.

Example Data Table

These example rows show how different models can be compared with the same weighting scheme and time budget.

Rank	Candidate	Parameter Snapshot	Mean CV	CV Std	Train	Validation	Fit Time	Final Score
1	SVM - RBF Kernel	C=10, gamma=0.01, class_weight=balanced	0.903	0.012	0.915	0.900	0.88s	91.73
2	Elastic Net - Alpha 0.2	alpha=0.2, l1_ratio=0.35, max_iter=5000	0.886	0.009	0.894	0.884	0.21s	91.47
3	Random Forest - Depth 12	n_estimators=300, max_depth=12, min_samples_leaf=2	0.914	0.018	0.941	0.909	0.62s	90.01
4	XGBoost - LR 0.05	max_depth=6, learning_rate=0.05, subsample=0.8	0.927	0.026	0.972	0.918	1.10s	83.86

Formula Used

The calculator first converts your score inputs into a 0 to 100 scale. That keeps performance, spread, efficiency, and penalty values comparable.

1. Performance Score
Performance Score = (0.70 × Mean CV Score) + (0.30 × Validation Score)

2. Generalization Gap
Generalization Gap = |Training Score − Validation Score|

3. Stability Score
Stability Score = 100 − (2 × CV Standard Deviation)

4. Runtime Estimate
Runtime Estimate = Total Candidates × CV Folds × (Fit Time + Score Time)

5. Efficiency Score
Efficiency Score = 100 × (Time Budget ÷ Runtime Estimate)

6. Weighted Base Score
Weighted Base = (Performance × Performance Weight + Stability × Stability Weight + Efficiency × Efficiency Weight) ÷ Total Weights

7. Overfit Penalty
Overfit Penalty = Generalization Gap × Overfit Penalty Factor

8. Final Grid Search Score
Final Score = Weighted Base − Overfit Penalty

How to Use This Calculator

Enter a clear candidate label for the parameter combination you want to review.
Select whether your metric values are decimals or percentages.
Provide mean CV score, CV standard deviation, training score, and validation score.
Enter fit time, score time, candidate count, folds, and your time budget.
Adjust the three weights to reflect your preference for quality, stability, and speed.
Set an overfit penalty factor. Larger values punish wide train-validation gaps more strongly.
Submit the form to view the score summary above the calculator.
Use the CSV or PDF buttons to export the result for reporting.

FAQs

1. What does this calculator measure?

It combines validation quality, cross-validation stability, runtime efficiency, and overfitting risk into one comparable score. That helps rank grid search candidates more consistently than using mean CV score alone.

2. Why not rely only on mean CV score?

Mean CV score is useful, but it can hide instability, excessive runtime, or a large train-validation gap. This calculator adds those trade-offs into a single decision-friendly result.

3. What is the generalization gap?

The generalization gap is the absolute difference between training and validation performance. A larger gap often suggests overfitting, so the calculator converts that gap into a penalty.

4. How is stability scored?

Stability depends on the cross-validation standard deviation. Lower spread means the model behaves more consistently across folds, so the stability score rises as the standard deviation falls.

5. What does the time budget do?

The time budget acts as your acceptable runtime target. If the estimated full search runtime exceeds that target, the efficiency score falls and the final score becomes less favorable.

6. Can I use loss metrics like RMSE?

Yes, but transform the loss into a higher-is-better score before entering it. This calculator assumes larger performance values mean better model behavior.

7. What final score is considered good?

Scores above 80 usually indicate a strong candidate. Scores near 90 or more often suggest a very balanced configuration, assuming the inputs and weighting scheme are realistic.

8. When should I increase the overfit penalty factor?

Increase it when you care more about dependable real-world performance than pure training strength. It is especially useful when complex models score high but show wider train-validation gaps.