Calculator inputs
Example data table
| Fold | Validation score | Log score proxy | Comment |
|---|---|---|---|
| 1 | 0.84 | -0.1744 | Strong predictive fit |
| 2 | 0.81 | -0.2107 | Good holdout response |
| 3 | 0.86 | -0.1508 | Best fold performance |
| 4 | 0.80 | -0.2231 | Threshold-level fit |
| 5 | 0.83 | -0.1863 | Stable predictive score |
Formula used
Prior parameters: α₀ = prior mean × prior strength, and β₀ = (1 − prior mean) × prior strength.
Posterior update: α = α₀ + Σ fold scores, and β = β₀ + n − Σ fold scores.
Posterior mean: μ = α / (α + β).
Posterior variance: Var(μ) = αβ / [((α + β)²)(α + β + 1)].
Credible interval: μ ± z × √Var(μ), clipped to the unit interval.
ELPD proxy: Σ log(scoreᵢ) / temperature, scaled by fold count.
WAIC: −2 × (ELPD − variance of fold log scores).
Expected risk: (1 − posterior mean) + complexity penalty / 100.
Stability index: 100 × [1 − min(1, sample deviation × √n × temperature)].
How to use this calculator
- Enter a model name for easy result tracking.
- Paste fold validation scores between 0 and 1.
- Set a prior mean that reflects earlier belief.
- Use prior strength to control belief influence.
- Choose a decision threshold for acceptance testing.
- Add a complexity penalty if model size matters.
- Adjust likelihood temperature to soften harsh evidence.
- Press submit to show the result above the form.
- Use CSV or PDF buttons to export your summary.
Why this calculator helps
This calculator blends prior expectations with observed fold scores. It helps compare mean performance, uncertainty width, predictive fit, and penalty-adjusted risk in one place. The result is useful when you want a more cautious validation summary than a simple average alone.
FAQs
1. What does Bayesian cross validation measure?
It estimates model performance by combining fold outcomes with prior belief. This approach reports posterior mean performance, uncertainty, and decision-oriented metrics instead of relying only on raw fold averages.
2. Why use a prior mean?
A prior mean lets you encode earlier evidence, expert judgment, or historical benchmark behavior. Stronger priors have more influence, while weaker priors allow the fold data to dominate.
3. What is prior strength?
Prior strength behaves like pseudo-observations. A larger value makes the posterior stay closer to the prior. A smaller value makes the posterior respond more directly to current folds.
4. Why must fold scores stay between 0 and 1?
The calculator uses a Beta-style posterior update. That framework assumes bounded scores in the unit interval, such as normalized accuracy, probability, or scaled validation quality.
5. What does WAIC mean here?
WAIC is a penalty-aware predictive score. Lower values usually indicate better out-of-sample behavior after accounting for variation in fold-level predictive fit.
6. What is the probability above threshold?
It estimates how likely the posterior performance is to exceed your chosen decision target. This is useful when approval depends on clearing a minimum predictive standard.
7. How should I set complexity penalty?
Use a higher penalty when larger models cost more, overfit more easily, or are harder to deploy. Keep it near zero when complexity has little practical downside.
8. Can I export the results?
Yes. After calculation, use the CSV button for spreadsheet review or the PDF button for a clean printable summary of the displayed metrics.