Advanced Brier Score Calculator

Brier Score Calculator Form

Enter probabilities as decimals or percentages. Outcomes must be 0 or 1.

Dataset Label

Reference Probability

Calibration Bins

Decimal Places

Accepted Probability Formats

0.65, 0.8, 0.25 or 65, 80, 25

Accepted Outcome Formats

Binary values only: 0 for no, 1 for yes

Predicted Probabilities

Observed Outcomes

Optional Weights

Example Data Table

This sample dataset matches the default example loaded into the form.

Case	Predicted Probability	Observed Outcome	Weight
1	0.05	0	1
2	0.15	0	1
3	0.25	1	1
4	0.40	0	1
5	0.55	1	1
6	0.65	1	1
7	0.72	0	1
8	0.81	1	1
9	0.90	1	1
10	0.96	1	1

Formula Used

Primary Brier Score Formula

BS = (1 / N) × Σ (pᵢ - oᵢ)²

Here, pᵢ is the predicted probability and oᵢ is the observed binary outcome.

Weighted Version

BS = Σ[wᵢ × (pᵢ - oᵢ)²] / Σwᵢ

Weights help emphasize more important observations or larger segments.

Brier Skill Score

BSS = 1 - (BS / BSref)

The reference score usually comes from climatology or another benchmark probability.

Murphy Decomposition

BS ≈ Reliability - Resolution + Uncertainty

This splits total error into calibration quality, discrimination power, and event uncertainty.

How to Use This Calculator

Enter one predicted probability for each forecasted event.
Enter matching observed outcomes using only 0 or 1 values.
Add weights if some observations should influence the score more.
Leave the reference probability blank to use the observed event rate.
Choose the number of calibration bins for reliability analysis.
Press the calculate button to generate metrics, tables, and the Plotly graph.
Use the CSV button for spreadsheet review and the PDF button for reporting.

Frequently Asked Questions

1. What does the Brier score measure?

The Brier score measures the average squared difference between predicted probabilities and actual binary outcomes. Lower values indicate better forecasting accuracy, because smaller gaps mean predictions better matched what really happened.

2. What is a good Brier score?

A good score depends on the problem and event frequency. Zero is perfect. Scores closer to zero are better, while higher values show larger prediction error. Comparing against a benchmark or historical model is usually more informative than using a single cutoff.

3. Why use Brier skill score?

Brier skill score shows whether your forecast beats a reference model. Positive values mean improvement over the benchmark. A value near zero means similar performance, while negative values indicate your forecast performed worse than the reference.

4. Can I enter percentages instead of decimals?

Yes. The calculator accepts either decimals like 0.72 or percentages like 72. When values are greater than 1 and no more than 100, the tool converts them automatically into decimal probabilities.

5. Why must outcomes be only 0 or 1?

The standard Brier score is designed for binary events. An outcome of 1 means the event occurred, and 0 means it did not. This keeps the interpretation clear and consistent for probabilistic event forecasting.

6. What do reliability and resolution mean?

Reliability measures calibration quality, showing whether stated probabilities align with observed frequencies. Resolution measures how well forecasts separate different outcome environments. Better forecasts usually have low reliability error and strong resolution.

7. When should I use weights?

Use weights when some forecasts represent more cases, bigger impacts, or stronger importance. Weighting lets the score reflect business value, sample size, or operational priority instead of treating every row as equally influential.

8. What does the chart show?

The graph compares forecast probabilities, actual outcomes, a benchmark reference line, and calibration points. This helps you see sharpness, consistency, and areas where your probabilities may be overconfident or underconfident.