Probabilistic Forecast Tool Calculator

Inputs

Distribution

Use lognormal for strictly positive targets.

Forecast mean

For normal, this is your point estimate.

Forecast SD

Larger SD means wider uncertainty bands.

Observed value (optional)

Adds scores like log-likelihood and Brier.

Confidence levels

Hold Ctrl/Command to select multiple.

Quantiles (p values)

Comma-separated; accepts 0.9 or 90.

Event threshold

Used for event probability and Brier score.

Event definition

Choose the risk side you care about.

Scenario note (optional)

Appears in the exported reports.

Reset

Example Data Table

Use this sample to verify your setup and expected outputs.

Distribution	Mean / μ(log)	SD / σ(log)	Threshold	Expected P(event)	Typical 90% Interval
Normal	100	15	120	≈ 9.18%	[75.33, 124.67]
Lognormal	4.60	0.25	120	Depends on scale	Positive-only range

For normal, 90% interval uses quantiles 0.05 and 0.95.

Formula Used

Quantile (Normal): Q(p) = μ + σ · Φ^-1(p)
Quantile (Lognormal): Q(p) = exp( μ + σ · Φ^-1(p) )
Prediction interval: [Q(α/2), Q(1-α/2)] where α = 1 − confidence
Event probability: P(Y > t) = 1 − F(t), or P(Y < t) = F(t)
Negative log-likelihood: −log f(y) from the chosen density
Pinball loss: (y−q)·(τ−I[y<q]) for quantile τ and forecast q
CRPS (Normal): σ·( z(2Φ(z)−1)+2φ(z)−1/√π ), z=(y−μ)/σ
Brier score: (p − o)², where o is 0/1 event outcome

How to Use This Calculator

Select a distribution that matches your target behavior.
Enter mean and uncertainty (SD), then choose confidence levels.
Set quantiles you want reported, including tails for risk.
Add a threshold to compute event probability for decisions.
Optionally enter an observed value to evaluate forecast quality.
Click Calculate, then export CSV or PDF for reporting.

Why probabilistic forecasting matters

Single-number forecasts hide risk. A distribution communicates expected value and uncertainty, so teams can price safety buffers, set alert thresholds, and quantify downside exposure. A calibrated 90% interval should contain outcomes about nine times out of ten, which is more actionable than a vague “high confidence” label.

Choosing distribution parameters

The mean is the central tendency of your point forecast. The standard deviation controls spread and should be estimated from recent residuals by horizon, not intuition alone. If the target is strictly positive and skewed, a lognormal assumption can better match demand, latency, or cost behavior. Use consistent units and avoid mixing log-space parameters with real-scale expectations.

Interpreting prediction intervals and quantiles

Each confidence level maps to two quantiles: Q(α/2) and Q(1−α/2), where α = 1 − confidence. Interval width is a quick proxy for uncertainty; narrower is preferable only if coverage remains reliable. Compare widths across candidate models using the same quantile set. Median and interquartile range summarize typical dispersion, while 5th and 95th percentiles reveal tail risk relevant to service levels.

Evaluating forecast quality with proper scores

When you provide an observed value, the tool returns proper scoring rules that reward honest uncertainty. Negative log-likelihood strongly penalizes overconfident densities that miss outcomes. Pinball loss evaluates quantile forecasts and exposes asymmetric errors, such as consistent underprediction in upper tails. Brier score evaluates threshold events, enabling principled alert tuning and decision calibration. For normal forecasts, CRPS gives a single, scale-aware accuracy number.

Operational decision workflows

Use event probability to estimate the chance demand exceeds a staffing limit or costs exceed a budget cap. Use the median for typical planning, then choose upper quantiles to size capacity and contingency. Export intervals and quantiles to reports, and monitor score trends weekly. Rising scores, widening intervals, or shifted probabilities can signal drift, prompting retraining, feature refresh, or a new uncertainty model. For backtesting, store observed values alongside forecast parameters and recompute scores on a rolling window. If 90% intervals capture only 75% of outcomes, inflate SD or recalibrate. If they capture 99%, you may be too conservative and can tighten uncertainty to improve sharpness without sacrificing coverage. over time.

FAQs

What inputs should I use for standard deviation?

Use recent forecast errors for the same horizon. Compute residuals (actual minus mean forecast), then use their standard deviation. Update regularly, and consider separate SD values for weekdays, seasons, or segments if error behavior changes.

When should I choose lognormal instead of normal?

Choose lognormal when the target cannot be negative and shows right skew, such as demand, durations, or costs. Enter μ and σ on the log scale; the tool then produces positive-only quantiles and intervals.

Why do my confidence intervals look too wide?

Intervals widen when SD is large or confidence is high. Check units, horizon alignment, and outliers in residuals. If your model is calibrated and still wide, the process may truly be volatile, and operational buffers should reflect that.

How do I interpret the negative log-likelihood score?

Lower is better. It penalizes forecasts that assign low density to what actually happened, especially when the distribution is narrow. Compare scores across models on the same dataset; avoid comparing across different targets or units.

What does pinball loss tell me about tail performance?

Pinball loss evaluates quantiles. If upper-tail losses are consistently high, the model underestimates extreme outcomes. Adjust features, recalibrate, or increase uncertainty. Use multiple τ values to diagnose where the distribution misses.

How should I use event probability in decisions?

Treat it as a risk estimate for crossing a threshold, like exceeding capacity. Combine it with impact to form expected loss. Track Brier score to ensure probabilities are calibrated; poor calibration can cause too many false alarms or missed events.