| Scenario | μ₁ | σ₁ | μ₂ | σ₂ | P₁ | P₂ | C₁₂ | C₂₁ | Interpretation |
|---|---|---|---|---|---|---|---|---|---|
| Quality pass vs. fail | 0.0 | 1.0 | 2.0 | 1.5 | 0.70 | 0.30 | 1 | 4 | False pass is costly; shift boundary stricter. |
| Control vs. treatment | 10.0 | 2.0 | 12.0 | 2.0 | 0.50 | 0.50 | 1 | 1 | Equal variances; one clean threshold. |
| Signal vs. noise | -1.0 | 0.8 | 1.2 | 0.5 | 0.85 | 0.15 | 2 | 1 | Prefer catching signal; accept some false alarms. |
Let class-conditional densities be f₁(x)=𝒩(μ₁,σ₁²) and f₂(x)=𝒩(μ₂,σ₂²),
with priors P₁, P₂, and misclassification costs C₁₂, C₂₁.
The boundary occurs where expected conditional risks tie: C₁₂·P₁·f₁(x) = C₂₁·P₂·f₂(x).
Define k = (C₂₁·P₂·σ₁)/(C₁₂·P₁·σ₂). After taking logs, the boundary satisfies:
- Equal variances (σ₁ = σ₂): a single linear threshold.
- Unequal variances: a quadratic equation that can yield 0, 1, or 2 real thresholds.
- Enter μ and σ for both classes from your data.
- Set P₁ and P₂ to reflect base rates.
- Choose C₁₂ and C₂₁ to match real impact.
- Click Compute decision boundary to get thresholds.
- Review the decision rule and the density chart.
- Export CSV or PDF to document your assumptions.
In this calculator, a decision boundary is the x value where scaled evidence for two classes is equal. Evidence combines the normal density f(x) with business context through priors and costs. When you compute a threshold, you are selecting the cut point that minimizes expected loss under the specified assumptions.
The means μ1 and μ2 set separation, while σ1 and σ2 control overlap. As overlap increases, the boundary becomes more sensitive to priors and costs. If P2 grows relative to P1, the boundary moves toward class 1 values, reducing false class 2 decisions. If C21 increases, the boundary becomes stricter against predicting class 1.
When σ1 equals σ2, the log likelihood ratio is linear in x, so the boundary is a single threshold. With μ1=0, μ2=2, σ=1, equal priors, and equal costs, the boundary is near x=1.00. Changing priors to P1=0.70 and P2=0.30 shifts the cut lower, favoring class 1.
When σ1 differs from σ2, the equality becomes quadratic and can produce zero, one, or two real solutions. Two thresholds mean the preferred class can switch twice as x increases. This often occurs when one class is narrow and peaked while the other is wider. Read the rule text to see which intervals map to each class.
Priors represent base rates from production, sampling, or historical logs. Costs translate consequences like refunds, safety incidents, or missed detections. The calculator uses k=(C21·P2·σ1)/(C12·P1·σ2); increasing k moves the boundary toward class 2. Document priors and costs, then export a report to keep choices auditable.
Validate inputs by estimating μ and σ from labeled samples and checking fit with residual checks. Then stress test: vary priors by ±20% and costs by 2× to see stability. If thresholds move drastically, improve measurement quality or gather more labels. Store exported CSV or PDF with model notes for traceable governance in practice today.
1) Why can there be two thresholds?
With unequal variances, the boundary equation is quadratic. The preferred class can switch twice as x increases, creating a middle interval where the other class is optimal.
2) What do priors change in the result?
Priors scale each density by base rate. Increasing a class prior increases its influence, shifting the boundary to reduce expected errors against that class under the same costs.
3) How should I pick misclassification costs?
Translate consequences into relative penalties. Use higher cost for the error you want to avoid most, such as safety misses or expensive false approvals.
4) What if the calculator shows no real boundary?
The scaled densities may never intersect, meaning one class dominates for all x in the plotted range. Verify parameters and consider whether the normal assumption or ranges are appropriate.
5) Does the plot affect the threshold?
No. The threshold is computed analytically from μ, σ, priors, and costs. The plot only visualizes the two PDFs and the computed boundary lines.
6) Can I use this for real data that is not normal?
You can as an approximation, but accuracy depends on fit. If distributions are skewed or multimodal, consider transforming data or using a nonparametric density estimate and recomputing boundaries.