Calculator Inputs
Enter confusion matrix counts to calculate advanced classifier quality metrics. The result section appears above this form after submission.
Example Data Table
Use this sample table to compare model behavior across different confusion matrix outcomes.
| Model | TP | TN | FP | FN | Accuracy | Precision | Recall | F1 Score |
|---|---|---|---|---|---|---|---|---|
| Model A | 120 | 250 | 30 | 20 | 88.10% | 80.00% | 85.71% | 82.76% |
| Model B | 90 | 310 | 12 | 28 | 90.91% | 88.24% | 76.27% | 81.82% |
| Model C | 145 | 180 | 55 | 40 | 77.38% | 72.50% | 78.38% | 75.32% |
Formula Used
Core Equations
Total = TP + TN + FP + FN
Accuracy = (TP + TN) / Total
Error Rate = (FP + FN) / Total
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
Specificity = TN / (TN + FP)
Negative Predictive Value = TN / (TN + FN)
Error and Balance Metrics
False Positive Rate = FP / (FP + TN)
False Negative Rate = FN / (FN + TP)
Balanced Accuracy = (Recall + Specificity) / 2
F1 Score = 2 × Precision × Recall / (Precision + Recall)
F-Beta = ((1 + β²) × Precision × Recall) / ((β² × Precision) + Recall)
G-Mean = √(Recall × Specificity)
Agreement and Advanced Metrics
Jaccard Index = TP / (TP + FP + FN)
Youden's J = Recall + Specificity - 1
MCC = (TP×TN - FP×FN) / √((TP+FP)(TP+FN)(TN+FP)(TN+FN))
Cohen's Kappa = (Observed Agreement - Expected Agreement) / (1 - Expected Agreement)
LR+ = Recall / (1 - Specificity)
LR- = (1 - Recall) / Specificity
How to Use This Calculator
1. Enter confusion matrix counts
Input true positives, true negatives, false positives, and false negatives from your classification output or validation report.
2. Set the beta value
Use β = 1 for a balanced F-score. Choose β above 1 when recall matters more, or below 1 when precision matters more.
3. Pick decimal precision
Choose how many decimal places you want in the final metrics table and summary cards.
4. Submit the form
After submission, the calculator shows the results above the form, including the confusion matrix, graph, insights, and export buttons.
5. Review multiple metrics together
Do not rely on accuracy alone. Compare precision, recall, balanced accuracy, MCC, and kappa for a more complete evaluation.
Frequently Asked Questions
1. Why is accuracy not enough for classifier evaluation?
Accuracy can look strong even when one class dominates the dataset. In imbalanced problems, MCC, recall, precision, and balanced accuracy often reveal weaknesses hidden by overall correctness.
2. When should I use F1 instead of accuracy?
Use F1 when false positives and false negatives both matter, especially with uneven classes. It focuses on positive-class quality rather than total correct predictions.
3. What does MCC tell me?
MCC measures the overall relationship between predictions and actual classes. It stays informative under class imbalance and ranges from poor disagreement to strong agreement.
4. Why can some metric values be undefined?
A metric becomes undefined when its denominator is zero. For example, precision is undefined when the model never predicts the positive class.
5. What does the beta value change?
Beta changes how much recall matters in the F-beta score. A larger beta rewards finding positives more strongly, while a smaller beta favors precision.
6. How do specificity and recall differ?
Recall measures how well the model catches actual positives. Specificity measures how well it rejects actual negatives. Strong systems usually need both.
7. When is balanced accuracy useful?
Balanced accuracy is useful when classes are imbalanced. It averages sensitivity and specificity so one dominant class does not distort the result.
8. What should I compare when selecting a final model?
Compare accuracy, F1, balanced accuracy, MCC, kappa, and the confusion matrix together. Then align the best metric mix with your business or risk objective.