Calculator Inputs
Enter confusion matrix counts, tune the beta weight, and optionally label the positive and negative classes.
Example Data Table
These example rows show how confusion matrix counts change the main evaluation metrics.
| Scenario | TP | FP | FN | TN | Precision | Recall | Accuracy | F1 |
|---|---|---|---|---|---|---|---|---|
| Balanced model | 120 | 18 | 12 | 250 | 86.96% | 90.91% | 92.50% | 88.89% |
| High precision | 95 | 8 | 32 | 265 | 92.23% | 74.80% | 90.00% | 82.61% |
| High recall | 140 | 42 | 8 | 210 | 76.92% | 94.59% | 87.50% | 84.85% |
Formula Used
These are the main equations behind the calculator and report output.
Core confusion matrix relationships
Total = TP + FP + FN + TN
Precision
Precision = TP / (TP + FP)
Recall
Recall = TP / (TP + FN)
Specificity
Specificity = TN / (TN + FP)
Accuracy
Accuracy = (TP + TN) / (TP + FP + FN + TN)
F1 score
F1 = 2 × Precision × Recall / (Precision + Recall)
F-beta score
Fβ = (1 + β²) × Precision × Recall / ((β² × Precision) + Recall)
Balanced accuracy
Balanced Accuracy = (Recall + Specificity) / 2
Matthews correlation coefficient
MCC = ((TP × TN) - (FP × FN)) / √((TP + FP)(TP + FN)(TN + FP)(TN + FN))
Jaccard index
Jaccard = TP / (TP + FP + FN)
How to Use This Calculator
Follow these steps to evaluate binary classification output correctly.
- Enter the model name and class labels for your report.
- Provide true positives, false positives, false negatives, and true negatives.
- Set beta to emphasize recall or precision in the F-beta score.
- Enter the threshold used when the confusion matrix was produced.
- Click Calculate Metrics to show results above the form.
- Review the metric table, confusion matrix heatmap, and comparison chart.
- Use the CSV button for spreadsheets and the PDF button for a report.
FAQs
1. What does precision measure?
Precision shows how many predicted positives were actually positive. It becomes important when false positives are costly, such as fraud alerts or spam filtering.
2. What does recall measure?
Recall shows how many actual positives the model found. It matters when missing a positive case is expensive, such as medical screening or defect detection.
3. Why can accuracy be misleading?
Accuracy can look strong on imbalanced data because the model may predict the majority class often. Precision, recall, specificity, and MCC usually explain performance better.
4. When should I change beta?
Use beta above 1 when recall deserves more weight. Use beta below 1 when precision matters more. Beta equal to 1 gives the standard F1 score.
5. What is balanced accuracy?
Balanced accuracy averages recall and specificity. It helps when classes are uneven because it rewards correct handling of both positives and negatives.
6. What does MCC tell me?
MCC summarizes the full confusion matrix in one score. Values near 1 indicate strong agreement, near 0 indicate weak performance, and negative values show inverse behavior.
7. Does the threshold change the metrics?
Yes. A different threshold changes which predictions count as positive, so TP, FP, FN, and TN shift. The calculator assumes your entered counts already reflect that threshold.
8. What does the CSV or PDF export include?
The export includes the input summary and calculated metrics. This makes it easier to archive model evaluations, share reports, and compare experiments later.