Binary Classification Tool

Enter Binary Classification Counts

Provide the confusion matrix values and optional labels. The tool calculates core and advanced diagnostic statistics instantly.

Model Name

Positive Label

Negative Label

True Positives (TP)

True Negatives (TN)

False Positives (FP)

False Negatives (FN)

Decision Threshold

Displayed Decimals

Reset

Example Data Table

This sample table shows how predicted probabilities convert into binary predictions using a 0.50 threshold.

ID	Actual Class	Predicted Probability	Predicted Class	Outcome
1	Positive	0.93	Positive	TP
2	Positive	0.81	Positive	TP
3	Positive	0.41	Negative	FN
4	Negative	0.62	Positive	FP
5	Negative	0.12	Negative	TN
6	Negative	0.28	Negative	TN

From record-level predictions, aggregate TP, TN, FP, and FN values, then enter them into the tool.

Formula Used

Accuracy = (TP + TN) / (TP + TN + FP + FN)

Shows overall classification correctness across all observations.

Precision = TP / (TP + FP)

Measures how many predicted positives are truly positive.

Recall / Sensitivity = TP / (TP + FN)

Measures the share of actual positives correctly identified.

Specificity = TN / (TN + FP)

Measures the share of actual negatives correctly identified.

F1 Score = 2 × Precision × Recall / (Precision + Recall)

Balances precision and recall into one harmonic metric.

Balanced Accuracy = (Recall + Specificity) / 2

Useful when class sizes are uneven.

Matthews Correlation = (TP×TN − FP×FN) / √[(TP+FP)(TP+FN)(TN+FP)(TN+FN)]

A robust summary metric for imbalanced binary classification.

Cohen's Kappa = (Observed Accuracy − Expected Accuracy) / (1 − Expected Accuracy)

Adjusts agreement by removing agreement expected by chance.

Likelihood Ratios = Recall / FPR and FNR / Specificity

Helpful in diagnostic testing and evidence-based model interpretation.

How to Use This Calculator

Collect your confusion matrix counts: true positives, true negatives, false positives, and false negatives.
Enter optional labels for the positive and negative classes.
Add the decision threshold used by your model, if relevant.
Choose the display precision for decimal results.
Click the calculation button to generate performance metrics.
Review the result summary, detailed metrics table, confusion matrix, and Plotly graphs.
Use the CSV button for spreadsheet analysis and the PDF button for reporting.
Compare metrics together rather than relying on accuracy alone, especially with imbalanced classes.

FAQs

1. What does this binary classification tool measure?

It measures how well a model separates two classes using confusion matrix counts. It reports accuracy, precision, recall, specificity, F1 score, balanced accuracy, likelihood ratios, kappa, and more.

2. Why is accuracy alone not enough?

Accuracy can look strong even when a model misses many rare positives. Precision, recall, specificity, and Matthews correlation give a more reliable view, especially when classes are imbalanced.

3. When should I focus on recall?

Focus on recall when missing a positive case is costly. Medical screening, fraud detection, and safety alerts often prioritize catching as many true positives as possible.

4. When should I focus on precision?

Precision matters when false alarms are expensive. It is useful in tasks where each positive prediction triggers time, money, or manual review.

5. What is balanced accuracy?

Balanced accuracy averages recall and specificity. It helps when one class is much larger than the other and prevents majority-class dominance from hiding weak minority detection.

6. What does Matthews correlation tell me?

Matthews correlation summarizes all four confusion matrix cells in one score. It is often more informative than accuracy for imbalanced binary classification problems.

7. Why include Cohen's kappa?

Kappa adjusts observed agreement by accounting for chance agreement. It is useful when you want a stricter agreement measure than raw accuracy.

8. What do the export buttons do?

The CSV export saves the calculated metrics in a structured table. The PDF export captures the visible results section for reporting, sharing, or client documentation.