Enter Multiclass Matrix Values
Use one row per actual class. Separate values with commas or spaces. Keep the matrix square.
Example Confusion Matrix Table
This sample matrix compares three classes. Rows are actual labels. Columns are model predictions.
| Actual \ Predicted | Class A | Class B | Class C | Row Total |
|---|---|---|---|---|
| Class A | 42 | 4 | 2 | 48 |
| Class B | 5 | 36 | 4 | 45 |
| Class C | 1 | 6 | 30 | 37 |
| Column Total | 48 | 46 | 36 | 130 |
Metrics and Formulas
Per-Class One-vs-Rest Counts
TP = diagonal value for the class.
FN = row total minus TP.
FP = column total minus TP.
TN = total samples minus TP, FP, and FN.
Core Classification Metrics
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
Specificity = TN / (TN + FP)
NPV = TN / (TN + FN)
F1 Score = 2 × Precision × Recall / (Precision + Recall)
IoU = TP / (TP + FP + FN)
Averaging Methods
Macro Average = arithmetic mean across classes.
Weighted Average = support-weighted mean across classes.
Micro Average = metrics from pooled TP, FP, and FN.
Agreement Metrics
Accuracy = sum of diagonal values / total samples.
Cohen Kappa = (Observed Agreement − Expected Agreement) / (1 − Expected Agreement)
Multiclass MCC = generalized correlation using total, row sums, column sums, and diagonal agreement.
How to Use This Calculator
- Enter class names in the class label field.
- Paste the confusion matrix into the matrix box.
- Keep rows as actual classes and columns as predictions.
- Make sure the matrix is square and non-negative.
- Select the number of decimal places you need.
- Click the calculate button to generate results.
- Review the summary, class metrics, and heatmap.
- Download the report in CSV or PDF format.
FAQs
1. What is a multiclass confusion matrix?
It is a table that compares actual classes against predicted classes for models with three or more categories. It helps reveal where correct predictions happen and where misclassifications cluster.
2. Why are rows treated as actual classes?
Using actual labels by row is a common convention. It makes recall interpretation easier because each row total becomes the support for one real class.
3. What is the difference between macro and weighted scores?
Macro scores treat every class equally. Weighted scores give larger classes more influence. Weighted metrics are useful when class frequencies are uneven.
4. Why can accuracy look good when a model is weak?
Accuracy can hide poor minority-class performance. A model may predict dominant classes well while failing rare classes. Macro recall and macro F1 usually expose that problem.
5. What does Cohen kappa add beyond accuracy?
Kappa adjusts observed agreement by expected agreement from class distributions. It is helpful when class imbalance might inflate plain accuracy.
6. What does multiclass MCC measure?
Multiclass MCC measures correlation between predictions and true labels. Higher values indicate stronger overall classification quality across all classes.
7. Can I use decimal values in the matrix?
Yes. The calculator accepts numeric values, including decimals. That supports weighted counts, averaged folds, or soft aggregated evaluation tables.
8. What does the heatmap show?
The heatmap displays row-normalized percentages. Each row sums to one hundred percent, so you can quickly see which predicted classes capture each actual class.