Macro Average F1 Calculator

Enter Class Statistics

Use one card per class. Large screens show three cards, medium screens show two, and small screens show one.

Zero division policy

Displayed decimals

Class 1

Class name

True positives (TP)

False positives (FP)

False negatives (FN)

Class 2

Class name

True positives (TP)

False positives (FP)

False negatives (FN)

Class 3

Class name

True positives (TP)

False positives (FP)

False negatives (FN)

Class 4

Class name

True positives (TP)

False positives (FP)

False negatives (FN)

Performance Chart

The chart compares per-class precision, recall, and F1. Submit the form to refresh values.

Example Data Table

Class	TP	FP	FN	Support	Precision	Recall	F1 Score
Cat	42	6	8	50	0.8750	0.8400	0.8571
Dog	37	9	11	48	0.8043	0.7708	0.7872
Bird	29	5	7	36	0.8529	0.8056	0.8286
Fish	24	4	6	30	0.8571	0.8000	0.8276
Macro Average F1							0.8251

Formula Used

Precision for each class
Precision = TP / (TP + FP)

Recall for each class
Recall = TP / (TP + FN)

F1 score for each class
F1 = 2 × Precision × Recall / (Precision + Recall)

Macro average F1
Macro F1 = (F1₁ + F1₂ + ... + F1ₙ) / n

Macro averaging gives every class equal weight, even when support differs. This helps expose weak performance on rare classes that accuracy may hide.

How to Use This Calculator

Add one row card for each class in your model.
Enter the class name and its TP, FP, and FN values.
Choose how zero-division cases should behave.
Select the number of displayed decimals.
Press Calculate Macro Average F1.
Read the summary above the form and review the table.
Use the chart to compare class-level precision, recall, and F1.
Export the results as CSV or PDF for documentation.

Frequently Asked Questions

1. What does macro average F1 measure?

It averages the F1 score of every class equally. Large classes do not dominate the result, so weak minority-class performance remains visible.

2. When should I prefer macro F1 over accuracy?

Use macro F1 when class imbalance matters or when every class deserves equal attention. Accuracy can look strong even if rare classes perform badly.

3. How is macro F1 different from weighted F1?

Macro F1 treats every class equally. Weighted F1 multiplies each class F1 by support, so common classes influence the overall score more.

4. Why can macro F1 be lower than micro F1?

Micro F1 aggregates totals across classes first. Strong performance on common classes can raise micro F1, while macro F1 still punishes weak classes equally.

5. What inputs do I need for each class?

You need true positives, false positives, and false negatives. The calculator derives support, precision, recall, and F1 automatically from those values.

6. What happens when a denominator becomes zero?

The calculator applies your chosen zero-division policy. You can return 0 or 1 when TP+FP, TP+FN, or precision+recall equals zero.

7. Can I use this for multiclass and multilabel reviews?

Yes, as long as your TP, FP, and FN values are valid for each class. The macro F1 formula is identical once those class-level statistics exist.

8. Why is support shown beside each class?

Support equals TP + FN. It helps you judge how much evidence backs each class metric and how imbalance may affect interpretation.

Enter Class Statistics

Class 1

Class 2

Class 3

Class 4

Performance Chart

Example Data Table

Formula Used

How to Use This Calculator

Frequently Asked Questions

1. What does macro average F1 measure?

2. When should I prefer macro F1 over accuracy?

3. How is macro F1 different from weighted F1?

4. Why can macro F1 be lower than micro F1?

5. What inputs do I need for each class?

6. What happens when a denominator becomes zero?

7. Can I use this for multiclass and multilabel reviews?

8. Why is support shown beside each class?

Related Calculators