F2 Score Calculator
This calculator supports two workflows. Use confusion matrix counts for a full metric summary, or enter precision and recall directly for quick scoring.
Formula Used
The general weighted F-measure is:
Fβ = ((1 + β²) × Precision × Recall) / ((β² × Precision) + Recall)
For F2, set β = 2. That gives:
F2 = (5 × Precision × Recall) / ((4 × Precision) + Recall)
When using confusion matrix inputs, the calculator first derives:
Precision = TP / (TP + FP)Recall = TP / (TP + FN)Accuracy = (TP + TN) / (TP + FP + FN + TN)Specificity = TN / (TN + FP)
Because β = 2, recall receives four times the weight used for precision in the denominator.
How to Use This Calculator
- Choose Confusion Matrix Inputs for a full classification report.
- Enter true positives, false positives, false negatives, and true negatives.
- Or choose Direct Precision and Recall for faster scoring.
- Click Calculate F2 Score to display the result above the form.
- Review F2, precision, recall, and other supporting metrics.
- Use the export buttons to save the result as CSV or PDF.
Example Data Table
The sample rows below show how different confusion matrices can produce different F2 scores when recall is emphasized.
| Model | TP | FP | FN | TN | Precision | Recall | F2 Score |
|---|---|---|---|---|---|---|---|
| Model A | 48 | 12 | 6 | 134 | 0.8000 | 0.8889 | 0.8696 |
| Model B | 35 | 10 | 20 | 135 | 0.7778 | 0.6364 | 0.6604 |
| Model C | 72 | 24 | 8 | 96 | 0.7500 | 0.9000 | 0.8654 |
FAQs
1. What does the F2 score measure?
F2 combines precision and recall into one metric, but it gives recall much more importance. It helps evaluate classifiers where missed positives matter more than extra false alarms.
2. When should I prefer F2 over F1?
Use F2 when false negatives are especially costly. Fraud detection, medical screening, and safety monitoring often value capturing positives more than avoiding every false positive.
3. Why can accuracy look good while F2 looks poor?
Accuracy can stay high in imbalanced datasets because negatives dominate. F2 focuses on precision and recall for the positive class, so it highlights weak positive detection better.
4. Can I calculate F2 without a confusion matrix?
Yes. If you already know precision and recall, enter them directly. The calculator will compute F2, F1, and F0.5 without requiring TP, FP, FN, and TN counts.
5. What happens if precision and recall are both zero?
The F2 score becomes unavailable because the formula denominator collapses to zero. In practical terms, the classifier is not producing useful positive identification.
6. Does a higher F2 score always mean a better model?
Not always. A higher F2 is better for recall-focused goals, but another metric may matter more if false positives are expensive or calibration quality matters.
7. What is the difference between F2 and F0.5?
F2 favors recall, while F0.5 favors precision. Comparing both can reveal whether your model is stronger at catching positives or avoiding incorrect alerts.
8. Why does this calculator show extra metrics?
The added metrics help you interpret F2 in context. Accuracy, specificity, prevalence, and support explain whether strong recall is coming with acceptable tradeoffs elsewhere.