Precision Recall Table Calculator

Build a precision‑recall table from many scored thresholds quickly. Visualize trade‑offs with an interactive curve. Download clean reports for teams and audits anytime, easily.

Enter Confusion Counts Per Threshold

Add as many rows as you need, then calculate.
Label / Threshold True Positives (TP) False Positives (FP) False Negatives (FN) Remove
Output Options
These options affect on-page display only.
Interpretation Helpers
Use labels like thresholds, folds, or segments.
  • TP: predicted positive and actually positive.
  • FP: predicted positive but actually negative.
  • FN: predicted negative but actually positive.
Calculate
Results will appear above this form.

Example Data Table

This sample shows how thresholds shift precision and recall.
Threshold TP FP FN Precision Recall
0.90428180.84000.7000
0.706324100.72410.8630
0.50724150.63720.9351

Formula Used

How to Use This Calculator

  1. Decide what each row represents, such as a threshold.
  2. Enter TP, FP, and FN for every row.
  3. Click submit to generate the table and curve.
  4. Review micro and macro metrics for reporting needs.
  5. Export CSV for analysis or PDF for sharing.

Precision and recall support different operational goals

In screening tasks, false negatives can be expensive. High recall reduces misses, even if precision drops. In moderation or fraud, false positives hurt users and costs. High precision limits unnecessary actions. This calculator lets you compare these priorities across multiple thresholds.

Threshold choice changes the confusion counts predictably

When you raise a decision threshold, fewer items are predicted positive. FP often falls, so precision rises. At the same time, FN can rise, so recall declines. For example, TP=42, FP=8 gives precision 0.84. If TP increases to 72 with FP=41, precision becomes 0.637, but recall improves strongly.

A precision–recall curve summarizes the trade‑off

Plotting recall on the x-axis and precision on the y-axis creates a curve of operating points. Points closer to the top-right indicate better overall performance. Use the interactive chart to spot thresholds that deliver acceptable recall without severe precision loss.

Micro and macro averages answer different reporting questions

Micro averaging pools TP, FP, and FN across rows, weighting larger rows more heavily. Macro averaging treats each row equally, highlighting consistency across thresholds or folds. If one segment dominates volume, micro metrics may look stronger than macro metrics.

Imbalanced datasets can mislead without context

With rare positives, accuracy can remain high even when recall is poor. Precision–recall metrics focus on positive detection quality. Track prevalence alongside TP, FP, and FN so stakeholders understand how many positives are expected.

Use the table to choose operating points and communicate risk

Identify thresholds that meet a target recall, then check precision and F1 for stability. Document the chosen label, counts, and averages. Exported CSV supports audits, while PDF helps cross‑team reviews and approvals.

FAQs

1) What if TP + FP equals zero?

Precision is undefined when no positives are predicted. The calculator shows a dash and excludes that row from macro precision and macro F1.

2) What if TP + FN equals zero?

Recall is undefined when there are no actual positives in that row. The calculator displays a dash and excludes that row from macro recall and macro F1.

3) Why is micro F1 different from macro F1?

Micro F1 is computed from pooled totals, so large rows dominate. Macro F1 averages per-row F1, so every row contributes equally to the final number.

4) Can I use this for k-fold validation?

Yes. Use each row as a fold. Macro metrics help gauge consistency across folds, while micro metrics summarize pooled performance across all examples.

5) Does a higher F1 always mean a better threshold?

Not always. F1 balances precision and recall equally, but your application may favor one. Choose thresholds based on costs, constraints, and required recall or precision targets.

6) How should I label rows?

Use meaningful labels such as probability thresholds, model versions, segments, or time windows. Labels appear in exports and help stakeholders interpret the curve and table quickly.

Related Calculators

fraud detection metricsmicro average f1precision recall metricsroc precision recallmodel validation metricsclassifier performance metricsmacro average f1

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.