Cumulative Gain Calculator

Calculator Input

Dataset Name

Display Step Size

Ranking Mode

Actual Labels

Use 1 and 0, or yes and no. Commas or new lines both work.

Predicted Scores

Leave blank if labels are already ranked in the desired order.

Example Data Table

Rank	Predicted Score	Actual Label	Interpretation
1	0.98	1	Highest ranked case is positive.
2	0.93	1	Second ranked case is positive.
3	0.91	0	Third ranked case is negative.
4	0.87	1	Fourth ranked case is positive.
5	0.76	0	Fifth ranked case is negative.
6	0.65	0	Sixth ranked case is negative.

Formula Used

Cumulative Positives at rank k
CP(k) = sum of actual positive labels from rank 1 to k

Cumulative Gain at rank k
CG(k) = CP(k) / Total Positives

Cumulative Gain Percentage
CG%(k) = [CP(k) / Total Positives] × 100

Population Percentage at rank k
Pop%(k) = [k / Total Records] × 100

Lift at rank k
Lift(k) = CG%(k) / Pop%(k)

Cumulative gain shows how many true positives your ranking captures as you move from the highest ranked records toward the full dataset. Better models climb faster and stay closer to the ideal curve.

How to Use This Calculator

Enter a dataset name for reporting clarity.
Paste actual labels using 1 for positive and 0 for negative.
Paste predicted scores if you want automatic descending ranking.
Choose a display step size for the results table.
Select score sorting or use the existing record order.
Click the calculate button to generate metrics and curves.
Review gain percentages at 10%, 25%, and 50% population cutoffs.
Download the results as CSV or PDF when needed.

Frequently Asked Questions

1. What does cumulative gain measure?

Cumulative gain measures how quickly a ranked model captures actual positives as you move through the population from highest score to lowest score.

2. When should I use cumulative gain instead of accuracy?

Use cumulative gain when ranking matters. It is especially useful for lead scoring, fraud review, uplift targeting, and other tasks where you act on only top segments.

3. What is a good cumulative gain curve?

A good curve rises steeply early, meaning top-ranked cases contain many positives. It should stay well above the random baseline and closer to the ideal line.

4. Do I need predicted probabilities?

No. You only need ranked observations. Probabilities help because the calculator can sort automatically, but pre-ranked labels also work correctly.

5. What does top 10% gain mean?

Top 10% gain tells you the percentage of all positives captured within the first tenth of ranked records. Higher values indicate stronger prioritization.

6. Why is lift shown with cumulative gain?

Lift compares your captured positive rate against random selection. It helps show how much better the model performs than picking records without ranking.

7. Can I use imbalanced datasets?

Yes. Cumulative gain is often most useful on imbalanced datasets because it focuses on ranking quality and positive capture, not just overall correctness.

8. Why does the calculator require both positive and negative labels?

A meaningful gain curve needs class contrast. If every record is positive or every record is negative, ranking quality cannot be evaluated properly.