Enter Experiment Inputs
Example Data Table
| Group | Users | Conversions | Conversion Rate | Revenue per Conversion |
|---|---|---|---|---|
| Treatment | 5,000 | 820 | 16.40% | $45.00 |
| Control | 4,800 | 690 | 14.38% | $45.00 |
This sample illustrates how treatment and control outcomes feed the uplift probability estimate, financial impact, and uncertainty measures.
Formula Used
How to Use This Calculator
- Enter treatment and control group sizes from your experiment.
- Add the observed conversions for both groups.
- Provide the evaluation population for scaling expected incremental outcomes.
- Enter average value per conversion and total campaign cost.
- Select the confidence level, then calculate to view uplift, significance, value, and the comparison graph.
Frequently Asked Questions
1. What is uplift probability?
It is the estimated difference in conversion probability between treatment and control. Positive uplift suggests the treatment likely increases the desired action.
2. Why compare treatment and control groups?
Comparing both groups helps isolate the effect of the intervention. Without control performance, apparent gains may simply reflect baseline behavior.
3. Can uplift be negative?
Yes. Negative uplift means the treatment performed worse than the control, suggesting the campaign, offer, or model may be harmful.
4. What does the confidence interval mean?
The interval shows a plausible range for the true uplift. Narrow intervals imply more precision, while wide intervals indicate stronger uncertainty.
5. Why include evaluation population?
It scales the measured uplift into expected incremental conversions. This helps convert a percentage-point difference into practical business impact.
6. What does the p-value show?
The p-value estimates how surprising the observed difference would be if no true uplift existed. Smaller values indicate stronger statistical evidence.
7. When can relative uplift mislead?
Relative uplift can look large when the control rate is tiny. Always review absolute uplift and sample size before making deployment decisions.
8. Is this enough for deployment decisions?
It is a strong screening tool, but production decisions should also consider segment stability, experiment design quality, operational cost, and downstream risk.