Lot Sample Size Calculator for ML Quality Audits

Calculator Inputs

Use this tool for auditing data lots, model batches, and labeling queues.

Lot Size

Expected Defect Rate (%)

Margin of Error (%)

Confidence Level

Detection Power (%)

AQL (%)

Inspection Cost per Item

Calculation Mode

Formula Used

1) Proportion based sample size

Use this method when estimating the share of defective items inside a lot.

n₀ = z² × p × (1 - p) / E²

n = n₀ / (1 + (n₀ - 1) / N)

2) Detection based sample size

Use this method when you want a strong chance of finding at least one bad item.

n = ln(1 - power) / ln(1 - p)

3) Acceptance number

This page estimates the maximum accepted defects as:

c = floor(sample size × AQL)

Variable meanings

N = lot size
p = expected defect rate
E = desired margin of error
z = z score from confidence level
power = probability of finding at least one defect
c = maximum accepted defective items

How to Use This Calculator

Enter the full lot size for your batch.
Set the expected defect rate from prior audits.
Choose a margin of error for precision.
Select a confidence level for reliability.
Set the detection power for defect discovery.
Enter the accepted quality limit percentage.
Optionally add inspection cost per item.
Pick a calculation mode and submit.
Review the recommended sample, cost, and graph.
Export the result table as CSV or PDF.

For ML teams, a lot can represent labeled rows, images, records, prompts, or output batches.

Example Data Table

Use Case	Lot Size	Expected Defect	Confidence	Margin Error	Sample Size	Accept Number
Image Labels	5,000	2.5%	95%	2%	231	3
Training Batch	12,000	4.0%	95%	3%	163	2
Sensor Records	850	6.0%	90%	4%	115	1
Inference Queue	25,000	1.5%	99%	1%	552	8

FAQs

1. What does lot sample size mean in ML workflows?

It means the number of records, labels, images, or predictions inspected from a larger batch. Teams use it to estimate quality, detect failures, and decide whether a lot meets acceptance rules.

2. Why does defect rate matter so much?

Higher expected defect rates usually increase required sampling. When more bad items are likely, a larger sample improves your chance of measuring the lot accurately and finding quality issues earlier.

3. When should I use hybrid mode?

Use hybrid mode when you want both good estimation precision and strong defect discovery. It selects the larger requirement between the proportion method and the detection method.

4. What is AQL in this calculator?

AQL means accepted quality limit. Here, it helps estimate the maximum defective items allowed inside the chosen sample before the lot should be flagged for review or rejection.

5. Does finite population correction help?

Yes. When the lot is not huge, finite correction reduces oversized samples. It adjusts the result because sampling many items from a small lot gives more information.

6. Can I use this for labeling audits?

Yes. It works well for image labels, text annotations, transcription batches, moderation reviews, and inference outputs where you inspect a limited batch from a larger production lot.

7. What confidence level should I choose?

Most teams start with 95%. Use 99% for stricter audits. Use 90% when speed matters more than precision. Higher confidence usually means a larger sample.

8. Why is the graph useful?

The graph shows how detection probability changes with sample size. It helps teams see where extra inspection gives diminishing returns and where sampling becomes operationally expensive.