Measure dataset health using weighted quality dimensions. Expose missing, stale, duplicate, invalid, and inconsistent patterns. Build dependable inputs for smarter models and safer outcomes.
Enter dataset defect counts and assign weights for each quality dimension. Results appear above this form after submission.
| Dataset Batch | Total Records | Missing | Duplicates | Invalid | Outliers | Stale | Inconsistent | Accuracy % |
|---|---|---|---|---|---|---|---|---|
| Customer Churn Set A | 10000 | 320 | 140 | 210 | 175 | 260 | 190 | 96.4 |
| Fraud Events Set B | 18500 | 510 | 220 | 340 | 420 | 380 | 275 | 94.8 |
| IoT Sensor Set C | 25000 | 890 | 160 | 430 | 690 | 740 | 310 | 92.6 |
Completeness Score = (1 - Missing Values / Total Records) × 100
Uniqueness Score = (1 - Duplicate Records / Total Records) × 100
Validity Score = (1 - Invalid Values / Total Records) × 100
Consistency Score = (1 - Inconsistent Records / Total Records) × 100
Timeliness Score = (1 - Stale Records / Total Records) × 100
Outlier Penalty Score = (1 - Outlier Records / Total Records) × 100
Accuracy Composite = (Verified Accuracy × 0.7) + (Outlier Penalty Score × 0.3)
Overall Score = Σ(Dimension Score × Weight) / Σ(Weights)
The calculator combines six core data quality dimensions into a weighted index. Higher weights emphasize the dimensions that matter most for your machine learning pipeline, monitoring policy, or deployment risk profile.
It summarizes how trustworthy a dataset is across key dimensions like completeness, validity, uniqueness, consistency, timeliness, and accuracy. Higher scores usually mean lower risk for training, inference, and monitoring outcomes.
Weights let you prioritize the dimensions most important to your use case. For example, fraud detection may emphasize timeliness and accuracy, while reporting systems may care more about completeness and consistency.
Validity checks whether values fit accepted formats or rules. Accuracy measures whether records match the real world or trusted references. A value can be valid in format but still inaccurate.
Extreme outliers often indicate noisy capture, labeling mistakes, broken sensors, or integration issues. Including an outlier penalty helps the score reflect unusual values that can distort training quality and model stability.
A score above 85 is typically strong for many workflows. Scores between 70 and 85 usually need targeted cleanup. Anything lower may introduce training bias, instability, or unreliable evaluation metrics.
Yes. A representative sample is often practical for audits. Just make sure the sample covers important classes, time periods, and sources so the score reflects actual production data conditions.
Yes. Repeating this calculation on fresh batches helps you detect drift, pipeline failures, schema changes, stale records, or rising duplicates before those problems damage model performance.
No. It is a compact decision aid, not a full governance framework. Pair it with profiling, lineage checks, class balance reviews, bias analysis, and feature-level validation for stronger assurance.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.