Bias Risk Assessment Calculator

Calculator Inputs

Enter measured gaps as percentages. Enter controls as adequacy percentages, where higher adequacy lowers risk.

Representation Gap (%)

Difference between observed and expected group representation.

Selection Rate Ratio (0 to 1)

Use the lower group selection rate divided by the higher rate.

False Positive Rate Gap (%)

Absolute subgroup gap in false positive rate.

False Negative Rate Gap (%)

Absolute subgroup gap in false negative rate.

Calibration Gap (%)

Difference in predicted versus observed outcome alignment.

Label Bias Risk (%)

Bias suspected in labels, targets, or annotation rules.

Proxy Feature Dependence (%)

Degree of reliance on features linked to protected attributes.

Human Oversight Adequacy (%)

Higher values mean stronger approval and review controls.

Documentation Completeness (%)

Covers model cards, audit logs, and known limitations.

Mitigation Maturity (%)

Measures readiness of testing, remediation, and revalidation.

Monitoring Coverage (%)

Bias checks, alerting, periodic review, and escalation coverage.

Regulatory Sensitivity (1 to 5)

1 is light scrutiny; 5 is tightly regulated use.

Impact Severity (1 to 5)

Reflects harm magnitude if biased decisions occur.

Clear Inputs

Formula Used

Direct Gap Risks

Representation, false positive, false negative, calibration, label bias, and proxy dependence risks use their entered percentage directly.

Selection Parity Risk

Selection Parity Risk = (1 − Selection Rate Ratio) × 100

Control Risks

Control Risk = 100 − Adequacy Score

Context Risks

Context Risk = ((Score − 1) / 4) × 100 for regulatory sensitivity and impact severity.

Overall Weighted Bias Score

Overall Score = Σ (Component Risk × Weight)

Weighting emphasizes measurable fairness gaps, data quality, feature proxy risk, governance maturity, and contextual exposure. The final score is reported on a 0 to 100 scale.

How to Use This Calculator

Enter subgroup gap percentages from your fairness evaluation or audit workbook.
Use a selection rate ratio between 0 and 1, where 1 means parity.
Score governance controls as adequacy percentages, not risk percentages.
Choose regulatory sensitivity and impact severity from 1 to 5.
Submit the form to view the overall score, risk breakdown, and chart.
Export the assessment as CSV or PDF for review meetings and documentation.

Example Data Table

Metric	Example Input	Normalized Risk	Comment
Representation Gap	18%	18.00	Moderate representation difference across groups.
Selection Rate Ratio	0.82	18.00	Parity is weakened because ratio is below 1.00.
False Positive Rate Gap	12%	12.00	Some groups receive more false alarms.
False Negative Rate Gap	15%	15.00	Missed outcomes differ across groups.
Label Bias Risk	20%	20.00	Annotation practices may amplify imbalance.
Proxy Feature Dependence	30%	30.00	Proxy variables need stronger review.
Oversight Adequacy	70%	30.00	Human review exists, but is not fully strong.
Documentation Completeness	65%	35.00	Documentation is usable but incomplete.
Mitigation Maturity	60%	40.00	Bias controls exist with room to improve.
Monitoring Coverage	55%	45.00	Post-release surveillance is weaker than preferred.
Regulatory Sensitivity	4	75.00	Use case attracts stronger oversight pressure.
Impact Severity	5	100.00	Potential harm from unfair outcomes is very high.
Overall Score	Weighted result	26.26	Guarded risk with stronger context concern.

FAQs

1. What does this calculator measure?

It estimates overall bias exposure in an AI system by combining data imbalance, fairness gaps, governance controls, and contextual severity into one weighted score.

2. Is a low score proof that a model is fair?

No. A low score suggests lower visible risk, but fairness still depends on test design, subgroup coverage, changing data, and ongoing real-world monitoring.

3. Why is the selection rate ratio entered from 0 to 1?

That format expresses parity directly. A value of 1 means equal selection rates, while lower values indicate stronger disparity and therefore higher selection parity risk.

4. Why are oversight and documentation reversed into risk?

They are entered as strengths. Stronger governance reduces exposure, so the calculator converts adequacy into risk by subtracting the adequacy score from 100.

5. Can I use this before model deployment?

Yes. It is especially useful before launch, during model review, or before procurement approval when teams need a structured bias-risk snapshot.

6. Should I use percentages from one dataset only?

Use the best validated figures available. For stronger decisions, compare training, validation, and live monitoring results rather than relying on one dataset alone.

7. What score range should trigger mitigation work?

Many teams begin formal remediation once scores enter the Significant range or when a single driver is extremely high, even if the overall score appears moderate.

8. Can the weights be customized?

Yes. You can edit the weights in the code to match your governance framework, sector rules, or internal materiality thresholds.