Sensitivity Analysis Tool for Model Inputs

Model Setup

Use a simple logistic model to test input sensitivity.

Intercept (bias)

Shifts the baseline score up or down.

Output meaning

Computed using the sigmoid of the linear score.

Analysis mode

Shows both range and baseline sensitivity.

Feature name	Baseline X₀	Min	Max	Steps	Weight	Remove

Clear

Example Data Table

A compact sample of inputs you can paste into your own workflow.

Feature	Baseline	Min	Max	Weight
Latency (ms)	120	80	250	-0.012
Error rate	0.03	0.00	0.10	-8.000
Data freshness (days)	4	0	14	-0.180
Feature quality score	0.78	0.50	0.95	4.200
Training set size (k)	80	20	200	0.010

Formula Used

This tool uses a common probability mapping used in classification scoring.

Model score

z = b + Σ (wᵢ · xᵢ)

b is the intercept. wᵢ is the feature weight.

Probability

p = 1 / (1 + e^-z)

p is the predicted probability of a positive outcome.

Local sensitivity

∂p/∂xᵢ = wᵢ · p · (1 - p)

Measures near-baseline influence for small changes.

Range sensitivity

Δ = p(max) - p(min)

Sweeps xᵢ from min to max while others stay fixed.

How to Use

Enter an intercept if your model has a bias term.
Add features with baseline, min, max, steps, and weight.
Run the analysis to get a ranked influence table.
Download CSV for spreadsheets or reporting pipelines.
Download PDF for sharing results in reviews.

Practical tip: Choose ranges from real operating conditions, not theoretical extremes.

Why Sensitivity Matters

In applied machine learning, small upstream shifts can create large downstream errors. Sensitivity analysis quantifies which inputs move a prediction the most, helping you focus monitoring and data quality budgets. In many deployments, 20% of features explain 80% of output volatility. Teams often discover that a single operational metric, like latency or error rate, dominates outcome drift during peak traffic.

Choosing Realistic Ranges

Ranges should reflect observed production variation, not theoretical extremes. For example, if a feature quality score usually stays between 0.60 and 0.90, sweeping 0.00 to 1.00 exaggerates risk. Use percentiles from logs, A/B experiments, or recent batch statistics to set min and max. A practical rule is to start with P5 to P95, then tighten to SLA bands for governance reporting.

Reading Local Gradients

Local sensitivity uses the derivative ∂p/∂x at the baseline. For the logistic mapping, ∂p/∂x = w·p·(1−p). This peaks near p=0.5 and shrinks near 0 or 1, so the same weight can behave differently across segments. Compare local gradients to understand near-term impact from small perturbations, such as a 1% rise in error rate or a 10 ms latency bump.

Interpreting Range Impact

Range impact measures p(max)−p(min) while holding other features fixed. It is intuitive for stakeholders because it converts input uncertainty into probability movement. Ranking by absolute range delta highlights high-leverage variables for guardrails, feature clipping, and fallback logic. If a single feature can move p by 0.15 across its range, it deserves stricter validation than a feature that moves p by 0.01.

Using Results in Model Governance

Document the top drivers, the chosen ranges, and the expected probability swing. Pair the results with drift thresholds: if a high-impact input shifts by more than its historical band, trigger investigation. Combine sensitivity ranking with feature importance from training to separate causal-like operational levers from spurious correlations. When models affect approvals or pricing, include sensitivity outputs in change reviews and audits.

Operational Checklist

Update baselines monthly, refresh ranges after releases, and re-run sensitivity for segments. Track the top three drivers in dashboards, add alerts for missingness and outliers, and validate that mitigations reduce the ranked deltas without hurting accuracy. Store CSV exports as evidence and compare runs to quantify stability over time.

FAQs

1) What does “local dY/dX” tell me?

It estimates how much the probability changes for a tiny change in one input at the baseline, with all other inputs held constant. It’s most useful for near-term perturbations and stability checks.

2) Why do I need both local sensitivity and range impact?

Local sensitivity captures small, immediate changes around today’s baseline. Range impact captures worst-to-best movement within your chosen bounds. Together they balance operational realism and stress testing.

3) How should I choose the number of steps?

Use 25–50 steps for smooth curves and stable extrema detection. Increase steps when your range is wide or your stakeholders want finer granularity. Very high steps add time without much extra insight.

4) What if my model is not logistic?

You can still use the range sweep concept by replacing the probability function with your model’s scoring rule. The local derivative formula will differ, but the ranking by range delta remains informative.

5) Can this replace feature importance from training?

No. Feature importance reflects how a model learned from data. Sensitivity reflects how predictions respond to controlled input changes. Use both to separate training signals from operational levers.

6) How do I use outputs for monitoring?

Track the top-ranked inputs, validate their distributions, and alert on shifts beyond historical bands. When drift occurs, rerun the tool with updated baselines to confirm whether the risk profile changed.