Change Point Detection Calculator

Detect shifts in signals with confidence today. Compare methods, tune sensitivity, and validate results quickly. Turn time series into decisions with clearer, faster insight.

Inputs

Paste values, choose a method, tune sensitivity, then calculate.
Different methods suit different drift and noise patterns.
Used to estimate baseline mean and variability.
Robust mode reduces outlier influence.
Higher k ignores small shifts.
Lower h detects sooner, may false-alarm more.
Re-estimates baseline from the next window after a change.
Small drift allowance before accumulating evidence.
0 keeps mean history; higher adapts faster.
Lower values flag changes more aggressively.
Look-back window for rolling mean and sigma.
Typical starting point is 3.0.
Tip
Z-score is best for sharp outliers and sudden jumps.
Lower alpha smooths more, detects slower shifts.
Lower L increases sensitivity, higher reduces alarms.
Best for
Gradual drifts where raw spikes are less informative.
Examples: 1 2 3 4 or 1,2,3,4. Scientific notation is accepted.
If omitted, the table uses numeric indices.

Example data table

This example includes an upward shift and a later return toward baseline.
Time Value What happens
1–10≈ 6–8Baseline behavior with small noise.
11–20≈ 12–15Mean level shifts upward (candidate change point near 11).
21–25≈ 8–9Series moves back toward baseline (candidate change point near 21).
Use “Load Example Data” to auto-fill the input box and test different methods.

Formula used

CUSUM (two-sided)
We track cumulative evidence of a mean shift: S⁺ₜ = max(0, S⁺ₜ₋₁ + (xₜ − μ − k)), S⁻ₜ = min(0, S⁻ₜ₋₁ + (xₜ − μ + k)). A change is flagged when S⁺ₜ > h or |S⁻ₜ| > h.
Page–Hinkley
We accumulate deviations from a running mean with drift: PHₜ = PHₜ₋₁ + (xₜ − mₜ − δ). Let minPHₜ = min(minPHₜ₋₁, PHₜ). A change is flagged when PHₜ − minPHₜ > threshold.
Rolling Z-score
Using a trailing window, compute zₜ = (xₜ − μᵥ)/σᵥ. A change candidate is flagged when |zₜ| ≥ zThreshold.
EWMA limits
EWMA smooths values: zₜ = αxₜ + (1−α)zₜ₋₁. Control limits use an estimated σz. A change is flagged when zₜ crosses the limits.

How to use this calculator

  1. Paste your time series values in the input box.
  2. Pick a method based on your signal behavior.
  3. Set baseline window and method parameters.
  4. Click “Detect Change Points” to compute results.
  5. Review flagged indices and the statistic trend chart.
  6. Export outputs using the CSV or PDF buttons.

Why change points matter in ML monitoring

Production models rarely fail all at once. Performance drifts when traffic mix, data capture, or user intent shifts. Change point detection converts those shifts into actionable indices so teams can investigate before metrics collapse. It is especially useful for latency, CTR, conversion, and feature distribution signals. It also standardizes incident timelines for teams.

Interpreting baseline windows and noise

The baseline window W estimates typical behavior: mean μ and variability σ. A short W reacts quickly but is noisy; a longer W is stable but slower. For weekly patterns, W should cover at least one cycle, and for daily seasonality it should include enough points to average peaks and troughs. Robust MAD-based σ helps when spikes or logging glitches inflate standard deviation, while still preserving sensitivity to sustained shifts.

Sensitivity controls and expected alert rates

Thresholds trade detection delay for false alarms. In CUSUM, k ignores small shifts and h sets the alarm boundary; lowering h increases sensitivity. Rolling Z uses |z| ≥ zThreshold, where 3σ corresponds to about 0.27% tail probability under normal noise, and 2σ corresponds to about 4.55%. EWMA with smaller α smooths more and highlights gradual drifts, while a smaller L tightens limits and raises alert frequency.

From flagged indices to root-cause analysis

A flagged index is a candidate, not a verdict. Compare the value series and the statistic trend, then segment the data before and after the point. Report effect size as Δμ/σ and include confidence ranges from bootstrap resampling when possible. Check whether the shift aligns with deployments, feature toggles, marketing campaigns, outages, or upstream schema changes. For multivariate systems, repeat detection per feature and look for co-occurring breaks across pipelines.

Practical validation and deployment tips

Start with offline backtests using labeled incidents or synthetic shifts. Track precision, recall, and mean time to detection, plus cost per alert for on-call teams. Use multiple methods: CUSUM for abrupt mean shifts, Page–Hinkley for drift with tolerance δ, and EWMA for smooth trends. In streaming setups, run detection on rolling aggregates (for example, 5-minute medians) to reduce noise. Export CSV for dashboards and keep thresholds versioned with your monitoring configuration.

FAQs

1) What data formats can I paste into the series box?

Paste numbers separated by commas, spaces, semicolons, or new lines. Scientific notation like 1e-3 works. Non-numeric tokens are ignored, so keep timestamps in the optional timestamps box.

2) Which method should I pick first?

Start with CUSUM for abrupt mean shifts, EWMA for gradual drift, Rolling Z for isolated spikes, and Page–Hinkley for drift with a small tolerance. If unsure, test two methods and compare stability across thresholds.

3) How do I choose a baseline window W?

Choose W large enough to represent normal behavior and seasonality. For daily cycles, include at least one full day of points; for weekly cycles, include a week. Smaller W reacts faster but increases false alarms.

4) What does robust MAD-based sigma change?

MAD-based sigma reduces the influence of outliers on variability estimates. That helps when occasional spikes would otherwise inflate standard deviation and hide real shifts. It is often better for noisy telemetry and partially missing data.

5) Why are there multiple change points close together?

Some signals shift in steps or oscillate during recovery. Tight thresholds can also trigger repeated detections. Increase h or L, widen W, or disable adaptive baseline to reduce clustering, then confirm with context such as deploy times.

6) Is this a statistical proof of change?

No. It flags candidates based on your assumptions and thresholds. Validate by comparing before/after distributions, checking confounders, and correlating with known events. Use domain knowledge and additional tests before taking corrective action.

Related Calculators

ARIMA Forecast CalculatorGRU Forecast CalculatorMoving Average ForecastSeasonality Detection ToolTime Series DecompositionAuto ARIMA SelectorForecast Accuracy CalculatorMAPE Error CalculatorRMSE Forecast ErrorMAE Error Calculator

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.