Behavior Anomaly Score Calculator

Spot unusual activity before it escalates quickly. Tune weights, baselines, and context for your environment. Export results and share evidence with security teams easily.

Enter behavior and baseline data

Provide mean and deviation from your baseline, then the current observation.
Used for labeling, not for math.

Login frequency (per day)
Baseline distribution and current count.
Session duration (minutes)
Typical session length versus current session.
Data transfer per session (MB)
Useful for spotting unusual exports and bulk reads.
Failed sign-in attempts (window)
Elevations can indicate guessing or session abuse.
Privileged actions (window)
Admin actions, policy changes, role grants, or secrets access.

Context signals
Binary and categorical factors that commonly drive investigations.
Advanced scoring options
Weights are auto-normalized. Use these to match your detection priorities.
Limits extreme values.
Z at which anomaly reaches 1.
Prevents divide-by-zero.
Higher shifts scores downward.
Reset

Example data table

These sample scenarios show how scores shift with behavior changes and context signals.
Scenario Score Sensitivity New device New geo Threat intel Data current (MB) Failed current
Baseline-like7.0MediumNoNoNone1301
New device + new geo41.9MediumYesYesNone1401
Data spike31.2HighNoNoLow4201
Brute-force pattern93.6HighYesNoMedium1109
Intel hit on critical account99.9CriticalYesYesHigh2603
Tip: start with conservative weights, then tune using confirmed incidents and false positives.

Formula used

Each numeric metric is transformed into a capped z-score and normalized to an anomaly value between 0 and 1.

z = min( |x - μ| / max(σ, ε), z_cap )
a = min( z / z_scale, 1 )

numeric_sum = Σ ( w_i * a_i )   // weights are normalized to sum to 1
context_sum = new_device + new_geo + mfa_failed + time_of_day + intel_match

base = (numeric_sum + context_sum) * sensitivity_multiplier
score = 100 / ( 1 + e^( -k * (base - midpoint) ) )
        
Interpretation: higher deviation, stronger context, and higher sensitivity increase the score.

How to use this calculator

  1. Collect baseline mean and deviation for each metric.
  2. Enter the current observation from your alert or log set.
  3. Select context signals such as new device or intel match.
  4. Set account sensitivity to reflect business impact.
  5. Calculate, then review drivers and suggested actions.
  6. Export CSV or PDF to attach to a case record.
Professional notes to help tune scoring and interpret results in operational environments.

Baselining telemetry for reliable drift detection

Build baselines from stable periods and consistent log sources. A practical window is 14–30 days, refreshed weekly, with at least 200 events per metric to reduce noise. Use separate baselines for weekdays versus weekends when usage patterns differ. If a standard deviation is near zero, treat the signal as deterministic and raise epsilon slightly. Store μ and σ per actor type to avoid mixing human, service, and host behavior. Consider per-application baselines for VPN, email, and admin portals.

Weight tuning to reflect control objectives

Weights are normalized so they always sum to 1, making tuning predictable across teams. Start with balanced weights, then increase data-transfer and privileged-action weights when your primary risk is exfiltration or policy tampering. For identity-focused programs, raise failed sign-ins and login frequency weights. Keep any single weight below 0.40 to prevent one metric dominating. Recalibrate quarterly using confirmed incidents and false-positive reviews. Document each change in a tuning log for audits.

Context signals that materially raise investigation priority

Binary context flags add direct risk when behavior shifts align with attacker playbooks. New device and new geography each add 0.15, and a failed challenge adds 0.20, reflecting higher compromise likelihood. Time-of-day and intel matches contribute 0.10–0.40 based on confidence. Use “Critical” sensitivity when the actor can access regulated data or production controls, applying a multiplier up to 1.60 to amplify the same anomaly evidence. Validate flags with device and geo inventories.

Thresholding and score bands for triage workflows

The score uses a logistic curve to compress raw evidence into 0–100. With k around 3.2 and midpoint near 1.0, small drift stays low while clustered signals rise quickly. Suggested bands are Low <30, Medium 30–59, High 60–79, and Critical ≥80. Pair bands with playbooks: Medium triggers validation, High triggers containment checks, and Critical starts incident response. Adjust thresholds to meet alert volume targets.

Export-ready reporting for case management

Operational teams need repeatable evidence, not just a number. The export includes metric means, deviations, current values, z-scores, normalized anomaly values, and weighted impacts, so reviewers can trace why the score changed. Attach the CSV to tickets for trend analysis, and share the PDF for executive updates. Capture notes like alert IDs and log references to support chain-of-custody and faster peer review.

FAQs

What does the anomaly score represent?

It summarizes how far current behavior deviates from baseline, plus context signals, then scales it to 0–100. Higher scores indicate stronger evidence and higher potential impact, not confirmed compromise.

How should I choose baseline mean and deviation?

Compute μ and σ from clean periods for the same actor type and workload. Use 14–30 days where possible, exclude incident windows, and split baselines by weekday or region if patterns differ.

What if I have limited historical data?

Start with short baselines and conservative thresholds, then tighten as telemetry grows. You can also borrow cohort baselines, such as team-level or role-level averages, but label them clearly to avoid overconfidence.

Why are the weights normalized automatically?

Normalization keeps weights comparable and prevents totals from inflating scores. You can tune priorities without recalculating sums, and reviewers can interpret each metric’s impact as a percentage of numeric evidence.

How do sensitivity and threat intel affect results?

Sensitivity multiplies combined evidence to reflect business risk, while intel adds a direct confidence-based boost. Together they raise urgency for high-impact accounts and known-bad indicators, even when numeric drift is moderate.

Can I use the exports in case workflows?

Yes. CSV supports analysis and trend reviews, and PDF supports sharing. Include notes like ticket IDs, log queries, and timestamps so responders can reproduce the evidence and document decisions.

Related Calculators

Insider Risk ScoreEmployee Threat ScoreUser Risk RatingCredential Misuse RiskAccount Compromise RiskMalicious Insider RiskNegligent Insider RiskAccess Abuse RiskEndpoint Insider RiskFile Access Risk

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.