Inputs
Enter your baseline and current metrics from the same segment and window.
Example data
| Metric | Baseline | Current |
|---|---|---|
| Query Rate (qps) | 900 | 1550 |
| Std Dev (qps) | 120 | — |
| Unique Domains / min | 820 | 1400 |
| NXDOMAIN Rate | 0.03 | 0.14 |
| Avg Query Length (chars) | 18 | 29 |
| Domain Entropy | 3.1 | 4.6 |
Formula used
- Volume z-score: z = (current_qps − baseline_qps) / baseline_std
- Unique shift: Δunique% = (current_unique − baseline_unique) / baseline_unique
- NXDOMAIN shift: Δnx = current_nx − baseline_nx
- Length shift: Δlen% = (current_len − baseline_len) / baseline_len
- Entropy shift: ΔH = current_entropy − baseline_entropy
Each feature becomes a subscore from 0–1 using capped scaling. The final score is a weighted sum, multiplied by 100.
How to use this calculator
- Choose a stable baseline window for the same resolver segment.
- Compute baseline averages and standard deviation for query rate.
- Measure current metrics over an equivalent time window.
- Submit values to get a weighted anomaly score and risk band.
- Investigate top drivers using the cues and your DNS logs.
Professional notes
Baseline integrity and windowing
Accurate anomaly scoring starts with comparable measurement windows. Build baselines per resolver, network segment, and business schedule, because lunch-hour traffic differs from overnight maintenance. Use the same aggregation period for baseline and current inputs, and refresh baselines after topology, policy, or application changes. Track seasonality and planned events so expected surges do not inflate risk. When uncertainty exists, widen the baseline sample and store standard deviation for stability. Document data sources and ensure query sampling excludes internal test generators and scanners.
Volume deviation as early warning
Query-rate z-scores highlight sudden load changes that may signal misconfiguration, denial patterns, or malware beacons. A z-score uses baseline mean and standard deviation to express how unusual the current rate is. High positive values suggest bursts; large negative values can indicate outages, sinkholing, or blocked egress. Pair volume changes with top talkers and query types to separate legitimate peaks, like patch windows, from suspicious spikes.
NXDOMAIN spikes and domain generation
NXDOMAIN rate shifts are strong indicators of failed lookups, often linked to typosquatting probes, blocked domains, or domain generation algorithms. Compare current NXDOMAIN rate to baseline and treat large percentage-point changes as high-value signals. Pivot on client IP, user identity, and first-seen domains, then inspect repeated random subdomains. Validate whether upstream filtering, new blocklists, or split-horizon records could explain the increase.
Entropy and length indicators
Random-looking domains tend to raise entropy metrics, while tunneling and exfiltration often increase average query length. Entropy captures character unpredictability; length captures payload capacity. Combine both with domain diversity to spot algorithmic patterns that evade reputation systems. Investigate unusually long names, TXT queries, and high-cardinality subdomains. Correlate with time-of-day, endpoint telemetry, and egress controls to confirm whether activity is benign automation or covert transport.
Turning scores into response actions
A weighted score provides triage, not attribution. Use the risk band to prioritize reviews, then let the top drivers guide investigation steps. For elevated results, sample logs and confirm baseline freshness. For high or critical results, isolate top talkers, block suspicious domains, and check for lateral spread. Record outcomes to tune caps and weights for your environment, reducing false positives while keeping sensitivity to genuine threats.
FAQs
1) What inputs should I use for the baseline?
Use averages from a stable period for the same resolver group: query rate, standard deviation, unique domains per minute, NXDOMAIN rate, average query length, and domain entropy. Keep the window length consistent with the current measurement.
2) How do I estimate domain entropy quickly?
Export queried domains, normalize to lowercase, and calculate Shannon entropy over characters for each domain or label. Average across the window. Keep the method identical for baseline and current values so the comparison stays meaningful.
3) Does a high score always mean an attack?
No. High scores mean the current window differs from your baseline. Planned maintenance, new applications, blocklist changes, or resolver outages can trigger anomalies. Validate with first-seen domains, client identity, and correlated security telemetry.
4) What score threshold should trigger escalation?
Start by alerting at Elevated and paging at High or above, then tune using incident reviews. Environments with bursty traffic may need higher caps. Track false positives and adjust weights per segment instead of globally.
5) Why include query length and entropy together?
Length suggests capacity for tunneling, while entropy suggests randomness often seen in algorithmic domains. Together they reduce blind spots: long but predictable domains can be benign, and random short domains may still be suspicious when diversity rises.
6) Can I use this for multiple locations or tenants?
Yes. Maintain separate baselines per location, tenant, or resolver pool. Mixing segments hides localized incidents and inflates variance. Store baselines with timestamps, and refresh after policy, routing, or workload changes.
Note: This tool supports triage, not attribution. Validate with client identity, query types, and threat intelligence.