| Scenario | GB/day | Hot days | Archive days | Sampling | Template | Monthly total |
|---|---|---|---|---|---|---|
| Lean troubleshooting | 15 | 7 | 30 | 20% | Standard (example) | $257.53 USD |
| Balanced observability | 50 | 14 | 60 | 10% | Standard (example) | $910.48 USD |
| High-volume platform | 250 | 30 | 180 | 5% | High-performance (example) | $6,993.52 USD |
- Raw_GB_Month = GB_per_Day × Days_per_Month
- Effective_GB_Month = Raw × (1−Sampling) × (1−Dedup) × Overhead
- Indexed_GB_Month = Effective × Indexed_Portion
- Parsed_GB_Month = Effective × Structured_Logs
- Daily_Effective_GB = Effective / Days_per_Month
- Storage_GB_Month = (Daily_Effective_GB / Compression) × Retention_Days
- Export_GB_Month = Effective × Exported_Portion
- Events_Month ≈ (Effective × 1,073,741,824) / Avg_Event_Bytes
- Ingestion = Effective_GB_Month × Ingestion_Rate
- Indexing = Indexed_GB_Month × Indexing_Rate
- Parsing = Parsed_GB_Month × Parsing_Rate
- Hot storage = Hot_GB_Month × Hot_Storage_Rate
- Archive storage = Archive_GB_Month × Archive_Storage_Rate
- Egress = Export_GB_Month × Egress_Rate
- Metrics = (Events_Month / 1,000,000) × Metrics_Rate
- Total = Sum(variable costs) + Fixed monthly
- Enter your daily log volume and the number of days in your billing month.
- Adjust sampling, deduplication, overhead, and compression to match your pipeline.
- Set retention for hot and archive tiers based on access patterns and compliance.
- Pick a pricing template, then replace the unit rates with your provider’s prices.
- Click Calculate. Review the breakdown and recommendations, then export CSV or PDF.
What Drives Ingestion Spend
Ingestion fees scale with effective gigabytes, not raw telemetry. After edge filtering, sampling, and deduplication, many teams land at 60–90% of raw volume. Metadata overhead (labels, tenant tags, JSON envelopes) often adds 5–15%. Use the calculator’s overhead factor to reflect that uplift. A 50 GB/day stream with 10% sampling and 5% dedup can drop to about 46 GB/day effective. For multitenant setups, apply per-team caps and routing rules to keep noisy services from consuming shared budgets monthly quickly.
Indexing and Parsing Multipliers
Indexing and parsing usually apply to subsets of data. If 35% of logs are indexed, index cost follows that share, but expensive fields can behave like a multiplier when they expand stored structures. Structured parsing is common for JSON and can represent 50–80% of volume in modern apps. Keep schemas stable and avoid parsing rarely queried keys. Dropping indexed portion from 35% to 20% is a direct 43% indexing reduction.
Retention, Compression, and Storage Tiers
Retention drives storage in GB-months. For steady ingestion, average stored data equals daily effective GB divided by compression, multiplied by retention days. Compression of 2–5× is typical; highly repetitive logs compress better than traces. Hot tiers are optimized for low-latency search, so keep them short, often 7–30 days. Archive tiers can cover 90–365 days for compliance at a lower GB-month rate.
Exports, Egress, and Downstream Pipelines
Exports create both extra volume and transfer charges. Common patterns include forwarding 1–10% to a SIEM, exporting full datasets to object storage, or streaming to a lakehouse. If exports are frequent, batch them and compress payloads to reduce egress. Keeping analytics in-region also helps. In the calculator, exported portion multiplies the egress rate; doubling exports from 5% to 10% doubles that component.
Benchmarking and Scenario Planning
Scenario planning is where cost models pay off. Build at least three cases: normal load, incident load, and growth target. Many platforms see 2–4× spikes during outages, so adjust GB/day and retention to reflect investigations. Compare cost per raw GB for chargeback and cost per effective GB for vendor benchmarking. Use the breakdown to identify top drivers, then test one change at a time.
FAQs
Does sampling reduce cost linearly?
Mostly yes. It lowers effective GB, which drives ingestion, indexing, parsing, storage, and exports. Keep full fidelity for security, billing, and rare incident sources when required.
Why do hot and archive retention affect cost differently?
Hot tiers are priced for fast search and frequent queries. Archive tiers trade latency for lower GB-month pricing. Split retention by access needs to reduce total storage spend.
How should I pick the ingestion overhead multiplier?
Start with 1.05–1.15 for added tags, metadata, and wrappers. Validate by comparing raw emitter size versus delivered payload size at the collector or agent.
What does “indexed portion” mean in practice?
It represents the share of logs where searchable indexes are built. Index only fields you query often. Lowering indexed portion cuts indexing cost directly and can improve query performance.
My provider charges per query or compute. How do I model that?
Add those charges into “Fixed monthly” or adjust rates upward to reflect average compute per GB. Run a month of billing data, then tune rates until the model matches reality.
Can I use this for multi-account or multi-region setups?
Yes. Run separate scenarios for each region or account, then sum monthly totals. Use different egress rates when logs cross regions or leave the provider network.