Cluster Sizing Calculator

Size clusters from traffic, latency, and SLAs. Compare node shapes and set utilization targets quickly. Export results to share plans with your team today.

Calculator inputs

Fill the fields, then press Calculate. Leave concurrency as 0 to auto-estimate from RPS and p95 latency.
Workload
Payload & network
Network is estimated from (request + response) payload at peak RPS, then adjusted by safety and growth buffers.
Storage
Overhead can include indexing, metadata, and compaction headroom.
Node shape
Utilization targets
Overhead accounts for OS, agents, sidecars, and runtime services.
Buffers & availability
Buffers model bursts and forecast growth. N+1 helps absorb node loss.
Reset

Example data

Use this as a starting point, then tune with real measurements.
Scenario Peak RPS CPU ms/req p95 latency (ms) Node shape Buffers Result nodes
API + cache 750 12 180 8 vCPU / 32 GB Safety 20%, Growth 15%, N+1 6
Batch + worker 120 55 600 16 vCPU / 64 GB Safety 25%, Growth 10% 3
Stateful store 350 18 250 8 vCPU / 32 GB / 2 TB Rep 3×, Retention 14d 8
Example rows are illustrative; your results will vary by workload and headroom needs.

Formula used

1) Concurrency (optional auto-estimate)

If concurrency is set to 0, the calculator uses: Concurrency ≈ ceil(RPS × p95LatencySeconds).

2) CPU cores required

CPUcores = (RPS × CPUms/1000) / (TargetCPU × (1−CPUOverhead))

Then it applies buffers: × (1+Safety) × (1+Growth).

3) Memory required

InflightGB = (Concurrency × MBperInflight) / 1024

MemGB = (InflightGB + WorkingSetGB) / (TargetMem × (1−MemOverhead))

Then it applies the same buffers.

4) Storage required

StorageGB = IngestPerDay × RetentionDays × (1+StorageOverhead) × Replication, then apply buffers.

5) Network required

Mbps = RPS × (ReqKB+RespKB) × 8 / 1024, then apply buffers.

The recommended node count is the maximum nodes required by CPU, memory, storage, and network, with availability rules applied.

How to use this calculator

  1. Enter peak traffic (RPS) and a realistic p95 latency target.
  2. Use measured CPU time per request from profiling or tracing.
  3. Set concurrency to 0 to auto-estimate, or enter a known value.
  4. Choose a node shape that matches your provider or on-prem hardware.
  5. Set utilization targets to avoid sustained saturation.
  6. Add buffers for load spikes and forecast growth.
  7. Press Calculate to see nodes and resource totals.
  8. Export CSV or PDF to share sizing decisions.

FAQs

1) Is this sizing exact?

No. It’s a planning baseline that uses simplified resource models. Validate with load tests, real traces, and production dashboards, then adjust node shape, overhead, and buffers accordingly.

2) Why does concurrency matter?

Concurrency drives inflight memory and can expose contention. If latency rises under load, concurrency increases, which raises memory and CPU needs. Using p95 latency helps capture typical peak behavior.

3) What should I put for CPU time per request?

Use measured on-CPU time from profiling or distributed tracing at steady load. Avoid local dev numbers. If workloads vary, use a weighted average or size for the most expensive critical endpoints.

4) How do I estimate overhead?

Include OS, agents, sidecars, runtime services, and reserved capacity. In many clusters, 5–20% CPU and 10–25% memory is common, but measure your baseline node usage to be sure.

5) Why use utilization targets like 60–70%?

Headroom reduces tail latency and helps absorb bursts, rebalancing, and background work. Running near 100% risks queueing, retries, and cascading failures during peak or partial outages.

6) Storage seems huge. What can reduce it?

Lower retention, reduce replication, enable compression, tier cold data, or shrink payloads. Also review overhead assumptions for indexing and compaction; stateful systems often require extra free space for stability.

7) Does N+1 always mean “add one node”?

It’s a simple, conservative rule for small to mid clusters. For larger fleets, you might model a percentage reserve instead. Use your SLOs and failure history to pick an availability strategy.

8) What else should I consider beyond this calculator?

Consider autoscaling behavior, bin packing efficiency, pod limits, disk IOPS, shard counts, cache hit rates, and maintenance windows. For multi-region systems, include cross-region replication bandwidth and failover traffic.

Related Calculators

cpu core calculatorkubernetes resource calculatorresource utilization calculator

Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.