| Scenario | Replicas | Avg CPU (m) | Peak CPU × | Avg Mem (MiB) | Peak Mem × | Node size |
|---|---|---|---|---|---|---|
| API (steady traffic) | 4 | 120 | 1.4 | 320 | 1.3 | 4 vCPU / 16 GiB |
| Worker (bursty) | 6 | 250 | 2.0 | 512 | 1.8 | 8 vCPU / 32 GiB |
| Batch (memory heavy) | 3 | 180 | 1.3 | 1024 | 1.6 | 8 vCPU / 64 GiB |
This calculator uses a practical sizing model based on observed averages, peak multipliers, and safety factors:
Effective CPU per pod (m) = (avg_cpu_m + sidecar_cpu_m) × peak_cpu_mult
CPU request per pod (m) = Effective CPU × cpu_req_safety
CPU limit per pod (m) = CPU request × cpu_limit_factor
Effective Mem per pod (Mi) = (avg_mem_mib + sidecar_mem_mib + overhead_mib) × peak_mem_mult
Mem request per pod (Mi) = Effective Mem × mem_req_safety
Mem limit per pod (Mi) = Mem request × mem_limit_factor
Totals = per-pod values × replicas
Capacity needed = Total requests × (100 / target_util%) × (1 + node_reserve%)
Node count is estimated from CPU, memory, and max pods per node, then the highest requirement is selected.
- Collect real averages and peaks from monitoring over representative traffic.
- Enter replicas, CPU and memory values, and optional sidecar overhead.
- Set peak multipliers and safety factors to match your risk tolerance.
- Choose utilization and node reserve to keep scheduling headroom.
- Review per-pod values, totals, and recommended node count.
- Download CSV/PDF for sharing, then apply the generated YAML.
1) What is the difference between requests and limits?
Requests guide scheduling and guarantee resources; limits cap usage. CPU over limit throttles, while memory over limit can trigger OOM termination. Start with realistic requests and conservative memory limits.
2) How do I choose peak multipliers?
Use historical metrics: compare p95 or p99 to average during peak hours. If your p95 CPU is 1.6× average, set the multiplier near 1.6. Re-check after releases.
3) Why include a safety factor if I already use peaks?
Peaks are estimates and traffic changes. Safety factor covers variance, GC cycles, cache warmups, noisy neighbors, and unknowns. Use smaller values for stable systems and larger values for new workloads.
4) What does target utilization mean here?
It represents how “full” you want nodes to be on requests. Lower targets increase slack for spikes and scheduling flexibility. Many teams use 60–80% depending on workload volatility.
5) What is node reserve and when should I raise it?
Node reserve accounts for system components, daemonsets, and eviction thresholds. Raise it when your nodes run many background agents or when you see frequent pressure events. Keep it consistent across environments.
6) Can I use this for multiple containers in a pod?
Yes. Add sidecar CPU and memory to approximate extra containers. If containers have very different patterns, size each container separately and sum their requests and limits for pod-level planning.
7) How accurate is the node count estimate?
It’s an approximation using requests, headroom, and a pod limit cap. Real clusters also need to consider topology constraints, affinities, storage, and burst behavior. Validate with a staging rollout or capacity tests.
8) What should I do after applying the generated YAML?
Observe throttling, OOMs, and HPA behavior for several peak cycles. Adjust requests if HPA scales too early or too late. Tighten memory limits carefully, and keep a change log for each tuning iteration.