Inputs
Choose a profile, enter workload numbers, and optionally split cores across services.
Example data table
Use these sample scenarios to sanity-check your inputs.
| Scenario | Physical cores | Reserved | SMT factor | RPS | CPU ms/req | Peak | Target util | Headroom | Overhead | Req logical | Req physical |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Balanced web | 32 | 2 | 1.30 | 1200 | 6.0 | 1.5 | 0.70 | 25% | 8% | 20.83 | 16.03 |
| Latency critical | 24 | 2 | 1.25 | 800 | 7.5 | 2.0 | 0.60 | 35% | 10% | 36.67 | 29.34 |
| Batch throughput | 64 | 4 | 1.35 | 2500 | 5.0 | 1.2 | 0.80 | 15% | 6% | 21.56 | 15.97 |
Formula used
- CPU_demand (core-seconds/sec) = RPS × CPU_ms ÷ 1000 × PeakMultiplier (or a direct input).
- Required_logical = (CPU_demand ÷ TargetUtilization) × (1 + Overhead) × (1 + Headroom).
- Available_logical = (PhysicalCores − ReservedCores) × SMTFactor.
- Required_physical = Required_logical ÷ SMTFactor.
- Per-service split: Service_logical = Required_logical × (Weight ÷ ΣWeights).
How to use this calculator
- Select a workload profile to prefill safe defaults.
- Enter physical cores and reserve capacity for the host.
- Pick a demand model: RPS with CPU time, or direct cores.
- Set utilization, headroom, and overhead for your risk tolerance.
- Optionally name services and set weights for core distribution.
- Press Calculate Allocation to view results above the form.
- Download CSV or PDF to share the scenario with your team.
Capacity Inputs
Start by defining physical cores and the amount reserved for the host. In clustered environments, use the minimum guaranteed cores per node rather than peak burst capacity. Reserve one to two cores for interrupts, telemetry, and system daemons, then convert remaining cores to logical capacity with an SMT efficiency factor. Practical SMT factors range from 1.20 to 1.45, depending on workload mix and cache pressure.
Demand Modeling
The calculator supports two demand views: throughput-based and direct core demand. With throughput, CPU demand equals requests per second multiplied by average CPU milliseconds per request, then scaled by a peak multiplier for bursts. This yields core‑seconds per second, which maps naturally to logical cores. If you already track CPU usage in cores from monitoring, enter the direct demand to bypass request measurements and keep the estimate aligned.
Utilization Targets
Target utilization determines how much sustained load you allow before saturation. For latency-sensitive services, 0.55 to 0.70 is common to protect tail latency during spikes and garbage collection. For batch or streaming jobs, 0.75 to 0.85 can increase throughput, especially when queues absorb jitter. Headroom and overhead are applied as multipliers; headroom covers uncertainty, while overhead accounts for runtime, scheduler, and containerization costs.
Service Weighting
When multiple services share a host or cluster, weighting translates the overall core requirement into a proportional allocation. Use weights to reflect relative CPU importance, not absolute percentages; a service with weight 4 receives twice the logical cores of weight 2. The resulting logical cores can be mapped to orchestration settings, such as CPU requests and limits, or hypervisor reservations, providing a clear baseline for capacity planning.
Operational Checks
After computing required and available logical capacity, validate results against real telemetry. If the calculator reports a shortage, reduce peak assumptions, lower overhead by tuning runtimes, or add nodes before production scaling events. If it reports large spare capacity, consider consolidating or adjusting utilization targets to improve efficiency. For container platforms, compare the recommended logical cores to throttling metrics, and tune limits to avoid CFS throttling under load during incident response windows. Re-run scenarios for seasonal peaks, deployment rollouts, and failover states to ensure allocations remain resilient.
FAQs
1) What does SMT efficiency factor mean?
It estimates how much extra throughput you get from simultaneous multithreading. Use 1.00 for no gain, 1.20–1.45 for many mixed workloads, and validate with benchmarks or production CPU saturation data.
2) How should I choose target utilization?
Lower targets reduce latency risk. Start near 0.60–0.70 for interactive services and 0.75–0.85 for batch jobs. If throttling or long queues appear, decrease the target or increase headroom.
3) Why reserve physical cores?
Reserved cores protect the host and platform components. They cover kernel work, interrupts, storage, networking, monitoring agents, and control-plane tasks. Without a reserve, application cores can be stolen unexpectedly, causing jitter and timeouts.
4) What is the peak multiplier used for?
Peak multiplier scales baseline demand to account for bursts, deployments, or uneven traffic. Use 1.0 for steady systems, 1.5–2.5 for typical web spikes, and higher values only when you have measured worst‑case surges.
5) How do service weights affect allocation?
Weights split the required logical cores proportionally. If services are 4,2,1, the first receives about 57%, the second 29%, and the third 14%. Use weights to express priority, then map results to requests and limits.
6) Why can’t I download before calculating?
Exports are generated from the most recent calculation stored in your session. Run the calculator once to populate results, then download CSV or PDF. This avoids exporting empty templates and keeps outputs consistent with the scenario shown.