VM Size Calculator

Workload inputs

Enter either Users + Requests per minute, or Direct RPS.

Tip: keep CPU target between 55–70% for steady services.

Workload type

Used as a reference label for reporting.

Users (optional)

Active users during peak window.

Requests per minute per user

Only used when Direct RPS is empty.

Direct RPS (override)

If set, it replaces the users-based estimate.

CPU milliseconds per request

Average single-core CPU time per request.

Base application RAM (GB)

App runtime, caches, and steady allocations.

RAM per user (MB)

Sessions, per-user buffers, connection memory.

OS overhead (GB)

Kernel, agents, logging, and base services.

Storage (GB)

Data, logs, snapshots, and growth buffer.

IOPS target

Used to suggest a disk performance tier.

Network throughput (Mbps)

Peak sustained traffic requirement.

CPU overhead (%)

Virtualization + agents + background tasks.

CPU target utilization (%)

Lower target = more headroom and stability.

RAM target utilization (%)

Avoid sustained high memory pressure.

Instances

Split load across multiple identical instances.

Availability mode

Affects total capacity planning, not per-VM size.

Headroom (%)

Handles bursts, GC spikes, and noisy neighbors.

Growth margin (%)

Plans for near-term demand and feature creep.

Reset

Example sizing scenarios

Reference-only examples. Real workloads should be measured and profiled.

Scenario	Peak RPS	CPU ms / req	Users	Headroom	Growth	Suggested starting point
Small web API	40	2.5	600	25%	20%	2 vCPU, 4 GB RAM
Moderate SaaS	150	3.0	2,000	30%	25%	4 vCPU, 8–12 GB RAM
DB-heavy service	90	4.0	1,200	35%	30%	8 vCPU, 16 GB RAM + premium disk

Formula used

These equations provide a practical, explainable estimate.

Requests per second (RPS): If Direct RPS is empty,
RPS = (Users × RequestsPerMinutePerUser) ÷ 60
CPU cores at 100%:
Cores100 = RPS × (CpuMsPerRequest ÷ 1000)
CPU with overhead and utilization target:
CoresBase = (Cores100 × (1 + CpuOverhead%)) ÷ TargetCpuUtil%
Base RAM:
RamBase = BaseAppRamGB + (Users × RamPerUserMB ÷ 1024) + OsOverheadGB
RAM with utilization target:
RamUtil = RamBase ÷ TargetRamUtil%
Headroom and growth multiplier:
Multiplier = (1 + Headroom%) × (1 + Growth%)
Per-instance recommendation:
PerInstance = (Base × Multiplier) ÷ Instances

Availability modes increase total planned capacity; the per-instance size stays the same.

How to use this calculator

A simple workflow for accurate VM sizing.

Enter peak load using Direct RPS, or provide Users and per-user request rate.
Set CPU ms per request from profiling or APM sampling.
Adjust OS overhead and CPU overhead for agents and background tasks.
Choose utilization targets; lower values improve stability and latency.
Add headroom for bursts and growth margin for upcoming releases.
Press Calculate to see sizing and download CSV/PDF reports.

Generic size reference

Use these as a quick comparison against your result.

Profile	vCPU	RAM (GB)	Typical use
Nano	1	1	Low-traffic services, dev workloads
Micro	2	2	Small web apps, light APIs
Small	2	4	General web, small databases
Medium	4	8	Production apps, moderate DB
Large	8	16	Heavier services, analytics
XL	16	32	High load, large caches
2XL	32	64	Big data nodes, high concurrency

Capacity planning notes

Professional guidance aligned with this calculator’s output.

Workload signals that drive CPU sizing

CPU demand is estimated from peak requests per second and average CPU milliseconds per request. For example, 150 RPS at 3 ms consumes 0.45 core at 100% utilization. After overhead and a 65% target, the baseline rises to roughly 0.73 vCPU before headroom and growth are applied.

Memory planning for stability and latency

RAM is calculated using application baseline memory plus per-user allocations, then adjusted for an operating-system buffer. Keeping a 70–80% memory target reduces swapping risk, lowers garbage-collection pressure, and helps caches remain warm. If your app uses in-memory sessions, increase per-user MB to reflect real heap usage.

Headroom and growth as separate levers

Headroom protects performance during bursts, deploy spikes, and noisy-neighbor variance. Growth margin accounts for business expansion, new features, and traffic seasonality. A practical starting point is 20–35% headroom and 15–30% growth for fast-moving products. Mature services can reduce growth but should keep headroom for reliability.

Multi-instance distribution and availability modes

When you scale horizontally, total demand is split across instances, producing a smaller per-instance recommendation. Availability modes influence total planned capacity: N+1 reserves enough capacity to survive a single node failure, while active-active can be modeled conservatively as doubling capacity to absorb failover without degradation.

Storage, IOPS, and network considerations

Storage size impacts retention and log growth, but performance is often limited by IOPS and latency. If targets exceed ~2,000 IOPS, premium tiers are typically appropriate; above ~5,000 IOPS, provisioned performance becomes more predictable. Network throughput requirements above 300 Mbps usually justify higher bandwidth options and careful egress budgeting.

Validate results with measurement and iteration

Treat the recommended VM size as a first-pass baseline, then validate with APM traces, load tests, and system metrics. Monitor CPU steal, memory paging, disk queue depth, and p95 latency during peak. Right-size after one to two release cycles, and keep a repeatable sizing record using the built-in CSV and PDF exports.

FAQs

Quick answers for common sizing questions.

1) Should I use Users or Direct RPS?

Use Direct RPS when you have monitoring or load-test data. Use Users when demand is early-stage and you can estimate per-user request rates during peak windows.

2) What CPU ms per request value should I enter?

Start with APM averages from peak traffic. If you lack data, use 2–5 ms for lightweight APIs and 5–15 ms for heavier logic, then calibrate using load tests.

3) Why do you apply utilization targets?

Running at 100% is unstable for most services. Targets like 60–70% CPU and 70–80% RAM preserve latency, handle spikes, and reduce risk from background tasks and contention.

4) What does N+1 change in the output?

N+1 increases total planned capacity to survive one instance failure. It does not increase the recommended per-instance size; it ensures remaining instances can carry the load.

5) How do I size storage beyond GB?

Confirm disk latency and IOPS needs. Databases and log-heavy systems often require higher-performance tiers even with modest GB. Benchmark with real workloads to validate queue depth and p95 latency.

6) Is the recommendation cloud-provider specific?

No. It is provider-agnostic and focuses on capacity math. Map the result to your platform’s instance families, then validate with metrics because vCPU performance differs by generation, region, and throttling policies.