Model the end‑to‑end duration for building an inverted index at scale. Tune document size, per‑KB processing time, batch overhead, parallel workers, utilization, I/O wait, and throttling. Instantly see throughput, total wall time, and completion ETA. Ideal for capacity planning, migration windows, and SLAs when every minute and gigabyte matters.
Enter your parameters and click Calculate to see throughput, total wall time, and ETA.
Per‑doc time (single) | sizeKB × (parse+tokenize+write) + (batchOverhead / batchSize) |
---|---|
Effective per‑doc time | perDocSingle ÷ (util%/100) × (1 + ioWait%/100) |
Throughput | (1000 / effectivePerDocMs) × workers |
Raw duration | totalDocs ÷ throughput |
Wall time | (rawDuration × throttleFactor) + warmup |
Throttle factor | 60 ÷ (60 − pauseMinutesPerHour) |
The fraction of time workers spend stalled on disk or network rather than executing compute. Higher values inflate effective per‑document time.
Choose a size that amortizes setup overhead without risking large retries on failure. Start with hundreds to a few thousand documents and adjust from error rates and latency targets.
Background services, context switches, GC, and coordination all reduce effective CPU time available to indexing threads.
Yes. Compression trades CPU for I/O. If you compress postings or stored fields, increase write time per KB to reflect the extra work.
Approximate by computing a weighted average KB and timings across your corpus, or run multiple scenarios for clusters of similar documents.
Use the average expected number of workers across the run or run the calculator in phases with different worker counts and sum the durations.
It is an estimate. For better accuracy, measure per‑KB timings on a representative sample, include realistic pauses, and monitor I/O contention in staging.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.