Rank-Size Fit Calculator

Data values (one per line, or comma separated)

Values must be positive after applying the optional epsilon shift.

Log base

The fitted α is unchanged by log base.

Rank order

Use descending for typical Zipf-style ranking.

Epsilon shift (optional)

Adds ε to each value before logs.

Minimum rank for fit

Exclude early ranks to fit the tail.

Maximum rank for fit

Leave blank or 0 to use all ranks.

Example data table

This sample mimics a rank-size pattern common in cascade and fragmentation measurements.

Rank	Value	Notes
1	100	Largest observed event size
2	60	Second largest event size
3	45	Mid-range measurement
4	35	Mid-range measurement
5	28	Tail begins to stabilize
6	23	Tail region
7	19	Tail region
8	16	Tail region
9	14	Tail region
10	12	Smallest reported value

Formula used

A rank-size relation models how an ordered physical observable scales with rank:

value(r) = C · r^{-α}

Taking logs gives a linear form suitable for least-squares fitting:

log(value) = log(C) − α · log(r)

This tool fits the line in log space, then converts back to report α and C, plus R² and residuals.

How to use this calculator

Paste your measured values into the data box.
Choose the ranking direction for your experiment.
Select a log base that matches your reporting style.
Adjust minimum and maximum rank to fit a region.
Click Calculate to view parameters above.
Use the CSV and PDF buttons to export results.

Rank-size fitting in physical measurements

1) Where rank-size patterns appear

Rank-size curves occur whenever many outcomes compete under shared constraints. In physics, common examples include crack fragment masses, avalanche energies, acoustic emission bursts, vortex cluster areas, and intensity peaks in turbulent signals. Ordering observations by magnitude converts raw measurements into a compact spectrum that highlights scaling behavior.

2) The power-law model behind Zipf behavior

This calculator fits the relation value(r) = C · r^−α. The exponent α controls how quickly values decay with rank. For α near 1, the second-ranked value is roughly half the first when the system follows ideal Zipf-like scaling. Larger α indicates a steeper drop and a more dominant top-ranked event.

3) Log-linear regression and what it estimates

Taking logs gives a straight line: log(value) = log(C) − α·log(r). The tool performs least-squares regression in log space, returning α, C, and the slope/intercept. Because the fit is linear in logs, you can compare experiments consistently even when values span several decades.

4) Choosing the fit window with rank limits

Real datasets often bend at small ranks due to saturation, detector limits, or finite-size effects. Use Minimum rank and Maximum rank to focus on the region that behaves most like a power law. For example, in a 200-event sequence, fitting ranks 10–200 can reduce bias from unusually large initial bursts.

5) Interpreting R², residuals, and SSE

R² summarizes how well the log-linear model explains the variance in log(value). Values above 0.95 often indicate strong scaling over the selected window, but always inspect residuals. Systematic curvature suggests a cutoff or multiple regimes. SSE provides an absolute error measure in log space for comparing fits across windows.

6) Uncertainty: confidence interval for α

The calculator reports an approximate 95% confidence interval for α from the regression standard error. With more points, uncertainty typically shrinks: doubling the number of fitted ranks can noticeably tighten the interval when noise is stable. Treat the interval as a practical guide rather than a strict guarantee in heavy-tailed data.

7) Practical data tips for stable results

Use consistent units, remove non-physical negatives, and avoid mixing different experimental regimes. If values include zeros from thresholding, apply a small epsilon shift so logs remain defined, but keep epsilon much smaller than the smallest meaningful measurement. Exporting the table helps document rank ordering and predicted values.

8) How to report rank-size parameters

When publishing, report α, the fitted rank window, and a goodness-of-fit metric. A concise statement might read: “Rank-size fit over ranks 5–100 gave α = 1.12 (95% CI 1.05–1.19), R² = 0.97.” Including the residual trend can justify why a specific window was chosen.

FAQs

1) What does the exponent α mean physically?

α describes how quickly ranked event sizes decay. Larger α means the largest events dominate more strongly. Smaller α indicates a flatter spectrum with relatively more mid-sized events.

2) Does changing the log base change α?

No. Switching between natural log and base-10 log rescales the intercept, not the slope in log-log space. The fitted α remains the same, while C changes to match the chosen base.

3) Why should I adjust the minimum rank?

Small ranks often deviate due to finite-size limits, saturation, or a different regime. Increasing the minimum rank fits the tail where power-law behavior is more stable and interpretable.

4) What if my data includes zeros or negatives?

Logs require positive values. Remove invalid entries or add a small epsilon shift to all values, but keep epsilon much smaller than the smallest meaningful measurement to avoid distorting the fit.

5) Is R² enough to confirm a power law?

R² is helpful but not definitive for heavy-tailed data. Also inspect residuals for curvature and test different rank windows. Consider complementary diagnostics when results are high-stakes.

6) How many points do I need for a reliable fit?

More is better. As a practical baseline, aim for at least 20–30 ranks in the fitted window. Very small samples can produce unstable α estimates and overly optimistic R².

7) What should I export for documentation?

Export the ranked table with measured values, predictions, and residuals. This preserves the exact ordering, the fit window, and the model output, making it easier to reproduce and compare experiments.