Learning Rate Range Test Calculator

Calculator Inputs

The page uses a single vertical flow. Inside the calculator, fields adapt to three columns on large screens, two on tablets, and one on mobile.

Starting Learning Rate

Example: 0.00001

Ending Learning Rate

Example: 1

Test Steps

Usually 20 to 100 steps.

Smoothing Beta

Higher values reduce noisy loss spikes.

Divergence Threshold %

How far loss can rise above the running minimum.

Warmup Steps to Ignore

Skips early noisy points when searching for signals.

Lower Safety Factor

Suggested minimum uses steepest rate divided by this factor.

Upper Safety Factor

Suggested maximum uses min-loss and divergence limits.

Observed Loss Count

This field updates after submission.

Observed Loss Values

Paste comma-separated or line-separated losses captured during the range test. The calculator uses the first N values that match your step count.

Formula Used

1) Exponential learning-rate sweep

The test increases the learning rate multiplicatively across steps:

multiplier = (end_lr / start_lr)^(1 / (steps - 1))

lr_i = start_lr × multiplier^i

2) Smoothed loss

Losses are smoothed with exponential averaging, then bias-corrected:

ema_i = beta × ema_(i-1) + (1 - beta) × loss_i

smoothed_i = ema_i / (1 - beta^(i+1))

3) Divergence detection

The calculator marks divergence when smoothed loss rises above the running minimum by the chosen threshold:

smoothed_i > running_min_i × (1 + threshold / 100)

4) Suggested range

The lower bound comes from the steepest loss descent. The upper bound is constrained by the minimum-loss point and divergence point:

recommended_min = steepest_lr / lower_factor

recommended_max = min(min_loss_lr / upper_factor, divergence_lr / adjusted_factor)

How to Use This Calculator

Run a short training sweep that increases learning rate from a very small start value to an intentionally aggressive end value.
Copy the recorded loss values in order and paste them into the losses field.
Set your starting rate, ending rate, step count, smoothing beta, and divergence threshold.
Use warmup ignore if the first few steps are unusually noisy or unstable.
Press Run Range Test. Results appear above the form, directly below the page header.
Review the suggested minimum and maximum band, then test those values in normal training with validation metrics.
Export the summary and detailed point table with the CSV or PDF buttons after calculation.

Example Data Table

Step	Learning Rate	Raw Loss	Smoothed Loss	Comment
1	0.000010	2.20	2.20	Very small rate. Slow progress.
4	0.000062	1.81	1.99	Loss begins dropping steadily.
8	0.001438	1.39	1.69	Stable descent continues.
12	0.033598	1.16	1.43	Strong candidate region.
14	0.112884	1.24	1.39	Loss flattens slightly.
16	0.379269	1.55	1.46	Possible divergence onset.

FAQs

1) What does this calculator estimate?

It estimates a practical learning-rate band from an exponential range test. The result helps you avoid values that are too slow or already showing unstable loss behavior.

2) Why use smoothed loss instead of raw loss?

Raw batch losses are often noisy. Smoothing exposes the real trend, making it easier to detect steep improvement zones and the point where the loss starts breaking upward.

3) What is the steepest descent rate?

It is the learning rate where smoothed loss drops fastest relative to the logarithmic rate increase. Many practitioners use this point to anchor a safe starting band.

4) Why divide by safety factors?

A range test is intentionally aggressive. Dividing by safety factors pulls recommendations away from the edge, giving you room to train normally with better stability.

5) What if no divergence is detected?

That usually means your end learning rate was still conservative. Increase the ending rate, add more steps, or lower smoothing slightly to expose the instability point.

6) Can I use validation loss instead of training loss?

Yes, but keep the measurement sequence consistent. Most range tests use training loss because it updates frequently, while validation metrics can be too sparse during short sweeps.

7) How many steps should I test?

Twenty to one hundred steps often works well. Use enough points to reveal the trend, especially if your losses are noisy or the learning-rate span is very wide.

8) Is the suggested band always the final answer?

No. Treat it as a disciplined shortlist. Final selection should still depend on full training behavior, validation performance, schedule choice, regularization, and batch size.