Calculator
Example Data Table
| Position | Value | Median split | Run note |
|---|---|---|---|
| 1 | 12 | Below | Run 1 starts |
| 2 | 15 | Above | Run 2 starts |
| 3 | 9 | Below | Run 3 starts |
| 4 | 18 | Above | Run 4 starts |
| 5 | 21 | Above | Same run continues |
Formula Used
The null hypothesis says the ordered sequence is random.
Let n1 be the count of group A values.
Let n2 be the count of group B values.
Let n = n1 + n2.
Expected runs: μR = 1 + (2n1n2 / n).
Variance: σ²R = [2n1n2(2n1n2 − n)] / [n²(n − 1)].
Z score: z = (R − μR) / σR.
With correction, 0.5 is adjusted toward the mean.
How to Use This Calculator
- Enter values in their original order.
- Choose numeric mode or two-category mode.
- Select median, mean, or custom cutoff for numbers.
- Choose tie handling near the cutoff.
- Select a test direction and alpha level.
- Press the calculate button.
- Review runs, z score, p value, and decision.
- Download CSV or PDF for records.
Runs Test for Randomness Guide
What the Test Checks
A runs test checks whether a sequence looks random. It studies the order of two groups. The groups may be heads and tails. They may also be values above and below a median. A run is a block of matching labels. For example, A A B B A has three runs. The test compares observed runs with the expected number under randomness. Too few runs may show clustering. Too many runs may show switching or alternation. This calculator helps you inspect both cases.
When to Use It
Use the tool when order matters. It is helpful for coin tosses, pass fail records, signs in residuals, quality checks, or market direction lists. Numeric data can be split by mean, median, or a custom cutoff. Categorical data can be tested directly when it has two labels. Ties can be removed or assigned. Removing ties is usually cleaner near the cutoff.
Reading the Result
The result includes group counts, observed runs, expected runs, variance, z score, p value, and decision. The z score shows how far the observed count is from the random expectation. The p value measures how unusual the result is. A small p value means the order is less consistent with randomness. It does not prove a cause. It only flags a pattern worth checking.
Choosing the Test Direction
A two sided test detects either clustering or excessive alternation. A left tailed test focuses on too few runs. A right tailed test focuses on too many runs. Choose alpha before reading the result. Common alpha values are 0.05 or 0.01. Use stricter alpha when false alarms are costly.
Good Data Practice
Good input quality matters. Keep the original order. Do not sort the data. Remove irrelevant records before testing. Make sure the two groups are meaningful. For small samples, the normal approximation can be rough. Treat borderline results with care. Use the CSV and PDF exports to save the full calculation. Then compare it with charts, domain knowledge, and any planned follow up test.
Important Limits
The method is simple, but interpretation needs context. Seasonal effects can create runs. Machine drift can create runs. Human scheduling can create runs. Random sources can also produce strange streaks. Review sample size, labels, and collection rules before changing a process. Repeat testing may need adjusted significance levels.
FAQs
What is a run?
A run is a continuous block of the same group. In A A B A A, there are three runs because the label changes twice.
What does the null hypothesis mean?
The null hypothesis says the sequence order is random. The test checks whether the observed number of runs is unusual under that assumption.
When should I use median splitting?
Use median splitting when numeric values need two balanced groups. It is common because it is less affected by extreme values.
What do too few runs suggest?
Too few runs may suggest clustering. Similar values appear together more often than expected under random ordering.
What do too many runs suggest?
Too many runs may suggest alternation. The sequence switches between groups more often than expected by chance.
Should I remove ties?
Removing ties is often preferred when values equal the cutoff. It avoids forcing neutral values into either group.
What alpha level should I choose?
Common choices are 0.05 and 0.01. Choose alpha before testing. Use lower alpha when false alerts are costly.
Can this prove data is random?
No. It can only test one type of ordering pattern. Use it with plots, context, and other checks.