Cross Correlation Calculator

Calculator Inputs

Series A (x[n])

Paste numbers separated by spaces, commas, or new lines.

Series B (y[n])

Use the same sample interval for meaningful lag.

Maximum Lag (samples)

The calculator evaluates lags from -max to +max.

Scaling Option

Scaling changes amplitude interpretation across different overlaps.

Normalization

Coefficient mode reports a similarity score per lag.

Remove mean (detrend by constant)

Recommended when DC offsets dominate the product sum.

Example Data Table

Use this quick example to confirm your workflow.

Index n	Series A x[n]	Series B y[n]
0	1	0
1	2	1
2	3	1
3	4	2
4	5	3
5	4	5
6	3	8
7	2	13
8	1	21

Formula Used

Core Cross Correlation

For a lag k, the calculator correlates x[n] with y[n+k] only where both samples exist.

Raw sum form:
R_xy[k] = Σ x[n] · y[n+k]

If mean removal is enabled, each series is shifted by its average before the sum: x'[n]=x[n]-μ_x, y'[n]=y[n]-μ_y.

Scaling Options

None: reports the raw overlap sum.
Biased: divides by the smaller series length for stable amplitude.
Unbiased: divides by the overlap count at each lag.

Coefficient Normalization

Coefficient mode returns a similarity score per lag:

r_xy[k] = Σ x'[n] y'[n+k] / √(Σ x'[n]² · Σ y'[n+k]²)

Values near +1 indicate strong alignment, near -1 indicate inverted alignment, and near 0 indicate weak similarity at that lag.

How to Use This Calculator

Paste your two sequences into Series A and Series B.
Set the maximum lag in samples to scan left and right.
Enable mean removal when offsets skew the product sum.
Select scaling if you want overlap-compensated magnitudes.
Choose coefficient normalization for a -1 to 1 score.
Click Calculate to view peak lag and the full lag table.
Use CSV or PDF buttons to export the computed results.

Cross Correlation in Practice

1) What cross correlation measures

Cross correlation estimates how similar two signals are after shifting one in time. In experiments, a peak at lag k indicates Series B aligns best when moved by k samples relative to Series A. This is widely used in vibration analysis, optics, seismology, and system identification.

2) Understanding lag and sampling

Lag is counted in samples, so its physical meaning depends on your sampling interval. For example, with a 1 ms sampling step, k = 25 corresponds to a 25 ms shift. Keep both series recorded with the same interval and time base to avoid false peaks.

3) Mean removal and DC offsets

Many sensors produce DC offsets (baseline drift). If you correlate raw values, the product sum can be dominated by the offset rather than the changing part of the signal. Enabling mean removal subtracts μ from each series before correlation, improving peak localization for oscillatory or transient data.

4) Scaling choices with overlap size

Overlap decreases at larger lags, so raw sums naturally shrink as fewer terms contribute. The unbiased option divides by the overlap count, making values comparable across lags when the signal energy is stationary. The biased option divides by the smaller series length for stable, report-friendly magnitudes.

5) Coefficient normalization for comparability

Coefficient mode outputs a bounded score from -1 to 1 by dividing by the square root of the overlap energies. This is useful when comparing different datasets, channels, or units (e.g., volts vs. meters). A value near +1 indicates strong alignment; near -1 suggests inversion; near 0 suggests weak similarity at that lag.

6) Interpreting peaks and confidence

The calculator reports both the maximum peak and the maximum absolute peak, since some applications care about inverted matches. When overlap is small, peaks can be noisy. Prefer peaks that remain strong across nearby lags and validate by inspecting the lag table rather than relying on a single number.

7) Typical engineering use cases

In acoustics, cross correlation helps estimate time delay between microphones for localization. In rotating machinery, it can reveal phase delays between sensors on different bearings. In communications, it supports synchronization by detecting known patterns in a received waveform across candidate time shifts.

8) Practical data tips for better results

Use sufficiently long records: more samples increase overlap and reduce variance. If your signals have trends, consider detrending (mean removal is a first step). Remove obvious outliers, and keep units consistent. Export CSV/PDF outputs to document peaks, overlap counts, and analysis settings in lab notes or reports.

FAQs

1) What is the difference between cross correlation and convolution?

Correlation compares similarity using a shift, while convolution combines signals using a flip-and-shift operation. Correlation is commonly used for delay detection; convolution is often used for filtering and system response modeling.

2) Why do I get different peak lags after enabling mean removal?

Mean removal reduces the influence of DC offsets. If offsets dominate the raw product sum, the reported peak can move. With mean removal, the peak is more likely tied to the varying signal features rather than baseline level.

3) Which normalization should I use for comparing two experiments?

Use coefficient normalization when you want a consistent -1 to 1 similarity score across datasets with different amplitudes or units. It makes comparisons clearer when signal energy varies between trials.

4) What does “overlap samples” mean in the lag table?

It is the number of paired points used to compute correlation at a specific lag. Larger lags have fewer overlapping samples, which can increase uncertainty and make peaks less reliable.

5) How do I convert lag to a time delay?

Multiply lag by the sampling interval. For instance, a lag of 40 samples at a 0.5 ms interval corresponds to a 20 ms time shift (40 × 0.5 ms).

6) Can I use sequences with different lengths?

Yes. The calculator automatically uses only the valid overlap region for each lag. However, very different lengths can reduce overlap at larger lags, so consider limiting max lag to maintain meaningful sample counts.

7) Why are my coefficient values close to zero everywhere?

This usually indicates weak similarity between the series, heavy noise, or mismatched sampling. Check that both signals share the same time base, consider mean removal, and ensure the max lag range covers the expected delay.