Calculator
Example data table
| Index | Sequence A | Sequence B |
|---|---|---|
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 3 | 2 |
| 4 | 2 | 3 |
| 5 | 1 | 2 |
| 6 | — | 1 |
This toy dataset illustrates different sequence lengths and a mild timing shift.
Formula used
Dynamic Time Warping aligns two sequences by allowing repeats and skips in time. It builds a cumulative cost matrix D using dynamic programming.
D(0,0) = 0, D(i,0)=∞, D(0,j)=∞
D(i,j) = c(i,j) + min{ D(i−1,j)·wᵥ, D(i,j−1)·wₕ, D(i−1,j−1)·w_d }
The window constraint limits calculations to cells where |i − j| ≤ window. Step weights adjust how strongly insertions, deletions, and matches are penalized.
How to use this calculator
- Enter numeric values for both sequences, separated by commas or spaces.
- Select a local distance metric based on your error sensitivity.
- Optionally set a window to prevent unrealistic time shifts.
- Adjust step weights if you want to discourage repeats or skips.
- Click Compute DTW to show results above the form.
- Export your matrix or summary using CSV or PDF buttons.
Why DTW matters for time series
Dynamic Time Warping (DTW) compares sequences when events occur at different speeds. Instead of forcing point‑to‑point matching, DTW allows repeats and skips to align shapes. In practice, this improves similarity scoring for signals with tempo drift, sensor lag, or irregular sampling, while still producing one interpretable distance value.
Interpreting DTW and normalized distance
The raw DTW distance accumulates local deviations along the best warping path, so it grows with both mismatch and path length. Normalizing by path length makes values more comparable across different sequence sizes. For model features, you can use both: raw distance reflects total error, while normalized distance reflects average per‑alignment error.
Windowing and complexity controls
A window constraint limits how far the alignment can drift from the diagonal, reducing unrealistic matches and speeding computation. With a reasonable band, you often cut work dramatically while preserving accuracy for real data. Windowing is especially useful for long signals in production scoring, where you want predictable latency and stable results.
When comparing many series, you can prefilter candidates using simple bounds, then run DTW only on the top matches. For noisy streams, combine a small window with smoothing to avoid chasing spikes. If your task needs partial alignment, subsequence DTW can match a short pattern inside a long signal. These choices turn DTW from a research tool into a reliable component for scoring, retrieval, and monitoring, at scale with clear controls.
Choosing local distance and step weights
Absolute difference is robust when outliers are expected, while squared difference emphasizes large errors and can sharpen separation for clean signals. Step weights let you penalize horizontal or vertical moves to discourage excessive stretching. This is a practical knob: higher skip penalties yield tighter alignments and can reduce false similarity between unrelated patterns.
Practical machine learning use cases
DTW is widely used for nearest‑neighbor classification of gestures, clustering of trajectories, template matching in audio, and aligning physiological signals. Many pipelines apply scaling or z‑normalization first, then compute DTW as a feature or distance metric. The returned path also supports interpretability by showing where time shifts occurred, which can guide feature engineering.
FAQs
1) When should I use DTW instead of Euclidean distance?
Use DTW when similar patterns are time-shifted or stretched, such as speech, motion, or sensor signals. Euclidean distance assumes perfect alignment and can overstate differences when timing varies.
2) What does the window constraint do?
The window limits how far alignment can drift from the diagonal. It reduces computation and prevents unrealistic matches, especially for long sequences where large time warps would not be meaningful.
3) Why normalize the DTW distance?
Raw DTW distance grows with path length. Dividing by the warping path length gives an average per-step mismatch, making comparisons fairer across different sequence lengths.
4) Which local metric should I choose?
Absolute difference is more robust to occasional spikes. Squared difference penalizes large deviations more strongly and can improve separation when signals are clean and well-scaled.
5) What do step weights change?
Weights control the cost of diagonal matches versus horizontal or vertical moves. Increasing skip penalties discourages excessive stretching and typically produces tighter, more conservative alignments.
6) Why is matrix export sometimes disabled?
Full matrices grow quickly with sequence length. For large inputs, this page switches to a memory-efficient distance-only method to keep performance stable, so matrix and path export may be unavailable.