Explore similarity structure with clear clustering outputs. Review merge stages, assignments, and visual patterns confidently. Export polished reports for faster analysis, sharing, and decisions.
| Label | Feature 1 | Feature 2 | Feature 3 |
|---|---|---|---|
| A | 1.0 | 1.4 | 0.8 |
| B | 1.2 | 1.6 | 1.1 |
| D | 5.2 | 5.5 | 5.0 |
| G | 8.8 | 2.0 | 7.5 |
This sample shows three separated patterns, making linkage behavior easier to compare.
Euclidean distance: d(a,b) = square root of the sum of squared coordinate differences.
Manhattan distance: d(a,b) = sum of absolute coordinate differences.
Chebyshev distance: d(a,b) = largest absolute coordinate difference.
Z-score standardization: z = (value minus mean) divided by standard deviation.
Single linkage: smallest point-to-point distance between two clusters.
Complete linkage: largest point-to-point distance between two clusters.
Average linkage: mean of all inter-cluster pair distances.
Centroid linkage: distance between cluster centroids.
Ward linkage: merge that produces the smallest increase in within-cluster variance.
It groups observations by repeatedly merging the most similar clusters. The full merge path helps you inspect structure at several cluster counts, not only one fixed answer.
Use scaling when features have different units or ranges. Without scaling, a large-range feature can dominate the distance calculation and distort group formation.
Single linkage favors chaining, complete linkage creates tighter groups, average linkage balances both, centroid tracks cluster centers, and Ward often forms compact clusters.
It estimates how well each observation fits its assigned cluster compared with nearby clusters. Higher values usually indicate cleaner separation and more stable grouping.
Ward linkage is based on variance geometry, which aligns with Euclidean space. This tool automatically switches to Euclidean distance for that specific linkage to preserve consistency.
Yes. The calculator still works with one numeric feature. The scatter chart then uses observation order on the horizontal axis and the feature values vertically.
Cut height is the merge distance at the step that produced your chosen cluster count. It helps compare how much dissimilarity was accepted before groups were combined.
Include one observation per line, a label column, and numeric feature columns. Comma, tab, semicolon, and pipe delimiters are accepted in this version.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.