Generate clear dendrograms for clustering and taxonomy studies. Paste data, tune linkage, then review merges. Download CSV summaries and PDF visuals for sharing today.
This sample contains six observations with three numeric features. Use it to validate parsing and compare linkage behaviors.
| Label | X1 | X2 | X3 |
|---|---|---|---|
| A | 1.0 | 2.0 | 1.5 |
| B | 1.2 | 1.8 | 1.7 |
| C | 5.1 | 4.9 | 5.0 |
| D | 5.2 | 5.1 | 4.8 |
| E | 3.0 | 3.2 | 3.1 |
| F | 8.0 | 7.9 | 8.1 |
This calculator converts a numeric dataset into a dendrogram using agglomerative hierarchical clustering. Each merge step records the two clusters joined, the merge distance, the new cluster id, and the new cluster size. The CSV export provides a complete audit trail of the clustering process for reproducible reporting. In the interface, the first 12 merges are previewed, while the export includes every merge from n clusters to 1.
Euclidean distance emphasizes straight‑line separation and is common for continuous variables, while Manhattan distance is often preferred when differences accumulate across many features. For mixed‑unit data, z‑score standardization transforms each feature to mean 0 and standard deviation 1, preventing a high‑variance column from dominating distances. Standardization is especially helpful when one feature is measured in thousands and another in decimals.
Single linkage can create “chains” by repeatedly merging near neighbors. Complete linkage tends to form compact groups by considering the farthest pair across clusters. Average linkage balances both behaviors and is a strong default for exploratory work. Ward linkage merges clusters that minimize the increase in within‑cluster variance and often produces well‑separated, trees. When Ward is selected, the merge criterion scales with (nₐnᵦ/(nₐ+nᵦ))·||cₐ−cᵦ||², so cluster sizes affect the merge decision.
The implementation uses a simple O(n³) search over active clusters at each step, which is suitable for small matrices. To keep results responsive in a browser workflow, inputs are limited to 80 rows and 20 columns. For larger studies, reduce dimensionality, sample observations, or compute the dendrogram in dedicated statistical tooling. As a benchmark, 40–60 rows typically render quickly, while 70–80 rows may feel slower depending on the server.
The vertical axis represents merge distance, so larger jumps indicate distinct groups. The “Cut clusters (k)” control stores assignments when the number of active clusters equals k (allowed range 2–12). Use the assignment table to label observations, compare linkage choices, and validate whether clusters align with domain expectations. The PDF report places the dendrogram on an A4 page with a merge summary for sharing.
Use numeric feature columns, optionally with a header row and a label column. Non‑numeric cells will be rejected to keep distances valid.
Enable it when features use different units or scales. It rescales each column to comparable variability, which stabilizes distance calculations and often improves clustering interpretability.
Average linkage is a solid default for exploration. Use complete linkage for compact clusters, single linkage for nearest‑neighbor chaining patterns, and Ward linkage when you want variance‑based merges.
Height is the merge distance (or Ward merge criterion). Larger jumps between merges suggest stronger separation between groups and can guide where to cut the tree.
Assignments are stored when the active cluster count equals k (2–12). Each observation inherits the id of the cluster it belongs to at that cut stage.
CSV contains all merge steps and the k‑cut assignment table. PDF includes a printable dendrogram view plus a small merge summary for quick sharing.
Notes: This implementation is intended for small datasets. For large matrices, use specialized statistical tools.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.