Online Correlation Calculator

Example data table

#	X	Y
1	10	12
2	12	15
3	13	18
4	15	17
5	16	21
6	18	24
7	20	26
8	22	29
9	24	30
10	26	33

This sample dataset shows a positive relationship suitable for demonstration.

Formula used

Pearson correlation (linear)

r = Σ[(xᵢ − x̄)(yᵢ − ȳ)] / √( Σ(xᵢ − x̄)² · Σ(yᵢ − ȳ)² )

Spearman correlation (rank)

Compute ranks for X and Y (average ranks for ties), then apply the Pearson formula to ranked values.

Kendall tau-b (ordinal)

τ = (C − D) / √((n₀ − n₁)(n₀ − n₂))

C: concordant pairs, D: discordant pairs, n₀: total pairs, n₁ and n₂: tie pairs in each variable.

How to use this calculator

Choose a method: Pearson for linear patterns, Spearman for monotonic ranks, or Kendall for ordinal strength.
Paste your pairs into the text box, or upload a CSV with two columns.
Set delimiter and decimal separator so numbers parse correctly.
Click Calculate correlation to view results above the form.
Use the download buttons to export CSV or a PDF report.

Method selection for analytics

Pearson measures linear association on raw values and is sensitive to outliers. Use it for continuous features where scatter plots look roughly elliptical. Spearman ranks each variable, so it works well for monotonic but curved relationships. Kendall tau-b compares concordant and discordant pairs, handling ties explicitly; it stays stable for ordinal ratings and small samples.

Input structure and validation

This calculator expects paired observations, one X with one Y. Rows with missing values can be dropped or filled with zero, depending on your policy. When using comma decimals, select a non-comma delimiter to avoid misreads. A clean dataset should have consistent units, no duplicated header rows, and at least two valid pairs.

Interpreting coefficient magnitude

Correlation ranges from −1 to +1. Values near zero indicate weak association, while larger absolute values indicate stronger association. As a practical guide, |r| < 0.10 is negligible, 0.10–0.29 is weak, 0.30–0.49 is moderate, 0.50–0.69 is strong, and ≥0.70 is very strong. Always interpret sign and magnitude together.

Significance and uncertainty

For Pearson and Spearman, a two‑tailed p‑value is computed using a t distribution with n−2 degrees of freedom. Smaller p‑values suggest the observed association is unlikely under zero correlation, but they do not measure effect size. The confidence interval uses Fisher’s z transform and narrows as n grows; for n ≤ 3, intervals are not reported.

Visual diagnostics with plots

A scatter plot reveals structure that a single number can hide. Look for clusters, leverage points, and nonlinearity. If a regression line is shown, treat it as a descriptive fit, not a causal model. When ranks are used, monotonic trends become clearer, but tied values can reduce resolution; consider rounding settings for consistent tie handling.

Reporting and reproducible exports

Exporting CSV supports audit trails by saving method, coefficient, p‑value, and interval alongside your original pairs. The PDF report provides a portable summary for stakeholders and documentation. For reproducibility, keep preprocessing steps consistent: filtering, scaling choices, and handling of missing rows. Then compare methods to confirm conclusions across assumptions. If you have repeated measurements, aggregate first, because pseudo‑replication inflates significance. For time series, consider lagged pairs; correlations at lag 1–7 days often reveal delayed effects and help tune forecasting features safely.

FAQs

When should I use Spearman instead of Pearson?

Use Spearman when the relationship is monotonic but not linear, when variables are ordinal, or when outliers distort a linear fit. It correlates ranks rather than raw values.

Does a small p-value mean the relationship is strong?

No. A small p-value mainly reflects evidence against zero correlation and depends on sample size. Strength is reflected by the coefficient magnitude and its confidence interval.

Why did the calculator drop some rows?

Rows are dropped when one value is missing or non-numeric and the missing-value policy is set to “drop.” Choose “fill with zero” only if that assumption matches your data rules.

What does Kendall tau-b add for ranked data?

Kendall tau-b is robust for ordinal data and explicitly accounts for ties. It measures pairwise agreement rather than distance, which can be helpful for ratings, ranks, and small datasets.

Why is the chart limited to 1,000 points?

Rendering many points in a browser can slow down interaction. Sampling keeps the plot responsive while the coefficient still uses all parsed pairs for the calculation.

Can correlation prove causation?

No. Correlation measures association only. Causal claims require study design, controls, temporal reasoning, and domain knowledge. Use correlation as a screening tool, not a final explanation.