Turn raw measurements into a clean covariance view. Extract principal components for stable summaries quickly. Export tables fast, share insights across your team today.
Paste a dataset, set options, then compute covariance and PCA outputs.
This sample has four variables (A–D) across six observations.
| Obs | A | B | C | D |
|---|---|---|---|---|
| 1 | 2.5 | 1.3 | 0.8 | 3.1 |
| 2 | 2.7 | 1.5 | 1.1 | 3.0 |
| 3 | 2.9 | 1.7 | 1.0 | 3.4 |
| 4 | 3.2 | 1.9 | 1.3 | 3.8 |
| 5 | 3.0 | 1.8 | 1.2 | 3.5 |
| 6 | 3.3 | 2.1 | 1.4 | 3.9 |
Centering (optional): x' = x − μ, where μ is the column mean.
Scaling (optional): x'' = x' / σ, where σ is the column standard deviation.
Covariance matrix: for transformed data Z with n rows, C = (ZᵀZ) / (n−1) (sample) or C = (ZᵀZ) / n (population).
PCA: solve C v = λ v. Eigenvalues λ rank components; eigenvectors v are loadings.
Covariance summarizes how variables move together across observations. Positive values indicate that two measures tend to rise and fall in tandem, while negative values suggest opposing movement. Larger magnitudes imply stronger joint variability, but units matter, so interpret values relative to each variable’s scale. Analysts often scan the matrix for clusters that hint at shared drivers, seasonal effects, or measurement overlap. This view helps prioritize variables for feature selection and flags pairs that may cause multicollinearity in regression models.
Centering subtracts each column mean, making covariance reflect variation around typical levels. Scaling divides by the column standard deviation, reducing dominance from high‑variance variables. When inputs use different units, scaling produces components that are easier to compare. When units are consistent, unscaled covariance can preserve meaningful variance differences. Always document these options because they change the numerical meaning of every entry.
PCA decomposes the covariance matrix into eigenvalues and eigenvectors. Each eigenvalue equals the variance captured by its principal component, and the explained percentage shows its share of total variance. A sharp drop after early components suggests strong compression potential. Many projects retain components until cumulative explained variance crosses a practical target, such as 80% for dashboards or 95% for model inputs. When eigenvalues are nearly equal, components can rotate with small data changes, so interpret them cautiously over time.
Loadings describe how strongly each original variable contributes to each component direction. Variables with larger absolute loadings influence the component more, and the sign reflects directionality. Look for interpretable patterns, such as several related variables loading together. If one variable dominates every component, reconsider scaling, investigate outliers, or verify that the column is not a duplicated or mislabeled measure.
The covariance table, explained variance, and loadings are designed for immediate reporting. Exporting to CSV supports audits, reproducible analysis, and spreadsheet review, while PDF is useful for static documentation. Pair results with short notes on centering, scaling, denominator choice, and missing‑value handling to keep interpretations comparable across teams. For stakeholders, summarize the top components and cite their variance percentages.
Sample covariance divides by n−1 and is common for inference from a sample. Population covariance divides by n and assumes the data represents the full population. Choose based on how the dataset was collected.
Enable scaling when variables have different units or very different variances, such as dollars, percentages, and counts. Scaling prevents one high‑variance variable from dominating the first component and improves comparability across features.
Centering shifts each variable to a zero mean so components capture variation, not average level. Without centering, the first component can reflect mean offsets and the covariance entries mix level and variability information.
You can drop any row with missing entries or use mean imputation per column. Dropping preserves original values but may reduce sample size. Mean imputation keeps rows but can shrink variance and weaken correlations.
A negative loading indicates the variable moves in the opposite direction to the component’s positive axis. The sign is relative; flipping a component’s axis changes all signs together. Focus on magnitude and patterns across variables.
Eigen decomposition becomes slower as matrices grow. The limits keep the tool responsive in a browser-based workflow. For larger problems, consider using specialized numerical libraries, then paste summaries back for reporting.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.