This sample has three variables (V1āV3) and five observations.
| Obs | V1 | V2 | V3 |
|---|---|---|---|
| 1 | 1 | 2 | 3 |
| 2 | 2 | 3 | 4 |
| 3 | 3 | 4 | 6 |
| 4 | 4 | 6 | 8 |
| 5 | 5 | 7 | 9 |
1,2,3 2,3,4 3,4,6 4,6,8 5,7,9
- Mean-centering: \(X_c = X - \mu\), subtract each column mean.
- Covariance matrix: \(S = \frac{1}{d}\,X_c^T X_c\), where \(d=n-1\) (sample) or \(d=n\) (population).
- Correlation mode: scale each column by its standard deviation before computing \(S\).
- Eigenvalues: solve \(S v = \lambda v\). Values \(\lambda\) measure variance along direction \(v\).
- Explained variance: \(\text{ratio}_i = \lambda_i / \sum_j \lambda_j\).
This tool uses the Jacobi rotation method, suited for real symmetric matrices like covariance and correlation.
- Paste your dataset, or upload a CSV file.
- Select the delimiter, header option, and variables location.
- Choose covariance or correlation, then pick sample or population.
- Set tolerance and iterations if you need stricter accuracy.
- Click Calculate to view results above this form.
- Use the download buttons to export CSV or PDF.
1) What do covariance eigenvalues represent?
They measure variance captured along principal directions of your variables. Larger eigenvalues indicate stronger spread in that direction, often used in principal component analysis for dimensionality reduction.
2) When should I use correlation instead of covariance?
Use correlation when variables have different units or scales. It standardizes each variable so no single high-variance column dominates the eigenvalues and components.
3) Why can eigenvector signs change between runs?
If \(v\) is an eigenvector, \(-v\) is also valid. Flipped signs do not change the component direction, variance, or explained variance percentages.
4) What is sample versus population covariance?
Sample uses \(n-1\) in the denominator to reduce bias when estimating from data. Population uses \(n\) when your dataset is the complete population you care about.
5) Why did my results show negative eigenvalues?
Small negative values can appear from rounding and numerical limits. For well-formed covariance matrices, eigenvalues should be nonnegative; tighter tolerance and more iterations may reduce small negatives.
6) How should I handle missing values?
Dropping rows is safest when few values are missing. Mean replacement keeps row count but can shrink variance. Zero replacement is quick but may distort relationships if zero is not meaningful.
7) How many components should I keep?
Common rules include keeping enough components to reach 90ā95% cumulative explained variance, or using an elbow point where additional components add little variance.
8) Is this method reliable for large matrices?
Jacobi is stable for symmetric matrices but can be slower as variables increase. For many variables, consider reducing columns, relaxing tolerance, or using specialized numeric libraries.