Turn raw variables into clear components and insights. Compare variance, loadings, and sample projections quickly. Download reports, share visuals, and explain results confidently now.
Paste a CSV with a header row. Numeric columns are used; missing or non-numeric cells are treated as missing.
Example Data Table
This sample contains four correlated variables across ten rows.
| VarA | VarB | VarC | VarD |
|---|---|---|---|
| 12 | 18 | 5 | 30 |
| 15 | 22 | 7 | 28 |
| 14 | 20 | 6 | 26 |
| 18 | 25 | 9 | 32 |
| 20 | 28 | 10 | 35 |
| 22 | 30 | 12 | 38 |
| 25 | 34 | 14 | 40 |
| 28 | 36 | 15 | 44 |
| 30 | 40 | 18 | 46 |
| 32 | 42 | 20 | 50 |
Formula Used
1) Centering and scaling (optional)
For each variable x, compute mean μ and sample standard deviation σ, then z = (x − μ) / σ. If scaling is disabled, scores are computed using centered values only.
2) Covariance / correlation matrix
Covariance: Σ = (1/(n−1)) · XcT Xc. Correlation: Rij = Σij / (si sj). This tool uses pairwise available rows when missing values exist.
3) Eigen decomposition and variance
For symmetric matrix A (Σ or R), find eigenpairs A v = λ v. Explained variance ratio for component k is λk / Σλ. Scores are S = Z · V, where V contains eigenvectors and Z is centered (and optionally scaled) data.
How to Use
Principal component analysis compresses correlated variables into orthogonal components. This generator reports eigenvalues, explained variance, and cumulative variance to justify how many components to retain. Use the scree plot to identify the elbow, then confirm with cumulative variance targets such as 80% or 90%. Retaining a simple set improves interpretability.
The report lists each variable’s mean and standard deviation so you can validate scaling. With standardization enabled, columns are centered and divided by sample standard deviation, limiting unit dominance. After scaling, means should be near 0 and standard deviations near 1, aside from rounding and missing‑value handling. With missing values, statistics use available rows; impute consistently before comparing runs.
Use a covariance matrix when variables share units and magnitude matters. Use a correlation matrix when units differ or you want equalized influence after scaling. The tool shows how this choice changes eigenvalues and loadings, and therefore which variables appear most influential in early components. With correlation, eigenvalues near 1 are a useful reference.
Loadings indicate how strongly each variable aligns with a component direction. The contribution table uses squared loadings to summarize share of component structure. Large positive and negative loadings reflect opposing patterns, while near‑zero values imply limited effect. Label components using the largest absolute loadings and verify the story with domain logic.
Scores project observations into component space. The PC1–PC2 scatter helps detect clusters, gradients, and isolated points that may signal outliers or data issues. Exported scores can be joined back to IDs for modeling, monitoring, or visualization. Flag observations that sit far from the center along key components. The scatter tooltip shows row index and coordinates, making it easy to trace unusual points back to the source record.
A good PCA report documents the options used: matrix type, scaling, and the number of components shown. This generator outputs consistent tables for variance, loadings, and scores, enabling audit trails. Include sample size, variables included, and retained variance percentage when sharing results to support reproducible analysis. CSV and PDF exports preserve tables for review, while plots stay interactive on screen across teams too.
FAQs
Paste comma‑separated values with a header row. Each column should be numeric. Extra spaces are fine; non‑numeric cells are treated as missing and skipped where possible.
Use covariance when variables share units and scale matters. Use correlation when units differ or you want standardized influence. Correlation is often safer for mixed‑scale datasets.
Start with the scree elbow, then confirm cumulative variance (often 80–90%). For correlation‑based PCA, components with eigenvalues near or above 1 can be a helpful secondary check.
Loadings describe the direction of each component. Larger absolute values mean stronger influence. Opposite signs indicate variables moving in opposite directions within that component.
Yes. Download the scores CSV to join PC coordinates back to your original IDs. You can also export the report CSV/PDF for documentation and review.
Treat it as an analytical aid, not a final decision engine. Validate assumptions, check data quality, and consult a qualified statistician when outcomes are high‑stakes.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.