Enter PCA inputs
Example data table
| Component | Eigenvalue | Variance Ratio (%) | Cumulative Ratio (%) |
|---|---|---|---|
| PC1 | 4.20 | 52.50 | 52.50 |
| PC2 | 1.80 | 22.50 | 75.00 |
| PC3 | 1.10 | 13.75 | 88.75 |
| PC4 | 0.60 | 7.50 | 96.25 |
| PC5 | 0.30 | 3.75 | 100.00 |
Formula used
1) Variance ratio per component
Variance Ratio for component i = λi / Σλ
2) Cumulative explained variance
Cumulative Ratio at component k = Σ(Variance Ratio from 1 to k)
3) Components needed for a target
Choose the smallest k where cumulative explained variance is at least the selected target percentage.
4) Effective dimensionality
Effective Dimensionality = 1 / Σpi2, where pi is each component ratio.
5) Broken-stick benchmark
Expected Share for rank k = (1 / p) × Σ(1 / j) from j = k to p. It helps compare observed structure against random partitioning.
When you provide a total variance override, listed eigenvalues can represent only the leading components. The calculator then shows how much overall variance those listed components still explain.
How to use this calculator
- Choose whether you want to enter eigenvalues or already-computed variance percentages.
- Paste one component value per line, or separate values with commas or spaces.
- Add optional labels, thresholds, retained component count, and decimal precision.
- Use total variance override when your listed eigenvalues are only a partial subset.
- Click the calculate button to see summary cards, threshold checks, tables, and the Plotly scree graph.
- Export the finished report to CSV or PDF for documentation, review, or model selection notes.
FAQs
1) What does the PCA variance ratio show?
It shows the fraction of total variance captured by each principal component. Larger ratios indicate components that preserve more information from the original feature space.
2) Why is cumulative variance important?
Cumulative variance helps you decide how many components to retain. It answers the practical question of how much total information remains after dimensionality reduction.
3) Should variance percentages sum to 100?
Yes, full PCA percentages usually sum to 100. If you enter partial percentages, this calculator normalizes them so relative importance can still be compared consistently.
4) What is the Kaiser rule?
The Kaiser rule keeps components with eigenvalues above a cutoff, often 1. It is common for standardized variables, but it should be compared with scree and cumulative variance checks.
5) What does the broken-stick benchmark mean?
It compares observed component importance against a random variance split. Components beating the benchmark often indicate stronger retained structure than random allocation would suggest.
6) When should I use total variance override?
Use it when you only know the leading eigenvalues but the full PCA contains more components. This helps estimate overall explained variance without pretending the listed values are complete.
7) What is effective dimensionality?
It summarizes how concentrated or spread the explained variance is. Values near one imply dominance by a few components, while larger values imply broader information sharing.
8) Can I use this for model reduction decisions?
Yes. It helps compare target coverage, retained component counts, and diagnostic benchmarks, which makes PCA retention choices easier to justify in reporting and analysis.