Paste data, standardize variables, and compute component scores. Choose an outlier threshold for screening. Download shareable CSV and PDF reports anytime.
Use this sample to confirm the tool and downloads work end-to-end.
| ID | X1 | X2 | X3 |
|---|---|---|---|
| A | 10 | 12 | 9 |
| B | 11 | 18 | 10 |
| C | 9 | 14 | 8 |
| D | 15 | 20 | 14 |
| E | 13 | 16 | 12 |
| F | 8 | 11 | 7 |
1) Standardization
For each variable Xj, the tool computes mean μj and standard deviation σj, then transforms each value:
2) Covariance and PCA
Using standardized data Z, covariance is:
PCA solves C v = λ v. Eigenvectors v are loadings; eigenvalues λ drive explained variance.
3) Component scores and Z-scores
Scores for the first k components are computed as:
For your selected component score s, the tool produces:
Principal components compress correlated variables into fewer, orthogonal signals. Converting a chosen component score into a Z-score makes those signals comparable across rows, because values are expressed in standard deviations from the component’s mean. In monitoring and screening work, analysts often treat |Z| above 2.0 as unusual and above 3.0 as rare under near-normal behavior.
This tool standardizes each variable using the sample mean and sample standard deviation. Standardization prevents high-variance variables from dominating the covariance structure. For operational datasets, consider trimming impossible values, aligning units, and keeping a consistent measurement window so that the covariance matrix represents the same process state.
Explained variance quantifies how much total standardized variance each component captures. If PC1 explains 60% or more, a single latent factor may drive the system. If variance is distributed across many components, the dataset may contain multiple independent drivers, requiring a higher component count for stable scoring.
Loadings show each variable’s contribution to a component. Large positive loadings move the score upward when the variable increases; large negative loadings move it downward. When variables are standardized, loadings are directly comparable across columns. Reviewing the largest absolute loadings helps label components with business or scientific meaning.
The outlier flag is a screening signal, not a verdict. Use a threshold that matches your risk tolerance and sample size. For small samples, 2.5 can reduce false alarms; for high-volume monitoring, 3.0 is common. Always review the raw row and variable contributions before action.
To improve repeatability, keep the same variable set and ordering when comparing runs. If you plan to deploy the scores, store the means, standard deviations, and loadings, then score new rows using those fixed parameters. Recomputing PCA on shifting samples changes the component space and can move Z-scores even when the underlying process is stable. When missing values exist, skipping rows preserves statistics but reduces coverage; zero-imputation increases coverage but may bias components. For time-series, recompute models on scheduled intervals and track drift in explained variance and loadings.
It expresses a selected component score in standard deviations from that component’s mean, enabling easy cross-row comparison and threshold-based screening.
This tool standardizes variables, so covariance on standardized data mirrors correlation-based PCA. Standardization is preferred when variables have different units or scales.
Start with components that explain meaningful variance, often 70–90% cumulatively. Also check whether loadings remain stable and interpretable for your use case.
PCA is sample-dependent. New rows can change means, standard deviations, covariance, loadings, and component distributions, shifting scores and their Z-score scaling.
Skipping rows preserves statistical structure but reduces coverage. Zero-imputation increases coverage but may distort covariance and loadings. Use a policy consistent with your data generation process.
No. It is a screening indicator. Confirm by reviewing original variables, context, and whether the component aligns with a plausible causal driver.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.