Variance dominance in mixed units
When variables use different units, the largest-variance feature can steer the first components. If Weight ranges 50–100 while Score ranges 0–10, the variance ratio may exceed 25×, so distances and covariances are driven by Weight. Standardization transforms each column onto a comparable scale, improving interpretability. Mean-centering alone removes offsets but still leaves unequal variances.
Method selection with practical thresholds
Z-score scaling is the default when you want each variable to contribute equally: mean becomes 0 and standard deviation becomes 1. The sample option uses (n−1) and suits observational datasets; the population option uses n for complete tables. Robust scaling is safer when outliers exceed about 3 standard deviations or when distributions are strongly skewed; it relies on the median and MAD with the 1.4826 consistency factor. Min–max scaling keeps values in [0,1], useful for bounded scores, but it can compress tails.
Reading the statistics table
Use n to confirm how many usable observations remain after missing-value handling. Large gaps between mean and median indicate skew; a high MAD relative to standard deviation suggests heavy tails. Compare min and max to spot coding errors, such as a misplaced decimal (720 instead of 72). Zero standard deviation means a constant column; the tool sets standardized values to 0 to prevent division errors, and such variables usually add no information to PCA. After Z-score scaling, column means should be near 0.
Matrix checks before running PCA
After standardization, the covariance matrix diagonal should be near 1 for Z-score methods, because each column’s variance is about 1. Off-diagonal covariance signs indicate whether variables move together or in opposite directions. For correlation-based PCA, use the correlation matrix; values near ±0.7 suggest strong shared structure. Unexpected near-zero correlations may indicate data entry issues or an incorrect delimiter.
Export-ready workflow for analysis
Use the on-screen preview to sanity-check signs, magnitudes, and rounding, then export CSV. Keep the same decimal setting for reproducible reports, and store the PDF when you need an audit trail of preprocessing choices. The export includes all standardized rows, not just the preview. If you iterate, change one option at a time and compare matrices to see what shifted.