Should there be a residual when applying principal component analysis?
이전 댓글 표시
I am using Matlab function PCA (principal component analysis) to reduce the dimensionality of a data set with approximately 20 000 observations x 100 dimensions.
After having obtained the principal component coefficients of the data I recreated the input signal in the original coordinate system using the transformation matrix from the PCA function. This yielded a very large residual when comparing with the input signal. I have tried around with different data sets and sizes and it appears to be commonplace to have a large residual. I am not sure yet whether it is due to round-off errors or high SNR in the input data. The dimensionality reduction could of course still be useful, but is this something that one should be cautious about when performing principal component analysis? Or is there another metric that is better to assess the performance of the principal component analysis?
댓글 수: 4
Adam
2016년 9월 20일
How many components does it leave you with? The obvious interpretation would be that there are too few to accurately represent the data and that you would need to use more components. I haven't use the pca functionality for a few years so I can't remember what it returns by default.
HaMo
2016년 9월 20일
Adam
2016년 9월 20일
I would have expected there to always be 0 residual if you simply re-arrange the dimensions without throwing any of them away, but maybe I am mis-remembering how PCA works. If it is simply re-orienting the data to the axes of maximal variance it should not be losing data though if you are reconstructing correctly.
HaMo
2016년 9월 20일
채택된 답변
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Dimensionality Reduction and Feature Extraction에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!