Should there be a residual when applying principal component analysis?

Question

0 개 추천

I am using Matlab function PCA (principal component analysis) to reduce the dimensionality of a data set with approximately 20 000 observations x 100 dimensions.

After having obtained the principal component coefficients of the data I recreated the input signal in the original coordinate system using the transformation matrix from the PCA function. This yielded a very large residual when comparing with the input signal. I have tried around with different data sets and sizes and it appears to be commonplace to have a large residual. I am not sure yet whether it is due to round-off errors or high SNR in the input data. The dimensionality reduction could of course still be useful, but is this something that one should be cautious about when performing principal component analysis? Or is there another metric that is better to assess the performance of the principal component analysis?

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기

Adam 2016년 9월 20일

I would have expected there to always be 0 residual if you simply re-arrange the dimensions without throwing any of them away, but maybe I am mis-remembering how PCA works. If it is simply re-orienting the data to the axes of maximal variance it should not be losing data though if you are reconstructing correctly.

HaMo 2016년 9월 20일

Yeah, you are right Adam. See answer below. Thanks!

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

John D'Errico 2016년 9월 20일

편집: John D'Errico 2016년 9월 20일

1 개 추천

If you used all of the components, and you STILl have a large residual, then you are doing the reconstruction incorrectly. The residuals at that point should be on the order of eps. Have you verified that indeed, you did not get tiny numbers,and merely did not see the tiny power of 10 attached to the numbers? With no data and no indication of how you did the analysis or the reconstruction, we cannot know what you did incorrectly.

If I had to guess, you did not properly deal with the mean, or something silly like that.

My recommendation is you attach the data to a comment as a .mat file. Then I can do PCA, and the reconstruction, showing you what to do.

댓글 수: 2
없음 표시 없음 숨기기

HaMo 2016년 9월 20일

Problem solved. As you guessed, my input data was not centered. Thanks!

John D'Errico 2016년 9월 20일

WOW! For once, the crystal ball told me the truth. Usually it does nothing more than tell me my horoscope. :)

댓글을 달려면 로그인하십시오.

Should there be a residual when applying principal component analysis?

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기

채택된 답변

댓글 수: 2
없음 표시 없음 숨기기

추가 답변 (0개)

카테고리

제품

태그

Community Treasure Hunt

Should there be a residual when applying principal component analysis?

댓글 수: 4 이전 댓글 2개 표시 이전 댓글 2개 숨기기

채택된 답변

댓글 수: 2 없음 표시 없음 숨기기

추가 답변 (0개)

카테고리

제품

태그

참고 항목

Community Treasure Hunt

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기

댓글 수: 2
없음 표시 없음 숨기기