How to find the weight of PC_1 in my measurements, after doing PCA?

조회 수: 4 (최근 30일)
Mark Golberg
Mark Golberg 2022년 9월 19일
답변: Githin George 2023년 10월 4일
Hi,
I'm trying to use the following code to understand PCA , SVD and it's releation:
% PCA_vs_SVD (my sand_box)
%% generate fake data points
my_fake_dataPoints = [-4 0 ; -2 1 ; -1 -1 ; 1 1 ; 3 2 ; 4 2];
% remove mean
my_fake_dataPoints_noMean = my_fake_dataPoints - mean(my_fake_dataPoints , 2);
% do PCA
[coeff , score , latent , tsquared , explained , mu] = pca(my_fake_dataPoints);
[coeff_noMean , score_noMean , latent_noMean , tsquared_noMean , explained_noMean , mu_noMean] = pca(my_fake_dataPoints_noMean);
% do SVD
[U , S , V] = svd(my_fake_dataPoints);
[U_noMean , S_noMean , V_noMean] = svd(my_fake_dataPoints_noMean);
%% plots
figure(1)
biplot(coeff , 'scores' , score , 'MarkerSize' , 30 , 'varlabels' ,{'var_1' , 'var_2'});
figure(2)
scatter(score(:,1) , score(:,2))
axis equal
xlabel('1st Principal Component')
ylabel('2nd Principal Component')
grid on
Have some questions:
1) How can I know the weight of PC_1 in my measurments? is it simply first column fo "score", or something else?
2) What's exactly the connection between the output of PCA and SVD? Which case should I compare, standard? with mean subtraction?
3) Am I missing something in the following:
PC1 = alpha_1 * v1 + alpha_2 * v2, right? my alphas are the first column of "coeff" variable, right?
So, the 1st data point projected on PC1 should be: 0.95 * (-4) + 0.28 * (0) = -3.8, right? But it doesn't match score(1,1), which is -4.23... what am I missing here?

답변 (1개)

Githin George
Githin George 2023년 10월 4일
Hello Mark,
I understand you have a few doubts related to PCA. To answer your queries:
  1. The "explained"/ "explained_noMean" variable contains the percentage weight of data, captured by each of the Principal Component (PC_1 and PC_2 in this case).
  2. PCA with standardized data yields the same result as doing SVD. I suggest you refer to the following answer know more: https://www.mathworks.com/matlabcentral/answers/774902-pca-vs-svd-or-eig-functions?s_tid=srchtitle_site_search_1_pca%20vs%20svd
  3. The equation "PC1 = alpha_1 * v1 + alpha_2 * v2" gives the projected value of original data point (v1,v2), on the principal axis. But note that "score" is a measure of correlation of data points to the corresponding PC. It does not equal to the projected value in the Principal Component.
I hope this addresses your queries.

카테고리

Help CenterFile Exchange에서 Dimensionality Reduction and Feature Extraction에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by