Dimensional error using PCA

조회 수: 2 (최근 30일)
Jaime  de la Mota
Jaime de la Mota 2019년 7월 9일
댓글: Jaime de la Mota 2019년 7월 10일
Hello everyone. I have generated a code in which I use a Gaussian correlation kernel to generate 1000 realizations of a stochastic process and then, perform PCA over the resulting process. The result is a matrix of 501*1000.
However, when I perform the PCA over this matrix, the results contradict the help at https://la.mathworks.com/help/stats/pca.html
In the info it says that if one inrtoduces a n*p matrix, coeff will be a p*p matrix and score an n*p. Here, I get different results, coeff is a p*n matrix and score a p*p; the weird thing is that the process is reconstructed propperly. Can anyone tell me what is happening?
Thanks.
Additionally, reading theory, coeffs should be standard normal random variables; if I plot the histograms, the resulting variables are normal but not standard. If someone could tell me why these are not standard I would be very thankfull.
The code in question:
close all
clear
clc
[X,Y] = meshgrid(0:0.002:1,0:0.002:1);
Z=exp((-1)*abs(X-Y));
tam=size(X, 1);
number_realizations=1000;
realizacion_mat=zeros(tam, number_realizations);
cov_mat=cov(Z);
[evec_mal, evalM_mal]=eig(cov_mat);
eval_mal=eig(evalM_mal);
num_eval=size(eval_mal,1);
for i=1:num_eval
eval(i)=eval_mal(num_eval-i+1);
evec(:,i)=evec_mal(:,num_eval-i+1);
end
figure
hold on
for j=1:number_realizations
realizacion=zeros(tam, 1);
for i=1:tam
v_a = normrnd(0,1);
realizacion=realizacion+sqrt(eval(i))*evec(:,i)*v_a;
end
realizacion_mat(:,j)=realizacion;
plot(realizacion)
clear('realizacion')
end
[coeff,score,latent,tsquared,explained,mu] = pca(realizacion_mat,'Centered',false);
reconstruction_process=score*coeff';
diference=reconstruction_process-realizacion_mat;
figure
plot(diference)
for i=1:5
figure
histogram(coeff(:,i), 20)
end

채택된 답변

Jon
Jon 2019년 7월 9일
편집: Jon 2019년 7월 9일
The first argument to pca should be n by p, where n is the number of observations. You are supplying it with a p by n matrix. As a result the arguments that are returned are not dimensioned as you expect. I do not see anything in the MATLAB documentation that discusses the distribution (standard normal) of the coefficients. Maybe this is something specific to your application. In any case, if you supply pca with an array, where each row is an observation, then you will be off to a good start.
I also suggest that in your code, you do not use the variable name eval, for eigenvalues. eval is a MATLAB function that evaluates an expression. You did not get any error message as MATLAB assumes you want to use eval as a variable name rather than as a function. It is at the least confusing to read the code if you know what the eval function does, and also if at some point further you actually wanted to use eval as a function you would have problems.
  댓글 수: 5
Jon
Jon 2019년 7월 9일
Hi I'm not familiar with the theoretical background for your problem, and have not used principle components analysis in this particular context, so I do not have an immediate answer regarding why they are not standard normal variables. I'm sorry, I do not have time to dig deeper, but I would guess that there is a scaling factor somewhere that is not consistent between the two implementations (MATLAB pca, and the reference that you are working from).
Jaime  de la Mota
Jaime de la Mota 2019년 7월 10일
Don't worry.
You have helped me enough.
Thanks again.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Dimensionality Reduction and Feature Extraction에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by