Different correlation results using matrix with NaN values
조회 수: 10 (최근 30일)
Hello, I am having problems when calculating the correlation coefficient in two different ways:
the first way is by eliminating all pairs of NaN values before correlating:
ign_nan = isfinite(M1) & isfinite(M2);
[RHO1,P] = corrcoef(M1',M2');
The second way is by leaving the matrix as the original one, but adding
'rows','pairwise' to ignore NaN
[RHO2,P] = corrcoef(M1',M2','rows','pairwise');
Can someone tell me why is RHO1 different from RHO2?
Thank you! Magui
Kirby Fears 2016년 4월 18일
The documentation for corrcoef indicates that 'complete' is the 'rows' value corresponding to your first calculation.
Try using the following for RHO2 and compare it to RHO1.
[RHO2,P] = corrcoef(M1',M2','rows','complete');
Documentation for the 'rows' setting is below.
'rows' — Use of NaN option 'all' (default) | 'complete' | 'pairwise'
Use of NaN option, specified as one of these values:
'all' — Include all NaN values in the input before computing the correlation coefficients.
'complete' — Omit any rows of the input containing NaN values before computing the correlation coefficients. This option always returns a positive definite matrix.
'pairwise' — Omit any rows containing NaN only on a pairwise basis for each two-column correlation coefficient calculation. This option can return a matrix that is not positive definite.