I am using the corr function to calculate correlation coefficients between variables of interest. Specifically, I am using the Spearman's rho method for correlation analysis. While doing this, I noticed something odd when looking at the calculated p-values:
Here is an example that illustrates the problem:
x = [2 7 9 5 4 1 3 8 10 6 11]';
y = [1 2 3 4 5 6 7 8 9 10 11;
1 1 3 4 5 6 7 8 9 10 11]';
The output is the following:
One can clearly see that the calculated p-values for the correlation between x and the fist column of y are different, depending on whether y is passed as the full matrix or just its first column.
I found out that this is a specific problem of the Spearman's correlation, as this method uses ranks for calculation. It essentially comes down to the fact that the p-values are calculated differently, depending on the existence of rank ties in the data. In the first function call, the method in the case of ties is used for both columns of y, even though there are only ties in the second column. In the second function call there are no ties in the first column of y, so a different method is used, yielding a different p-value (and from my understanding the correct one) for the correlation between x and the fist column of y.
Would it be possible for anyone to resolve this issue?