NaN in ranksum test
조회 수: 16 (최근 30일)
이전 댓글 표시
Hi,
I know there is a nanmedian function, but it seems there is no nanranksum function. I have tried the ranksum function with and without NaN values filling in empty data and I get different p values. Am I wrong? Is there a Wilcoxon ranksum / Mann-Whitney U test that ignores NaN values?
Thanks for your help!
댓글 수: 0
채택된 답변
Daniel Golden
2013년 5월 31일
I have the same problem in Matlab R2012a. My version of the Matlab ranksum() documentation doesn't mention anything about NaNs, but the latest (R2013a) documentation at http://www.mathworks.com/help/stats/ranksum.html states, "ranksum treats NaNs in x and y as missing values and ignores them." This is obviously not true in R2012a, where there is probably a bug in the treatment of NaN values. For example try the following:
>> ranksum([1 2 3 nan], [4 5 6])
ans =
0.0571
>> ranksum([1 2 3], [4 5 6])
ans =
0.1000
Obviously, the NaNs are not being ignored.
Other times, NaN inputs will result in NaN outputs:
K>> ranksum([-14.44 NaN 5.97 -117.55 -77.56 -45.00], [-78.59 -101.04 -26.15 -79.51 -48.10 -23.45 -42.18 -76.75 -55.42 -135.18 70.02 -57.44 -31.69 -146.01])
ans =
NaN
But this isn't consistent. For example, removing any one of the vector values in the above example, even the non-NaN values, will result in a non-NaN output.
Here's a simple workaround if your inputs might have NaNs. If your input vectors are x and y, and you're running ranksum like:
p = ranksum(x, y)
Then just run ranksum like this:
p = ranksum(x(~isnan(x)), y(~isnan(y)))
댓글 수: 0
추가 답변 (1개)
Açmae
2013년 6월 3일
@ Daniel and Eric:
In R2012b and beyond, the test:
p = ranksum(x(~isnan(x)), y(~isnan(y)))
is performed in the function RANKSUM to remove any missing data, and thus takes care of NaN's. If you are using versions older than R2012b, then this is the workaround.
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Hypothesis Tests에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!