kstest - normal?

Question

Ian 2011년 3월 31일

1
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/4471-kstest-normal

댓글: the cyclist 2020년 1월 30일

Hi, I am confused from reading the description from the 'kstest' function. Usually '1' means true and '0' means false, and the purpose of this function is to test whether or not a set of data is normally distributed. However, what I gather from reading the description, '0' is returned when the data is normally distributed, and '1' is returned when the data is not normally distributed.

Is this correct interpretation? The example is also a little confusing x = -2:1:4 x = -2 -1 0 1 2 3 4

[h,p,k,c] = kstest(x,[],0.05,0)
h =
   0
p =
   0.13632
k =
   0.41277
c =
   0.48342

These data are linear, not a normal distribution. Yet the kstest returns '0', which means the kstest classifies these data as normal, which is a limitation of the kstest with small data samples?

From what I read, the resolution is thus to use the 'smaller' or 'larger' tag to correct for this problem, but is there any clear cut-off for what is 'smaller' and what is 'larger'?

Lastly, if I were to use this test in a publication and say that our data was 'normal' (this function returned 0) or failed to be classified as 'normal' (this function returned 1) with this test and I used the 'smaller' or 'larger' tags, how does that change the name of the test? It can't be the same test if it is returning different values. How would I explain this?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Andrew Newell 2011년 3월 31일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/4471-kstest-normal#answer_6225

MATLAB Online에서 열기

Your example (taken from the documentation), "illustrates the difficulty of testing normality in small samples." If you plot

normplot(x)

you'll see that the deviations from a standard normal distribution occur in the two outer points. It doesn't take a lot more data to get a reasonable result, though:

x = -2:0.5:4;
[h,p,k,c] = kstest(x,[],0.05,0)
h =
     1

p =

0.0245

k =

0.3947

c =

0.3614

Keep in mind, too, their comment about the Lilliefors test - it is more likely to be the one you want.

댓글 수: 2
없음 표시없음 숨기기

the cyclist 2011년 3월 31일

Andrew, I think you meant "normplot(x)" rather than "normpdf(x)" here.

Andrew Newell 2011년 3월 31일

Oops!

댓글을 달려면 로그인하십시오.

Answer 2

the cyclist 2011년 3월 31일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/4471-kstest-normal#answer_6227

Ian,

Lots and lots of things that need to be addressed here. I'll try to address as much as I can.

First, in your little example, you only have seven data points. Therefore, the statistical test you are applying has very little power to distinguish between normal and non-normal distributions. Note that if you added even one more point, x=-2:1:5, the K-S test would have rejected the null hypothesis, though. I hope that the real study you are planning to submit has more data than this!

The test certainly does not "classify these data as normal"! It fails to reject the hypothesis that the data are normally distributed. That's an important distinction. Given this dataset, you should not say your data are normal.

The data [-2 -1 0 1 2 3 4] are not, in and of themselves, "linear". They are seven data points that you just happen to know you generated linearly.

The resolution of this issue is not to use the additional arguments "larger" or "smaller". Those arguments are more related to one's expectation that the distribution being sampled is skewed toward one side or the other of normal. I don't think those are relevant here. (But, the way it would be described, if it were relevant, would be to say you used a one-sided KS test rather than two-sided.)

There are other tests of normality that may also be useful to you: jbtest and lillietest.

I would say that if it is important to distinguish normality, then, sadly, you do not have enough data to do so confidently.

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

N 2020년 1월 29일

On a side note related to the definition of the tails:

when using 'Tail' set to 'smaller' we are testing if the the distribution is left skewed
when using 'Tail' set to 'larger' we are testing if the the distribution is right skewed

Is this correct?

the cyclist 2020년 1월 30일

MATLAB Online에서 열기

% Set random number seed to default
rng default
% Generate data that is clearly shifted larger than standard normal
% (I'm not sure I would refer to this as "right skewed", but I think this is what you mean.)
N = 1000;
x = randn(N,1) + 5;
% Null hypothesis that the distribution is larger than standard normal is NOT rejected
h_larger  = kstest(x,'Tail',"larger")
% Null hypothesis that the distribution is unequal to standard normal IS rejected
h_unequal  = kstest(x,'Tail',"unequal")
% Null hypothesis that the distribution is smaller than standard normal IS rejected
h_smaller = kstest(x,'Tail',"smaller")

댓글을 달려면 로그인하십시오.

Answer 3

Matt Tearle 2011년 3월 31일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/4471-kstest-normal#answer_6228

The output is the more likely hypothesis, not a true/false. Hence, h = 0 means the null hypothesis (H0) which is that the data comes from the assumed distribution.

The smaller/larger options are for performing one-sided tests - eg if your data came from a normal distribution with positive mean.

Other than that, see Andrew's answer. In particular, look at lillietest and jbtest.

댓글 수: 2
없음 표시없음 숨기기

the cyclist 2011년 3월 31일

h=0 does not mean that the null hypothesis is the more likely hypothesis. It means only that the null hypothesis cannot be rejected at the specified level of confidence.

Matt Tearle 2011년 3월 31일

Yes, but given that it returns a single value 0 or 1, I was trying to find a way to phrase that this return is the "decision" (H0 or H1), rather than a true/false.

댓글을 달려면 로그인하십시오.

kstest - normal?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 2
없음 표시없음 숨기기

추가 답변 (2개)

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 2
없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

kstest - normal?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 2 없음 표시없음 숨기기

추가 답변 (2개)

댓글 수: 6 이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 2 없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 2
없음 표시없음 숨기기