필터 지우기
필터 지우기

In the chi-square test, how to calculate (the correct number of parameters and consequently) the correct number of degrees of freedom, without using the chi2gof function?

조회 수: 10 (최근 30일)
Question
In the chi-square test, how to calculate (the correct number of parameters and consequently) the correct number of degrees of freedom, without using the chi2gof function?
I have indeed noticed that the number of degrees of freedom was slightly different in one matlab answer and in the chi2gof function....
Option 1: "df = nbins - 1"
Population = [996, 749, 370, 53, 9, 3, 1, 0];
Sample = [647, 486, 100, 22, 0, 0, 0, 0];
Population2 = [996, 749, 370, sum(Population(4:8))];
Sample2 = [647, 486, 100, sum(Sample(4:8))];
chi2stat = sum((Sample2-Population2).^2./Population2);
df = length(Population2)-1;
pcrit = .05;
chi2crit = chi2inv(pcrit,df);
h2 = chi2stat > chi2crit;
p2 = 1 - chi2cdf(chi2stat,df);
fprintf('h=%d, p=%.3f df=%d\n',h2,p2,df);
h=1, p=0.000 df=3
Option 2: "df = nbins - 1 - nparams"
"chi2gof compares the value of the test statistic to a chi-square distribution with degrees of freedom equal to nbins - 1 - nparams, where nbins is the number of bins used for the data pooling and nparams is the number of estimated parameters used to determine the expected counts."
bins = 0:5;
obsCounts = [6 16 10 12 4 2];
n = sum(obsCounts);
pd = fitdist(bins','Poisson','Frequency',obsCounts');
expCounts = n * pdf(pd,bins);
[h,p,st] = chi2gof(bins,'Ctrs',bins,...
'Frequency',obsCounts, ...
'Expected',expCounts,...
'NParams',1)
h = 0
p = 0.4654
st = struct with fields:
chi2stat: 2.5550 df: 3 edges: [-0.5000 0.5000 1.5000 2.5000 3.5000 5.5000] O: [6 16 10 12 6] E: [7.0429 13.8041 13.5280 8.8383 6.0284]

답변 (1개)

dpb
dpb 2023년 6월 21일
Although you specified 'Ctrs', bins, chi2gof created only 5 bins because the obsCounts values for the last two bins in the 'Frequency' vector were too small individually. Hence the DOF for the chi-square test statistic turns out to be based on 5-1-1 --> 3 instead of 6-1-1 --> 4 that may have been what you were expecting?
  댓글 수: 5
dpb
dpb 2023년 6월 22일
편집: dpb 2023년 6월 23일
Not sure what the remaing puzzle is so don't know how to try to add anything that haven't already said.
The correction to the number of DOF based solely on number of (collapsed) bins is simply how many parameters of the distribution used to calculate the expected counts per bin were estimated from the data itself -- IF ("the big if") the theoretical distribution parameter values are based on the input data itself.
If you test against counts from a theoretical distribution that is obtained from other considerations, then you've not estimated any further parameters from the count data itself and nParams=0.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Hypothesis Tests에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by