How to quantify the goodness of a fit?

Question

Sofia Flora 2023년 1월 20일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1897665-how-to-quantify-the-goodness-of-a-fit

댓글: Sofia Flora 2023년 1월 20일

Hello, I need to quantify how much a fit of a PDF (Probability Density Function) is good. I have my data set with its PDF and its fit. I decided to use the function chi2gof. Since it's the first time I'm using it, I decided to run a test script first.

I decided to generate gaussian (one of the easiest PDFs) random variables with mu=3 and calculate the observed PDF. I know for sure which is the correct PDF. The problem is that when I apply the chi2gof function, I obtain the rejection of the null hypothesis (that says that the probability distribution is the same)! I don't understand what I'm doing wrong. I attach my test code:

ntot=1000;

x=randn([ntot,1])+3;

[bins,hist]=my_hist(x);

hist=hist';

my_O=ntot*hist;

gauss=exp(-0.5*(bins-3).^2)/sqrt(2*pi);

my_E=ntot*gauss;

figure

plot(bins,hist); hold on

plot(bins,gauss)

[h,p,stats] = chi2gof(bins,'Ctrs',bins,'Expected',my_E,'Frequency',my_O)

function [bins,hist]=my_hist(input)

input=input(isfinite(input));

h=histogram(input,'Normalization','pdf');

hist=h.Values;

b=h.BinEdges;

bins=NaN.*ones(length(b)-1,1);

norm=0;

for k=1:length(bins)

bins(k)=(b(k)+b(k+1))/2;

norm=norm+(b(k+1)-b(k))*hist(k);

end

%disp(norm)

close

end

Any help is greatly appreciated!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

the cyclist 2023년 1월 20일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1897665-how-to-quantify-the-goodness-of-a-fit#answer_1152885

편집: the cyclist 2023년 1월 20일

MATLAB Online에서 열기

EDIT: My first posting on this was incomplete, so I radically edited it. Sorry for any confusion if you saw the first version.

I noticed that your expected bin totals my_E do not sum to the value of ntot. The reason for this is that you have mistakenly used gauss as the bin probability, not as the probability density. You need to multiply by the bin width.

You made the same mistake in my_O.

I think you may also be making a mistake in using bin edges where bin centers are expected, but I did not follow up on this.

rng default

ntot=1000;

x=randn([ntot,1])+3;

[bins,hist]=my_hist(x);

hist=hist';

bin_width = bins(2) - bins(1);

my_O=ntot*hist*bin_width;

gauss=exp(-0.5*(bins-3).^2)/sqrt(2*pi);

my_E=ntot*gauss*bin_width;

figure

plot(bins,hist); hold on

plot(bins,gauss)

[h,p,stats] = chi2gof(bins,'Ctrs',bins,'Expected',my_E,'Frequency',my_O)

h = 0

p = 0.2150

stats = struct with fields:

chi2stat: 21.2561 df: 17 edges: [-0.3000 0.6000 0.9000 1.2000 1.5000 1.8000 2.1000 2.4000 2.7000 3.0000 3.3000 3.6000 3.9000 4.2000 4.5000 4.8000 5.1000 5.4000 6.6000] O: [6 15.0000 17.0000 27 53.0000 79 85.0000 101.0000 127.0000 110.0000 124.0000 95.0000 67.0000 32.0000 27.0000 16.0000 6.0000 13.0000] E: [7.5349 9.5219 17.8784 30.6795 48.1150 68.9646 90.3412 108.1581 118.3438 118.3438 108.1581 90.3412 68.9646 48.1150 30.6795 17.8784 9.5219 7.8464]

function [bins,hist]=my_hist(input)

input=input(isfinite(input));

h=histogram(input,'Normalization','pdf');

hist=h.Values;

b=h.BinEdges;

bins=NaN.*ones(length(b)-1,1);

norm=0;

for k=1:length(bins)

bins(k)=(b(k)+b(k+1))/2;

norm=norm+(b(k+1)-b(k))*hist(k);

end

%disp(norm)

close

end

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Sofia Flora 2023년 1월 20일

Thank you very much!

댓글을 달려면 로그인하십시오.

How to quantify the goodness of a fit?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

How to quantify the goodness of a fit?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기