Why does my log-normal distribution not fit my data?

조회 수: 6 (최근 30일)
David McVea
David McVea 2018년 12월 4일
댓글: David McVea 2018년 12월 5일
Hello,
I am fitting some realively simple data with a log normal distribution.
I am then generating a probability distribution from that fit.
Shouldn't this roughly match the intial data?
when I plot it on top of normalized histogram of the data, the shape is appropriate but the scale is about ten-fold lower.
Example below.
parmat = lognfit(data)
pdf = lognpdf(0:1:1000,parmat(1),parmat(2))
figure;hold on
histogram(data,[0:1:1000],'normalization','probability')
plot(0:1:1000,pdf)
In this case shouldn't the probability density function approximate the histrogam, rather than being one tenth or less the values of the histogram probability?
Thanks.
  댓글 수: 1
dpb
dpb 2018년 12월 4일
Seems reasonable; probably need to see what data are to be able to decipher what actually happened.

댓글을 달려면 로그인하십시오.

채택된 답변

John D'Errico
John D'Errico 2018년 12월 5일
편집: John D'Errico 2018년 12월 5일
Consider thisexample:
X = lognrnd(0,1,[1,1000]);
histogram(X,100,'normalization','probability')
hold on
ezplot(@(x) lognpdf(x,0,1))
So the two plots seems scaled wrong.
But the histogram normlization chosen was one such that the sum of the bars is 1. Consider the choices given though.
ezplot(@(x) lognpdf(x,0,1))
hold on
histogram(X,100,'normalization','pdf')
You made what may have seemed a reasonable choice in your normailzation, but the wrong one in hindsight. Remember that a PDF does not contain actual probabilities, something that is a source of frequent consternation for some people when someone sees a PDF that returns numbers greater than 1.
  댓글 수: 1
David McVea
David McVea 2018년 12월 5일
that makes sense, thank you.
When I change the normalization to pdf, it matches.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Jeff Miller
Jeff Miller 2018년 12월 4일
The probability density function should only approximate the histogram in shape, not in height. Remember, the PDF is defined such that it's total area (integral) is 1, over the whole range of the random variable. The total area under the histogram is much more than that.
  댓글 수: 3
Jeff Miller
Jeff Miller 2018년 12월 4일
Yes, I think so. According to the docs, that normalization produces a sum of the bar heights equal to 1. But the integral depends on the range along the horizontal axis as well as on the bar heights.
David McVea
David McVea 2018년 12월 5일
OK, thanks.
Do you have advice on how a fitted distribution, and the underlying data, should be overlayed to allow comparison?
David

댓글을 달려면 로그인하십시오.

제품


릴리스

R2015a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by