Statistics : Data binning vs increased size

조회 수: 3 (최근 30일)
Youssef  Khmou
Youssef Khmou 2013년 10월 5일
댓글: Youssef Khmou 2013년 10월 6일
Dear Users,
Suppose you have data that is recorded daily( lets say a temperature in different places over 4 years), each time you compute the histogram whether 1D or two dimensional, the number of bins affects dramatically the behavior of the data, the default is 256 here, while there are some techniques for optimizing that number, 256 is fine then?
  댓글 수: 4
Cedric
Cedric 2013년 10월 6일
편집: Cedric 2013년 10월 6일
As far as I am concerned, tests and statistics are analytical, and histograms (different from optimal binning) are for data visualization only. So for me the ideal bin size is the one which shows major behaviors and smooths down small scale fluctuations.. in other words, it's a question of scale. If I had to build a "cheap" automatic bin-size adjustment algorithm (e.g. if MATLAB was meant to output automatically series of figures for automatic reports generation), I guess that I would just implement a loop which starts at 3 bins, and increases the number of bins until the derivative of the histogram changes sign more than a certain threshold (which could depend on the number of bins).
Youssef  Khmou
Youssef Khmou 2013년 10월 6일
편집: Youssef Khmou 2013년 10월 6일
ALL what you said is correct, indeed i did a test; each time nth order statistics are computed while the N bins increases, the results diverge while it is supposed to be stable ( reaching an asymptote from below .).

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Histograms에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by