How is the number of bins chosen with the auto binning algorithm in histcounts?

조회 수: 19 (최근 30일)
When I call the function histcounts as following:
[N,edges] = histcounts(data)
It returns a number of bins with the corresponding bin edges. How is the number of bins determined? At https://se.mathworks.com/help/matlab/ref/histcounts.html it is says "...uses an automatic binning algorithm...".
What is this for an algorithm and where can I find some documentation about it?
  댓글 수: 2
Richard
Richard 2018년 7월 10일
I have the exact same question. This really is the sort of thing that should be in the documentation. At least a reference to the algorithm should be provided.
Weiqiang Zhao
Weiqiang Zhao 2020년 10월 9일
I agree, there isn't any reference to the "automatic binning algorithm", which is confusing to users.

댓글을 달려면 로그인하십시오.

채택된 답변

Anton Semechko
Anton Semechko 2018년 7월 10일
'histcounts' first estimates width of the histogram bins using 'scottsrule':
rawBinWidth = 3.5*std(data)/(numel(data)^(1/3));
It then passes this info along with the minimum and maximum values of input data (xmin and xmax, resp.) to the 'binpicker' function which first adjusts rawBinWidth depending on its order of magnitude:
powOfTen = 10.^floor(log10(rawBinWidth)); % next lower power of 10
relSize = rawBinWidth / powOfTen; % guaranteed in [1, 10)
if relSize < 1.5
binWidth = 1*powOfTen;
elseif relSize < 2.5
binWidth = 2*powOfTen;
elseif relSize < 4
binWidth = 3*powOfTen;
elseif relSize < 7.5
binWidth = 5*powOfTen;
else
binWidth = 10*powOfTen;
end
and then computes the total number of bins and positions of the left- and right-most histogram edges:
leftEdge = min(binWidth*floor(xmin ./ binWidth), xmin);
nbinsActual = max(1, ceil((xmax-leftEdge) ./ binWidth));
rightEdge = max(leftEdge + nbinsActual.*binWidth, xmax);
Remaining bin edges are distributed uniformly between leftEdge and rightEdge at binWidth intervals. You can inspect source code for the 'binpicker' function by typing:
edit histcounts
into your command prompt.
I imagine the reason Mathowrks does not describe how the bin edges are determined by 'hiscounts' is because the process involves several steps, and isn't based on a simple formula.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Data Distribution Plots에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by