필터 지우기
필터 지우기

function hist3 number of bins with 'Edges' option doesn't count the bins right

조회 수: 14 (최근 30일)
I am using hist3 to histogram 2D scattered data. I intended to use the 'Edges' option the same way as in histogram. But it doesn't do the same. I have defined an edges cell to histogram my data in 40x40 bins using something like {0:1/40:1 0:1/40:1}. This gives me 41 edges on each axes and thus should be 40x40 bins. But I get 41x41. If I use the same cell as centers with the 'Ctrs' option I also get 41x41, which is right. Something is fishy here. The 'Edges' option doesn't seem to work right. Is that a bug or am I doing something wrong?
The code looks something like that:
%define bin edges
%limits = [xmin xmax ymin ymax]
%bin = [binNumX binNumY]
binSizeX = (limits(2)-limits(1))/bins(1);
binSizeY = (limits(4)-limits(3))/bins(2);
edgsX = limits(1):binSizeX:limits(2);
edgsY = limits(3):binSizeY:limits(4);
edgs = {edgsX edgsY}; %what I wanted to use in hist3, but doesn't work as expected, 1 bin too much
%create histograms
%2D histogram
%define bin centers (edges option does some crap and they are needed anyway)
ctrsX = edgsX-binSizeX/2; ctrsX(1) = [];
ctrsY = edgsY-binSizeY/2; ctrsY(1) = [];
%add two bins to be removed afterwards (unwanted open bins)
ctrs = {[ctrsX(1)-binSizeX ctrsX ctrsX(length(ctrsX))+binSizeX] [ctrsY(1)-binSizeY ctrsY ctrsY(length(ctrsY))+binSizeY]};
%scattered data points = [Ax1 Ax2]
%Hist2Dbin = hist3([Ax1 Ax2],'Edges',edgs)'; %gives one bin too much
Hist2Dbin = hist3([Ax1 Ax2],'Ctrs',ctrs)';
%remove first and last bins (open bins, because of 'Ctrs' option?)
Hist2Dbin = Hist2Dbin(2:size(Hist2Dbin,1)-1,2:size(Hist2Dbin,2)-1);
Ax1bin = [ctrsX; sum(Hist2Dbin,1)]';
Ax2bin = [ctrsY' sum(Hist2Dbin,2)];
%1D histograms
Ax1binTotal = [ctrsX; histcounts(Ax1,edgsX)]';
Ax2binTotal = [ctrsY; histcounts(Ax2,edgsY)]';
  댓글 수: 8
Adam Danz
Adam Danz 2018년 8월 16일
I see. When I asked for a section of your code, I was assuming it included the lines that were problematic to you. I'll provide an answer below.
Johann Thurn
Johann Thurn 2018년 8월 16일
I included them. They are just commented out. I have received an explanation for this behaviour in the meantime. I will just copy the mail here:
"I understand that you are using 'hist3' with 'Edges' option and are observing some discrepancies compare to 'edges' of 'histogram'
This behavior is caused because when data points fall on the upper bound of the edges supplied, the "hist3" function will create an additional bin for these points. So the binning is as follows for points strictly inside the range:
edges{1}(i) <= X(k,1) < edges{1}(i+1)
edges{2}(j) <= X(k,2) < edges{2}(j+1)
and there is a new bin created for points
X(k,1) = edges{1}(I+1) or X(k,2) = edges{2}(J+1).
The "histcounts" function on the other hand behaves differently. For this function, the binning is as follows for points strictly inside the range:
edges(i) <= X(k) < edges(i+1)
and as follows for bins that fall on the upper bound:
edges(I-1) <= X(k) <= edges(I)
The two functions are part of different toolboxes: "histcounts" is part of the MATLAB toolbox, while "hist3" is part of the Statistics toolbox, which might explain the different choice of implementation. Our development teams is aware of this discrepancy and they can consider changing it in a future release of MATLAB.
As a workaround and to ensure that you get consistent behavior between the functions, you can set the upper limit of your edges to be the maximum value in your data set plus eps. The "eps" function will add a small value to the upper edge so that points that fall on the upper edge are correctly binned."

댓글을 달려면 로그인하십시오.

채택된 답변

Adam Danz
Adam Danz 2018년 8월 16일
편집: Adam Danz 2020년 2월 6일
The reason why the 2nd output of hist3() provides 1 extra value is because it includes the last outer edge for any data that extends past the last bin. This is explained in the documentation .
Look at the values of your 'edges' and the values of the output bins.
edges = 0 : 0.025 : 1;
length(edges)
ans = 41
[N, c] = hist3(data, edges);
length(c)
ans = 41
c
ans = [0.0125 : 0.025 : 1.0125];
Notice that the last bin is greater than 1 which was your last edge.
Read more about this in the link I provided under 'edges'.
  댓글 수: 1
Johann Thurn
Johann Thurn 2018년 8월 16일
Ah, this is consistent with what I got from MATLAB staff in the meantime. I copied the mail above. Now I get it. I just assumed the behaviour was the same as in histogram or histocount. My mistake.
Thanks a lot :)

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Steven Lord
Steven Lord 2018년 8월 16일
Instead of using hist3 I recommend you use histogram2.
  댓글 수: 3
Steven Lord
Steven Lord 2018년 8월 16일
Ah, if you just want the binned data and not the figure then use histcounts2 instead.
Johann Thurn
Johann Thurn 2018년 8월 16일
Oh, I must have overlooked that one. Thanks! Nevertheless, I still believe, hist3 with edges is faulty.

댓글을 달려면 로그인하십시오.

제품


릴리스

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by