splitapply doesn't split well into bins

조회 수: 2 (최근 30일)
Amit Ifrach
Amit Ifrach 2021년 10월 11일
댓글: Matt J 2021년 10월 13일
לק"י
Hi guys,
I wanted splitapply command to split to 90 different bins. somewhy it returns only 50.
Here is the process I made:
First, 'cell1areas' (size - 18800X1) - a variable that contains vector of areas was loaded.
then 'bins' or 'groups' from 0 to 90000 in 1000 spacing was created in 'edges' variable.
after that, discretize function was applied to the area vector data. the max value of the variable dis is 62 (max(dis)).
valid function was apllied to check rather the data is a number or NaN.
last, splitapply function was called with @sum to sum all values for each group.
The problem is, that the spltsum variable have 50 'bins' or vector elements in it, instead of the desired 90 (which is the number of bins in edges) or even 62(!) like the discretize gave only 62 different numbers and not 90.
Thanks in advace, this community is great and really helpfull!
the code:
edges=[0 0:1000:90000 90000];
dis=discretize(cell1areas, edges);
valid=isfinite(cell1areas);
spltsum=splitapply(@sum , cell1areas(valid) , findgroups(dis(valid)) );

채택된 답변

Matt J
Matt J 2021년 10월 11일
편집: Matt J 2021년 10월 13일
You can use accumarray instead.
spltsum=accumarray(dis(valid), cell1areas(valid) , [90,1]);
  댓글 수: 5
Amit Ifrach
Amit Ifrach 2021년 10월 13일
לק"י
thanks!
and another (last) one, I want the data to be splitted in bins defined by:
edges=[0 0:1000:90000 90000];
but as far as I understand the acuumarray arbitrary devides the data into 90 bins without paying attention to the length of the bins required (because of the last argument, [90,1]). is it true?
spltsum=accumarray(dis(valid), cell1areas(valid) , [90,1]);
if so, I need a way that the data will be splitted by the edges vector alone.
or to put it in other words:
I assume accumarray only sums up each value in cell1area that has the same 'bin' (value of bin as an integer).
the binning of cell1area is done primarily by discretize function (dis variable in this example).
accumarray only sums up all the values in cell1area that has the same binnig (by the dis function).
if so, why should I mention in the accumarray function the [90,1] vector/variable. it should know that I want 90 bins that are separated from each other by 1000 untill the value 90000, not arbitrary values that matlab thinks suites to devide the data I give it.
thanks!
Matt J
Matt J 2021년 10월 13일
Not all 90 bins contain counts. If you don't tell accumarray how many bins you have, it will assume you only have max(dis(valid)) bins.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Elementary Math에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by