group by indexing discontinuous timetable

조회 수: 5 (최근 30일)
Eric Escoto
Eric Escoto 2021년 11월 18일
댓글: Kelly Kearney 2021년 11월 19일
I am tying to identify groupings based on two temporal criteria from a discontinuous timetable array. The timetable is aggregated to 1-minute intervals.
The criteria are:
1. Minimum of five continuous rowtime minutes,
2. at least one rowtime less than 15 minutes from #1 (above) or #2).
Put another way, at least five minutes of data and thereafter if one minute is within 15 consecutive minutes continue to include those minutes to the group.
In the end, I want to be able to perform algebraic and other operations on the variable contained in each new grouping (for example taking the sum of each group).
Example data is provided. It looks like there should be a total of 4 groupings. One of the rows (113) is a time that doesn't meet the critera above and thus should either be discarded or flagged differently from the groups so I can remove it later (maybe by a 'nan' flag?).

채택된 답변

Kelly Kearney
Kelly Kearney 2021년 11월 19일
Here's one possible solution; it first checks for criteria 2 and then goes back to verify #1. You can probably do it all in one fell swoop but finding consecutive runs can get messy so I prefer to keep that on its own. I also opted not to mark the extra groups with a NaN, because splitapply and other grouping functions insist on consecutive-integer groupings; I find it cleaner to filter those out after the fact.
% First group by the within-15-minutes criteria
dt = minutes(diff(test1.TIMESTAMP));
grp = cumsum([true; dt>15]);
% Now check that each subgroup has at least a 5-minute run of consecutives
maxrun = @(x) max(diff([1; find(minutes(diff(x)) > 1); length(x)+1]));
isgood = splitapply(@(x) maxrun(x)>=5, test1.TIMESTAMP, grp);
% Sum of each group
grpsum = splitapply(@sum, test1.mean_V, grp);
grpsum(~isgood) = NaN;
  댓글 수: 3
Eric Escoto
Eric Escoto 2021년 11월 19일
Looks like all I need to do to get other variables is modify the 'Var' name in any generic 'TT.Var' format.
Kelly Kearney
Kelly Kearney 2021년 11월 19일
You can use the same splitapply setup to do whatever calculations you need:
nt = splitapply(@length, test1.TIMESTAMP, grp); % number of elements in each
t0 = splitapply(@min, test1.TIMESTAMP, grp); % earliest time in each

댓글을 달려면 로그인하십시오.

추가 답변 (0개)


Help CenterFile Exchange에서 Preprocessing Data에 대해 자세히 알아보기




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by