필터 지우기
필터 지우기

Use rowfun to sum multiple columns by group

조회 수: 5 (최근 30일)
Ledger Yu
Ledger Yu 2019년 10월 15일
댓글: Guillaume 2019년 10월 15일
I have a table that looks like this:
Data =
date id flag1 flag2
2019-01-01 x1 1 1
2019-01-01 x1 1 0
2019-01-02 x2 0 0
2019-01-02 x2 0 1
...
The line below will sum "flag1" by "date":
rowfun(@sum, Data, 'groupingVariables','date','inputVariables','flag1')
But how do I apply this to both "flag1" and "flag2"? I tried:
rowfun(@sum, Data, 'goupingVariables','date','inputVariables',{'flag1','flag2'})
...and it throws out the following error:
Dimension argument must be a positive integer scalar within indexing range

채택된 답변

Guillaume
Guillaume 2019년 10월 15일
rowfun is not the correct function for this. rowfun applies the function by rows and consider the input variables as separate input arguments to the function. Indeed the 2nd argument to sum is the dimension to operate along so you get an error since your 2nd variable does not contain valid dimensions.
You want to apply the same function to the different variables of your table. The function for that is varfun:
varfun(@sum, Data, 'GroupingVariables', 'date', 'InputVariables', {'flag1','flag2'})
However, note that since R2018a we have groupsummary which is even easier to use and allows you to get several statistics at the same time:
groupsummary(Data, 'date', 'sum', {'flag1', 'flag2'})
is the equivalent to the above, but you could also do:
groupsummary(Data, 'date', 'weekly', {'sum', 'mean', 'std'}, {'flag1', 'flag2'})
to get weekly sum, mean and standard deviation at once.
  댓글 수: 2
Rik
Rik 2019년 10월 15일
You can still do something with rowfun, although I'm not sure this is what OP wants to do:
Data=table(...
datetime({'2019-01-01';'2019-01-01';'2019-01-02';'2019-01-02'}),...
{'x1';'x1';'x2';'x2'},...
[1;1;0;0],...
[1;0;0;1],...
'VariableNames',{'date','id','flag1','flag2'});
fun=@(flag1,flag2) sum([flag1;flag2]);
output=rowfun(fun,Data,...
'groupingVariables','date',...
'InputVariables',{'flag1','flag2'})
Guillaume
Guillaume 2019년 10월 15일
Oh yes, you can still use rowfun by writing your own function that takes multiple outputs, but the OP question is clearly meant to operate on variables not rows, so varfun is far more appropriate.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Dates and Time에 대해 자세히 알아보기

태그

제품


릴리스

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by