Table: group by datetime

조회 수: 22 (최근 30일)
Arjan
Arjan 2015년 6월 11일
댓글: Peter Perkins 2015년 7월 16일
Hi all,
I have defined a table and I now want to group by a column of type 'datetime'. I tried varfun, but this function doesn't work for this kind. Can someone help? Thanks a lot! So table looks like:
[TYPE Datetime VALUE]
And the result should be the value per datetime, summed by TYPE.
  댓글 수: 4
Arjan
Arjan 2015년 6월 12일
Hi Peter, Of course. I have defined a table in the following way and I now want to aggregate the rows, grouped by DT (=date time), i.e. I want to get the total value per hour of the day. Hopefully I made it clear.
if true
T_Flex = readtable('TEXTFILE.csv','Delimiter',';','Format',formatSpec,'ReadVariableNames',false);
T_Flex.Properties.VariableNames{1}='TYPE';%type of flex
T_Flex.Properties.VariableNames{2}='Date';%date
T_Flex.Properties.VariableNames{3}='HOUR'; %period of the day T_Flex.Properties.VariableNames{4}='VALUE';%flex
DT = table(datetime(year(T_Flex.Date),month(T_Flex.Date),day(T_Flex.Date),0,(T_Flex.PTU-1)*15,0));%date+period of day
T_Flex=[T_Flex DT];
end
Peter Perkins
Peter Perkins 2015년 7월 16일
Arjan, I don't exactly understand your table, but I think your question boils down to, "I want to use varfun, with a grouping variable that's a datetime."
In R2014b and R2015a, varfun does not allow that. One work-around would be to temporarily convert the datetime to a numeric representation (for example, the number of seconds since some reference time). Then do the grouped calculation, and convert the grouping variable's unique values back to datetime:
>> d = datetime(2015,1,1) + caldays(randi(2,5,1));
>> x = randn(size(d));
>> t = table(d,x)
t =
d x
___________ ________
02-Jan-2015 0.30352
03-Jan-2015 -0.60033
02-Jan-2015 0.48997
03-Jan-2015 0.73936
02-Jan-2015 1.7119
>> d0 = datetime(2015,1,1);
>> t.d = seconds(t.d - d0)
t =
d x
_________ ________
86400 0.30352
1.728e+05 -0.60033
86400 0.48997
1.728e+05 0.73936
86400 1.7119
>> gt = varfun(@mean,t,'groupingVariable','d')
gt =
d GroupCount mean_x
_________ __________ ________
86400 86400 3 0.83512
172800 1.728e+05 2 0.069518
>> gt.d = d0 + seconds(gt.d)
gt =
d GroupCount mean_x
____________________ __________ ________
86400 02-Jan-2015 00:00:00 3 0.83512
172800 03-Jan-2015 00:00:00 2 0.069518

댓글을 달려면 로그인하십시오.

채택된 답변

Michael Rice
Michael Rice 2015년 6월 12일
It isn't possible. To illustrate, I'll sum by matching date (instead of hour) using datetime versus datenum.
VALUE=[2;3;4];
dt = datetime(2015,[1;2;1],1);
tbl_dt=table(dt,VALUE,'VariableNames',{'dt','VALUE'});
tbl_dt=varfun(@nansum,tbl_dt,'GroupingVariables',{'dt'});
MATLAB returns
Error using table/varfun (line 154)
A grouping variable must be a categorical, numeric, or logical vector, a cell vector of strings, or a 2D character array.
Now use datenum instead:
dn = datenum(2015,[1;2;1],1);
tbl_dn=table(dn,VALUE,'VariableNames',{'dn','VALUE'});
tbl_dn=varfun(@nansum,tbl_dn,'GroupingVariables',{'dn'});
Because datenum is an integer, MATLAB has no problem:
tbl_dn =
dn GroupCount nansum_VALUE
__________ __________ ____________
735965 7.3597e+05 2 6
735996 7.36e+05 1 3
If you have table variables with dates or times and need to do grouping or summary functions (i.e. varfun or grpstats), this makes datetime pretty much useless. You could work around it by casting the value from datetime to datenum around each operation. Perhaps there is a forthcoming modification to varfun that would solve this limitation.
  댓글 수: 1
Arjan
Arjan 2015년 6월 16일
Thank you Michael!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Tables에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by