How to output specific rows from tables depending on values within the table?

Question

Sean Byrne 2018년 6월 5일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/404108-how-to-output-specific-rows-from-tables-depending-on-values-within-the-table

답변: Peter Perkins 2018년 7월 3일

I have a table of variable I have pulled form an excel spread sheet (the actual file is 45 columns X 2000 rows). But this gives the idea of what I am trying to achieve.

I would like to find each separate participants (identified by their 'ID') maximum jump height for that 'season' of testing and remove the other rows.

The Table I'm working with is like this (but extended):

[Season] [TrialNo] [ID] [AgeGroup] [bodyMass] [JumpHeight] [Force] [FlighTime] [LandingForce]
 'Pre'      1      0001    U14         40         35         685       0.3          1100  
 'Pre'      2      0001    U14         40         32         630       0.25         1200  
 'Pre'      1      0002    U14         42         40         750       0.42         1000  
 'Pre'      2      0002    U14         42         36         700       0.4          1300  
 'Pre'      1      0003    U14         45         32         610       0.3          1111  
 'Pre'      2      0003    U14         45         28         600       0.3          1600  
 'Post'     1      0001    U14         40         35         685       0.3          1100  
 'Post'     2      0001    U14         40         32         630       0.25         1200  
 'Post'     1      0002    U14         42         40         750       0.42         1000  
 'Post'     2      0002    U14         42         36         700       0.4          1300  
 'Post'     1      0003    U14         45         32         610       0.3          1111  
 'Post'     2      0003    U14         45         28         600       0.3          1600

What I aim to end up with is something more like

    [Season] [TrialNo] [ID] [AgeGroup] [bodyMass] [JumpHeight] [Force] [FlighTime] [LandingForce]
     'Pre'      1      0001    U14         40         35         685       0.3          1100    
     'Pre'      1      0002    U14         42         40         750       0.42         1000  
     'Pre'      1      0003    U14         45         32         610       0.3          1111  
     'Post'     1      0001    U14         40         35         685       0.3          1100  
     'Post'     1      0002    U14         42         40         750       0.42         1000  
     'Post'     1      0003    U14         45         32         610       0.3          1111

댓글 수: 2
없음 표시없음 숨기기

Paolo 2018년 6월 5일

편집: Paolo 2018년 6월 5일

JumpHeight seems to be already sorted for the ID for every pair or rows, so if that's always the case you could just delete every other row.

Alternatively, if its not always sorted, you could sort every two rows by JumpHeight.

Could you attach a sample spreadsheet?

Sean Byrne 2018년 6월 5일

Raw_Data_Sample.xls

Attached is a sample of the data I'm working with. As you can see jump height is random within the three trials.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Razvan Carbunescu 2018년 6월 5일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/404108-how-to-output-specific-rows-from-tables-depending-on-values-within-the-table#answer_323310

편집: Razvan Carbunescu 2018년 6월 5일

MATLAB Online에서 열기

If you're using R2018a and only interested in maximum JumpHeight can use groupsummary on table T:

>> GT = groupsummary (T,{'Season','ID'},'max','JumpHeight')
GT =
6×4 table
    Season    ID    GroupCount    max_JumpHeight
    ______    __    __________    ______________
    'Post'    1         2               35      
    'Post'    2         2               40      
    'Post'    3         2               32      
    'Pre'     1         2               35      
    'Pre'     2         2               40      
    'Pre'     3         2               32

If you want to get all the row information or on an earlier release can use findgroups / splitapply workflow

idx = findgroups(T.Season,T.ID);
GT = splitapply(@maxidx,T,idx);
GT.Properties.VariableNames = T.Properties.VariableNames
function T = maxidx(varargin)
   [~,i] = max(varargin{6});
   tmpvarargout = cellfun(@(x) x(i,:),varargin,'UniformOutput',false);
   T = table(tmpvarargout{:});
end

Sample Output

GT =
6×9 table
    Season    TrialNo    ID    AgeGroup    bodyMass    JumpHeight    Force    FlighTime    LandingForce
    ______    _______    __    ________    ________    __________    _____    _________    ____________
    'Post'       1       1        14          40           35         685        0.3           1100    
    'Post'       1       2        14          42           40         750       0.42           1000    
    'Post'       1       3        14          45           32         610        0.3           1111    
    'Pre'        1       1        14          40           35         685        0.3           1100    
    'Pre'        1       2        14          42           40         750       0.42           1000    
    'Pre'        1       3        14          45           32         610        0.3           1111

Edit: Script assumes JumpHeight is 6th column in table, might have to modify for correct position

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

Razvan Carbunescu 2018년 6월 5일

MATLAB Online에서 열기

I am able to get the data with changing the index in the script to 9:

T = readtable('Raw_Data_Sample.xls');
idx = findgroups(T.Season,T.ID);
GT = splitapply(@maxidx,T,idx);
GT.Properties.VariableNames = T.Properties.VariableNames
function T = maxidx(varargin)
   [~,i] = max(varargin{9});
   tmpvarargout = cellfun(@(x) x(i,:),varargin,'UniformOutput',false);
   T = table(tmpvarargout{:});
end

Output I get:

>> head(GT)

ans =

8×10 table
       Year           Season       TrialNumber    AgeGroup     ID            TestDate          BodyMass    PushForce    JumpHeight    Take_off_Velocity
    ___________    ____________    ___________    ________    _____    ____________________    ________    _________    __________    _________________
    '2016-2017'    'Pre-season'         3          'U14'      17637    9/18/2016 4:58:00 PM      37.8       1136.6        0.394             2.69       
    '2016-2017'    'Pre-season'         3          'U14'      17864    9/18/2016 5:10:00 PM     49.56       1256.9        0.302             2.35       
    '2016-2017'    'Pre-season'         2          'U14'      17917    9/18/2016 5:07:00 PM     40.01       853.48        0.282             2.27       
    '2016-2017'    'Pre-season'         2          'U14'      18069    9/18/2016 4:47:00 PM     38.51       835.35        0.336             2.48       
    '2016-2017'    'Pre-season'         3          'U14'      18133    9/18/2016 5:05:00 PM     53.66       1277.5        0.364             2.57       
    '2016-2017'    'Pre-season'         3          'U13'      18891    9/18/2016 4:32:00 PM     40.76       870.99        0.423             2.81       
    '2016-2017'    'Pre-season'         3          'U13'      18935    9/18/2016 4:21:00 PM     42.36       1145.8        0.284             2.27       
    '2016-2017'    'Pre-season'         3          'U13'      19054    9/18/2016 4:25:00 PM      41.3       800.82        0.289             2.26

>>

Sean Byrne 2018년 6월 7일

Also as a fix I tried changing the (T.Season,T.ID) to {T.Season,T.ID} and that changed the error message to Undefined variable "findgroups" or class "findgroups"

Razvan Carbunescu 2018년 6월 7일

MATLAB Online에서 열기

I had missed the fact that you're on R2014a. findgroups/splitapply were introduced in R2016b.

I think the way to try to get it in R2014a is to use sortrows and unique with the rows flag to find the indexing to the first sorted highest value.

ST = sortrows(T,{'Season' 'ID' 'JumpHeight'},{'ascend' 'ascend' 'descend'});
% taking advantage here of the fact that ST is sorted by JumpHeight and unique returns first element
[~,idx] = unique([double(categorical(ST.Season)) ST.ID],'rows');
GT = ST(idx,:)

댓글을 달려면 로그인하십시오.

Answer 2

Are Mjaavatten 2018년 6월 5일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/404108-how-to-output-specific-rows-from-tables-depending-on-values-within-the-table#answer_323303

편집: Are Mjaavatten 2018년 6월 11일

MATLAB Online에서 열기

I am a little uncertain about the type of data structure you use. For completeness I therefore entered your data in an Excel workbook that I read using readtable.

If there are always exactly two trials per ID and season:

T0 = readtable('Byrne.xlsx');
rows = [];
for i = 1:2:size(T0,1)-1
  [~,j] = max(T0.JumpHeight(i:i+1));
  rows = [rows;i+j-1];
end
T2 = T0(rows,:);

If the number of trials may vary:

T0 = sortrows(T0,'ID');
T0 = sortrows(T0,'Season','descend');
J = [find(diff([0;T0.ID])~=0);size(T0,1)];  % Indices for each ID change
rows = [];
for i = 1:length(J)-1
  [~,j] = max(T0.JumpHeight(J(i):J(i+1)-1));
  rows = [rows;J(i)+j-1];
end
T2 = T0(rows,:);

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 3

Peter Perkins 2018년 7월 3일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/404108-how-to-output-specific-rows-from-tables-depending-on-values-within-the-table#answer_327356

In more recent versions of MATLAB there are several ways to do this. In R2014a, do a grouped varfun, using @max as the function to apply, ID and Season as the grouping Variables, and JumpHight as the InputVariable.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

How to output specific rows from tables depending on values within the table?

댓글 수: 2
없음 표시없음 숨기기

답변 (3개)

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

How to output specific rows from tables depending on values within the table?

댓글 수: 2 없음 표시없음 숨기기

답변 (3개)

댓글 수: 6 이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기