How to output specific rows from tables depending on values within the table?
조회 수: 15 (최근 30일)
이전 댓글 표시
I have a table of variable I have pulled form an excel spread sheet (the actual file is 45 columns X 2000 rows). But this gives the idea of what I am trying to achieve.
I would like to find each separate participants (identified by their 'ID') maximum jump height for that 'season' of testing and remove the other rows.
The Table I'm working with is like this (but extended):
[Season] [TrialNo] [ID] [AgeGroup] [bodyMass] [JumpHeight] [Force] [FlighTime] [LandingForce]
'Pre' 1 0001 U14 40 35 685 0.3 1100
'Pre' 2 0001 U14 40 32 630 0.25 1200
'Pre' 1 0002 U14 42 40 750 0.42 1000
'Pre' 2 0002 U14 42 36 700 0.4 1300
'Pre' 1 0003 U14 45 32 610 0.3 1111
'Pre' 2 0003 U14 45 28 600 0.3 1600
'Post' 1 0001 U14 40 35 685 0.3 1100
'Post' 2 0001 U14 40 32 630 0.25 1200
'Post' 1 0002 U14 42 40 750 0.42 1000
'Post' 2 0002 U14 42 36 700 0.4 1300
'Post' 1 0003 U14 45 32 610 0.3 1111
'Post' 2 0003 U14 45 28 600 0.3 1600
What I aim to end up with is something more like
[Season] [TrialNo] [ID] [AgeGroup] [bodyMass] [JumpHeight] [Force] [FlighTime] [LandingForce]
'Pre' 1 0001 U14 40 35 685 0.3 1100
'Pre' 1 0002 U14 42 40 750 0.42 1000
'Pre' 1 0003 U14 45 32 610 0.3 1111
'Post' 1 0001 U14 40 35 685 0.3 1100
'Post' 1 0002 U14 42 40 750 0.42 1000
'Post' 1 0003 U14 45 32 610 0.3 1111
댓글 수: 2
Paolo
2018년 6월 5일
편집: Paolo
2018년 6월 5일
JumpHeight seems to be already sorted for the ID for every pair or rows, so if that's always the case you could just delete every other row.
Alternatively, if its not always sorted, you could sort every two rows by JumpHeight.
Could you attach a sample spreadsheet?
답변 (3개)
Razvan Carbunescu
2018년 6월 5일
편집: Razvan Carbunescu
2018년 6월 5일
>> GT = groupsummary (T,{'Season','ID'},'max','JumpHeight')
GT =
6×4 table
Season ID GroupCount max_JumpHeight
______ __ __________ ______________
'Post' 1 2 35
'Post' 2 2 40
'Post' 3 2 32
'Pre' 1 2 35
'Pre' 2 2 40
'Pre' 3 2 32
If you want to get all the row information or on an earlier release can use findgroups / splitapply workflow
idx = findgroups(T.Season,T.ID);
GT = splitapply(@maxidx,T,idx);
GT.Properties.VariableNames = T.Properties.VariableNames
function T = maxidx(varargin)
[~,i] = max(varargin{6});
tmpvarargout = cellfun(@(x) x(i,:),varargin,'UniformOutput',false);
T = table(tmpvarargout{:});
end
Sample Output
GT =
6×9 table
Season TrialNo ID AgeGroup bodyMass JumpHeight Force FlighTime LandingForce
______ _______ __ ________ ________ __________ _____ _________ ____________
'Post' 1 1 14 40 35 685 0.3 1100
'Post' 1 2 14 42 40 750 0.42 1000
'Post' 1 3 14 45 32 610 0.3 1111
'Pre' 1 1 14 40 35 685 0.3 1100
'Pre' 1 2 14 42 40 750 0.42 1000
'Pre' 1 3 14 45 32 610 0.3 1111
Edit: Script assumes JumpHeight is 6th column in table, might have to modify for correct position
댓글 수: 6
Razvan Carbunescu
2018년 6월 7일
I had missed the fact that you're on R2014a. findgroups/splitapply were introduced in R2016b.
I think the way to try to get it in R2014a is to use sortrows and unique with the rows flag to find the indexing to the first sorted highest value.
ST = sortrows(T,{'Season' 'ID' 'JumpHeight'},{'ascend' 'ascend' 'descend'});
% taking advantage here of the fact that ST is sorted by JumpHeight and unique returns first element
[~,idx] = unique([double(categorical(ST.Season)) ST.ID],'rows');
GT = ST(idx,:)
Are Mjaavatten
2018년 6월 5일
편집: Are Mjaavatten
2018년 6월 11일
I am a little uncertain about the type of data structure you use. For completeness I therefore entered your data in an Excel workbook that I read using readtable.
If there are always exactly two trials per ID and season:
T0 = readtable('Byrne.xlsx');
rows = [];
for i = 1:2:size(T0,1)-1
[~,j] = max(T0.JumpHeight(i:i+1));
rows = [rows;i+j-1];
end
T2 = T0(rows,:);
If the number of trials may vary:
T0 = sortrows(T0,'ID');
T0 = sortrows(T0,'Season','descend');
J = [find(diff([0;T0.ID])~=0);size(T0,1)]; % Indices for each ID change
rows = [];
for i = 1:length(J)-1
[~,j] = max(T0.JumpHeight(J(i):J(i+1)-1));
rows = [rows;J(i)+j-1];
end
T2 = T0(rows,:);
댓글 수: 0
Peter Perkins
2018년 7월 3일
In more recent versions of MATLAB there are several ways to do this. In R2014a, do a grouped varfun, using @max as the function to apply, ID and Season as the grouping Variables, and JumpHight as the InputVariable.
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Data Distribution Plots에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!