Querying rows in a table, specific scenario

조회 수: 26 (최근 30일)
Bryan Wilson
Bryan Wilson 2017년 4월 24일
댓글: Andrei Bobrov 2017년 4월 25일
I have road noise data in a table and I need to query it down to a balanced data set for analysis. For example, I only need Route_Surface Type combinations that have both levels of Speed. An example of my data and the desired output are below. Any suggestions on how to build this query?
%Data Inputs
ID = [1;2;3;4;5;6;7;8;9;10;11;12;13;14];
Route = {'IH10';'IH10';'IH10';'IH10';'IH10';'IH10';'US6';'US6';'IH35';'IH35';'US20';'US20';'US20';'US20'};
Surface = {'conc';'conc';'conc';'conc';'asph';'asph';'conc';'conc';'conc';'conc';'asph';'asph';'asph';'asph'};
%asph=asphalt, conc=concrete
Speed = [50;50;70;70;50;50;50;50;70;70;50;50;70;70];
%mph
SoundLevel = [51;54;75;76;52;60;68;63;79;80;49;51;66;68];
%Build the table
RoadNoise = table(ID,Route,Surface,Speed,SoundLevel);
RoadNoise =
ID Route Surface Speed SoundLevel
1 IH10 conc 50 51
2 IH10 conc 50 54
3 IH10 conc 70 75
4 IH10 conc 70 76
5 IH10 asph 50 52
6 IH10 asph 50 60
7 US6 conc 50 68
8 US6 conc 50 63
9 IH35 conc 70 79
10 IH35 conc 70 80
11 US20 asph 50 49
12 US20 asph 50 51
13 US20 asph 70 66
14 US20 asph 70 68
%Queried data (desired output)
Queried RoadNoise =
ID Route Surface Speed SoundLevel
1 IH10 conc 50 51
2 IH10 conc 50 54
3 IH10 conc 70 75
4 IH10 conc 70 76
11 US20 asph 50 49
12 US20 asph 50 51
13 US20 asph 70 66
14 US20 asph 70 68

채택된 답변

Peter Perkins
Peter Perkins 2017년 4월 25일
You sample code doesn't actually work. Also, storing numeric values in a cell array is not a good idea.
Try this:
>> ID = [1;2;3;4;5;6;7;8;9;10;11;12;13;14];
>> Route = categorical({'IH10';'IH10';'IH10';'IH10';'IH10';'IH10';'US6';'US6';'IH35';'IH35';'US20';'US20';'US20';'US20'});
>> Surface = categorical({'concrete';'concrete';'concrete';'concrete';'asphalt';'asphalt';'concrete';'concrete';'concrete';'concrete';'asphalt';'asphalt';'asphalt';'asphalt'});
>> Speed = categorical({'50mph';'50mph';'70mph';'70mph';'50mph';'50mph';'50mph';'50mph';'70mph';'70mph';'50mph';'50mph';'70mph';'70mph'});
>> SoundLevel = [51;54;75;76;52;60;68;63;79;80;49;51;66;68];
>> RoadNoise = table(ID,Route,Surface,Speed,SoundLevel)
RoadNoise =
14×5 table
ID Route Surface Speed SoundLevel
__ _____ ________ _____ __________
1 IH10 concrete 50mph 51
2 IH10 concrete 50mph 54
3 IH10 concrete 70mph 75
4 IH10 concrete 70mph 76
5 IH10 asphalt 50mph 52
6 IH10 asphalt 50mph 60
7 US6 concrete 50mph 68
8 US6 concrete 50mph 63
9 IH35 concrete 70mph 79
10 IH35 concrete 70mph 80
11 US20 asphalt 50mph 49
12 US20 asphalt 50mph 51
13 US20 asphalt 70mph 66
14 US20 asphalt 70mph 68
There are a number of ways to get from there to what you want, the following is perhaps a clever one. There are certainly more obvious ways too, involving logical operations for row subscripts.
Get the unique combinations of Route and Surface, and a flag indicating which ones contain both levels of Speed.
>> t = varfun(@(speed) length(unique(speed))==2,RoadNoise, ...
'GroupingVariables',{'Route' 'Surface'},'InputVariables','Speed');
>> t.Properties.VariableNames{end} = 'BothSpeeds'
t =
5×4 table
Route Surface GroupCount BothSpeeds
_____ ________ __________ __________
IH10 asphalt 2 false
IH10 concrete 4 true
IH35 concrete 2 false
US20 asphalt 4 true
US6 concrete 2 false
Pick out the rows of the second table where both levels existed in the first, and use that to select rows of the original table.
>> innerjoin(RoadNoise,t(t.BothSpeeds,{'Route' 'Surface'}))
ans =
8×5 table
ID Route Surface Speed SoundLevel
__ _____ ________ _____ __________
1 IH10 concrete 50mph 51
2 IH10 concrete 50mph 54
3 IH10 concrete 70mph 75
4 IH10 concrete 70mph 76
11 US20 asphalt 50mph 49
12 US20 asphalt 50mph 51
13 US20 asphalt 70mph 66
14 US20 asphalt 70mph 68
  댓글 수: 2
Bryan Wilson
Bryan Wilson 2017년 4월 25일
Thanks Peter. This is exactly what I was looking for. Your solution does in 3 lines of code what mine did in 14.
(FYI, I've fixed the errors in my previous code.)
Andrei Bobrov
Andrei Bobrov 2017년 4월 25일
+1

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Bryan Wilson
Bryan Wilson 2017년 4월 25일
편집: Bryan Wilson 2017년 4월 25일
I got this to work. If you can find a more elegant solution, I'll accept your answer.
%Create new column for Route+Surface
RoadNoise.RouteSurface = strcat(RoadNoise.Route,RoadNoise.Surface);
%Among the 50mph and 70mph data, get all unique Route+Surface combinations
Temp50 = RoadNoise(RoadNoise.Speed==50,:);
Temp70 = RoadNoise(RoadNoise.Speed==70,:);
u50 = unique(Temp50.RouteSurface);
u70 = unique(Temp70.RouteSurface);
%Identify the Route+Surface rows that have both 50 and 70mph readings.
rows50 = zeros(height(Temp50),1);
rows70 = zeros(height(Temp70),1);
for i=1:size(u70)
rows50 = rows50+strcmp(Temp50.RouteSurface,u70(i));
end
for i=1:size(u50)
rows70 = rows70+strcmp(Temp70.RouteSurface,u50(i));
end
%Combine the identified rows into a new table.
RoadNoiseQuery = [Temp50(logical(rows50),:);Temp70(logical(rows70),:)];

카테고리

Help CenterFile Exchange에서 Workspace Variables and MAT Files에 대해 자세히 알아보기

태그

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by