Help reorganizing data in a table
조회 수: 5 (최근 30일)
이전 댓글 표시
Hi,
I have an excel sheet with lots of data based on recordings and measurements. These recordings and measurements are displayed in the excel sheet in a certain way. I need to reorganise this data and display it in another way. I have attached an example excel sheet with two days worth of data. The attached excel sheet also shows how I need the data to be displayed.
I have been trying to use MATLAB for nearly a week to try to get this working but I don't think I am any closer to figuring it out. I am new to MATLAB and don't have much experience using it for something like this.
I would really appreciate it if someone could explain to me in simple terms how to do this. I have lots of data to work with so doing this manually is not an option.
Andrew
댓글 수: 2
the cyclist
2021년 8월 5일
@Konrad beat me to posting the elegant unstack solution. I'll just comment here that in general these types of operations are known as "pivoting" the table. You'll see various terminology out there (when considering other languages, such as R and python) such as making the table "long" or "wide", etc.
One other comment. Are you really sure that you want the new format? The reason I ask is that the original seems to me to be closer to the "tidy" format (see, e.g. this article) that is typically best for analysis. But I have to admit I did not think deeply about this yet. Maybe the new format is actually tidier.
Regardless, you may find the article informative. I like to evangelize about tidy data, as I find it to be a powerful concept.
Peter Perkins
2021년 8월 5일
Andrew, this is essentially the same question you asked earlier: unstacking multiple variables. I recommend you look at the answer there.
채택된 답변
Cris LaPierre
2021년 8월 5일
I like Konrad's approach. You can easily reorder the results to be what you want, and it's easy to understand. I went down a rabbit hole trying to get the table to be closer to what you showed. I share it only because I already have the working code. It splits the data into separate tables, then joins them together.
% Load the data
opts = detectImportOptions("example_sheet.xlsx");
opts = setvartype(opts,["Name","Day","Category"],"categorical");
opts = setvaropts(opts,"Name","Ordinal",true);
data = readtable('example_sheet.xlsx',opts);
data = rmmissing(data,'DataVariables',"Name");
data.Name = reordercats(data.Name,"Person "+(1:14));
% Split data into subtables by activity/measurement
a1M1 = data(ismember(data.Category,"Activity 1") & ~isnan(data.Measurement1),[1:3 5:7]);
a1M1.Properties.VariableNames(4:6) = "Activity 1: "+a1M1.Properties.VariableNames(4:6);
a1M2 = data(ismember(data.Category,"Activity 1") & ~isnan(data.Measurement2),[1:3 5:6 8]);
a1M2.Properties.VariableNames(4:6) = "Activity 1: "+a1M2.Properties.VariableNames(4:6);
a2 = data(ismember(data.Category,"Activity 2"),[1:3 5:6]);
a2.Properties.VariableNames(4:5) = "Activity 2: "+a2.Properties.VariableNames(4:5);
% Join tables back into a single table
joinAct1 = outerjoin(a1M1,a1M2,"Keys",["Name","Day","Date"],...
"MergeKeys",true);
joinAct2 = outerjoin(joinAct1,a2,"Keys",["Name","Day","Date"],...
"MergeKeys",true);
joinedData = sortrows(joinAct2,["Date","Name"])
추가 답변 (1개)
Konrad
2021년 8월 5일
편집: Konrad
2021년 8월 6일
Hi Andrew,
I think unstack() is what you're looking for:
T=readtable('example_sheet.xlsx');
uT = unstack(T,{'Duration' 'Rating' 'Measurement1' 'Measurement2'},'Category');
Best, Konrad
댓글 수: 5
Cris LaPierre
2021년 8월 5일
I think it's the same approach, but you could just use indexing.
T=readtable('example_sheet.xlsx');
newT = unstack(T,{'Duration' 'Rating' 'Measurement1' 'Measurement2'},'Category');
newT = newT(:,[1:4 6 8 4 6 10 5 7]);
% name variables
newT.Properties.VariableNames(4:6) = "Activity 1: "+T.Properties.VariableNames(5:7);
newT.Properties.VariableNames(7:9) = "Activity 1b: "+T.Properties.VariableNames(5:7);
newT.Properties.VariableNames(10:11) = "Activity 2: "+T.Properties.VariableNames(5:6)
Konrad
2021년 8월 6일
Just to promote regular expressions (and because I love regexp, I admit):
The desired column names can be achieved in one line:
uT.Properties.VariableNames = regexprep(uT.Properties.VariableNames, '(^[^_]+)_([^_]+)$', '$2: $1')
참고 항목
카테고리
Help Center 및 File Exchange에서 Data Import from MATLAB에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!