Fast subsetting or indexing of data
조회 수: 6 (최근 30일)
이전 댓글 표시
I am working with large datasets which I am subsetting into various categories and saving as smaller files. What I am doing right now is working but it is quite time consuming and error prone, as it involved a lot of copy and paste.
For example, I have many files I have split into those with boats and those without boats. I then split those into season. Would there be a faster way to do this where I apply the same command to prescribed set of variables?
%% Comparisons... Season using water temp
boatsAbsent_t=boatsAbsent.Var1; %time variables
[BA_spring, BA_summer, BA_autumn, BA_winter]=indexSeasons(boatsAbsent_t); %index times into seasons
boatsPresent_t=boatsPresent.Var1;
[BP_spring, BP_summer, BP_autumn, BP_winter]=indexSeasons(boatsPresent_t);
%Subset PSD outputs and write to file
S=withtol(BA_spring,seconds(1));
BA_spring=boatsAbsent(S,:);
writetable(timetable2table(BA_spring),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Spring.csv')));
S=withtol(BA_summer,seconds(1));
BA_summer=boatsAbsent(S,:);
writetable(timetable2table(BA_summer),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Summer.csv')));
S=withtol(BA_autumn,seconds(1));
BA_autumn=boatsAbsent(S,:);
writetable(timetable2table(BA_autumn),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Autumn.csv')));
S=withtol(BA_winter,seconds(1));
BA_winter=boatsAbsent(S,:);
writetable(timetable2table(BA_winter),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_Winter.csv')));
S=withtol(BP_spring,seconds(1));
writetable(timetable2table(BP_spring),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Spring.csv')));
S=withtol(BP_summer,seconds(1));
writetable(timetable2table(BP_summer),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Summer.csv')));
S=withtol(BP_autumn,seconds(1));
writetable(timetable2table(BP_autumn),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Autumn.csv')));
S=withtol(BP_winter,seconds(1));
writetable(timetable2table(BP_winter),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_Winter.csv')));
댓글 수: 3
Stephen23
2020년 9월 29일
Meta-data is data, and data does not belong in variable names! Sticking meta-data into variable names, e.g. the season names:
BA_spring, BA_summer, BA_autumn, BA_winter
means that you force yourself into writing slow, inefficient code or doing lots of copy-and-paste. Rik correctly recommends that you should put all of your data in arrays, rather than splitting into separated variables.
채택된 답변
Rik
2020년 9월 29일
Whenever you find yourself copy-pasting code in Matlab, you should consider an array.
seasons={'Spring','Summer','Autumn','Winter'};
boatsPresent_t=boatsPresent.Var1; %time variables
boatsAbsent_t=boatsAbsent.Var1; %time variables
BP=cell(1,4);BA=cell(1,4);
[BP{:}]=indexSeasons(boatsPresent_t); %index times into seasons
[BA{:}]=indexSeasons(boatsAbsent_t); %index times into seasons
for n=1:numel(seasons)
S=withtol(BP{n},seconds(1));
BP_part=boatsPresent(S,:);
writetable(timetable2table(BP_part),...
fullfile(folder,strcat(site,'_PSD_boatsPresent_',seasons{n},'.csv')));
S=withtol(BA{n},seconds(1));
BA_part=boatsAbsent(S,:);
writetable(timetable2table(BA_part),...
fullfile(folder,strcat(site,'_PSD_boatsAbsent_',seasons{n},'.csv')));
end
If you have more states than just present and absent you should consider putting those states in an array so you can use it to generate logical indices.
댓글 수: 5
Rik
2020년 9월 30일
If you want to have a dynamic field name you need to use this syntax:
name='foo';
S.(name)='bar';
But what is wrong with the code you posted? You shouldn't be storing data (i.e. the season) in a variable name. If you do, that will cause the same issue every time you want to use the variables.
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Matrix Indexing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!