separate excel file based one single column in matlab

조회 수: 1 (최근 30일)
Daphne Mariaravi
Daphne Mariaravi 2017년 6월 25일
답변: Guillaume 2017년 6월 28일
I have a multiple .csv file which has time series data. I have attached a sample file. Is it possible to find every 5 minute interval in the HH:MM:SS column, cut the entire data until that time and save it as a separate file with the header row.? Likewise it has to do until the entire time series up to 5 hrs or so. Any suggestions on how to do this?
  댓글 수: 10
Rik
Rik 2017년 6월 27일
doc fopen will give you an idea of what 'first.txt' means and where you need to put them in your code.
Joshua
Joshua 2017년 6월 28일
Daphne,
I apologize as formatted my response wrong at first. I fixed the post so the code is all in the correct order. Also, first.txt was just the name of a random file, but in retrospect that name does not make any sense. I changed it to be filename.txt where you can put anything for 'filename'. Also, 'w' indicates that you give MATLAB write access to the file as opposed to read access only.

댓글을 달려면 로그인하십시오.

답변 (1개)

Guillaume
Guillaume 2017년 6월 28일
I don't know what is all this conversation about fopen which is probably the worst way of parsing your data. Using modern tools such as readtable makes a lot more sense.
alldata = readtable('test.csv'); %See note 2
timestamp = datetime(alldata.HH_MM_SS, 'InputFormat', 'HH:mm.s', 'Format', 'HH:mm:ss'); %see note 1
group = discretize(timestamp, minutes(5));
splitdata = splitapply(@(rows) {alldata(rows, :)}, (1:height(alldata))', group);
for fileidx = 1:numel(splitdata)
writetable(splitdata{fileidx}, sprintf('split%02d.csv', fileidx)); %see note 3
end
Note 1: Your header implies that the column format is HH:MM:SS, yet the data in the column is of the form XX:YY.z, so it's really not clear if the format is actually hours:minutes.seconds or minutes:seconds.fractionofseconds. I assumed the first in the above. Adjust the 'InputFormat' if necessary.
Note 2: You can specify column format in the readtable call to directly read the HH:MM:SS column as datetime. I've not bothered here.
Note 3: readtable will convert your header into valid variable names, slightly altering your headers. These slightly altered headers is what will be saved in the split files. If the original headers are absolutely required, it can be done with a slightly more complex for loop, but relying on the undocumented fact that the table VariableDescription property holds the original name of the columns:
columnnames = regexp(alldata.Properties.VariableDescriptions, '(?<='')[^'']+(?='')', 'match', 'once');
notmodified = cellfun(@isempty, columnnames);
columnnames(notmodified) = alldata.Properties.VariableNames(notmodified);
for fileidx = 1:numel(splitdata)
xlswrite(sprintf('split%02d.csv', fileidx), [columnnames; table2cell(splitdata{fileidx})]);
end
As said, the fact that the original column names are saved in a property is not documented so this may only work in some versions (tested with R2017a)

카테고리

Help CenterFile Exchange에서 Spreadsheets에 대해 자세히 알아보기

태그

아직 태그를 입력하지 않았습니다.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by