Boost performace for import txt file

조회 수: 3 (최근 30일)
MZ123
MZ123 2020년 4월 28일
댓글: MZ123 2020년 4월 28일
Good morning everyone,
I'm importing raw data from an instrument which creates a file called "file00.txt" and reading after reading it continues to append new lines to the file. I used to process the file offline, after all the measurements. The file size is normally around 7.5 MB. Attached a reduced version. Every single measurement takes 17 lines organized with a particular schema. To read the file and organize the data I wrote the following code which does its job, but I'd like to push it for better performance. For the attached file matlab scores "Elapsed time is 1.127454 seconds". Can I improve the time elapsed by 1 order of magnitude? Thanks for your help.
fid = fopen('file00.txt');
TotRow = 0; %Initialization of the variable
tline = fgetl(fid); %Set the variable equal to the first line of the file
while ischar(tline) %Counts the number of total lines in the file
tline = fgetl(fid);
TotRow = TotRow + 1;
end
nosm = TotRow/17; %Number of single measurements (17 lines per measurements)
D = cell(nosm,219); %Cell initialization
tic
frewind(fid); %Reset file pointer at the beginning
for i = 1 : nosm
% 17 lines for a single measurements take 1370 different positions of the indicator
fseek(fid,1370*(i-1),'bof'); %Sets the file position indicator 1370*N bytes from the beginning of the file.
CurrentLine = fgetl(fid); %Get 1st line
D{i,1} = CurrentLine(1:6);
D{i,2} = CurrentLine(8);
D{i,3} = CurrentLine(10:23);
D{i,4} = CurrentLine(28:29);
D{i,5} = CurrentLine(33:35);
D{i,6} = CurrentLine(41);
D{i,7} = CurrentLine(43:48);
CurrentLine = fgetl(fid); %Get 2nd line
D{i,8} = CurrentLine(2:7);
D{i,9} = CurrentLine(8:end);
CurrentLine = fgetl(fid); %Get 3rd line, composed of a letter "b" followed by 10 hexadecimal different values
for j = 1 : 10
D{i,9+j} = CurrentLine(2+6*(j-1):7+6*(j-1));
end
for j = 20 : 15 : 200 %Get lines from 4th to 16th, made of a letter followed by 15 hexadecimal different values
CurrentLine = fgetl(fid);
for k = 0 : 14
D{i,j+k} = CurrentLine(2+6*k:7+6*k);
end
end
j = 215; %index initialization for last 5 measurements
CurrentLine = fgetl(fid); %Get last line made of a letter followed by 5 hexadecimal different values
for k = 0 : 4
D{i,j+k} = CurrentLine(2+6*k:7+6*k);
end
end
toc
  댓글 수: 3
darova
darova 2020년 4월 28일
But fgetl takes most of time
Run profiler and look
MZ123
MZ123 2020년 4월 28일
You're right! Thanks a lot.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

제품


릴리스

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by