필터 지우기
필터 지우기

Read every nth line of text file with different deliminters across a single line

조회 수: 2 (최근 30일)
Have a large at least one million line test files that are formatted as below with forward slash as part of date, colons with time, and tabs between the data:
currently using to following code to read in the files where impliment fgets to skip over N# lines. However, that does not seem to speed up the process all that much. I was wondering if that was faster way to read in this type of fille which has a few different deliminters across the line.
numRows = 0; %Set counter for each row read
Time = zeros(1e6, 1); %Preallocate
PTCM = zeros(1e6, 4); %Preallocate
SkipLines=input('Enter # of lines to skip: ');
fid = fopen('laserBPTC.txt', 'rt'); % open to read as text
while 1 %Loop through each line of file
dt = fscanf(fid, '%d/%d/%d %d:%d:%d ', 6); %Read in Date/Time
if isempty(dt), break, end %EOF so break out of read loop
newTime = datenum(dt(1), dt(2), dt(3), dt(4), dt(5), dt(6)); %Contert to Matlab date/time
data = fscanf(fid, '%lf', 4); %Read in 4 data items
numRows = numRows + 1; %New row;
Time(numRows, 1) = newTime; %Store time
PTCM(numRows, 1:4) = data(1:4); %Store data
for skipLine=0:SkipLines %Skip lines
[~]=fgets(fid);
end
end
Time = Time(1:numRows); %Tuncate to off preallocated lines
PTCM = PTCM(1:numRows,:); %Tuncate to off preallocated lines
fclose(fid); %Close file
2019/01/09 16:31:44 783.9 21.84 2.088 4.835
2019/01/09 16:31:48 782.5 21.87 2.084 4.835
2019/01/09 16:31:52 782.5 21.71 2.084 4.835
2019/01/09 16:31:56 778.5 21.73 2.092 4.835
2019/01/09 16:32:00 779.8 21.82 2.088 4.835
2019/01/09 16:32:04 777.1 21.66 2.088 4.835

채택된 답변

Bob Thompson
Bob Thompson 2019년 4월 3일
You might have an easier time using textscan instead of fgets or dlmread. You have to specify your format for a row, but that should allow you to end up just using the white space as a delimiter. Below is an example formatting input for your sample given,
data = textscan('mytextfile.txt','%(yyyy/mm/dd hh:mm:ss)D %f %f %f %f');
  댓글 수: 3
Rory Uibel
Rory Uibel 2019년 4월 3일
Bob,
Thanks for the suggestion. I used curley braces '%{yyyy/mm/dd hh:mm:s})D ... for the date imput and this drastically well over 20x sped up the time. Note this actaully reads in all of the lines, while questions was about every Nth line. Hovever, the BONUS is using testscan becomes so much faster than one can read in all of the lines.
Best -Rory
Rory Uibel
Rory Uibel 2019년 4월 4일
It look like the '%{ }D' option for textscan became avaiable in 2013vB. Was setting this up with 2013vA and so used the following variation to read in the file.
fid = fopen(file, 'rt'); % open to read as text
allDATA=textscan(fid,'%d/%d/%d %d:%d:%d %f %f %f %f');
fclose(fid); %Close file
Time=datenum([double(allDATA{1}),double(allDATA{2}),double(allDATA{3}),double(allDATA{4}),double(allDATA{5}),double(allDATA{6})]);
Data = [double(allDATA{7}),double(allDATA{8}),double(allDATA{9}),double(allDATA{10})];

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Text Files에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by