fprintf to txt too slow, anyway to accelerate it?

조회 수: 19 (최근 30일)
Yu Li
Yu Li 2017년 3월 5일
편집: dpb 2017년 3월 5일
hi:
I have data with different column, such as
a=ones(100,2), b=ones(50,2);
but I want to append them and export to a .txt file, so what I did is:
  1. construct a new matrix that has the size (max row of a&b, sum of the column of a&b), here is with the size: (100,4).
  2. for each row, find the data that equal to '1', and build a new row that set all the element '1' to ' ,', and save to .txt file using fprinf. however, I found that it is too slow.
below is my test code, and I also attatched the test file A.
load A
tic
fid = fopen('test.txt','w');
for i=1:1:length(A(:,1))
str={''};
for j=1:1:length(A(1,:))
if A(i,j)==1e5;
str=strcat(str,{' ,'});
else
str=strcat(str,sprintf('%0.5e',A(i,j)),',');
end
end
fprintf(fid,'%s\n',str{1});
end
fclose(fid);
toc
tic
dlmwrite('test_dlm.txt',A)
toc
here the 1e5 is the identifier that need to be set to ' ,'. the result shows that fprintf will cost about 129 seconds, while the dlmwrite cost only 12 seconds.
thanks!
Li
  댓글 수: 1
dpb
dpb 2017년 3월 5일
If you'll profile the code, you'll find all the time is spent in the strcat operations, not in fprintf
Unclear to me what you're really wanting to do; if the point is to append the two datasets, why not just
a=ones(100,2), b=ones(50,2);
csvwrite('aplusb.csv',[a;b]
and be done with it instead of creating a file that's 100x4 elements instead of 150x2? If the point is to only put values in the accumlated file that are unique somehow to the two arrays, then do that processing first on the combined and then write.

댓글을 달려면 로그인하십시오.

채택된 답변

dpb
dpb 2017년 3월 5일
편집: dpb 2017년 3월 5일
A=[a;b]; % append to one array for convenience
A(A==badValue)=nan; % replace the bum values with NaN
csvwrite('FixedUpCombined.csv',A) % write to csv file with missing value indicator
Much simpler than creating the specific format to write an empty field; let NaN serve as the placeholder instead. This will also be unequivocal as to what is bum data; "csvread fills empty delimited fields with zero." from the documentation.
Note, however, if you're adamant about using empty delimited fields, the way to go about it is to find the locations in the row and use those locations to build the proper format string with the appropriate number of fields of given type, then write the record using that format string. repmat is exceedingly useful in these machinations since unfortunately the C-like formatting strings cannot accept a repeat count.

추가 답변 (0개)

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by