Loop through large number of files and access data outside the loop
이전 댓글 표시
I have to run this code over 50000 files which is around 250 GB so, I am also looking to improve performance.
답변 (2개)
dipak sanap
2015년 11월 23일
편집: Walter Roberson
2015년 11월 23일
Walter Roberson
2015년 11월 23일
numfiles = 50000
A = cell{numfiles,1};
for i = 1:numfiles
f = fopen(sprintf('F%d', i), 'r'); %File names are F1, F2 and son on.
A{i} = fscanf(f, '%d %d %f %*d',[3, inf]) .';
fclose(f);
end
You do not use column 4, so tell fscanf that it exists but that no value is to be returned for it. With this done you do not need the temporary variable X.
댓글 수: 3
dipak sanap
2015년 11월 23일
Walter Roberson
2015년 11월 23일
In your existing code, you were creating X temporarily and using A outside the loop. Now you say that you got rid of A and want to use X outside the loop. Your existing code does not use X outside the loop, only A.
Walter Roberson
2015년 11월 23일
U = union( cell2mat(cellfun(@(M) M(:,1:2), A(:), 'Uniform', 0)), 'rows');
However, I would suggest
U = unique( cell2mat(cellfun(@(M) unique(M(:,1:2), 'rows'), A(:), 'Uniform', 0)), 'rows');
However, if you know that those rows are unique within each file, then you might as well use the first command. If those rows are not unique within each file then you can save a lot of memory by taking the unique values by file before merging them all together.
카테고리
도움말 센터 및 File Exchange에서 Performance and Memory에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!