Fast code to open and write separate 366 .bin files for each (row,col)
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
I am trying to speed up my code. I have a 18000*36000*366 matrix. I split the matrix into 18 bands of each 1000* 36000 .bin files located in 366 folders. I plan to run the same code for 18 times. The problem is it takes 3.5hrs to process 1*36000 like I wanted. It is super slow, any help is appreciated. Below is my code,
for lat = 1:length(lat_band) % length of lat_band = 1000
for lon = 1:36000
if (mask(lat,lon)~=9) % if ocean then skip
data_1km = data; % matrix of size [366,1];
else
data_1km = NaN(366,1);
end
tmp = 'path';
tic
for day = 1:366
fwrite(fopen(sprintf('%s%d', tmp, day, '/band_1.bin'), 'a'), data_1km(day),'double');
end
fclose('all');
clear data_1km day
toc
end
end
채택된 답변
Walter Roberson
2021년 1월 24일
That code fopen()'s 366 files inside the loop, which can consume all of the file handles. Better would be
for day = 1:366
fid = fopen(sprintf('%s%d', tmp, day, '/band_1.bin');
fwrite(fid, 'a'), data_1km(day),'double');
fclose(fid)
end
If your system can handle and your process is authorized to have 366 files, then fopen() all of them before for lat and index into the list of handles in the loop. You are doing a lot of fopen() of the same file, and that is expensive.
댓글 수: 11
I read one of your posts https://www.mathworks.com/matlabcentral/answers/82106-create-a-empty-file-very-fast, where the speed is suggested to be reduced due to fclose and not fopen. Can I do this instead ? The speed doesn't improve, how else I can approach this problem ?
for day = 1:366
fid = fopen(sprintf('%s%d', tmp, day, '/band_1.bin');
fwrite(fid, 'a'), data_1km(day),'double');
clear fid
end
fclose('all')
I forgot about that.
However, faster is to not do all those fopen/fclose:
ND = 366;
temp = 'path';
bnd = '/band_1.bin';
fids = zeros(1,ND);
for day = 1 : ND
fids(day) = fopen(sprintf('%s%d%s', tmp, day, bnd), 'a');
end
for lat = 1:length(lat_band) % length of lat_band = 1000
for lon = 1:36000
if (mask(lat,lon)~=9) % if ocean then skip
data_1km = data; % matrix of size [366,1];
else
data_1km = NaN(366,1);
end
tic
for day = 1:ND
fwrite(fids(day), data_1km(day), 'double');
end
toc
end
end
fclose(fids)
... remembering that your system might not support more than 240 simultaneous files.
I will run this program on Linux so I am assuming it can handle more than 240 ?
Walter Roberson
2021년 1월 24일
편집: Walter Roberson
2021년 1월 24일
on Linux you are probably ok. I see hints that the default maximum is often 1024.
In that case should't fclose(fids)come after toc like this, otherwise number of files open exceeds 1024...
for lat = 1:length(lat_band) % length of lat_band = 1000
for lon = 1:36000
if (mask(lat,lon)~=9) % if ocean then skip
data_1km = data; % matrix of size [366,1];
else
data_1km = NaN(366,1);
end
tic
for day = 1:ND
fwrite(fids(day), data_1km(day), 'double');
end
toc
fclose(fids)
end
end
and when I do that it gives me an error, "Error using fclose
Invalid file identifier. Use fopen to generate a valid file identifier.
""
No, you are not doing fopen() inside the for lat loop. The code I posted in https://www.mathworks.com/matlabcentral/answers/725277-fast-code-to-open-and-write-separate-366-bin-files-for-each-row-col#comment_1281827 does all the fopen() once at the top, and then you should not fclose() until after all your looping is done.
I understand what you did, that was nice trick.
But I need to do this for every lat lon. For every lat lon there are 366 bin files to write into. So, I modified the code like below, still fopen the 366 files before. What's going wrong here?
for lat = 1:length(lat_band)
for lon = 1:36000
ND = 366;
tmp = 'path';
bnd = '/band_1.bin';
fids = zeros(1,ND);
for day = 1 : ND
fids(day) = fopen(sprintf('%s%d%s', tmp, day, bnd), 'a');
end
if (mask(lat,lon)~=9) % if ocean then skip
data_1km = data; % matrix of size [366,1];
else
data_1km = NaN(366,1);
end
tic
for day = 1:ND
fwrite(fids(day), data_1km(day), 'double');
end
toc
fclose(fids)
end
end
Your code writes to the same file depending only on the day number independent of lat and long. /tmp and /band_1 are constants that do not vary with lat and long
For a given lat_band = 1, the lat lon varies from lat = 1:1000, lon = 1:36000 and days= 366. the same file in each day will store lat lon data. I will run this code for 18 lat_bands. I had to split into bands because matlab license on linux expires after 24hrs. Is there a way to rerun the code when the license wall time of 24hrs expires on linux?
Thank you for your tip! I appreciate your feedback and time :)
I will run this code for 18 lat_bands.
What is the file name to be used for band #4 day #7 ? And could you confirm that the variable lat is the one that stores the current lat band number?
Can you tell me why running a code as simple as 0.012 secs. That is a long time isn't it ?
data = NaN(366,1);
tic
for day = 1:ND
fwrite(fids(day), data(day), 'double');
end
toc
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Time Series Events에 대해 자세히 알아보기
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
