During parfor-loop, suddenly get the error "unable to read file"

조회 수: 21 (최근 30일)
Xingwang Yong
Xingwang Yong 2020년 12월 12일
댓글: Xingwang Yong 2020년 12월 14일
parfor k = 1:100000
% something else
tmpStruct = load(filename);
% something else
end
I have 3 scripts like the one above. I am running them on 3 different nodes of a cluster.
After some iterations, one job get the error "unable to read file, no such file or directory". This is confusing, since the file does exist and the other two jobs can read it.
I thought this is due to limited file handle of the linux system. But I don't understand why the load() function is related to file handle.
And, if they are related, how can I aviod this "limited file handle" problem? I tried to increase the file handle of linux, but it seems it will always exceed the limit if I run several jobs togegher.
By the way, I am definitely sure that I did not use fopen() in my script.
  댓글 수: 9
Mario Malic
Mario Malic 2020년 12월 13일
편집: Mario Malic 2020년 12월 13일
Search here for your issues, I have seen some comments that network drives can cause read access issues. You can try creating a copy of the file for each node, which still won't completely negate your issues as each node have workers that try to access the files. That's why I am recommending you to avoid load, at least within parfor loop.
Xingwang Yong
Xingwang Yong 2020년 12월 14일
Yes, using load() is time-cosuming and error-prone.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Parallel for-Loops (parfor)에 대해 자세히 알아보기

제품


릴리스

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by