Why does the number of workers decrease while running parfor?

조회 수: 18 (최근 30일)
한범
한범 2022년 6월 20일
댓글: Edric Ellis 2022년 6월 21일
I used parallel computing in order to increase the calculation speed.
At the beginning, I defined mypool = parpool('local',3); and began the code. I expected that it would take a little more than 50 hours to finish the job.
However, it has been already 3 days after the beginning. I tried to check why this is happening. Then I found "Number of workers: 1". (I found it by putting my cursor on the '4 green vertical lines at the left bottom corner of the MATLAB window.') I am sure it was 3 at the beginning.
Why does this happen? If the number of worker decreases automatically, there is no use of paralled computing.
Can it be related to the memory? The job deals with huge files, and when I tried to use more than 3 workers, there occured memory allocation problem below.
Unexpected Standard exception from MEX file.
What() is:bad allocation

답변 (1개)

Edric Ellis
Edric Ellis 2022년 6월 20일
The number of workers in a local parallel pool decreases like this only when one of the worker processes terminates (i.e. crashes with a segmentation fault or similar).
parfor will try to continue even after workers crash, by running the loop iterations on remaining workers. You should get some indication that this is happening in the command window.
You could check for crash dump files in the directory returned by this command:
c = parcluster("local");
c.JobStorageLocation
There will be a bunch of "Job##" directories there - look for the most recent, that will probably the one corresponding to your currently-running parallel pool.
  댓글 수: 2
한범
한범 2022년 6월 21일
I found the "Job##" directories and looked into the recent one. But there seem no useful information.
Some '*.log' files are in the directory but ther are just 0-byte empty file.
Also I found 'Task#.common/in/out/state.mat' but they are also without crash info.
Edric Ellis
Edric Ellis 2022년 6월 21일
You can cause workers to emit diagnostic logging information by running
setenv('MDCE_DEBUG', 'true')
before creating the pool. I'm not sure if it will help though...

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 MATLAB Parallel Server에 대해 자세히 알아보기

제품


릴리스

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by