Parpool Initialization Stuck in Timeout While Loop

조회 수: 12 (최근 30일)
Jeremy
Jeremy 2022년 9월 16일
댓글: Edric Ellis 2022년 9월 20일
I was able to run Matlab parfor commands without issue one day ago. Today, Matlab gets stuck on "Starting parallel pool (parpool) using the 'local' profile ..." I've tried searching the Matlab functions used when starting a parpool and found that it is getting stuck in an endless while loop in the below nested function within JavaBackedSession. Any troubleshooting suggestions or fixes for this issue would be appreciated. I've spent several hours looking at parfor issues online already and the four troubleshooting steps listed below did not work.
Troubleshooting that did not work:
  1. Entering distcomp.feature( 'LocalUseMpiexec', true )
  2. Entering distcomp.feature( 'LocalUseMpiexec', false)
  3. Deleting the local_cluster_jobs folder and restarting Matlab
  4. Entering poolobj = gcp('nocreate'); delete(poolobj);
C:\Software\Mathworks\Matlab_All_Products_R2021b\toolbox\parallel\cluster\+parallel\+internal\+pool\JavaBackedSession.m
function session = waitForSessionCreation(~, sessionFuture, connectionCounter, ...
checkFcn)
% Block until the session has been created - which completes only when all the
% connections are available.
gotSession = false;
session = [];
previouslyConnectedTo = 0;
while ~gotSession
% This throws an appropriate error in the case where things go wrong.
[gotSession, session] = parallel.internal.getJavaFutureResult(...
sessionFuture, 1, java.util.concurrent.TimeUnit.SECONDS);
if gotSession
return
end
% If we get here, we have no session. Let's check to see how things are getting
% on using the injected checkFcn - this might throw an error if things are bad.
checkFcn();
currentlyConnectedTo = double(connectionCounter.get());
if currentlyConnectedTo > previouslyConnectedTo
dctSchedulerMessage(2, 'Currently connected to: %d', currentlyConnectedTo);
previouslyConnectedTo = currentlyConnectedTo;
end
end
end
  댓글 수: 2
Jeremy
Jeremy 2022년 9월 16일
5. I also tried using the "Validate" option in "Cluster Profile Manager," but the Validate function got stuck at "Job test (createJob).
Edric Ellis
Edric Ellis 2022년 9월 20일
If validation got stuck at the "createJob" stage, then that might well mean that for some reason worker processes aren't launching correctly. I suggest contacting MathWorks support directly.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Parallel Computing Fundamentals에 대해 자세히 알아보기

태그

제품


릴리스

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by