SPMD Loop Iteration Job Not Submitting on MATLAB Distributed Computing Server

I have implemented SPMD on a loop iteration which runs about 3000 times. I want to run the job on a MATLAB DCS to see if I can get even better execution speeds. However, when I try to submit the job, I get an error: Unable to determine job requirements.
I suspect the reason I am receiving this error is that I didn't specify particular values for the input arguments when creating the task. But the iteration case here is such that the values of the input variables change at every iteration. Thus, I cannot input a specific value.
Please, how can I get my code to submit on the DCS server. My code is outlined below OMP is a user-defined function written within the same MATLAB script):
c = parcluster ('LegionProfile');
myJob = createCommunicatingJob(c, 'Type', 'SPMD');
num_workers = 24;
for iter = 1 : MAX_ITER
for col = 1:n
task = createTask (myJob, @OMP, 1, {D,Y(:,col), k(col)});
submit(myJob);
wait(myJob);
X(:,col) = fetchOutputs(myJob);
end

댓글 수: 5

I'm a little confused here. A job of type 'SPMD' can have only a single task - so in your loop you should be doing something like:
for ...
myJob = createCommunicatingJob(c, 'Type', 'SPMD');
task = createTask(myJob, ...);
submit(myJob);
...
end
I'm not sure quite where that error comes from. When you encounter it, could you please execute:
disp(MException.last)
and post the output?
Thanks Edric for your suggestion. The output from disp(MException.last) is:
disp(MException.last)
MException with properties:
identifier: 'parallel:cluster:GenericSubmissionFailed'
message: [1x104 char]
cause: {[1x1 MException]}
stack: [3x1 struct]
Please do you know what this means and how I can fix it? Thanks again for your help.
What version of MATLAB/PCT are you using? Could you also post the output of
getReport(MException.last)
In more recent MATLAB releases, the GenericSubmissionFailed error identifier should correspond to an error message something like
Job submission failed because the user supplied SubmitFcn (...) errored
Hi Edric,
Thanks again for your contribution. I am using Matlab/PCT R2013a. The result from the getReport(MException.last) is outlined below:
getReport(MException.last)
ans =
Error using parallel.Job/submit (line 304)
Job submission failed because the user supplied CommunicatingSubmitFcn (communicatingSubmitFcn) errored.
Error in MOD_OMP1 (line 146)
submit(myJob);
Error in Test1 (line 66)
[D_out, X_out] = MOD_OMP1(Y, r, s*ones(n,1), 'MAX_ITER', MAX_ITER, ...
Caused by:
Error using communicatingSubmitFcn (line 101)
Submit failed with the following message:
Unable to run job: Rejected by ucl_jsv4h Reason:Unable to determine job requirements.
Exiting.
Aha! The "Unable to determine job requirements" is coming from your underlying scheduling system. You'll probably need to work with your cluster admin to work out why that error is showing up.

댓글을 달려면 로그인하십시오.

답변 (1개)

Andrew Schenk
Andrew Schenk 2015년 6월 18일
Take a look at this UCL FAQ as it mentions the same final error message you are encountering:
Does the Validation of your cluster profile pass successfully? If not then the issue is not with your particular script, but with the setup of the cluster profile. Cluster profile validation is described here:

카테고리

도움말 센터File Exchange에서 MATLAB Parallel Server에 대해 자세히 알아보기

태그

질문:

2015년 6월 17일

댓글:

2015년 6월 19일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by