필터 지우기
필터 지우기

Why do I receive an error that no supported GPU device was found when submitting a job to a MATLAB Parallel Server cluster using Slurm?

조회 수: 1 (최근 30일)
Why do I receive an error that no supported GPU device was found when submitting a job to a MATLAB Parallel Server cluster using Slurm?
Unable to find a supported GPU device.

채택된 답변

MathWorks Support Team
MathWorks Support Team 2024년 4월 10일
편집: MathWorks Support Team 2024년 4월 10일
This error may occur if...
  • MATLAB Parallel Server cannot detect the node's GPU
  • GPUsPerNode has not been added to the integration scripts
  • The GPU is not being requested in the cluster profile correctly
  • Slurm's configuration has not made any GPUs available
To tell if MATLAB Parallel Server can detect a GPU, run this command on the worker node in question:
Linux
matlab -dmlworker -r "gpuDevice"
Windows
matlab -dmlworker -batch "gpuDevice"
Please use the latest integration scripts with your cluster profile. When using the integration scripts, you will need to add this to the file getCommonSubmitArgs.m:
% GPU
ngpus = validatedPropValue(ap, 'GPUsPerNode', 'double', 0);
if ngpus>0
gcard = validatedPropValue(ap, 'GPUCard', 'char', '');
commonSubmitArgs = sprintf('%s --gres=gpu:%s:%d', commonSubmitArgs, gcard, ngpus);
commonSubmitArgs = strrep(commonSubmitArgs,'::',':');
end
You can then use the AdditionalProperty "GPUsPerNode" in your cluster profile to specify GPUs per node. Otherwise, you'll need to add "--gres=gpu:%s:%d" to your AdditionalSubmitArgs. One of these methods should be used to request GPUs per node.
If none of these things work, please make sure that GPUs have been added to the Slurm and gres configuration files.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Parallel Computing Fundamentals에 대해 자세히 알아보기

태그

아직 태그를 입력하지 않았습니다.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by