Hi all,
I am trying make predictions with a machine learning model. I want to excercise the parellel processing on a HPC cluster with 128 cores. I run the script 'step_4.m' with slurm.sh. Unfortunately, the input files are not getting processed in parellel, rather they are processed one by one. Could any offer a help?
Thank you!
slurm.sh
#!/bin/bash
#SBATCH --job-name=ml_test
#SBATCH --output=matlab_job_output.log
#SBATCH --error=matlab_job_error.log
# CHANGE TO YOUR EMAIL ADDRESS
#SBATCH --mail-type=END
#SBATCH --mail-user=xyz@oregonstate.edu
#SBATCH --partition=preempt.q
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=128
#Load MATLAB module
module load matlab/R2023b
#Run MATLAB script
matlab -nodisplay -nodesktop < step_4.m
p.s. step_4_.m script is attached

댓글 수: 2

parpool('local', numWorkers);
It is not clear to me that the HPC would be using local as the name of the parallel pool.
Edric Ellis
Edric Ellis 2024년 7월 4일
parpool('local',numWorkers) ought to work - the general idea here is
  1. Request from SLURM a node with 128 cores available
  2. Launch a "client" MATLAB on that node
  3. Use parpool('local',numWorkers) to launch a pool using the cores on that node
Are there any error messages when you do this? Does parpool complete successfully? I would expect that either parpool succeeds and your parfor loop runs in parallel, or else parpool fails with an error.
By the way, I would suggest using matlab -batch step_4 to run your script as being better than piping in the commands. I don't think that will change much though.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

제품

릴리스

R2023b

질문:

2024년 7월 3일

댓글:

2024년 7월 4일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by