필터 지우기
필터 지우기

Parallel reinforcement learning on HPC with warning "Received duplicate id = x from worker"

조회 수: 3 (최근 30일)
When I'm running training of a reinforcement learning agent using a HPC cluster and parallel computing toolbox I get the warning "Received duplicate id = 22 from worker" (or other id) after e.g. 180 training episodes. Then the training seems to be stopped and there is no further error or warning. I am using this command to start the .m-script:
module load matlab/R2021a
matlab -nodisplay < rl_training.m
When I set
trainOpts.UseParallel = false;
often I get the warning "Error reading character from command line". Does anyone know why these messages are occurring and is there perhaps a way to continue the training?
  댓글 수: 5
Image Analyst
Image Analyst 2021년 12월 2일
If you have a maintenance contract in place, I'd call them on the phone. Of course you can use email like @Raymond Norris said. I never use email or a support page since when I encounter a problem I need an immediate solution so I call them.
Walter Roberson
Walter Roberson 2021년 12월 5일
I never call them, myself -- I open support cases, where I can describe the problem and include code and results to show clearly what is expected and what is received instead. 85% of the time the response is going to be "You are right, that's not good, the developers have been notified and it might get fixed some day".

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Startup and Shutdown에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by