Simulink interruption during RL training

Question

gerway 2025년 3월 29일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2175794-simulink-interruption-during-rl-training

댓글: gerway 2025년 4월 3일

Hey everyone,

Anyone who has used reinforcement learning (RL) to train on physical models in Simulink knows that during the initial training phase, random exploration often triggers assertions or other instabilities that can cause Simulink to crash or diverge. This makes it very difficult to use the official train function provided by MathWorks, because once Simulink crashes, all the RL experience (replay buffer) is lost—essentially forcing you to start training from scratch each time.

So far, the only workaround I’ve found is to wrap the training process in an external try-catch block. When a failure occurs, I save the current agent parameters and load them again at the start of the next training run. But as many of you know, this slows down training by 100x or more.

Alternatively, one could pre-train on a simpler case and then fine-tune on the full model, but that’s not always feasible.

Has anyone discovered a better way to handle this?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Jaimin 2025년 4월 1일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2175794-simulink-interruption-during-rl-training#answer_1562910

Hi @gerway

To address instabilities that may lead to crashes or divergence in Simulink, I can recommend a few strategies.

Consider enhancing the approach by implementing custom error handling directly within the Simulink model, in addition to using a try-catch block. You might find it useful to incorporate blocks that detect when the model is nearing an unstable state, allowing you to dynamically reset or adjust parameters to help prevent crashes.
Utilize Simulink Test to create test cases that can help identify and rectify scenarios that lead to instability before running RL training.
Start training with a simplified version of your model and gradually increase its complexity. This can help the agent learn stable behaviours before tackling the full model.
Regularly save checkpoints of the agent's parameters and replay buffer during training. This way, you can resume training from the last stable state rather than starting over.
Use surrogate models or simplified representations of your Simulink model to perform initial training. Once the agent has learned a stable policy, transfer it to the full model.

I recommend implementing a custom training loop, as it gives you complete control over the training process. This includes managing episodes, determining how often to save checkpoints, and handling errors. This approach offers the flexibility to incorporate custom logic for stability checks, dynamically adjust exploration parameters, and integrate domain-specific knowledge.

Kindly refer the following links for additional information.

Custom training loop: https://www.mathworks.com/matlabcentral/answers/1673039-setting-up-a-pause-resume-reinforcement-learning-training-using-rl-toolbox

Curriculum Learning: https://www.mathworks.com/help/reinforcement-learning/ug/train-ppo-for-lka-using-curriculum-learning.html

Simulink Test: https://www.mathworks.com/help/sltest/getting-started-with-simulink-test.html

I hope you find this helpful.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

gerway 2025년 4월 3일

The main issue is that the .ssc in the Rankine Cycle example can't be modified, so I can only passively try to adapt to it. Analyzing all the assertions is far beyond my capabilities, so ending the episode before an assertion occurs or creating a simplified version of the model is nearly impossible for me. Still, thank you.

댓글을 달려면 로그인하십시오.

Simulink interruption during RL training

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Simulink interruption during RL training

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기