Issue with Q0 Convergence during Training using PPO Agent
이전 댓글 표시
Hi guys,
I have developed my model and trained using PPO agent. Overall, the training process has been successful. However, I have encountered an issue with the Q0 values. The maximum achieable rewards is 6000. I set to stop my training at 98.5% of the maximum rewards (5910).
During the training, I have noticed that the Q0 values did not converge as expected. In fact, they seem to be capped at 100, as indicated by the figures. I am currently seeking an explanation for this behavior and trying to understand why the Q0 values are not reaching the desired convergence.

My agent option is as follow:

If anyone has any insights or explanations regarding the behavior of Q0 during training with the PPO agent, I would greatly appreciate your input. Your expertise and guidance would be invaluable in helping me understanding and addressing this issue.
Thank you.
댓글 수: 2
Emmanouil Tzorakoleftherakis
2023년 7월 10일
Can you share the code with the training options?
Muhammad Fairuz Abdul Jalal
2023년 7월 11일
채택된 답변
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Reinforcement Learning에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


