my DDPG agent starts applying one single action

조회 수: 1 (최근 30일)

Mokhtar 2022년 9월 12일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1803325-my-ddpg-agent-starts-applying-one-single-action

댓글: nick 2023년 11월 16일

Hello, i am new i Deep Reinforcement learning

using RL Toolbox i am trying to train a DDPG agent to go to a position and stay there (start position = 0 , target position = 5), if he goes above 5 or under 0 he will get a big penalty. the agent starts learning and trying different actions for the first 20~30 episodes and then starts to implement the extreme action (+1) (action space[-1 1]) for the next 100 episodes, it is like he found the optimal action to take each step, which is weird because if he keeps applying the action (+1) he gets to the penalty quickly which doesn't make any sense. even if i let it for +1000 Episodes he comes back to the action (+1) everytime. my reward function for now is:

(-0,1*(reference position - actual position)^2) - 100 *( if X <0 or X>5)

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

nick 2023년 11월 16일

Hi Mokhtar,

Kindly specify the environment of the agent. Also what is meant by reference position? Are the start and stop position refering to X coordinates? It would be better if you can share the code.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.