Reinforcement Learning does not show that training occurs?

Question

shadi abpeikar 2021년 3월 12일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/770393-reinforcement-learning-does-not-show-that-training-occurs

편집: Emmanouil Tzorakoleftherakis 2021년 3월 19일

Hi, I have a reinforcement learning in a continuous state/action space. I trained it for 2000 episodes, each episode contains maximum of 10 steps, and stops episode training when reaches a positive reward more than 10 or when reaches the maximum number of steps. Here is the training procedure of this off-policy reinforcement learning. This reinforcement learning visually shows that the training happens, when tested on some samples. But I cannot understand why it doesn't show the original training trend of RL (start from low reward to high rewards). I checked some of the answers provided in MathWork like changing OU noise, deep neural netwrok setting of actor and critics, and changing the reward function, but it just fluctuates as follow. I appreciaate if someone could help me in this case.

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Emmanouil Tzorakoleftherakis 2021년 3월 18일

It's also not clear what the question is. How did you get the plot above? The x axis does not show all training episodes

shadi abpeikar 2021년 3월 18일

Let me give you some more information. My RL is going to train some swarm behaviors, so in each epiosed it recives a positive reward and stops that episode, when the behaviour is flocking, and gets a penalty when it is a random behaviour. In the second condition the training of the episode iterates for maximum of 100 steps, until reaches flocking or maximum steps. I just checked the new generated states of RL in some random behaviours, and they changed to flocking, as I expected, But I cant see the increeasing trend of rewards, in the RL plot (the same happens with 2000 episodes).

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Emmanouil Tzorakoleftherakis 2021년 3월 18일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/770393-reinforcement-learning-does-not-show-that-training-occurs#answer_651307

Thanks for the info. I think this is a scaling issue with the plot. The Episode Manager has this option where you can uncheck "Q0" (orange line) which prevents you from seeing the training trends more closely

댓글 수: 2
없음 표시없음 숨기기

shadi abpeikar 2021년 3월 18일

편집: shadi abpeikar 2021년 3월 18일

Thanks for your response, Emmanouil,

But I unchecked Q0 as well, and again there are a lot of fluctuations, and not a smooth increasing in the reward values.

Emmanouil Tzorakoleftherakis 2021년 3월 19일

편집: Emmanouil Tzorakoleftherakis 2021년 3월 19일

Well, that means that your agent is not learning anything in which case you have to go back and see what you can change to improve training. I would recommend starting from the reward signal

댓글을 달려면 로그인하십시오.

Reinforcement Learning does not show that training occurs?

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

답변 (1개)

댓글 수: 2
없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

Reinforcement Learning does not show that training occurs?

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

답변 (1개)

댓글 수: 2 없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 2
없음 표시없음 숨기기