How to handle invalid/illegal actions/moves in Reinforcement Learning toolbox?

Question

Lymperis Perakis 2019년 8월 8일

2
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/475466-how-to-handle-invalid-illegal-actions-moves-in-reinforcement-learning-toolbox

답변: Rajani Mishra 2019년 8월 21일

I have implemented a Reinforcement Learning Algorithm, where some actions are not legal/valid depending on the state. What is the best way to deal with this problem? I have tried to give negative rewards and stop the episode if an illegal action is selected, but it does not seem to work. I also tried to just ignore these actions, but then the agent keeps making the same move until it reaches the max episodes during the training. My best approach was to make the next possible action, instead of the selected one and although it has better results than the other options, it does not seem to be a good way to deal with it. I have read on the AlphaGo paper that "Illegal moves are masked out by setting their probabilities to zero, and re-normalising the probabilities for remaining moves." Is there a way to implement this on Matlab and if not what would be the best way to avoid the invalid moves ?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Rajani Mishra 2019년 8월 21일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/475466-how-to-handle-invalid-illegal-actions-moves-in-reinforcement-learning-toolbox#answer_388477

To implement your own custom reinforcement learning algorithms, you can create a custom agent.

Specify properties of the agent for creating and training the agent. Refer this link for creating custom agents : https://www.mathworks.com/help/reinforcement-learning/ug/custom-agents.html

I found one example of a Reinforcement learning agent, link for which is:https://www.mathworks.com/help/reinforcement-learning/ug/train-q-learning-agent-to-solve-basic-grid-world.html

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

How to handle invalid/illegal actions/moves in Reinforcement Learning toolbox?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

How to handle invalid/illegal actions/moves in Reinforcement Learning toolbox?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기