Determine the reward value to stop training in RL agent

Question

H. M. 2022년 10월 17일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1828678-determine-the-reward-value-to-stop-training-in-rl-agent

댓글: Francisco Serra 2023년 12월 14일

I saw in example of using RL agent, this sentence:

Stop training when the agent receives an average cumulative reward greater than -355 over 100 consecutive episodes. At this point, the agent can control the level of water in the tank.

https://www.mathworks.com/help/reinforcement-learning/ug/tune-pi-controller-using-td3.html#:~:text=Stop%20training%20when%20the%20agent%20receives%20an%20average%20cumulative%20reward%20greater%20than%20%2D355%20over%20100%20consecutive%20episodes.%20At%20this%20point%2C%20the%20agent%20can%20control%20the%20level%20of%20water%20in%20the%20tank.

how did he calculate the exact reward -355 over 100 episodes? Is there any tips could help know when to stop the training at specific point before get worst.

thank you advance

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Emmanouil Tzorakoleftherakis 2023년 1월 25일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1828678-determine-the-reward-value-to-stop-training-in-rl-agent#answer_1156395

편집: Emmanouil Tzorakoleftherakis 2023년 1월 25일

For some problems you may be able to calculate what the maximum reward that can be collected in an episode is, so you can use this knowledge accordingly in the training settings. In general, there is no recipe that will tell you when it would be good to stop training. You would typically need to train for a large number of episodes to see how the training goes and that could help you identify what a good average reward is. You could also just train for a set number of episodes instead (similar to how you would train for a certain number of epochs in supervised learning).

Hope that helps

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 2

Sam Chak 2022년 10월 17일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1828678-determine-the-reward-value-to-stop-training-in-rl-agent#answer_1077358

Hi @Haitham M.

There is an option to set the StopTrainingValue.

댓글 수: 2
없음 표시없음 숨기기

H. M. 2022년 10월 17일

Thank you @Sam Chak for answering.

what I mean is how did he know that if the average cumulative reward reach -355, then the agent can control the level. why -355 exactly?

Francisco Serra 2023년 12월 14일

For example, imagine your are using a RL agent for a control problem. You can use a classic controller to have a reference and apply to it the same cost function you use in the RL Agent. Then you do some simulations with that controller, you see how it goes and then you have an idea of how your RL Agent should perform. However, if you don't have a working reference to guide yourself you have to do what @Emmanouil Tzorakoleftherakis said.

댓글을 달려면 로그인하십시오.

Determine the reward value to stop training in RL agent

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (1개)

댓글 수: 2
없음 표시없음 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

Determine the reward value to stop training in RL agent

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

추가 답변 (1개)

댓글 수: 2 없음 표시없음 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시없음 숨기기