problem with simulation trained DRL agent

Question

0 개 추천

Hello,

I implemented deep reinforcement learning in Matlab based on a custom template and saved some agents with high rewards. I was plotting signals in the training phase in each episode and can see the desired performance. I saved all state and control effort (Action) in each episode. My action space is as follows:

numAct = 1;

ActionInfo = rlNumericSpec([numAct 1], 'LowerLimit' ,-0.4189, 'UpperLimit' ,0.4189);

I have a problem with the simulation of the trained agent.

The first figure is one of the results of a training phase and part of the variation of it's action value.

After the simulation, with the below command,

simOptions = rlSimulationOptions('MaxSteps',maxSteps);

experience = sim(env,agent,simOptions);

or for saved agent

experience = sim(env,saved_agent,simOptions);

The result is wrong according to the below figure.

I checked the final agent and some of the high rewards agents. But, the results are similar to the above figure.

After the simulation of the trained agent the action is fixed to lower or upper values of action space acording to above figure for all simulated agents!

Thank you for any help you can offer.

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Emmanouil Tzorakoleftherakis 2020년 12월 26일

1 개 추천

Hello,

Please see this post that goes over a few potential reasons for discrepancies between training results and simulation results.

Looking at the actions and plots above, it seems to me that agent stopped epxloring somewhere along the way (in which case you would need to adjust exploration options in your custom algorithm). Make sure to also keep track of the individual episode rewards to get an idea of which agents lead to higher rewards.

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

Emmanouil Tzorakoleftherakis 2021년 1월 4일

편집: Emmanouil Tzorakoleftherakis 2021년 1월 4일

If you have these settings right, it may not be an exploration issue. You are saying that if the target us further away the robot does not reach it - could it be that the problem is not feasible, i.e. the target is too far away to reach within a single episode? If that's the case, maybe increasing the episode duration or adjusting action limits (if any) may help.

beni hadi 2021년 1월 4일

Thanks.

댓글을 달려면 로그인하십시오.

problem with simulation trained DRL agent

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

카테고리

태그

Community Treasure Hunt

problem with simulation trained DRL agent

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 3 이전 댓글 1개 표시 이전 댓글 1개 숨기기

카테고리

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기