Incorrect tanhLayer output in RL agent

Question

Mohammad Ashraful Islam 2020년 4월 5일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/515602-incorrect-tanhlayer-output-in-rl-agent

댓글: H. M. 2022년 10월 20일

Last layer in my actor network is set to tanhLayer. However, I am seeing output that goes above 1 or below -1 from the RL agent block. Is this normal behavior of RL agent?

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

Asvin Kumar 2020년 4월 11일

I am unable to reproduce the error. Here's what I got:

Each view corresponds to a leg of the bipedal robot. The three signals are the normalized torques applied to the ankle, knee and hip.

Mind sharing your model to have a look?

Mohammad Ashraful Islam 2020년 4월 13일

MATLAB Online에서 열기

I am just making sure, did you make the following change?

replace:

actInfo = rlNumericSpec([numAct 1],'LowerLimit',-1,'UpperLimit',1);

with

actInfo = rlNumericSpec([numAct 1]);

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Asvin Kumar 2020년 4월 13일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/515602-incorrect-tanhlayer-output-in-rl-agent#answer_425717

I’ve tried this. I still don’t see the values going beyond [–1, 1]. However, I might be able to answer your question. If you have a look at the helper functions createTD3Agent.m and createDDPGAgent.m, you will notice the ‘agentoptions’ object. The parameters called ‘ExplorationModel’ or ‘NoiseModel’ specify details about the kind of noise added to the predicted action. This can either be an ‘OrnsteinUhlenbeckActionNoise’ object or a ‘GaussianActionNoise’ object each with their own set of parameters. Have a more detailed look at the Noise Options here: rlDDPGAgentoptions and rlTD3AgentOptions. This noise is added to encourage the agent to explore the environment.

The output action from the tanhLayer in the ‘actorNetwork’ will still be in the range of [–1, 1]. Once the noise is added, the new action values will be saturated to the limits specified in the ‘ActorInfo’. These limits will be [-Inf, Inf] by default and won’t saturate your action values when not mentioned.

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Abdul Basith Ashraf 2021년 4월 5일

편집: Abdul Basith Ashraf 2021년 4월 5일

If only I knew when that noise was added, I could have saved a ton of my time. Finally I know it. Noise is added to the output of the actor. I had used a default value for variance of 0.6, but my input was in the range -1e-4 to +1e-4 And because of that my output was always saturated

H. M. 2022년 10월 20일

MATLAB Online에서 열기

@Asvin Kumar

@Abdul Basith Ashraf

Is that means, we can only use action limits,

actInfo = rlNumericSpec([numAct 1],'LowerLimit',-1,'UpperLimit',1);

without using tanhlayer and this will guarantee the action to be in the desired range.

댓글을 달려면 로그인하십시오.

Incorrect tanhLayer output in RL agent

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

채택된 답변

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Incorrect tanhLayer output in RL agent

댓글 수: 4 이전 댓글 2개 표시이전 댓글 2개 숨기기

채택된 답변

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기