Noise parameters in Reinforcement learning DDPG

조회 수: 43 (최근 30일)

Surya teja Tunuguntla 2019년 6월 14일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/467153-noise-parameters-in-reinforcement-learning-ddpg

댓글: Atikah Surriani 2023년 5월 8일

What should be the values of Noise parameters (for agent) if my action range is between -0.5 to -5 in DDPG reinforcement learning I want to explore whole action range for each sample time? Also is there anyway to make the noise options (for agent) independent of sample time?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

채택된 답변

Drew Davis 2019년 6월 19일

3
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/467153-noise-parameters-in-reinforcement-learning-ddpg#answer_379990

편집: Drew Davis 2019년 6월 19일

MATLAB Online에서 열기

Hi Surya

It is fairly common to have Variance*sqrt(SampleTime) somewhere between 1 and 10% of your action range for Ornstein Uhlenbeck (OU) action noise. So in your case, the variance can be set between 4.5*0.01/sqrt(SampleTime) and 4.5*0.10/sqrt(SampleTime). The other important factor is the VarianceDecayRate, which will dictate how fast the variance will decay. You can calculate how many samples it will take for your variance to be halved by this simple formula:

halflife = log(0.5)/log(1-VarianceDecayRate)

It is critically important for your agent to explore while learning so keeping the VarianceDecayRate small (or even zero) is a good idea. The other noise parameters can usually be left as default.

You can check out this pendulum example which does a pretty good job of exploring during training.

The sample time of the noise options will be inherited by the agent, so it is not necessary to configure. By default, the noise model will be queried at the same rate as the agent.

Hope this helps

Drew