solve critic overestimate and how to explore specific action range
조회 수: 4 (최근 30일)
이전 댓글 표시
hello
im using a ddpg agent to tune a robot controller.all of my rewards are negetive and my critic learning rate is 0.01 and my actor learning rate is 0.0001 with adan optimizer and my gradient tresholds are 1. i have tow questions :
1- when my action ange is between [0.00001 0.2] my q0 predict a negetive value too(although with a large bias over actual value) but when my action range is between[0.00001 0.5] my critic have large overstimating around big positive values. why this happen with using bigger action range?
2- i define my action range between [0.00001 0.5] but i know my best action sit somewhere about [0.1 0.2] most of the time. how should i define my actor to explore this range more? is this related to noise option? how should i define ornstein-ohlenbeck noise option to explore this area?
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1493642/image.png)
댓글 수: 0
답변 (0개)
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!