![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/274552/image.png)
number of look ahead steps in DDPG Agent Options
조회 수: 1 (최근 30일)
이전 댓글 표시
I want to know how does the parameter "NumStepsToLookAhead" in rlDDPGAgentOptions from reinforcement learning toolboxof matlab 2019b works?
- Whether the look ahead is done on target networks? (like modification in critic objective, from {r+gamma*Qt - Q} to {r+ sum(gamma**i*Qt) -Q}
- Or the look ahead is done on reward sampling itself? ( like changing reward "r" from each sample to "r+gamma*r_t+gamma**2*r_t+1+...
Any help is highly appreciated.
댓글 수: 0
답변 (1개)
Anh Tran
2020년 3월 1일
I am not sure what does reward sampling mean. "NumStepsToLookAhead" in rlDDPGAgentOptions changes the critic's target values in step 5 of DDPG training algorithm.
Assume g is the discount factor, the critic target will be as followed
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/274552/image.png)
댓글 수: 4
Dingshan Sun
2022년 9월 1일
Could you give a hint how R_t,R_t_1,,R_t+2,...,R_t+n-1 can be obtained in an online off-policy algorithm? Especially for DRL methods that use an experience replay?
참고 항목
카테고리
Help Center 및 File Exchange에서 Environments에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!