DAMODARAN B.K

Last seen: 거의 5년 전 | 2021년부터 활동

Followers: 0 Following: 0

통계

Feeds

All (2)
MATLAB Answers (2)

질문

Why RL agent performs same actions repeatedly still it does not constitute optimal policy or better episode Q0.Can anyone explain?

5년 초과 전 | 답변 수: 0 | 0

0

답변

질문

Episode Q0 increases exponentially
Can anyone explain why episode Q0 in RL increases exponentially after convergence of reward to a suboptimal policy?

5년 초과 전 | 답변 수: 1 | 0

1

답변