Changing how DQN agent explores
조회 수: 2 (최근 30일)
이전 댓글 표시
Hi,
I'm using a DQN agent with epsilon-greedy exploration. The problem is that my agent sees state 1 99% of the time, so it never learns to act in other states. By the time it learns to get to state 2 from state 1, epsilon has already decayed significantly and the agent gets stuck taking a sub-optimal action in state 2. Is there a way to implement some other form of exploration, like using a Boltzmann distribution? Thanks for your time.
댓글 수: 2
Tanay Gupta
2021년 7월 13일
Can you give a brief description of the states and the respective transitions?
답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Training and Simulation에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!