Is it possible to implement a prioritized replay buffer (PER) in a TD3 agent?
조회 수: 4 (최근 30일)
이전 댓글 표시
Hey,
I' trying to implement a TD3 Agent using MATLAB. But instead of using a replay buffer that randomly chooses samples to use in the mini batch, I would like to implememt a prioritized replay buffer instead. Until now, I couldn't find a agent option to do so.
I would be very grateful if somebody could help me with my problem.
Thanks in advance for the answers.
best regards
Michael
댓글 수: 0
답변 (1개)
Ahmed R. Sayed
2022년 9월 30일
By default, built-in off-policy agents (DQN, DDPG, TD3, SAC, MBPO) use an rlReplayMemory object as their experience buffer. Agents uniformly sample data from this buffer. To perform nonuniform prioritized sampling [1], which can improve sample efficiency when training your agent, use an rlPrioritizedReplayMemory object. Please refere to rlprioritizedreplaymemory.
[1] Schaul, Tom, John Quan, Ioannis Antonoglou, and David Silver. 'Prioritized experience replay'. arXiv:1511.05952 [Cs] 25 February 2016. https://arxiv.org/abs/1511.05952.
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Training and Simulation에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!