필터 지우기
필터 지우기

Is it possible to implement a prioritized replay buffer (PER) in a TD3 agent?

조회 수: 4 (최근 30일)
Michael Müller
Michael Müller 2021년 6월 18일
답변: Ahmed R. Sayed 2022년 9월 30일
Hey,
I' trying to implement a TD3 Agent using MATLAB. But instead of using a replay buffer that randomly chooses samples to use in the mini batch, I would like to implememt a prioritized replay buffer instead. Until now, I couldn't find a agent option to do so.
I would be very grateful if somebody could help me with my problem.
Thanks in advance for the answers.
best regards
Michael

답변 (1개)

Ahmed R. Sayed
Ahmed R. Sayed 2022년 9월 30일
By default, built-in off-policy agents (DQN, DDPG, TD3, SAC, MBPO) use an rlReplayMemory object as their experience buffer. Agents uniformly sample data from this buffer. To perform nonuniform prioritized sampling [1], which can improve sample efficiency when training your agent, use an rlPrioritizedReplayMemory object. Please refere to rlprioritizedreplaymemory.
[1] Schaul, Tom, John Quan, Ioannis Antonoglou, and David Silver. 'Prioritized experience replay'. arXiv:1511.05952 [Cs] 25 February 2016. https://arxiv.org/abs/1511.05952.

카테고리

Help CenterFile Exchange에서 Training and Simulation에 대해 자세히 알아보기

제품


릴리스

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by