DDPG algorithm/Experience Buffer/ rl.util.ExperienceBuffer

Question

hieu nguyen 2023년 4월 26일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1953509-ddpg-algorithm-experience-buffer-rl-util-experiencebuffer

답변: Aravind 2025년 2월 12일

I want to code my own DDPG algorithm. In the intial steps, the Batch size is bigger than the number of experiences in experience buffer, how can I still get enough sampled data for my miniBatch ?

I use rl.util.ExperienceBuffer to create my experience buffer and use createSampledExperienceMiniBatch(buffer,BatchSize) function to get datas for minBatch. However, when the data in experience buffer is smaller than the BatchSize, the function return 0x0 cell.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Aravind 2025년 2월 12일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1953509-ddpg-algorithm-experience-buffer-rl-util-experiencebuffer#answer_1559641

Hi @hieu nguyen,

To manage the initial phase when your experience buffer has fewer experiences than the desired mini-batch size in a DDPG algorithm, you consider one of these two options:

Adjust the Mini-Batch Size Dynamically: Use the current buffer size as the mini-batch size if it is smaller than the desired size when calling the “createSampledExperienceMiniBatch” function. This allows the agent to learn at each time step. However, the downside is that the agent might not explore sufficiently, potentially leading to a sub-optimal policy by the end of training.
Warm-Up Phase: Implement an initial phase where the agent collects experiences without updating the policy to ensure the buffer is adequately filled. This is a common approach in training a DDPG agent. The agent takes random actions until the buffer is filled with enough entries to match the batch size. Once the buffer length meets the batch size, you can begin training thus avoiding errors with the “createSampledExperienceMiniBatch” function.

I hope this addresses your query.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

DDPG algorithm/Experience Buffer/ rl.util.ExperienceBuffer

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

DDPG algorithm/Experience Buffer/ rl.util.ExperienceBuffer

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기