Reinforcement Learning Toolbox - When does algorithm train?

조회 수: 1 (최근 30일)
I am currently using the RL-Toolbox with a DQN-Agent built into a long-running process-simulation.
The maximum stepcount is currently 8000 steps per episode.
Unfortunately the documentation seems a little ambiguous to me, so here my question:
Doese the train-function of the RL-Toolbox train the agent at the end of an episode or during the episode when the step count exeeds the minibatch-size (like in the baseline algorithms)?
Thank you in advance.

채택된 답변

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2019년 9월 25일
The implementation is based on the algorithm listed here.
Weights are being updated at each time step.
  댓글 수: 1
Hans-Joachim Steinort
Hans-Joachim Steinort 2019년 9월 26일
"For each training time step" - that was the line I was looking for (yet looking into the source code lead me to the same conclusion).
After double-checking the baseline-algorithms I found that they do it the same way.
Thank you for your time!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

제품


릴리스

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by