Hello!
I'm implementing a Double Q-Learning Algorithm to the pendulum-v0. I'm looking at the pseudocode and there's one part of the algorithm which I'm not sure how to write.
If you see on line 8, it states:
With Pr = 0.5:
Q_a(s,a)
else
Q_b(s,a)
Alternatively the choice between Q_a and Q_b can be random. How can I implement this in my code so that for each episode either Q_a or Q_b are selected and run?
Thank you in advance!

댓글 수: 1

P = 0.5; % the split level
if rand()>=P
res=fnA();
else
res=fnB();
end

댓글을 달려면 로그인하십시오.

 채택된 답변

David Hill
David Hill 2020년 8월 18일

0 개 추천

randbit=randi(2,1,numEpisodes)-1;
for k=1:numEpisodes
if randbit(k)
Q_a(s,a)
else
Q_b(s,a)
end
end

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기

질문:

2020년 8월 18일

댓글:

dpb
2020년 8월 18일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by