Why reinforcement learning has different results of action between sim() and getAction()?

Question

Shuyue Li 2023년 9월 7일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2018301-why-reinforcement-learning-has-different-results-of-action-between-sim-and-getaction

답변: Emmanouil Tzorakoleftherakis 2023년 9월 25일

Hi Matlab reinforcement learning team

I have a well-trained PPO actor-critic agent and turned UseExplorationPolicy to 0 to obtain actions from sim() and getAction() respectively without any random setting in env. They share the same observations and agents.

However, the actions obtained from sim() and getAction() are different, though the actions can be reproduced respectively.

Thus, I would like to know how sim() generates actions. Does action come from actor network? If so, why the results are different with the same network?

code

actoraction = getAction(saved_agent,{testobstate});

ResetHandleT = @() myResetFunctionCNsim(testData,testobstate);

StepHandleT = @(Action,StockSaved) myStepFunctionCNsim(Action,StockSaved,testData,testobstate);

envT = rlFunctionEnv(observationInfo,actionInfo,StepHandleT,ResetHandleT);

experience = sim(envT,saved_agent,simOpts);

Look forward to your reply.

Sincerely,

Shuyue

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Emmanouil Tzorakoleftherakis 2023년 9월 25일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2018301-why-reinforcement-learning-has-different-results-of-action-between-sim-and-getaction#answer_1317957

Hi,

Which release are you using? We tried in R2023a and R2023b with UseExplorationPolicy =0 and getAction and sim provide the same results. A reproduction model would be great.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Why reinforcement learning has different results of action between sim() and getAction()?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Why reinforcement learning has different results of action between sim() and getAction()?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기