Easy way to evaluate / compare the performance of RL algorithm

Question

Saurav Sthapit 2020년 7월 29일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm

편집: Saurav Sthapit 2020년 8월 6일

I have a RL agent trained and would like to compare its performance with a dumb agent. I can run simout=sim(env,agent,simOpts) to evaluate the actual agent. But, I would like to compare the simulation results with a couple of dumb agents which always has the same action or random action. Is there any easy way to do this?

Currently, I have a seperate simulink model without RL agent block (replaced with constant block) and logging Observation and rewards using Simulation Data Inspector.

Thanks

Saurav

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Emmanouil Tzorakoleftherakis 2020년 8월 3일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm#answer_474718

Why not use a MATLAB Fcn block and implement the dummy agent in there? If you want random/constant actions should be just one line.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Saurav Sthapit 2020년 8월 6일

편집: Saurav Sthapit 2020년 8월 6일

Thanks, thats an excellent suggestion for evaluating random actions.

However, when I do that (or use constant blocks), I have to run two statements below: first one for evaluating random/dumb action and one for evaluating the agent.

logsout=sim(mdl)

simout=sim(env,agent,simOpts)

logsout and simout are not directly comparable, but logsout is a field in the simout.SimulationInfo struct.

I am wondering if this is the best approach or if there is a easy way to do this.

Also, simout contains action, observation and reward but if the reward is weighted sum of multiple rewards, I can't access the individual rewards. ( Of course, i can compare logsout with simout.logsout)

댓글을 달려면 로그인하십시오.

Easy way to evaluate / compare the performance of RL algorithm

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Easy way to evaluate / compare the performance of RL algorithm

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기