Problems with reward generation in reinforcement learning simulation

Question

Aaron Bramhasta 2024년 9월 10일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2151810-problems-with-reward-generation-in-reinforcement-learning-simulation

댓글: Aaron Bramhasta 2024년 9월 25일

Hi everyone,

I am currently running a reinforcement learning model, integrated with simevents blocks in simulink. I have both a reinforcement learning script and the RL agent present in the simulink. Currently my reward function works based on a matlab function block that is connected to the reward input of RL agent block. I am facing an issue of constant reward generated throughout the entire episodes of RL iteration. Any ideas why? Because I try to assign the reward function (code below) as simple as possible, extract values from the entities of simevents, to generate values that are supposed to be different with each iteration.

function r = w(u1, u2, u3) %#codegen
    % Extract Entities
    FH = u1 + 1;
    Cost = u2 + 1;
    Downtime = u3 + 1;
    % Reward calculation based on values
    r = (Downtime/Cost) * FH;
end

There seems to be a problem as well because this reward area is red, eventhough the simulation runs normally.

I uploaded my model and a screenshot of the RL training result of the reward. If you are interested to replicate my results here are the steps:

Run script A.mlx to generate random number set A
Run script B.mlx to generate random number set B
Run MainScript.mlx to run the simulation

Thank you so much in advance! Let me know should you require any further information.

Best,

Aaron.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Subhajyoti 2024년 9월 13일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2151810-problems-with-reward-generation-in-reinforcement-learning-simulation#answer_1515795

Hi @Aaron Bramhasta,

It is my understanding that you are trying to train RL model, but the reward function is not updating as expected.

This is happening because the values of ‘FH’, ‘Cost’, and ‘Downtime’ are not being updated for the following iterations. For every episode, when the model is calling these values, it is taking the initial default value, generating the constant reward value.

To address this issue, you can either save the values to Workspace after each update or add a feedback loop to pass the updated values to the next iteration.

Refer to the following link for more information on ‘To Workspace’ block in Simulink:

https://www.mathworks.com/help/simulink/slref/toworkspace.html

Additionally, you can refer to the following resource to know more about ‘Reward and Observation Signals in Custom Environments’ in MATLAB:

https://www.mathworks.com/help/reinforcement-learning/ug/define-reward-and-observation-signals.html

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Aaron Bramhasta 2024년 9월 25일

Hi @Subhajyoti thank you for your reply, and apologies for a late response from me.

My model is in a form of a feedback loop already, so the updated values will always be passed for the next iteration. I don't get regarding saving the values to workspace, when should I call these values again?

Also, do you have an idea on why the reward generated from the matlab function, and the reward shown in the training manager differs hugely? The matlab function generates decimals below 1 as it should, but the training manager generates numbers in around 70.

Thanks in advance!

댓글을 달려면 로그인하십시오.

Problems with reward generation in reinforcement learning simulation

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Problems with reward generation in reinforcement learning simulation

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기