Why Reinforcement Learning Agent Block's action output is not reset?
조회 수: 4 (최근 30일)
이전 댓글 표시
I'm working on the RL Agent problem learning a reference trajectory in a two-dimensional plane. The algorithm I use is DDPG. My agent is a missile, and that missile must learn the reference trajectory. For this, I set my states as follows:
Let the coordinates of the agent at time t be denoted by x_m, y_m, and the coordinates of the reference trajectory as x_r, y_r.
My states are as follows:
x_m, y_m, (x_m-x_r), (y_m-y_r) and integrals of these differences. (Inspired by Water-Tank Problem)
My action is as follows:
a = Lateral acceleration information (between -100 and 100)
Here is my simulink model:

Here is my reward function:

My problem is:
My network's acceleration values are in the range of (-1,1). By multiplying the output of my network by 100, I create the appropriate input for my environment. But I do the multiplication after adding the OU noise. So the variance value of my noise is 0.3.
Everything is fine so far, but something caught my attention during the simulation. While defining the observationInformation, I deliberately set the limit values in the range (-2.2). Then I saw that in each new episode, the output of the agentBlock is the last value in the previous episode.

On first episode, action gets -1.2 on the last step
On second episode, action starts with -1.2. --> Why? The Environment is reset, why is action not reset also?
Therefore, the environment continues to be explored over that value by adding noise on the last value. But shouldn't the agentBlock's action output be reset when the environment is reset in every episode?
When I pull the action limits to the range of (-1.1), the action value taken by my agent becomes saturated at one of the limit values and cannot learn anything.

I did everything to make the agent learn for 7 days but I started giving up slowly. I need some advice to understand where I went wrong.
Thanks in advance
댓글 수: 1
Venu
2023년 11월 22일
Could you share the reset function commands you have given while defining simulink environment? That would help to proceed debugging process further.
Please find this documentation for your reference.
https://www.mathworks.com/help/reinforcement-learning/ref/rl.env.simulinkenvwithagent.html
답변 (0개)
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!