필터 지우기
필터 지우기

Optimize RL Agent for a DC-Motor Speed control

조회 수: 32 (최근 30일)
Franz Schnyder
Franz Schnyder 2023년 4월 14일
댓글: 주원 2024년 5월 28일
Hello together
I am trying to replace a PI controller with a RL agent to achieve a simple speed control of a motor (at the moment without current control). I have managed so far that the RL agent behaves like a P-controller. It keeps its set speed well and can also correct it well and quickly in case of a step. However, it still has an error of 10-200 rpm (depends on the specified target speed). I am currently observing the current rpm, the error and the integrated error.
I punish it linearly at an error and reward it, from the point when it gets closer than 50 rpm to the desired rpm.
In the graph I have simulated a brake from the 2nd second onwards. The goal would be to get to the desired speed despite the brake. Unfortunately, an error remains present with a swing.
I slowly do not know what I could change to teach the RL agent a robust PI controller behavior and ask for possible suggestions. As a template for the actor and critic I use the example of the water tank.
Another problem is that the agent so far could learn the behavior only for positive speeds. Teaching it to behave in the negative range the same as in the positive range simply with the negative voltage has not worked yet.
Thanks for a possible answer.
  댓글 수: 2
madhav
madhav 2023년 11월 7일
Hi Franz,
were you able to control the speed now.If you had done pls share the code for my reference
주원
주원 2024년 5월 28일
hii can i get your prj details?

댓글을 달려면 로그인하십시오.

답변 (1개)

Yash Sharma
Yash Sharma 2023년 10월 20일
Hi Franz,
I understand that you want to replace the PI controller with an RL (Reinforcement Learning) agent and would like to increase the accuracy of the system. For achieving the same, you can consider the following:
  • ·Adjust the reward function: Instead of punishing the agent linearly for errors, you can use a reward function that penalizes larger errors more heavily, such as a quadratic or exponential penalty. This can help the agent prioritize reducing the error more effectively.
  • Experiment with different network architectures: Try experimenting with different architectures, such as increasing the depth or width of the neural networks used for the actor and critic. This can provide the agent with more capacity to learn complex control strategies.
  • You can try adjusting the exploration rate or using different exploration strategies, such as epsilon-greedy or noise-based exploration. This can allow the agent to explore a wider range of actions and potentially discover better control strategies.
  • Explore different reward structures: In addition to the error, consider incorporating other factors into the reward function. For example, you can include a term that rewards the agent for maintaining a stable and smooth response, such as penalizing large changes in control output. This can encourage the agent to learn a more robust and stable control strategy.
  • Adjust hyperparameters: Hyperparameters, such as learning rate, discount factor, and exploration rate decay, can significantly impact the learning process. Experiment with different values for these hyperparameters to find the ones that work best for your specific problem.
Please find links to below documentation which I believe will help you for further reference:
Hope this helps!

제품


릴리스

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by