Training Quadrotor using PPO agent

조회 수: 8 (최근 30일)
Mahmoud Chick Zaouali
Mahmoud Chick Zaouali 2022년 4월 27일
So I am trying to control a quadrotor model using Reinforcement learning. My agent will control my quadrotor and make it navigating to a desired position or following a path. Right now I am trying to train my PPO agent to hover the quadrotor. I built a dynamical model of the quadrotor with 6DOF block. After that I built the observation and reward function of my agent.
I coded the actor critic network and set my parameters.The problem is my reward function is always equals to 0 and my agent is not learning and I am suspuscious that I didn't build the environment correctly. I have been working on my model for long period and couldn't make my agent learn a little. I will really be glad if someone can support me on this issue.
I attached my quadrotor Reinforcement learning model with actor and critic codes.

답변 (1개)

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2022년 4월 28일
편집: Emmanouil Tzorakoleftherakis 2022년 4월 28일
Hello,
There are multiple things not set up properly, including:
1) The isdone flag seems to be 1 all the time leading to episodes terminating early, after a single step
2) The reward signal is often not a scalar real number. One reason is that you are trying to calculate the sq root of a negative number
3) Your Simulink model has a lot of algebraic loops - I would get rid of those to make sure they don't interfere with training.
Hope that helps
  댓글 수: 1
Unmanned Aerial and Space Systems
Hi, like this problem, I shared my model:
https://www.mathworks.com/matlabcentral/answers/1708930-reinforcement-learning-based-quadrotor-control-using-soft-actor-critic-the-reward-is-not-converging?s_tid=prof_contriblnk

댓글을 달려면 로그인하십시오.

제품


릴리스

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by