- Reward Function: Inspect your environment's step function. Ensure that the reward vector (or structure) includes a non-zero value for the first agent (“rlPPOAgent”).
- Agent Configuration: Make sure “rlPPOAgent” is correctly associated with its environment and policy.
- Environment Setup: You can double-check the environment setup to make sure all agents are interacting with it as intended.
- Training Parameters: Review the training parameters specific to the first agent, like the learning rate and discount factor.
I see a zero mean reward for the first agent in multi-agent RL Toolbox
조회 수: 6 (최근 30일)
이전 댓글 표시
Hello, I have extended the PPO Coverage coverage path planning example of the Matlab for 5 agents. I can see now that always, I have a reward for the first agent, and the problem is always, I see a zero mean reward in the toolbox for the first agent like the following image which is not correct. Do you have any idea what is happening there?

댓글 수: 0
답변 (1개)
TARUN
2025년 4월 22일
I understand that you are experiencing an issue with the reward for the first agent in your multi-agent PPO setup.
Here are a few things you can check to resolve the issue:
These are some of the ways that might help you to fix the problem. If not, please provide the code that you are working with so that I can take a deeper look.
Feel free to refer this documentation on “Agents”:
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Introduction to Installation and Licensing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!