Custom DDPG Algorithm in MATLAB R2023b: Performing Gradient Ascent for Actor Network

Question

roham farhadi 2023년 12월 26일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2064177-custom-ddpg-algorithm-in-matlab-r2023b-performing-gradient-ascent-for-actor-network

댓글: Syed Adil Ahmed 2024년 8월 13일

Hello MATLAB community,

I am working on implementing a custom Deep Deterministic Policy Gradients (DDPG) algorithm in MATLAB R2023b. In the DDPG algorithm, during the training of the actor network, the Q value produced by the critic network is set as the objective function for the actor network. The standard approach involves using gradient ascent to update the actor network based on these Q values.

My question pertains to the use of the gradient function from the Reinforcement Learning Toolbox to calculate gradients. Following this, how can I perform gradient ascent, as the update function from the same toolbox seems to default to gradient descent and not gradient ascent? I would appreciate any insights or examples on implementing gradient ascent in this context.

Thank you for your assistance!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Venu 2024년 1월 8일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2064177-custom-ddpg-algorithm-in-matlab-r2023b-performing-gradient-ascent-for-actor-network#answer_1385111

편집: Venu 2024년 1월 8일

Hi @roham farhadi,

Gradient ascent is the same as gradient descent except that you don't multiply your step (learning_rate * gradients) by a negative sign. So your step has the same sign as your gradient.

If the update function defaults to gradient descent, you can adjust the sign of the gradients before updating the parameters.

actorNetwork.Parameters = actorNetwork.Parameters + learningRate * -gradients; (% Perform gradient ascent by adjusting the sign of the gradients)

You can refer to example in this documentation for 'gradient' function

https://www.mathworks.com/help/reinforcement-learning/ug/train-reinforcement-learning-policy-using-custom-training.html,

Hope this helps!

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Syed Adil Ahmed 2024년 8월 13일

Hey @Venu,

Is it possible to provide the documentation link again ? It shows up as:

"The page you were looking for does not exist. Use the search box or browse topics below to find the page you were looking for."

Thank you

댓글을 달려면 로그인하십시오.

Custom DDPG Algorithm in MATLAB R2023b: Performing Gradient Ascent for Actor Network

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Custom DDPG Algorithm in MATLAB R2023b: Performing Gradient Ascent for Actor Network

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기