DDPG agent low performance

Question

Armin Norouzi 2021년 6월 3일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/847105-ddpg-agent-low-performance

Hello everyone,

I am trying to train a DDPG agent for my system, and the goal is to generate actions (mf) to follow desired Torque. The attached figure shows episodic award vs. the number of episodes and plot of the system ( output, error, reward, and action). I set the output range from 5 to 30, but agent steel oscillating around these values. Although the training performance in episodic reward seems to converge, steel I am experiencing oscillatory response.

I would appreciate it if someone could help me with this matter.

here is my reward block in simulink:

It worth mentioning that I am using standar parameters for noise model:

agentOpts = rlDDPGAgentOptions(...
 'SampleTime',Ts,...
 'TargetSmoothFactor',1e-3,...
 'DiscountFactor',0.99, ...
 'MiniBatchSize',64, ...
 'ExperienceBufferLength',1e6);
agentOpts.NoiseOptions.Variance = 0.05*(25/sqrt(Ts));
agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;

here is my actor and critic structure:

L = 500; % number of neurons
statePath = [
    featureInputLayer(numObservations, 'Normalization', 'none', 'Name', 'observation')
    fullyConnectedLayer(L, 'Name', 'fc1')
    reluLayer('Name', 'relu1')
    fullyConnectedLayer(L, 'Name', 'fc2')
    additionLayer(2,'Name','add')
    reluLayer('Name','relu2')
    fullyConnectedLayer(L, 'Name', 'fc3')
    reluLayer('Name','relu3')
    fullyConnectedLayer(1, 'Name', 'fc4')];
actionPath = [
    featureInputLayer(numActions, 'Normalization', 'none', 'Name', 'action')
    fullyConnectedLayer(L, 'Name', 'fc5')];
actorNetwork = [
    featureInputLayer(numObservations, 'Normalization', 'none', 'Name', 'observation')
    fullyConnectedLayer(L, 'Name', 'fc1')
    reluLayer('Name', 'relu1')
    fullyConnectedLayer(L, 'Name', 'fc2')
    reluLayer('Name', 'relu2')
    fullyConnectedLayer(L, 'Name', 'fc3')
    reluLayer('Name', 'relu3')
    fullyConnectedLayer(numActions, 'Name', 'fc4')
    tanhLayer('Name','tanh1')
    scalingLayer('Name','ActorScaling1','Scale',max(actInfo.UpperLimit))];
actorOptions = rlRepresentationOptions('LearnRate',1e-4,'GradientThreshold',1,'L2RegularizationFactor',1e-4);
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,...
 'Observation',{'observation'},'Action',{'ActorScaling1'},actorOptions);

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

DDPG agent low performance

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

DDPG agent low performance

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기