DDPG agent low performance

조회 수: 2 (최근 30일)
Armin Norouzi
Armin Norouzi 2021년 6월 3일
Hello everyone,
I am trying to train a DDPG agent for my system, and the goal is to generate actions (mf) to follow desired Torque. The attached figure shows episodic award vs. the number of episodes and plot of the system ( output, error, reward, and action). I set the output range from 5 to 30, but agent steel oscillating around these values. Although the training performance in episodic reward seems to converge, steel I am experiencing oscillatory response.
I would appreciate it if someone could help me with this matter.
here is my reward block in simulink:
It worth mentioning that I am using standar parameters for noise model:
agentOpts = rlDDPGAgentOptions(...
'SampleTime',Ts,...
'TargetSmoothFactor',1e-3,...
'DiscountFactor',0.99, ...
'MiniBatchSize',64, ...
'ExperienceBufferLength',1e6);
agentOpts.NoiseOptions.Variance = 0.05*(25/sqrt(Ts));
agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;
here is my actor and critic structure:
L = 500; % number of neurons
statePath = [
featureInputLayer(numObservations, 'Normalization', 'none', 'Name', 'observation')
fullyConnectedLayer(L, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(L, 'Name', 'fc2')
additionLayer(2,'Name','add')
reluLayer('Name','relu2')
fullyConnectedLayer(L, 'Name', 'fc3')
reluLayer('Name','relu3')
fullyConnectedLayer(1, 'Name', 'fc4')];
actionPath = [
featureInputLayer(numActions, 'Normalization', 'none', 'Name', 'action')
fullyConnectedLayer(L, 'Name', 'fc5')];
actorNetwork = [
featureInputLayer(numObservations, 'Normalization', 'none', 'Name', 'observation')
fullyConnectedLayer(L, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(L, 'Name', 'fc2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(L, 'Name', 'fc3')
reluLayer('Name', 'relu3')
fullyConnectedLayer(numActions, 'Name', 'fc4')
tanhLayer('Name','tanh1')
scalingLayer('Name','ActorScaling1','Scale',max(actInfo.UpperLimit))];
actorOptions = rlRepresentationOptions('LearnRate',1e-4,'GradientThreshold',1,'L2RegularizationFactor',1e-4);
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,...
'Observation',{'observation'},'Action',{'ActorScaling1'},actorOptions);

답변 (0개)

카테고리

Help CenterFile Exchange에서 Reinforcement Learning에 대해 자세히 알아보기

제품


릴리스

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by