training is not efficient for RL agent
이전 댓글 표시
Hello,
I am trying to use the Reinforcement Learning toolbox for an energy optimization problem. I have started by a simple RL agent the DQN and the critic network code is as shown below
nI = 4; % number of inputs (4)
nL = 400; % number of neurons
nO = 101; % number of possible outputs (101)
dnn = [
featureInputLayer(nI,'Normalization','none','Name','state')
fullyConnectedLayer(nL,'Name','fc1')
reluLayer('Name','relu1')
fullyConnectedLayer(nL/2,'Name','fc2')
reluLayer('Name','relu2')
fullyConnectedLayer(nL/4,'Name','fc3')
reluLayer('Name','relu3')
fullyConnectedLayer(nO,'Name','fc4')];
figure(1)
plot(layerGraph(dnn)
I have used the following options for the critic, agent, and training respectively.
criticOpts = rlRepresentationOptions('LearnRate',0.1,'GradientThreshold',1,...
'UseDevice','gpu');
agentOpts = rlDQNAgentOptions(...
'UseDoubleDQN',false, ...
'ExperienceBufferLength',1e5, ...
'DiscountFactor',0.99, ...
'MiniBatchSize',256,...
'SaveExperienceBufferWithAgent',true,...
'SampleTime',1);
trainOpts = rlTrainingOptions(...
'MaxEpisodes',1000,...
'MaxStepsPerEpisode',3000,...
'StopTrainingCriteria',"AverageReward",...
'StopTrainingValue',0,...
'Verbose',false,...
'Plots',"training-progress",...
'SaveAgentDirectory','D:\ADVISOR_Exp\RL_Exp');
trainstats=train(agent,env,trainOpts);
However I did not get good results when testing the agent, the agent does not evolve with time and after several hundered episodes the reward still oscilates as shown in the figure.

I have tried different critic network archituctures (with state and action paths) and different agents (Q-learnign agent, and DDPG) with similar options but no luck. And also I have tried using different reward and I have tuned the reward function. What should I do to
답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Reinforcement Learning에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!