Multi action agent programming in reinforcement learning
이전 댓글 표시
Please, how can I program or represent multi action agent in reinforcement learning (DQN), where I could construct the agent but I do not know how can represent it (action with three decision every stage of learning) in step function. The action has three decision that are charging battery, operating first generator and operating second generator. The first part of code below show how I construct the enviroment and in the second part I ask how can I add this actions to the my step function.
Thank you in advance.
first part
clc
ObservationInfo = rlNumericSpec([4 1]);
ObservationInfo.Name = 'EnergSolar States';
ObservationInfo.Description = 'T,SOC,SOF,Temp';
ActionInfo = rlFiniteSetSpec({[-1 0 0],[-1 1 0],[-1 0 1],[-1 1 1],[0 0 0],[0 1 0],[0 0 1],[0 1 1],[1 0 0],[1 1 0],[1 0 1],[1 1 1]});
ActionInfo.Name = 'EnergSolar Action';
env = rlFunctionEnv(ObservationInfo,ActionInfo,'myStepFunctionfuel','myResetFunctionfuel');
obsInfo = getObservationInfo(env);
numObservations = obsInfo.Dimension(1);
actInfo = getActionInfo(env);
statePath = [
imageInputLayer([4 1 1], 'Normalization', 'none', 'Name', 'state')
fullyConnectedLayer(200, 'Name', 'CriticStateFC1')
reluLayer('Name', 'CriticRelu1')
fullyConnectedLayer(200, 'Name', 'CriticStateFC2')];
actionPath = [
imageInputLayer([1 3 1], 'Normalization', 'none', 'Name', 'action')
fullyConnectedLayer(200, 'Name', 'CriticActionFC1')];
commonPath = [
additionLayer(2,'Name', 'add')
reluLayer('Name','CriticCommonRelu')
fullyConnectedLayer(1, 'Name', 'output')];
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = addLayers(criticNetwork, commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
criticOpts = rlRepresentationOptions('LearnRate',0.002,'GradientThreshold',1);
critic = rlRepresentation(criticNetwork,obsInfo,actInfo,...
'Observation',{'state'},'Action',{'action'},criticOpts);
agentOpts = rlDQNAgentOptions(...
'UseDoubleDQN',false, ...
'TargetUpdateMethod',"periodic", ...
'TargetUpdateFrequency',4, ...
'ExperienceBufferLength',100000, ...
'DiscountFactor',0.99, ...
'MiniBatchSize',1000);%500 to 1000
agent = rlDQNAgent(critic,agentOpts);
trainOpts = rlTrainingOptions(...
'MaxEpisodes', 1000, ...
'MaxStepsPerEpisode', 500, ...
'Verbose', false, ...
'Plots','training-progress',...
'StopTrainingCriteria','EpisodeReward',...
'StopTrainingValue',0,...
'ScoreAveragingWindowLength',5);
trainingStats = train(agent,env,trainOpts);
Second part
%Balance eq.
Pg=PL-Ppv-bpr*(Action1);
if(Pg>Z)
if(Pg-Z<=150)
PDG1=Pg(T)-Z;
PDG2=0;
F(T)=A*PDG1+B*Pr;
Pg=Z;
else
if(Pg-Z<350)
PDG2=Pg-Z;
F=A*PDG2+B*Pr2;
PDG1=0;
Pg=Z;
elseif(Pg-Z<500)
PDG2=350;
PDG1=(Pg-Z-PDG2)*Action2;
F=A*(PDG1+PDG2)+B*(Pr1*Action2+Pr2*Action3);
Pg=Pg-Z-PDG1-PDG2;
end
end
답변 (1개)
Emmanouil Tzorakoleftherakis
2020년 7월 13일
0 개 추천
This example shows how to create an environment with multiple discrete actions. Hope that helps
댓글 수: 3
Nabil Jalil Aklo
2020년 7월 13일
Emmanouil Tzorakoleftherakis
2020년 7월 14일
All the elements are in ActionInfo.Elements. Is that what you need?
Nabil Jalil Aklo
2020년 7월 14일
카테고리
도움말 센터 및 File Exchange에서 Reinforcement Learning에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!