Multi action agent programming in reinforcement learning

조회 수: 16 (최근 30일)
Nabil Jalil Aklo
Nabil Jalil Aklo 2020년 7월 11일
댓글: Nabil Jalil Aklo 2020년 7월 14일
Please, how can I program or represent multi action agent in reinforcement learning (DQN), where I could construct the agent but I do not know how can represent it (action with three decision every stage of learning) in step function. The action has three decision that are charging battery, operating first generator and operating second generator. The first part of code below show how I construct the enviroment and in the second part I ask how can I add this actions to the my step function.
Thank you in advance.
first part
clc
ObservationInfo = rlNumericSpec([4 1]);
ObservationInfo.Name = 'EnergSolar States';
ObservationInfo.Description = 'T,SOC,SOF,Temp';
ActionInfo = rlFiniteSetSpec({[-1 0 0],[-1 1 0],[-1 0 1],[-1 1 1],[0 0 0],[0 1 0],[0 0 1],[0 1 1],[1 0 0],[1 1 0],[1 0 1],[1 1 1]});
ActionInfo.Name = 'EnergSolar Action';
env = rlFunctionEnv(ObservationInfo,ActionInfo,'myStepFunctionfuel','myResetFunctionfuel');
obsInfo = getObservationInfo(env);
numObservations = obsInfo.Dimension(1);
actInfo = getActionInfo(env);
statePath = [
imageInputLayer([4 1 1], 'Normalization', 'none', 'Name', 'state')
fullyConnectedLayer(200, 'Name', 'CriticStateFC1')
reluLayer('Name', 'CriticRelu1')
fullyConnectedLayer(200, 'Name', 'CriticStateFC2')];
actionPath = [
imageInputLayer([1 3 1], 'Normalization', 'none', 'Name', 'action')
fullyConnectedLayer(200, 'Name', 'CriticActionFC1')];
commonPath = [
additionLayer(2,'Name', 'add')
reluLayer('Name','CriticCommonRelu')
fullyConnectedLayer(1, 'Name', 'output')];
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = addLayers(criticNetwork, commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
criticOpts = rlRepresentationOptions('LearnRate',0.002,'GradientThreshold',1);
critic = rlRepresentation(criticNetwork,obsInfo,actInfo,...
'Observation',{'state'},'Action',{'action'},criticOpts);
agentOpts = rlDQNAgentOptions(...
'UseDoubleDQN',false, ...
'TargetUpdateMethod',"periodic", ...
'TargetUpdateFrequency',4, ...
'ExperienceBufferLength',100000, ...
'DiscountFactor',0.99, ...
'MiniBatchSize',1000);%500 to 1000
agent = rlDQNAgent(critic,agentOpts);
trainOpts = rlTrainingOptions(...
'MaxEpisodes', 1000, ...
'MaxStepsPerEpisode', 500, ...
'Verbose', false, ...
'Plots','training-progress',...
'StopTrainingCriteria','EpisodeReward',...
'StopTrainingValue',0,...
'ScoreAveragingWindowLength',5);
trainingStats = train(agent,env,trainOpts);
Second part
%Balance eq.
Pg=PL-Ppv-bpr*(Action1);
if(Pg>Z)
if(Pg-Z<=150)
PDG1=Pg(T)-Z;
PDG2=0;
F(T)=A*PDG1+B*Pr;
Pg=Z;
else
if(Pg-Z<350)
PDG2=Pg-Z;
F=A*PDG2+B*Pr2;
PDG1=0;
Pg=Z;
elseif(Pg-Z<500)
PDG2=350;
PDG1=(Pg-Z-PDG2)*Action2;
F=A*(PDG1+PDG2)+B*(Pr1*Action2+Pr2*Action3);
Pg=Pg-Z-PDG1-PDG2;
end
end

답변 (1개)

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2020년 7월 13일
This example shows how to create an environment with multiple discrete actions. Hope that helps
  댓글 수: 3
Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2020년 7월 14일
All the elements are in ActionInfo.Elements. Is that what you need?
Nabil Jalil Aklo
Nabil Jalil Aklo 2020년 7월 14일
Let me explain what I need in this example:
If I have action vector consist of three elements at time,
ActionInfo = rlFiniteSetSpec({[-1 0 0],[-1 1 0],[-1 0 1],[-1 1 1],[0 0 0],[0 1 0],[0 0 1],[0 1 1],[1 0 0],[1 1 0],[1 0 1],[1 1 1]});
At any time, let the action vector became Action=[-1 0 1] these element represent three decisions to control battery charging, first generator control and second generator control, at mean time I want to apply the first element of this vector on the equation below
SOC=SOC+200*(first element of the action vector)
the question is how can I abstruct the first element from the vector.
Thank you in advance.

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by