Multi action agent programming in reinforcement learning

Question

0 개 추천

Please, how can I program or represent multi action agent in reinforcement learning (DQN), where I could construct the agent but I do not know how can represent it (action with three decision every stage of learning) in step function. The action has three decision that are charging battery, operating first generator and operating second generator. The first part of code below show how I construct the enviroment and in the second part I ask how can I add this actions to the my step function.

Thank you in advance.

first part

clc

ObservationInfo = rlNumericSpec([4 1]);

ObservationInfo.Name = 'EnergSolar States';

ObservationInfo.Description = 'T,SOC,SOF,Temp';

ActionInfo = rlFiniteSetSpec({[-1 0 0],[-1 1 0],[-1 0 1],[-1 1 1],[0 0 0],[0 1 0],[0 0 1],[0 1 1],[1 0 0],[1 1 0],[1 0 1],[1 1 1]});

ActionInfo.Name = 'EnergSolar Action';

env = rlFunctionEnv(ObservationInfo,ActionInfo,'myStepFunctionfuel','myResetFunctionfuel');

obsInfo = getObservationInfo(env);

numObservations = obsInfo.Dimension(1);

actInfo = getActionInfo(env);

statePath = [

imageInputLayer([4 1 1], 'Normalization', 'none', 'Name', 'state')

fullyConnectedLayer(200, 'Name', 'CriticStateFC1')

reluLayer('Name', 'CriticRelu1')

fullyConnectedLayer(200, 'Name', 'CriticStateFC2')];

actionPath = [

imageInputLayer([1 3 1], 'Normalization', 'none', 'Name', 'action')

fullyConnectedLayer(200, 'Name', 'CriticActionFC1')];

commonPath = [

additionLayer(2,'Name', 'add')

reluLayer('Name','CriticCommonRelu')

fullyConnectedLayer(1, 'Name', 'output')];

criticNetwork = layerGraph(statePath);

criticNetwork = addLayers(criticNetwork, actionPath);

criticNetwork = addLayers(criticNetwork, commonPath);

criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');

criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');

criticOpts = rlRepresentationOptions('LearnRate',0.002,'GradientThreshold',1);

critic = rlRepresentation(criticNetwork,obsInfo,actInfo,...

'Observation',{'state'},'Action',{'action'},criticOpts);

agentOpts = rlDQNAgentOptions(...

'UseDoubleDQN',false, ...

'TargetUpdateMethod',"periodic", ...

'TargetUpdateFrequency',4, ...

'ExperienceBufferLength',100000, ...

'DiscountFactor',0.99, ...

'MiniBatchSize',1000);%500 to 1000

agent = rlDQNAgent(critic,agentOpts);

trainOpts = rlTrainingOptions(...

'MaxEpisodes', 1000, ...

'MaxStepsPerEpisode', 500, ...

'Verbose', false, ...

'Plots','training-progress',...

'StopTrainingCriteria','EpisodeReward',...

'StopTrainingValue',0,...

'ScoreAveragingWindowLength',5);

trainingStats = train(agent,env,trainOpts);

Second part

%Balance eq.

Pg=PL-Ppv-bpr*(Action1);

if(Pg>Z)

if(Pg-Z<=150)

PDG1=Pg(T)-Z;

PDG2=0;

F(T)=A*PDG1+B*Pr;

Pg=Z;

else

if(Pg-Z<350)

PDG2=Pg-Z;

F=A*PDG2+B*Pr2;

PDG1=0;

Pg=Z;

elseif(Pg-Z<500)

PDG2=350;

PDG1=(Pg-Z-PDG2)*Action2;

F=A*(PDG1+PDG2)+B*(Pr1*Action2+Pr2*Action3);

Pg=Pg-Z-PDG1-PDG2;

end

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Emmanouil Tzorakoleftherakis 2020년 7월 13일

0 개 추천

This example shows how to create an environment with multiple discrete actions. Hope that helps

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

Emmanouil Tzorakoleftherakis 2020년 7월 14일

All the elements are in ActionInfo.Elements. Is that what you need?

Nabil Jalil Aklo 2020년 7월 14일

Let me explain what I need in this example:

If I have action vector consist of three elements at time,

ActionInfo = rlFiniteSetSpec({[-1 0 0],[-1 1 0],[-1 0 1],[-1 1 1],[0 0 0],[0 1 0],[0 0 1],[0 1 1],[1 0 0],[1 1 0],[1 0 1],[1 1 1]});

At any time, let the action vector became Action=[-1 0 1] these element represent three decisions to control battery charging, first generator control and second generator control, at mean time I want to apply the first element of this vector on the equation below

SOC=SOC+200*(first element of the action vector)

the question is how can I abstruct the first element from the vector.

Thank you in advance.

댓글을 달려면 로그인하십시오.

Multi action agent programming in reinforcement learning

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

카테고리

제품

태그

Community Treasure Hunt

Multi action agent programming in reinforcement learning

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 3 이전 댓글 1개 표시 이전 댓글 1개 숨기기

카테고리

제품

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기