Saved Agent gives me constatn output always..

Question

sungho park 2022년 1월 17일

1
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1630155-saved-agent-gives-me-constatn-output-always

답변: Yash Sharma 2024년 1월 22일

Hi, i'm using Reinforcment learning in matlab and i found out some issue.

i can see in the training session that the Input value is chaging, however after training session when i runs with saved agent it doesn't show Input value like training session.

%% Create observation specification

obsInfo = rlNumericSpec([3 1]);

obsInfo.Name = 'observations';

numObs = obsInfo.Dimension(1);

%% Create action specification

actInfo = rlNumericSpec([1 1],'LowerLimit',-15,'UpperLimit',15);

%actInfo = rlNumericSpec([1 1]);

actInfo.Name = 'current';

numActions = actInfo.Dimension(1);

%% Create the environment

blk= [mdl '/RL Agent'];

env = rlSimulinkEnv(mdl,blk,obsInfo,actInfo);

env.ResetFcn= @(in)setVariable(in,'current0',5,'Workspace',mdl);

env.UseFastRestart = 'off';

Ts= param.dt;

Tf= param.end_time;

rng(0)

%% Create DDPG Agent

statePath = [

featureInputLayer(numObs,'Normalization','none','Name','observations')

fullyConnectedLayer(200,'Name','CriticStateFC1')

reluLayer('Name', 'CriticRelu1')

fullyConnectedLayer(200,'Name','CriticStateFC2')];

actionPath = [

featureInputLayer(1,'Normalization','none','Name','action')

fullyConnectedLayer(200,'Name','CriticActionFC1','BiasLearnRateFactor',0)];

commonPath = [

additionLayer(2,'Name','add')

reluLayer('Name','CriticCommonRelu')

fullyConnectedLayer(1,'Name','CriticOutput')];

criticNetwork = layerGraph(statePath);

criticNetwork = addLayers(criticNetwork,actionPath);

criticNetwork = addLayers(criticNetwork,commonPath);

criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');

criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');

figure

plot(criticNetwork)

criticOpts = rlRepresentationOptions('LearnRate',1e-03,'GradientThreshold',1);

%% Create the criticrepresentation using the specified deep neural

% network and options

critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,'Observation', ...

{'observations'},'Action',{'action'},criticOpts);

%% create the actor

actorNetwork = [

featureInputLayer(numObs,'Normalization','none','Name','observations')

fullyConnectedLayer(200,'Name','ctorFC1')

reluLayer('Name','ActorRelu1')

fullyConnectedLayer(200,'Name','ActorFC2')

reluLayer('Name','ActorRelu2')

fullyConnectedLayer(1,'Name','ActorFC3')

tanhLayer('Name','ActorTanh')

scalingLayer('Name','ActorScaling','Scale',max(actInfo.UpperLimit))];

actorOpts = rlRepresentationOptions('LearnRate',1e-04,'GradientThreshold',1);

actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo, ...

'Observation',{'observations'},'Action',{'ActorScaling'},actorOpts);

%% Create the DDPG agent optioon

agentOpts = rlDDPGAgentOptions(...

'SampleTime',Ts,...

'TargetSmoothFactor',1,...

'ExperienceBufferLength',1e6,...

'DiscountFactor',0.99,...

'MiniBatchSize',64);

agentOpts.NoiseOptions.Variance = 0.1;

agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;

agent = rlDDPGAgent(actor,critic,agentOpts);

%% Train Agent

maxepisodes = 2;

maxsteps = ceil(Tf/Ts);

trainOpts = rlTrainingOptions(...

'MaxEpisodes',maxepisodes,...

'MaxStepsPerEpisode',maxsteps,...

'ScoreAveragingWindowLength',50,...

'Verbose',false,...

'Plots','training-progress',...

'StopTrainingCriteria','AverageReward',...

'StopTrainingValue',500,...

'SaveAgentCriteria','EpisodeReward',...

'SaveAgentValue',0);

doTraining = true;

%if doTraining

% Train the agent.

% trainingStats = train(agent,env,trainOpts);

%else

% Load the pretrained agent for the example.

% load('agent_1000episodes.mat','agent')

%end

trainingStats = train(agent,env,trainOpts);

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Florian Rosner 2022년 1월 19일

The behaviour during training might not be the same as in the simulation afterwards, due to the way the network is updated. However the number of episodes is according to my feeling pretty low. Did you tried to train for more episodes?

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Yash Sharma 2024년 1월 22일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1630155-saved-agent-gives-me-constatn-output-always#answer_1394201

MATLAB Online에서 열기

Hi Sungho Park,

I understand that you have a pretrained RL DDPG agent and you want to load that agent in MATLAB, when you load a pretrained RL DDPG agent using the “load” function, it only loads the agent object itself, not the underlying network weights.

To effectively load the pretrained agent network into the RL DDPG network in MATLAB Simulink training, you can follow these steps:

Save the network weights separately: Before saving the agent to a MAT file, extract the network weights from the actor and critic networks using the “getLearnableParameters” function and save these network weights to separate variables.
Load the network weights and agent configuration: When loading the pretrained agent, use the "load" function to load the network weights and agent configuration from the MAT file. Assign the loaded network weights to the actor and critic networks of a new DDPG agent.

Pretrained_agent_flag = true; 
if (Pretrained_agent_flag == true) 
    % Load the pretrained agent 
    pretrainedAgentData = load('MyAgent.mat'); 
    
    % Extract the network weights from the loaded agent 
    actorWeights = getLearnableParameters(pretrainedAgentData.agent.actor); 
    criticWeights = getLearnableParameters(pretrainedAgentData.agent.critic); 
    
    % Create new actor and critic networks with the loaded weights 
    actorNetwork = setLearnableParameters (actorWeights); 
    criticNetwork = setLearnableParameters (criticWeights); 
    
    % Create a new DDPG agent with the loaded network weights and configuration 
    agent = rlDDPGAgent(actorNetwork, criticNetwork, agentOptions); 
else 
    % Create a new DDPG agent 
    agent = rlDDPGAgent(actor, critic, agentOptions); 
end 
trainingResults = train(agent, env, trainingOptions); 

Following are documentation links which I believe will help you for further reference: 

getLearnableParameters: https://www.mathworks.com/help/reinforcement-learning/ref/rl.policy.rlmaxqpolicy.getlearnableparameters.html
setLearnableParameters: https://www.mathworks.com/help/reinforcement-learning/ref/rl.policy.rlmaxqpolicy.setlearnableparameters.html
rlDDPGAgent: https://www.mathworks.com/help/reinforcement-learning/ref/rl.agent.rlddpgagent.html?searchHighlight=plot&s_tid=doc_srchtitle

Hope this helps!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Saved Agent gives me constatn output always..

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

Saved Agent gives me constatn output always..

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기