matlab 2021a RL SAC Error

조회 수: 3 (최근 30일)
요셉 이
요셉 이 2023년 1월 31일
i got this error anyone has idea what is the problem ?
Caused by: Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 667) Invalid input argument type or size such as observation, reward, isdone or loggedSignals. Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 667) One input argument can have dimension labels only when the other input argument is an unformatted scalar. Use .* for element-wise multiplication.
-------------------------------------------
numObs = 21; % Number of observations
numAct = 2; % Number of actions
obsInfo = rlNumericSpec([numObs 1]);
actInfo = rlNumericSpec([numAct 1]);
actInfo.LowerLimit = -1;
actInfo.UpperLimit = 1;
mdl = 'RLagentsimlSAC';
agentblk = ['RLagentsimlSAC/CarMaker/VehicleControl/CreateBus VhclCtrl/RL Agent'];
env = rlSimulinkEnv(mdl,agentblk,obsInfo,actInfo);
env.ResetFcn = @(in) setVariable(in , 'x0',rand());
rng(0)
% open_system(mdl)
Ts = 0.1;
Tf = 150;
cnet = [
featureInputLayer(numObs,"Normalization","none","Name","observation")
fullyConnectedLayer(128,"Name","fc1")
concatenationLayer(1,2,"Name","concat")
reluLayer("Name","relu1")
fullyConnectedLayer(64,"Name","fc3")
reluLayer("Name","relu2")
fullyConnectedLayer(32,"Name","fc4")
reluLayer("Name","relu3")
fullyConnectedLayer(1,"Name","CriticOutput")];
actionPath = [
featureInputLayer(numAct,"Normalization","none","Name","action")
fullyConnectedLayer(128,"Name","fc2")];
criticNetwork = layerGraph(cnet);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = connectLayers(criticNetwork,"fc2","concat/in2");
% plot(criticNetwork)
criticdlnet = dlnetwork(criticNetwork,'Initialize',false);
criticdlnet1 = initialize(criticdlnet);
criticdlnet2 = initialize(criticdlnet);
criticOptions = rlRepresentationOptions('LearnRate',1e-3,'GradientThreshold',1,'L2RegularizationFactor',1e-4);
critic1 = rlQValueRepresentation(criticdlnet1,obsInfo,actInfo, ...
'Observation',{'observation'},'Action',{'action'},criticOptions);
critic2 = rlQValueRepresentation(criticdlnet2,obsInfo,actInfo, ...
'Observation',{'observation'},'Action',{'action'},criticOptions);
% Create the actor network layers.
statePath = [
featureInputLayer(numObs,"Normalization","none","Name","observation")
fullyConnectedLayer(128,"Name","fc1")
reluLayer("Name","relu1")
fullyConnectedLayer(64,"Name","fc2")
reluLayer("Name","relu2")];
meanPath = [
fullyConnectedLayer(32,"Name","MeanFC1")
reluLayer("Name","relu3")
fullyConnectedLayer(numAct,"Name","Mean")];
stdPath = [
fullyConnectedLayer(numAct,"Name","StdFC")
reluLayer("Name","relu4")
softplusLayer("Name","StandardDeviation")];
concatPath = concatenationLayer(1,2,'Name','GaussianParameters');
% Connect the layers.
actorNetwork = layerGraph(statePath);
actorNetwork = addLayers(actorNetwork,meanPath);
actorNetwork = addLayers(actorNetwork,stdPath);
actorNetwork = addLayers(actorNetwork,concatPath);
actorNetwork = connectLayers(actorNetwork,'relu2','MeanFC1/in');
actorNetwork = connectLayers(actorNetwork,'relu2','StdFC/in');
actorNetwork = connectLayers(actorNetwork,'Mean','GaussianParameters/in1');
actorNetwork = connectLayers(actorNetwork,'StandardDeviation','GaussianParameters/in2');
% plot(actorNetwork)
actordlnet = dlnetwork(actorNetwork);
actorOptions = rlRepresentationOptions('Optimizer','adam','LearnRate',1e-3,...
'GradientThreshold',1,'L2RegularizationFactor',1e-5);
actor = rlStochasticActorRepresentation(actordlnet,obsInfo,actInfo,actorOptions,...
'Observation',{'observation'});
agentOpts = rlSACAgentOptions( ...
"SampleTime",Ts, ...
"TargetSmoothFactor",1e-3, ...
"ExperienceBufferLength",1e6, ...
"MiniBatchSize",512, ...
"NumWarmStartSteps",1000, ...
"DiscountFactor",0.99);
% agentOpts.ActorOptimizerOptions.Algorithm = 'adam';
% agentOpts.ActorOptimizerOptions.LearnRate = 1e-4;
% agentOpts.ActorOptimizerOptions.GradientThreshold = 1;
%
% for ct = 1:2
% agentOpts.CriticOptimizerOptions(ct).Algorithm = "adam";
% agentOpts.CriticOptimizerOptions(ct).LearnRate = 1e-4;
% agentOpts.CriticOptimizerOptions(ct).GradientThreshold = 1;
% end
agent = rlSACAgent(actor,[critic1,critic2],agentOpts);
trainOpts = rlTrainingOptions(...
"MaxEpisodes", 5000, ...
"MaxStepsPerEpisode", floor(Tf/Ts), ...
"ScoreAveragingWindowLength", 100, ...
"Plots", "training-progress", ...
"StopTrainingCriteria", "AverageReward", ...
"StopTrainingValue", 100, ...
"UseParallel", false);
if trainOpts.UseParallel
trainOpts.ParallelizationOptions.AttachedFiles = [pwd,filesep] + ...
["bracelet_with_vision_link.STL";
"half_arm_2_link.STL";
"end_effector_link.STL";
"shoulder_link.STL";
"base_link.STL";
"forearm_link.STL";
"spherical_wrist_1_link.STL";
"bracelet_no_vision_link.STL";
"half_arm_1_link.STL";
"spherical_wrist_2_link.STL"];
end
doTraining = true;
if doTraining
stats = train(agent,env,trainOpts);
else
load("kinovaBallBalanceAgent.mat")
end
---------------------------------------------------------------
  댓글 수: 3
요셉 이
요셉 이 2023년 1월 31일
Error using rl.env.AbstractEnv/simWithPolicy (line 83)
Unable to simulate model 'RLagentsimlSAC' with the agent 'agent'.
Error in rl.task.SeriesTrainTask/runImpl (line 33)
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
Error in rl.task.Task/run (line 21)
[varargout{1:nargout}] = runImpl(this);
Error in rl.task.TaskSpec/internal_run (line 166)
[varargout{1:nargout}] = run(task);
Error in rl.task.TaskSpec/runDirect (line 170)
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
Error in rl.task.TaskSpec/runScalarTask (line 194)
runDirect(this);
Error in rl.task.TaskSpec/run (line 69)
runScalarTask(task);
Error in rl.train.SeriesTrainer/run (line 24)
run(seriestaskspec);
Error in rl.train.TrainingManager/train (line 424)
run(trainer);
Error in rl.train.TrainingManager/run (line 215)
train(this);
Error in rl.agent.AbstractAgent/train (line 77)
TrainingStatistics = run(trainMgr);
Error in RLagentSAC (line 133)
stats = train(agent,env,trainOpts);
Caused by:
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 667)
Invalid input argument type or size such as observation, reward, isdone or loggedSignals.
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 667)
One input argument can have dimension labels only when the other input argument is an unformatted scalar. Use .* for element-wise
multiplication.
Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2023년 1월 31일
If you had this example working with TD3, then the error is likely due to how you structured your actor and critic for SAC. I would use the default agent feature, which gives you an initial policy architecture, and then compare to what you have to find the error. Without reproducible code, it's not easy to identify the error

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Reinforcement Learning에 대해 자세히 알아보기

제품


릴리스

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by