problems encountered in DDPG.

조회 수: 28 (최근 30일)
邓龙京
邓龙京 2024년 11월 8일 12:32
편집: Walter Roberson 2024년 11월 18일 5:41
I set a simulink environment for using DDPG to suppress sub-oscillations. I follw the tutorial on matlab(DDPG), there are some errors when the program running.
The error message is:
错误使用 rlQValueFunction
Number of input layers for state-action-value function deepneural network must equal the number of observation andaction specifications.
why this happening???I tried to set my obsInfo and actInfo be [1 1] or 1,but all of this tring are failed, even their error message are same. So I can sure the reason of error is not relate to obsInfo = rlNumericSpec([3 1]); actInfo = rlNumericSpec([1 1]).
My code is:
mdl = 'rlVSG';
open_system(mdl);
% 定义观察空间和动作空间
obsInfo = rlNumericSpec([3 1]); % 假设观察空间维度为 3
actInfo = rlNumericSpec([1 1]); % DDPG 用于连续动作空间,这里假设动作空间维度为 1
% 设置环境
env = rlSimulinkEnv(mdl, [mdl '/Subsystem2/RL Agent'], obsInfo, actInfo);
rng(0)
% 定义 Actor 网络
actorLayers = [
featureInputLayer(prod(obsInfo.Dimension))
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(1)]; % 输出一个连续动作值
actorNet = dlnetwork(actorLayers);
summary(actorNet)
% 创建 Actor 对象
actor = rlContinuousDeterministicActor( ...
actorNet, ...
obsInfo, ...
actInfo);
% 定义 Critic 网络
criticLayers = [
featureInputLayer(prod(actInfo.Dimension))
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(1)]; % 输出一个连续动作值
% 创建 dlnetwork 时直接传入所有层
criticNet = dlnetwork(criticLayers);
summary(criticNet);
% 创建 Critic 对象
critic = rlQValueFunction(...
criticNet,...
obsInfo, ...
actInfo);
% 设置优化器选项
actorOpts = rlOptimizerOptions('LearnRate', 1e-4, 'GradientThreshold', 0.3);
criticOpts = rlOptimizerOptions('LearnRate', 1e-3, 'GradientThreshold', 0.2);
% 设置 DDPG 智能体选项
agentOpts = rlDDPGAgentOptions( ...
'SampleTime', 0.05, ...
'MiniBatchSize', 256, ...
'DiscountFactor', 0.99, ...
'ExperienceBufferLength', 1e6, ...
'ActorOptimizerOptions', actorOpts, ...
'CriticOptimizerOptions', criticOpts, ...
'UseTargetNetwork', true, ...
'TargetSmoothFactor', 1e-3, ...
'LearnRate', 1e-4);
% 创建 DDPG 智能体
agent = rlDDPGAgent(actor, critic, agentOpts); % 使用 actor 和 critic 对象而非网络
% 设置训练选项
trainOpts = rlTrainingOptions( ...
'MaxEpisodes', 1000, ...
'MaxStepsPerEpisode', 800, ...
'StopTrainingCriteria', 'AverageReward', ...
'StopTrainingValue', 2000, ...
'SaveAgentCriteria', 'AverageReward', ...
'SaveAgentValue', 2000);
% 训练智能体
trainingStats = train(agent, env, trainOpts);
% 设置仿真选项
simOptions = rlSimulationOptions('MaxSteps', 1000);
% 仿真智能体
sim(env, agent, simOptions);
Checked the example on matlab, I set a common layers
There are part of code that I modify:
% 定义 Critic 网络
% 观察输入层和动作输入层
obsInputLayer = featureInputLayer(prod(obsInfo.Dimension),Name="obsInput"); % 观察空间输入层
actInputLayer = featureInputLayer(prod(actInfo.Dimension),Name="actInput"); % 动作空间输入层
% 使用 ConcatenationLayer 合并观察和动作的输入
criticLayers = [concatenationLayer(1,2,Name="concat")
fullyConnectedLayer(200, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(200, 'Name', 'fc2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(1, 'Name', 'qValue')];
% 创建 Critic 网络
criticNet = dlnetwork;
criticNet = addLayers(criticNet, obsInputLayer);
criticNet = addLayers(criticNet, actInputLayer);
criticNet = addLayers(criticNet, criticLayers);
criticNet = connectLayers(criticNet,"obsInput","concat/in1");
criticNet = connectLayers(criticNet,"actInput","concat/in2");
summary(criticNet);
% 创建 Critic 对象
critic = rlQValueFunction(criticNet, obsInfo, actInfo);
The error message is:
Error using dlnetwork argument list is invalid. The function requires 1 additional input.
How can solve this probelm?

답변 (1개)

MULI
MULI 2024년 11월 18일 4:42
The error message “Number of input layers for state-action-value function deep neural network must equal the number of observation and action specifications suggests that
  • The critic network should have distinct input layers for observations and actions.
  • These input layers must be combined using a concatenation layer before they proceed through the rest of the network.
You can modify your critic network setup as given below:
% Define observation and action input layers
obsInputLayer = featureInputLayer(prod(obsInfo.Dimension), 'Name', 'obsInput');
actInputLayer = featureInputLayer(prod(actInfo.Dimension), 'Name', 'actInput');
% Define the layers for the critic network
criticLayers = [
concatenationLayer(1, 2, 'Name', 'concat')
fullyConnectedLayer(200, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(200, 'Name', 'fc2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(1, 'Name', 'qValue')];
% Create the critic network
criticNet = layerGraph();
criticNet = addLayers(criticNet, obsInputLayer);
criticNet = addLayers(criticNet, actInputLayer);
criticNet = addLayers(criticNet, criticLayers);
% Connect the input layers to the concatenation layer
criticNet = connectLayers(criticNet, 'obsInput', 'concat/in1');
criticNet = connectLayers(criticNet, 'actInput', 'concat/in2');
% Convert to dlnetwork
criticNet = dlnetwork(criticNet);
% Create Critic object
critic = rlQValueFunction(criticNet, obsInfo, actInfo);
The second error message Error using dlnetwork argument list is invalid. The function requires 1 additional input. typically occurs when:
  • The layers are not correctly added to the dlnetwork object.
  • The above solution will address this by correctly setting up the layer graph and converting it to a dlnetwork.
Hope this helps!

카테고리

Help CenterFile Exchange에서 Reinforcement Learning에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by