problems encountered in DDPG.

Question

邓龙京 2024년 11월 8일 12:32

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2164923-problems-encountered-in-ddpg

편집: Walter Roberson 2024년 11월 18일 5:41

I set a simulink environment for using DDPG to suppress sub-oscillations. I follw the tutorial on matlab（DDPG）, there are some errors when the program running.

The error message is：

错误使用 rlQValueFunction

Number of input layers for state-action-value function deepneural network must equal the number of observation andaction specifications.

why this happening？？？I tried to set my obsInfo and actInfo be [1 1] or 1，but all of this tring are failed, even their error message are same. So I can sure the reason of error is not relate to obsInfo = rlNumericSpec([3 1]); actInfo = rlNumericSpec([1 1]).

My code is:

mdl = 'rlVSG';
open_system(mdl);
% 定义观察空间和动作空间
obsInfo = rlNumericSpec([3 1]);  % 假设观察空间维度为 3
actInfo = rlNumericSpec([1 1]);  % DDPG 用于连续动作空间，这里假设动作空间维度为 1
% 设置环境
env = rlSimulinkEnv(mdl, [mdl '/Subsystem2/RL Agent'], obsInfo, actInfo);
rng(0)
% 定义 Actor 网络
actorLayers = [
    featureInputLayer(prod(obsInfo.Dimension))
    fullyConnectedLayer(200)
    reluLayer
    fullyConnectedLayer(200)
    reluLayer
    fullyConnectedLayer(1)];  % 输出一个连续动作值
actorNet = dlnetwork(actorLayers);
summary(actorNet)
% 创建 Actor 对象
actor = rlContinuousDeterministicActor( ...
    actorNet, ...
    obsInfo, ...
    actInfo);
% 定义 Critic 网络
criticLayers = [
    featureInputLayer(prod(actInfo.Dimension))
    fullyConnectedLayer(200)
    reluLayer
    fullyConnectedLayer(200)
    reluLayer
    fullyConnectedLayer(1)];  % 输出一个连续动作值
% 创建 dlnetwork 时直接传入所有层
criticNet = dlnetwork(criticLayers); 
summary(criticNet);
% 创建 Critic 对象  
critic = rlQValueFunction(...
    criticNet,...
    obsInfo, ...
    actInfo);   
% 设置优化器选项
actorOpts = rlOptimizerOptions('LearnRate', 1e-4, 'GradientThreshold', 0.3);
criticOpts = rlOptimizerOptions('LearnRate', 1e-3, 'GradientThreshold', 0.2);
% 设置 DDPG 智能体选项
agentOpts = rlDDPGAgentOptions( ...
    'SampleTime', 0.05, ...
    'MiniBatchSize', 256, ...
    'DiscountFactor', 0.99, ...
    'ExperienceBufferLength', 1e6, ...
    'ActorOptimizerOptions', actorOpts, ...
    'CriticOptimizerOptions', criticOpts, ...
    'UseTargetNetwork', true, ...
    'TargetSmoothFactor', 1e-3, ...
    'LearnRate', 1e-4);
% 创建 DDPG 智能体
agent = rlDDPGAgent(actor, critic, agentOpts);  % 使用 actor 和 critic 对象而非网络
% 设置训练选项
trainOpts = rlTrainingOptions( ...
    'MaxEpisodes', 1000, ...
    'MaxStepsPerEpisode', 800, ...
    'StopTrainingCriteria', 'AverageReward', ...
    'StopTrainingValue', 2000, ...
    'SaveAgentCriteria', 'AverageReward', ...
    'SaveAgentValue', 2000);
% 训练智能体
trainingStats = train(agent, env, trainOpts);
% 设置仿真选项
simOptions = rlSimulationOptions('MaxSteps', 1000);
% 仿真智能体
sim(env, agent, simOptions);

Checked the example on matlab, I set a common layers

There are part of code that I modify：

% 定义 Critic 网络 
% 观察输入层和动作输入层 
obsInputLayer = featureInputLayer(prod(obsInfo.Dimension),Name="obsInput");  % 观察空间输入层
actInputLayer = featureInputLayer(prod(actInfo.Dimension),Name="actInput");  % 动作空间输入层
% 使用 ConcatenationLayer 合并观察和动作的输入
criticLayers = [concatenationLayer(1,2,Name="concat")
    fullyConnectedLayer(200, 'Name', 'fc1')   
    reluLayer('Name', 'relu1')              
    fullyConnectedLayer(200, 'Name', 'fc2')   
    reluLayer('Name', 'relu2')                 
    fullyConnectedLayer(1, 'Name', 'qValue')];
% 创建 Critic 网络
criticNet = dlnetwork;
criticNet = addLayers(criticNet, obsInputLayer); 
criticNet = addLayers(criticNet, actInputLayer);
criticNet = addLayers(criticNet, criticLayers);
criticNet = connectLayers(criticNet,"obsInput","concat/in1"); 
criticNet = connectLayers(criticNet,"actInput","concat/in2");
summary(criticNet);
% 创建 Critic 对象
critic = rlQValueFunction(criticNet, obsInfo, actInfo);

The error message is：

Error using dlnetwork argument list is invalid. The function requires 1 additional input.

How can solve this probelm?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

MULI 2024년 11월 18일 4:42

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2164923-problems-encountered-in-ddpg#answer_1546178

MATLAB Online에서 열기

Hi 邓龙京,

The error message “Number of input layers for state-action-value function deep neural network must equal the number of observation and action specifications” suggests that

The critic network should have distinct input layers for observations and actions.
These input layers must be combined using a concatenation layer before they proceed through the rest of the network.

You can modify your critic network setup as given below:

% Define observation and action input layers 
obsInputLayer = featureInputLayer(prod(obsInfo.Dimension), 'Name', 'obsInput'); 
actInputLayer = featureInputLayer(prod(actInfo.Dimension), 'Name', 'actInput'); 
 
% Define the layers for the critic network 
criticLayers = [ 
    concatenationLayer(1, 2, 'Name', 'concat') 
    fullyConnectedLayer(200, 'Name', 'fc1') 
    reluLayer('Name', 'relu1') 
    fullyConnectedLayer(200, 'Name', 'fc2') 
    reluLayer('Name', 'relu2') 
    fullyConnectedLayer(1, 'Name', 'qValue')]; 
 
% Create the critic network 
criticNet = layerGraph(); 
criticNet = addLayers(criticNet, obsInputLayer); 
criticNet = addLayers(criticNet, actInputLayer); 
criticNet = addLayers(criticNet, criticLayers); 
 
% Connect the input layers to the concatenation layer 
criticNet = connectLayers(criticNet, 'obsInput', 'concat/in1'); 
criticNet = connectLayers(criticNet, 'actInput', 'concat/in2'); 
 
% Convert to dlnetwork 
criticNet = dlnetwork(criticNet); 
 
% Create Critic object 
critic = rlQValueFunction(criticNet, obsInfo, actInfo); 

The second error message “Error using dlnetwork argument list is invalid. The function requires 1 additional input.” typically occurs when:

The layers are not correctly added to the ‘dlnetwork’ object.
The above solution will address this by correctly setting up the layer graph and converting it to a ‘dlnetwork’.

Hope this helps!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

problems encountered in DDPG.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

problems encountered in DDPG.

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기