在运行最后一句trainingStats = train(agent, env, trainOpts)​;之前没报错,DDP​GAgent也创建出​来了

조회 수: 6 (최근 30일)
成奥
成奥 2024년 5월 21일
댓글: liu lin 2025년 10월 30일
clear all
clc
addpath('matpower6.0');
define_constants;
% define named indices into bus, gen, branch matrices
[PQ, PV, REF, NONE, BUS_I, BUS_TYPE, PD, QD, GS, BS, BUS_AREA, VM, ...
VA, BASE_KV, ZONE, VMAX, VMIN, LAM_P, LAM_Q, MU_VMAX, MU_VMIN] = idx_bus;
[F_BUS, T_BUS, BR_R, BR_X, BR_B, RATE_A, RATE_B, RATE_C, ...
TAP, SHIFT, BR_STATUS, PF, QF, PT, QT, MU_SF, MU_ST, ...
ANGMIN, ANGMAX, MU_ANGMIN, MU_ANGMAX] = idx_brch;
[GEN_BUS, PG, QG, QMAX, QMIN, VG, MBASE, GEN_STATUS, PMAX, PMIN, ...
MU_PMAX, MU_PMIN, MU_QMAX, MU_QMIN, PC1, PC2, QC1MIN, QC1MAX, ...
QC2MIN, QC2MAX, RAMP_AGC, RAMP_10, RAMP_30, RAMP_Q, APF] = idx_gen;
%% 创建环境
env = PowerSystemEnv;
% 查看环境的状态和动作规范
obsInfo = getObservationInfo(env);
actInfo = getActionInfo(env);
% 固定随机种子确保实验的可重复性
rng(0)
%% 创建神经网络
% 创建观察路径
obsPath = [
featureInputLayer(obsInfo.Dimension(1), 'Normalization', 'none', 'Name', 'state')
fullyConnectedLayer(200, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(100, 'Name', 'fc2')
reluLayer('Name', 'relu2')];
% 创建动作路径
actPath = [
featureInputLayer(actInfo.Dimension(1), 'Normalization', 'none', 'Name', 'action')
fullyConnectedLayer(100, 'Name', 'fc3')
reluLayer('Name', 'relu3')];
% 创建公共路径
commonPath = [
concatenationLayer(1, 2, 'Name', 'concat')
fullyConnectedLayer(50, 'Name', 'fc4')
reluLayer('Name', 'relu4')
fullyConnectedLayer(1, 'Name', 'output')];
% 创建并连接网络
net = layerGraph(obsPath);
net = addLayers(net, actPath);
net = addLayers(net, commonPath);
net = connectLayers(net, 'relu2', 'concat/in1');
net = connectLayers(net, 'relu3', 'concat/in2');
% 将神经网络转换为dlnetwork
net = dlnetwork(net);
% 使用神经网络创建连续动作空间的价值函数
critic = rlQValueFunction(net, obsInfo, actInfo, ...
'ObservationInputNames', 'state', 'ActionInputNames', 'action');
%% 创建Actor网络
layers = [
featureInputLayer(obsInfo.Dimension(1), 'Normalization', 'none', 'Name', 'state')
fullyConnectedLayer(200, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(100, 'Name', 'fc2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(actInfo.Dimension(1), 'Name', 'output')
tanhLayer('Name', 'tanh')];
net = dlnetwork(layers);
actor = rlContinuousDeterministicActor(net, obsInfo, actInfo);
%% 设置优化选项
criticOpts = rlOptimizerOptions('LearnRate', 1e-3, 'GradientThreshold', 1);
actorOpts = rlOptimizerOptions('LearnRate', 1e-4, 'GradientThreshold', 1);
%% 设置DDPG超参数
agentOpts = rlDDPGAgentOptions( ...
'SampleTime', 0.1, ...
'CriticOptimizerOptions', criticOpts, ...
'ActorOptimizerOptions', actorOpts, ...
'ExperienceBufferLength', 1e4, ...
'MiniBatchSize', 256, ...
'DiscountFactor', 0.99, ...
'TargetSmoothFactor', 1e-3, ...
'TargetUpdateFrequency', 1 );
% 创建DDPG智能体
agent = rlDDPGAgent(actor, critic, agentOpts);
%% 训练智能体
trainOpts = rlTrainingOptions( ...
'MaxEpisodes', 2000, ...
'MaxStepsPerEpisode', 5000, ...
'StopTrainingCriteria', 'EpisodeReward', ...
'StopTrainingValue', 40, ...
'SaveAgentCriteria', 'EpisodeReward', ...
'SaveAgentValue', 40, ...
'Verbose', false, ...
'Plots', 'training-progress');
% 开始训练智能体
trainingStats = train(agent, env, trainOpts);
% 运行后报错
错误使用 rl.train.SeriesTrainer/run
There was an error executing the ProcessExperienceFcn.
Caused by:
错误使用 rl.policy.rlAdditiveNoisePolicy/getAction_
Batch observations not supported for Ornstein-Uhlenbeck noise model.
出错 rl.policy.PolicyInterface/getAction (36 )
[action,this] = getAction_(this,observation);
出错 rl.agent.AbstractOffPolicyAgent/getExplorationAction_ (116 )
[action,this.ExplorationPolicy_] = getAction(this.ExplorationPolicy_,...
出错 rl.agent.AbstractAgent/getAction_ (90 )
[action,this] = getExplorationAction_(this,observation);
出错 rl.policy.PolicyInterface/getAction (36 )
[action,this] = getAction_(this,observation);
出错 rl.env.internal.PolicyExperienceProcessorInterface/evaluateAction_ (32 )
[action,this.Policy_] = getAction(this.Policy_,observation);
出错 rl.env.internal.ExperienceProcessorInterface/evaluateAction (62 )
action = evaluateAction_(this,observation);
出错 rl.env.internal.MATLABSimulator/simInternal_ (109 )
act = evaluateAction(expProcessor,obs);
出错 rl.env.internal.MATLABSimulator/sim_ (67 )
out = simInternal_(this,simPkg);
出错 rl.env.internal.AbstractSimulator/sim (30 )
out = sim_(this,simData,policy,processExpFcn,processExpData);
出错 rl.env.AbstractEnv/runEpisode (144 )
out = sim(simulator,simData,policy,processExpFcn,processExpData);
出错 rl.train.SeriesTrainer/run (64 )
out = runEpisode(...
出错 rl.train.TrainingManager/train (516 )
run(trainer);
出错 rl.train.TrainingManager/run (253 )
train(this);
出错 rl.agent.AbstractAgent/train (187 )
trainingResult = run(trainMgr,checkpoint);
出错 main (104 )
trainingStats = train(agent, env, trainOpts);
出错 rl.train.TrainingManager/train (516 )
run(trainer);
出错 rl.train.TrainingManager/run (253 )
train(this);
出错 rl.agent.AbstractAgent/train (187 )
trainingResult = run(trainMgr,checkpoint);
出错 main (104 )
trainingStats = train(agent, env, trainOpts);
open 'run Run script.
错误: 字符向量未正常终止。
  댓글 수: 1
liu lin
liu lin 2025년 10월 30일
主函数看不出来什么问题,可能是environment中的一些问题

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Applications에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!