Problems with control of PID parameters using reinforcement learning

Question

jh j 2023년 5월 15일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1963919-problems-with-control-of-pid-parameters-using-reinforcement-learning

답변: Divyanshu 2024년 12월 26일

Hello,I am currently learning about reinforcement learning. I want to use reinforcement learning to generate my PID controller parameters. But when I build the simulink model and start training, my reward value keeps going to 0 and when I look at the actions generated by the model in simulink it keeps going to null.I have no idea why this is happening and I am looking for some help.Thank you in advance for your assistance.

clc;clear;close all;
open_system('BLDCRL')
Unable to find system or file 'BLDCRL'.
obsInfo = rlNumericSpec([3 1],...
    'LowerLimit',[-inf -inf -inf ]',...
    'UpperLimit',[ inf  inf inf]');
numObservations = obsInfo.Dimension(1);
actInfo = rlNumericSpec([3 1]);
numActions = actInfo.Dimension(1);
env = rlSimulinkEnv('BLDCRL','BLDCRL/RL Agent',...
    obsInfo,actInfo);
env.ResetFcn = @localResetFcn;
Ts = 5;
Tf = 100;
rng(0)
statePath = [
    featureInputLayer(numObservations,'Normalization','none','Name','State')
    fullyConnectedLayer(500,'Name','CriticStateFC1')
    reluLayer('Name','CriticRelu1')
    fullyConnectedLayer(500,'Name','CriticStateFC2')];
actionPath = [
    featureInputLayer(numActions,'Normalization','none','Name','Action')
    fullyConnectedLayer(500,'Name','ActionFC1')];
commonPath = [
    additionLayer(2,'Name','add')
    reluLayer('Name','CriticCommonRelu')
    fullyConnectedLayer(1,'Name','CriticOutput')];
criticNetwork = layerGraph();
criticNetwork = addLayers(criticNetwork,statePath);
criticNetwork = addLayers(criticNetwork,actionPath);
criticNetwork = addLayers(criticNetwork,commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'ActionFC1','add/in2');
% figure
% plot(criticNetwork)
criticOpts = rlRepresentationOptions('LearnRate',1e-04,'GradientThreshold',1);
% critic = rlQValueFunction(criticNetwork,...
%              obsInfo,actInfo, ...
%              "ActionInputNames",{'Action'},"ObservationInputNames",{'State'},"UseDevice","gpu");
critic = rlRepresentation(criticNetwork,obsInfo,actInfo,'Observation',{'State'},'Action',{'Action'},criticOpts);
actorNetwork = [
    featureInputLayer(numObservations,'Normalization','none','Name','State')
    fullyConnectedLayer(500,'Name','ActorFC1')
    reluLayer('Name','ActorRelu1') 
    fullyConnectedLayer(500,'Name','ActorFC2')
    reluLayer('Name','ActorRelu2') 
    fullyConnectedLayer(numActions,'Name','Action')
    ];
actorOptions = rlRepresentationOptions('LearnRate',1e-05,'GradientThreshold',1);
actor = rlRepresentation(actorNetwork,obsInfo,actInfo,'Observation',{'State'},'Action',{'Action'},actorOptions);
% actor = rlContinuousDeterministicActor(actorNetwork,obsInfo,actInfo, ...
%     "ObservationInputNames",{'State'},"UseDevice","gpu");
% act = getAction(actor, ...
%     {rand(obsInfo.Dimension)}); 
% act{1}
agentOpts = rlDDPGAgentOptions(...
    'SampleTime',Ts,...
    'TargetSmoothFactor',1e-3,...
    'DiscountFactor',1.0, ...
    'MiniBatchSize',64, ...
    'ExperienceBufferLength',1e6);
agentOpts.NoiseOptions.Variance = 0.3;
agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOpts);
maxepisodes = 500;
maxsteps = ceil(Tf/Ts);
trainOpts = rlTrainingOptions(...
    'MaxEpisodes',maxepisodes, ...
    'MaxStepsPerEpisode',maxsteps, ...
    'ScoreAveragingWindowLength',5, ...
    'Verbose',false, ...
    'Plots','training-progress',...
    'StopTrainingCriteria','AverageReward',...
    'StopTrainingValue',998);
trainingStats = train(agent,env,trainOpts);

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Divyanshu 2024년 12월 26일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1963919-problems-with-control-of-pid-parameters-using-reinforcement-learning#answer_1556495

Hello @jh j,

If you are following some official example of Reinforcement Learning Toolbox, then ensure that the Reinforcement Learning Toolbox is installed properly on your system along with MATLAB. Because the error indicates that the model file 'BLDCRL' is missing.

However, if this is a custom code then ensure that the model 'BLDCRL' is present on MATLAB path.

Additionally, you can take reference of the following example documentation to get more details.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Problems with control of PID parameters using reinforcement learning

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Problems with control of PID parameters using reinforcement learning

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기