필터 지우기
필터 지우기

Randomized position of obstacles in Grid World

조회 수: 2 (최근 30일)
GCats
GCats 2022년 2월 9일
편집: StevenKlein 2022년 8월 8일
Hello everyone!
I'm working on training a Q-learning agent using a standard 5x5 gridworld environment. I would like to implement in my environment obstacles such that they change at every episode in the training without ever coinciding with the target state of course. Anyone got any intel?
Here is my code:
GW = createGridWorld(5,5);
GW.CurrentState = '[1,1]';
GW.TerminalStates = '[3,3]';
GW.ObstacleStates = ["[3,2]";"[2,2]";"[2,3]";"[2,4]"; "[3,4]"];
updateStateTranstionForObstacles(GW);
nS = numel(GW.States);
nA = numel(GW.Actions);
GW.R = -1*ones(nS,nS,nA);
% GW.R(state2idx(GW,"[2,4]"),state2idx(GW,"[4,4]"),:) = 5;
GW.R(:,state2idx(GW,GW.TerminalStates),:) = 10;
env = rlMDPEnv(GW)
env.ResetFcn = @() 1;
rng(0)
qTable = rlTable(getObservationInfo(env),getActionInfo(env));
qRepresentation = rlQValueRepresentation(qTable,getObservationInfo(env),getActionInfo(env));
qRepresentation.Options.LearnRate = 1;
agentOpts = rlQAgentOptions;
agentOpts.EpsilonGreedyExploration.Epsilon = .04;
qAgent = rlQAgent(qRepresentation,agentOpts);
%training
trainOpts = rlTrainingOptions;
trainOpts.MaxStepsPerEpisode = 50;
trainOpts.MaxEpisodes= 200;
trainOpts.StopTrainingCriteria = "AverageReward";
trainOpts.StopTrainingValue = 11;
trainOpts.ScoreAveragingWindowLength = 30;
doTraining = true;
if doTraining
% Train the agent.
trainingStats = train(qAgent,env,trainOpts);
else
% Load the pretrained agent for the example.
load('basicGWQAgent.mat','qAgent')
end
plot(env)
env.Model.Viewer.ShowTrace = true;
env.Model.Viewer.clearTrace;
sim(qAgent,env)
  댓글 수: 1
Francesco Rizzo
Francesco Rizzo 2022년 5월 23일
Did you manage to do it? I have the same problem

댓글을 달려면 로그인하십시오.

답변 (1개)

StevenKlein
StevenKlein 2022년 8월 8일
편집: StevenKlein 2022년 8월 8일
Same question here!
_____

제품


릴리스

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by