Multiple goals in Grid World

Question

GCats 2022년 2월 15일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1650440-multiple-goals-in-grid-world

답변: Anshuman 2024년 1월 22일

Hello everyone,

Is it possible to train the agent to reach different goals in the same episode? I'm working with the standard GridWorld environment given here: https://nl.mathworks.com/help/reinforcement-learning/ref/creategridworld.html#d123e1207 and I would like to have the agent reach for example state [2,1] and then [3,5] and finally [3,3] in this order. Is it possible to do so?

If i try to modify the terminal state option as

GW.TerminalStates = ['[2,1]'; '[3,5]'; '[3,3]'];

The agent only reaches the closest one to the initial state and not all of them. Any hints?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Anshuman 2024년 1월 22일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1650440-multiple-goals-in-grid-world#answer_1394136

The "createGridWorld" function is typically used to create a simple grid environment where an agent can navigate to a single goal. By default, the environment is set up to terminate the episode once the agent reaches the specified terminal state.

To train an agent to reach multiple goals in sequence within the same episode, modifying the "TerminalStates" property alone won't suffice because, as you've observed, the episode will end once the agent reaches the first terminal state. To do so, you need to implement a custom reward function and modify the environment's step logic to handle sequential goals.

Initially, don't set any terminal states, as you want the episode to continue after reaching each goal.
Define a reward function that provides positive reinforcement when the agent reaches one of the goals and then updates the next goal. Basically you need to create a custom reward function.
Enhance the state representation to include the current goal or the sequence of goals that the agent needs to reach.
Customize the step function of the environment to check if the agent has reached the current goal. When a goal is reached, update the state to reflect the next goal in the sequence(but do not terminate the episode).
The episode should only terminate when all goals have been reached or if a maximum number of steps is exceeded.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Multiple goals in Grid World

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Multiple goals in Grid World

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기