SARSA Reinforcement Learning

버전 1.0.0.0 (117 KB) 작성자: Bhartendu
Maze solving using SARSA, Reinforcement Learning
다운로드 수: 1.7K
업데이트 날짜: 2017/5/24

라이선스 보기

Refer to 6.4 (Sarsa: On-Policy TD Control), Reinforcement learning: An introduction, RS Sutton, AG Barto , MIT press
In this demo, two different mazes have been solved by Reinforcement Learning technique, SARSA.
State-Action-Reward-State-Action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning.
SARSA, Updation of Action-Value Function:

Q(S{t}, A{t}) := Q(S{t}, A{t}) + α*[ R{t+1} + γ ∗ Q(S{t+1}, A{t+1}) − Q(S{t}, A{t}) ]

Learning rate (α)
The learning rate determines to what extent the newly acquired information will override the old information. A factor of 0 will make the agent not learn anything, while a factor of 1 would make the agent consider only the most recent information.

Discount factor (γ)
The discount factor determines the importance of future rewards. A factor of 0 will make the agent "opportunistic" by only considering current rewards, while a factor approaching 1 will make it strive for a long-term high reward. If the discount factor meets or exceeds 1, the Q values may diverge.

Note: Convergence is tested on particular examples, in general convergence is not sure for above demo.

인용 양식

Bhartendu (2026). SARSA Reinforcement Learning (https://kr.mathworks.com/matlabcentral/fileexchange/63089-sarsa-reinforcement-learning), MATLAB Central File Exchange. 검색 날짜: .

MATLAB 릴리스 호환 정보
개발 환경: R2016a
모든 릴리스와 호환
플랫폼 호환성
Windows macOS Linux
카테고리
Help CenterMATLAB Answers에서 Labyrinth problems에 대해 자세히 알아보기
버전 게시됨 릴리스 정보
1.0.0.0