훈련 및 시뮬레이션

강화 학습 에이전트 훈련 및 시뮬레이션하기

훈련이 진행되는 동안 에이전트는 주어진 환경에 대한 최적의 정책을 학습하기 위해 파라미터를 계속해서 업데이트합니다. 시뮬레이션 중에 에이전트는 환경으로부터 관측값과 보상을 받고, 파라미터 업데이트 없이 행동을 환경으로 반환합니다.

Reinforcement Learning Toolbox™는 시뮬레이션을 통해 에이전트를 훈련하고 훈련 결과를 검증하는 함수를 제공합니다. 에이전트 훈련 및 시뮬레이션에 대한 소개는 강화 학습 에이전트 훈련시키기 항목을 참조하십시오.

앱

강화 학습 디자이너

강화 학습 에이전트 설계, 훈련 및 시뮬레이션 (R2021a 이후)

함수

모두 확장

에이전트 훈련시키기

`train`	Train reinforcement learning agents within a specified environment
`rlTrainingOptions`	Options for training reinforcement learning agents
`rlMultiAgentTrainingOptions`	Options for training multiple reinforcement learning agents (R2022a 이후)
`trainWithEvolutionStrategy`	Train DDPG, TD3 or SAC agent using an evolutionary strategy within a specified environment (R2023b 이후)
`rlEvolutionStrategyTrainingOptions`	Options for training off-policy reinforcement learning agents using an evolutionary strategy (R2023b 이후)
`inspectTrainingResult`	Plot training information from a previous training session (R2021a 이후)

오프라인으로 에이전트 훈련시키기

`trainFromData`	Train off-policy reinforcement learning agent using existing data (R2023a 이후)
`rlTrainingFromDataOptions`	Options to train reinforcement learning agents using existing data (R2023a 이후)
`inspectTrainingResult`	Plot training information from a previous training session (R2021a 이후)

훈련 중 에이전트 평가하기

`rlEvaluator`	Options for evaluating reinforcement learning agents during training (R2023b 이후)
`rlCustomEvaluator`	Custom object for evaluating reinforcement learning agents during training (R2023b 이후)

데이터 기록하기

`rlDataLogger`	Create either a file logger object or a monitor logger object to log training data (R2022b 이후)
`rlDataViewer`	Open Reinforcement Learning Data Viewer tool (R2023a 이후)
`FileLogger`	Log reinforcement learning training data to MAT-files (R2022b 이후)
`MonitorLogger`	Log reinforcement learning training data to monitor window (R2022b 이후)
`trainingProgressMonitor`	Monitor and plot training progress for deep learning custom training loops (R2022b 이후)
`setup`	Set up reinforcement learning environment or initialize data logger object (R2022a 이후)
`store`	Store data in the internal memory of a (file or monitor) logger object (R2022b 이후)
`write`	Transfer stored data from the internal logger memory to the logging target (R2022b 이후)
`cleanup`	Clean up reinforcement learning environment or data logger object (R2022a 이후)

에이전트 시뮬레이션하기

`sim`	Simulate trained reinforcement learning agents within specified environment
`rlSimulationOptions`	Options for simulating a reinforcement learning agent within an environment

사용자 지정 훈련

`rlOptimizer`	Creates an optimizer object for actors and critics (R2022a 이후)
`runEpisode`	Simulate reinforcement learning environment against policy or agent (R2022a 이후)
`setup`	Set up reinforcement learning environment or initialize data logger object (R2022a 이후)
`cleanup`	Clean up reinforcement learning environment or data logger object (R2022a 이후)
`Future`	Object that supports deferred outputs for reinforcement learning environment simulations running on workers (R2022a 이후)
`fetchNext`	Retrieve next available unread outputs from a reinforcement learning environment simulations running on workers (R2022a 이후)
`fetchOutputs`	Retrieve results from all reinforcement learning environment simulations running on workers (R2022a 이후)
`cancel`	Cancel unfinished reinforcement learning environment simulations on workers (R2022a 이후)
`wait`	Wait for reinforcement learning environment simulations running on a workers to finish (R2022a 이후)

블록

RL Agent	강화 학습 에이전트
Policy	Reinforcement learning policy (R2022b 이후)

도움말 항목

강화 학습 디자이너 앱 사용하기

Design and Train Agent Using Reinforcement Learning Designer
Design and train a DQN agent for a cart-pole system using the Reinforcement Learning Designer app.
Specify Training Options in Reinforcement Learning Designer
Interactively specify options for training reinforcement learning agents using the Reinforcement Learning Designer app.
Specify Simulation Options in Reinforcement Learning Designer
Interactively specify options for simulating reinforcement learning agents using the Reinforcement Learning Designer app.

훈련 및 시뮬레이션 기본 사항

강화 학습 에이전트 훈련시키기
지정된 환경 내에서 에이전트를 훈련시켜 최적의 정책을 찾습니다.
기본 그리드 월드에서 강화 학습 에이전트 훈련시키기
MATLAB^®에서 그리드 월드를 풀도록 Q-러닝 및 SARSA 에이전트를 훈련시킵니다.
MDP 환경에서 강화 학습 에이전트 훈련시키기
일반 마르코프 결정 과정 환경에서 강화 학습 에이전트를 훈련시킵니다.
Simulink 환경 만들기 및 에이전트 훈련시키기
Simulink^®에서 훈련 환경으로 모델링된 플랜트에 강화 학습을 사용하여 제어기를 훈련시킵니다.
Train Reinforcement Learning Agent for Simple Contextual Bandit Problem
Train Q and DQN agents to solve a contextual bandit problem.

훈련 및 시뮬레이션 고급 사항

Create DQN Agent Using Deep Network Designer and Train Using Image Observations
Create a reinforcement learning agent using the Deep Network Designer app from the Deep Learning Toolbox™.
Log Training Data to Disk
Log a variety of data to disk while training an agent.
Train Agent or Tune Environment Parameters Using Parameter Sweeping
Tune a DDPG agent using hyperparameter sweeping.
Train Reinforcement Learning Agent Offline to Control Quanser QUBE Pendulum
Train TD3 agent offline to control a Quanser QUBE pendulum.

다중 프로세스 및 GPU 사용하기

Train Agents Using Parallel Computing and GPUs
Accelerate agent training by running simulations in parallel on multiple cores, GPUs, clusters or cloud resources.
Train AC Agent to Balance Cart-Pole System Using Parallel Computing
Train a AC agent for a discrete action space environment using asynchronous parallel computing.
Train DQN Agent for Lane Keeping Assist Using Parallel Computing
Train a DQN agent for an automated driving application using parallel computing.

다중 에이전트 훈련

여러 개의 에이전트가 협업해서 작업을 수행하도록 훈련시키기
두 개의 연속 행동 공간 PPO 에이전트가 협업하여 객체를 옮기도록 훈련시킵니다.
Train Multiple Agents for Area Coverage
Train three discrete action space PPO agents to explore a grid-world environment in a collaborative-competitive manner.
Train Multiple Agents for Path Following Control
Train a DQN and a DDPG agent to collaboratively perform adaptive cruise control and lane keeping assist to follow a path.

사용자 지정 에이전트 및 훈련 알고리즘 개발하기

Train Reinforcement Learning Policy Using Custom Training Loop
Train a reinforcement learning policy using your own custom training loop.
Create and Train Custom PG Agent
Create a custom PG agent and train it using the built-in train function.
Create and Train Custom LQR Agent
Create a custom agent that solves an LQR problem and train it using the built-in train function.
Custom Training Loop with Simulink Action Noise
Use a custom training loop to train a continuous action space reinforcement learning policy in Simulink when action noise is generated within the model.

모델 기반의 정책 최적화 에이전트 훈련시키기

Train MBPO Agent to Balance Cart-Pole System
A model-based reinforcement learning agent learns a model of its environment that it can use to generate additional experiences for training.
Model-Based Reinforcement Learning Using Custom Training Loop
Create a model-based reinforcement learning agent using a custom training loop.