응용 사례

강화 학습을 적용하는 방법에 대한 예제

강화 학습은 제어, 로보틱스, 스케줄링, 최적화, 금융 등 서로 다른 분야의 다양한 문제에 적용할 수 있습니다. 다음은 몇 가지 예제입니다.

튜토리얼

제어 작업을 수행하도록 에이전트 훈련시키기

DDPG 에이전트를 사용하여 탱크의 수위 제어하기
Simulink^®에서 모델링된 플랜트를 훈련 환경으로 설정하여 강화 학습을 사용해 제어기를 훈련시킵니다.
강화 학습을 사용하여 PI 제어기 조정하기
TD3 에이전트를 사용하여 PI 제어기의 이득을 조정합니다.
Train SAC Agent for Ball Balance Control
Train a SAC agent to balance a ball on a flat surface using a robot arm.
Train Default TD3 Agent to Control Quanser QUBE Pendulum
Train a TD3 agent to balance the Quanser QUBE rotational inverted pendulum.
Train Reinforcement Learning Agent Offline to Control Quanser QUBE Pendulum
Train TD3 agent offline to control a Quanser QUBE pendulum.
Train TD3 Agent for PMSM Control
Train a TD3 agent to control the currents in a permanent magnet synchronous motor.
Field-Oriented Control of PMSM Using Reinforcement Learning (Motor Control Blockset)
This example shows you how to use the control design method of reinforcement learning to implement field-oriented control (FOC) of a permanent magnet synchronous motor (PMSM).
Train DQN Agent with LSTM Network to Control House Heating System
Train a DQN agent with a recurrent network to control the temperature of an house.
Train Reinforcement Learning Agent with Constraint Enforcement (Simulink Control Design)
Train a reinforcement learning agent with actions constrained using the Constraint Enforcement block.
Create and Train Custom LQR Agent
Create a custom agent that solves an LQR problem and train it using the built-in train function.

로봇을 제어하도록 에이전트 훈련시키기

Train DDPG Agent to Control Two-Thruster Sliding Vehicle
Train a DDPG agent to control a robot sliding over a frictionless 2-D plane.
Train Default PPO Agent for Discrete Lander Vehicle
Train a default PPO agent to land a discrete action space flying vehicle.
Train Soft Actor Critic Agent with Custom Networks for Discrete Lander Vehicle
Train a SAC agent to land a discrete action space flying vehicle.
Train Biped Robot to Walk Using Reinforcement Learning Agents
Compare DDPG and TD3 agent for the control a biped walking robot modeled in Simscape™ Multibody™.
Add Safety Constraint to Simulate Two-Link Robot with SAC Agent
Add high-order barrier function to safely simulate a two-link robot model with a SAC agent.
Train Biped Robot to Walk Using Evolution Strategy-Reinforcement Learning Agents
Train TD3 agent using evolutionary strategy.
DDPG 에이전트를 사용한 사족 보행 로봇 운동
Simscape Multibody에서 모델링된 4족 보행 로봇을 제어하도록 DDPG 에이전트를 훈련시킵니다.

제어 사양에서 보상 생성하기

Generate Reward Function from a Model Predictive Controller for a Servomotor
Generate a reward function from an MPC controller applied to a servomotor and use it to train a TD3 agent.
Generate Reward Function from a Model Verification Block for a Water Tank System
Generate a reward function from an model verification block applied to a water tank system and use it to train a TD3 agent.

모방 학습

Imitate MPC Controller for Lane Keeping Assist
Train a deep neural network to imitate the behavior of a model predictive controller within a lane keeping assist system.
Imitate Nonlinear MPC Controller for Sliding Robot
Train a deep neural network to imitate the behavior of a nonlinear model predictive controller for a robot siding on a 2-D frictionless plane.
Train DDPG Agent with Pretrained Actor Network
Train a DDPG agent using an actor network that has been previously trained using supervised learning.

자동차 응용 사례를 위한 에이전트 훈련시키기

차선 유지 보조를 위해 DQN 에이전트 훈련시키기
차선 유지 보조 응용 사례를 위해 DQN 에이전트를 훈련시킵니다.
Train PPO Agent with Curriculum Learning for a Lane Keeping Application
Train a PPO agent for a lane keeping assist task by gradually increasing task complexity.
적응형 크루즈 컨트롤을 위해 DDPG 에이전트 훈련시키기
적응형 크루즈 컨트롤 응용 사례를 위해 DDPG 에이전트를 훈련시킵니다.
경로 추종 컨트롤을 위해 DDPG 에이전트 훈련시키기
차선 추종 응용 사례를 위해 DDPG 에이전트를 훈련시킵니다.
Train Multiple Agents for Path Following Control
Train a DQN and a DDPG agent to collaboratively perform adaptive cruise control and lane keeping assist to follow a path.
Train Hybrid SAC Agent for Path-Following Control
Train a hybrid SAC agent for lane following control.
Train Hybrid-Action PPO Agent for Path-Following Control
Train a hybrid PPO agent for lane following control.
Train PPO Agent for Automatic Parking Valet
Train a discrete action space PPO agent to park a car in an open parking space.

컨텍스트 밴딧 문제

Train Reinforcement Learning Agent for Simple Contextual Bandit Problem
Train Q and DQN agents to solve a contextual bandit problem.
Why Solving Regression Using Reinforcement Learning is Not Recommended
Using a reinforcement learning agent to solve a regression problem is possible but not recommended.
Why Solving Classification Using Reinforcement Learning Is Not Recommended
Using a reinforcement learning agent to solve a classification problem is possible but not recommended.

기타 응용 사례

Train Agent to Play Turn-Based Game
Train a DQN agent to play a turn-based game.
Deep Reinforcement Learning for Optimal Trade Execution
This example shows how to use the Reinforcement Learning Toolbox™ and Deep Learning Toolbox™ to design agents for optimal trade execution.
Multiperiod Goal-Based Wealth Management Using Reinforcement Learning
This example shows a reinforcement learning (RL) approach to maximize the probability of obtaining an investor's wealth goal at the end of the investment horizon.
Train DQN Agent for Beam Selection (5G Toolbox)
Train a deep Q-network (DQN) reinforcement learning agent for beam selection in a 5G new radio communications system. (R2022b 이후)
Water Distribution System Scheduling Using Reinforcement Learning
Train a DQN agent to optimally activate pumps in a water distribution system.

응용 사례

튜토리얼

제어 작업을 수행하도록 에이전트 훈련시키기

로봇을 제어하도록 에이전트 훈련시키기

제어 사양에서 보상 생성하기

모방 학습

자동차 응용 사례를 위한 에이전트 훈련시키기

컨텍스트 밴딧 문제

기타 응용 사례

추천 예제

Identify Vulnerabilities in DC Microgrids

Optimizing Queue Selection Strategies Using Reinforcement Learning

Automatic Parking Valet with Unreal Engine Simulation

DDPG 에이전트를 사용한 사족 보행 로봇 운동