Hi,
I am wondering if it is possible to have time-varying (non-stationary) policy functions in the reinforcement learning toolbox.
For example, say my episode lasts three periods (t=1,2,3), then I would have the set where is some neural network structure indexed by a general vector of parameters ϑ, which will ultimately depend on the time period.
Is that possible to do with the toolbox?
Thank you so much!

답변 (1개)

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis 2023년 5월 25일

0 개 추천

Why don't you just train 3 separate policies and pick and choose as needed?

댓글 수: 4

Matheus Silva
Matheus Silva 2023년 5월 25일
편집: Matheus Silva 2023년 5월 25일
Thank you for your answer! My application has stochastic terms that may be not independent from past values, so that the second period has information about the first and I do not want to lose that in the solution.
I could be misunderstanding, but assuming first period has no dependencies, then you train that first. Then you use the trained policy to train your second period policy and so on
Matheus Silva
Matheus Silva 2023년 5월 28일
편집: Matheus Silva 2023년 5월 28일
My problem is that my periods can be related in some arbitrary way. For example, I am thinking of a model where the state can vary according to
Where is a stochastic term and is some transition function. However, I may want to allow some relation between the stochastic terms in periods 1 and 3. Solving the problem period by period would eliminate that dependence, no?
Honestly, I think your best bet would be to use the same policy throughout, but maybe use an input signal to the neural net to indicate which period you are in based on your state.
Another option, which is similar to what I mentioned earlier, is to train 3 different policies. To work around the period dependencies, you can place the RL policy block inside a triggered subsystem and only enable the subsystem for training when the system is in the appropriate period. Do that for each policy and then you can switch between the 3 as needed. See here

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기

제품

릴리스

R2022b

질문:

2023년 5월 24일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by