REINFORCE algorithm- unable to compute gradients on latest toolbox version

Question

Bhooshan V 2022년 4월 4일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1687859-reinforce-algorithm-unable-to-compute-gradients-on-latest-toolbox-version

댓글: Bhooshan V 2022년 4월 6일

채택된 답변: Joss Knight

MATLAB Online에서 열기

I have been trying to implement the REINFORCE algorithm using custom training loop.

The LSTM actor network inputs 50 timestep data of three states. Therefore a state is of dimension 3x50.

For computing gradients, the input data in the forllowing format

num_states x batchsize x N_TIMESTEPS = (3x1)x50x50.

In Reinforcement Learning toolbox version 1.3, the following line works perfectly.

% actor- the custom actor network , actorLossFunction- custom loss fn, lossData- custom variable  
actorGradient = gradient(actor,@actorLossFunction,{reshape(observationBatch,[3 1 50 50])},lossData);    

However, when I run the same code in the latest RL toolbox version 2.2, I get the following error:

------------------------------------------------------------------------------------------------------------------------------------------------------

Error using rl.representation.rlAbstractRepresentation/gradient

Unable to compute gradient from representation.

Error in simpleRLTraj (line 184)

actorGradient= gradient(actor,@actorLossFunction,{reshape(observationBatch,[3 1 50 50])},lossData);

Caused by:

Error using extractBinaryBroadcastData

dlarray is supported only for full arrays of data type double, single, or logical, or for full gpuArrays of

these data types.

------------------------------------------------------------------------------------------------------------------------------------------------------

I tried tracing back to the error but it get more complicated. How do I get an error for a code that works perfectly on the earlier version of RL toolbox?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Joss Knight 2022년 4월 5일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1687859-reinforce-algorithm-unable-to-compute-gradients-on-latest-toolbox-version#answer_935134

편집: Joss Knight 2022년 4월 5일

What is

underlyingType(observationBatch)

underlyingType(lossData)

?

댓글 수: 5
이전 댓글 3개 표시이전 댓글 3개 숨기기

Anh Tran 2022년 4월 5일

Can you attached your script so we can better help?

Bhooshan V 2022년 4월 6일

I found the issue. Apparently, the output of the neural network is a cell array and not a double type.

As a result of some sort of typecasting, the loss was of type cell array.

I found that we cannot convert a cell type to dlarray type using the dlarray() function which must have been used somewhere internally in the gradient() function.

example-

dlarray({3})

Error using dlarray

dlarray is supported only for full arrays of data type double, single, or logical, or for full gpuArrays of these data types.

I have resolved the error. Thank you for helping me realize this.

댓글을 달려면 로그인하십시오.

REINFORCE algorithm- unable to compute gradients on latest toolbox version

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 5
이전 댓글 3개 표시이전 댓글 3개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

REINFORCE algorithm- unable to compute gradients on latest toolbox version

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 5 이전 댓글 3개 표시이전 댓글 3개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 5
이전 댓글 3개 표시이전 댓글 3개 숨기기