Using dlarray with betarnd/randg

조회 수: 5 (최근 30일)
Jack Hunt
Jack Hunt 2024년 6월 20일
댓글: Jack Hunt 2024년 6월 22일
I am writing a custom layer with the DL toolbox and a part of the forward pass of this layer is making draws from a beta distribution where the b parameter is to be optimised as part of the network training. However, I seem to be having difficulty using betarnd (and by extension randg) with a dlarray valued parameter.
Consider the following, which works as expected.
>> betarnd(1, 0.1)
ans =
0.2678
However, if I instead do the following, then it does not work.
>> b = dlarray(0.1)
b =
1×1 dlarray
0.1000
>> betarnd(1, b)
Error using randg
SHAPE must be a full real double or single array.
Error in betarnd (line 34)
g2 = randg(b,sizeOut); % could be Infs or NaNs
Is it not possible to use such functions with parameters to be optimised via automatic differentiation (hence dlarray)?
Many thanks

채택된 답변

Matt J
Matt J 2024년 6월 20일
편집: Matt J 2024년 6월 20일
Random number generation operations do not have derivatives in the standard sense. You will have to define some approximate derivative for yourself by implementing a backward() method.
  댓글 수: 2
Matt J
Matt J 2024년 6월 20일
편집: Matt J 2024년 6월 22일
You will have to define some approximate derivative for yourself by implementing a backward() method.
One candidate would be to reparametrize the beta distribution in terms of uniform random variables, U1 and U2, which you would save during forward propagation,
function [Z, U1, U2] = forward_pass(alpha, beta)
% Generate uniform random variables
U1 = rand();
U2 = rand();
% Generate Gamma(alpha, 1) and Gamma(beta, 1) using the inverse CDF (ppf)
X = gaminv(U1, alpha, 1);
Y = gaminv(U2, beta, 1);
% Combine to get Beta(alpha, beta)
Z = X / (X + Y);
end
During back propagation, your backward() method would differentiate non-stochastically with resepct to alpha and beta, using the saved U1 and U2 data as fixed and given values,
function [dZ_dalpha, dZ_dbeta] = backward_pass(alpha, beta, U1, U2, grad_gaminv)
% Differentiate gaminv with respect to the shape parameter alpha and beta
dX_dalpha = grad_gaminv(U1, alpha);
dY_dbeta = grad_gaminv(U2, beta);
% Compute partial derivatives of Z with respect to X and Y
X = gaminv(U1, alpha, 1);
Y = gaminv(U2, beta, 1);
dZ_dX = Y / (X + Y)^2;
dZ_dY = -X / (X + Y)^2;
% Use the chain rule to compute gradients with respect to alpha and beta
dZ_dalpha = dZ_dX * dX_dalpha;
dZ_dbeta = dZ_dY * dY_dbeta;
end
This assumes you have provided a function grad_gaminv() which can differentiate gaminv(), e.g.,
function grad = grad_gaminv(U, shape)
% Placeholder for the actual derivative computation of gaminv with respect to the shape parameter
% Here we use a numerical approximation for demonstration
delta = 1e-6;
grad = (gaminv(U, shape + delta, 1) - gaminv(U, shape, 1)) / delta;
end
DISCLAIMER: All code above was ChatGPT-generated.
Jack Hunt
Jack Hunt 2024년 6월 22일
I see, so I do indeed need to use a closed form gradient. I had naively assumed that the autodiff engine would treat the stochastic (rng) quantities as non stochastic and basically do as you have described above.
Thank you for the answer. I shall work through the maths (re-derive the derivatives) and implement it the manual way. I have been spoiled by autodiff in the last decade or so; it’s been some time since I explicitly wrote a backward pass!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by