Gradient of loss for variational autoencoder?

Question

LPep 2022년 11월 17일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1854228-gradient-of-loss-for-variational-autoencoder

편집: Richard 2022년 11월 25일

Hi, I have the following code for a variational autoencoder. My data is sequence data, not images, so 'Train' consists of ~5,000 univariate sequences, each around 400 observations long. When I run the below code, 'genGrad' is coming up as entirely 0s (not NaNs) and I'm just getting the same loss value every time over multiple epochs. Very unfamiliar with dl in MatLab and not sure where I'm off here.

inputsize = height(Train);

R = 2;

numLatentChannels = 2;

layersE1 = layerGraph([

sequenceInputLayer(inputsize,"Name","input",'Normalization','none')

fullyConnectedLayer(150*R,"Name","fc_1") %R can be any number/ factor

leakyReluLayer(0.01,"Name","leakyrelu_1")

fullyConnectedLayer(100*R,"Name","fc_2")

leakyReluLayer(0.01,"Name","leakyrelu_2")

fullyConnectedLayer(50*R,"Name","fc_3")

leakyReluLayer(0.01,"Name","leakyrelu_3")

fullyConnectedLayer(25*R,"Name","fc_4")

leakyReluLayer(0.01,"Name","leakyrelu_4")

fullyConnectedLayer(10*R,"Name","fc_5")

leakyReluLayer(0.01,"Name","leakyrelu_5")

fullyConnectedLayer(5*R,"Name","fc_6")

leakyReluLayer(0.01,"Name","leakyrelu_6")

fullyConnectedLayer(2*numLatentChannels)

]);

%% Decoder

numInputChannels = size(Train,1);

outputsize = height(Train);

layersD = layerGraph([

sequenceInputLayer(numLatentChannels,"Name","Dinput")

fullyConnectedLayer(5*R,"Name","fc_ou2")

leakyReluLayer(0.01,"Name","leakyrelu_ou2")

fullyConnectedLayer(10*R,"Name","fc_ou3")

leakyReluLayer(0.01,"Name","leakyrelu_ou3")

fullyConnectedLayer(25*R,"Name","fc_ou4")

leakyReluLayer(0.01,"Name","leakyrelu_ou4")

fullyConnectedLayer(50*R,"Name","fc_ou5")

leakyReluLayer(0.01,"Name","leakyrelu_ou5")

fullyConnectedLayer(100*R,"Name","fc_ou6")

leakyReluLayer(0.01,"Name","leakyrelu_ou6")

fullyConnectedLayer(150*R,"Name","fc_ou7")

leakyReluLayer(0.01,"Name","leakyrelu_ou7")

fullyConnectedLayer(outputsize,"Name","fc_16")

]);

%% create networks from layers

encoderNet1 = dlnetwork(layersE1);

decoderNet = dlnetwork(layersD);

%%

miniBatchSize = 64;

numTrainSeq = width(Train);

%Set training options

executionEnvironment = "auto"; % set execution environment

dsTrain = arrayDatastore(Train,IterationDimension=2);

numOutputs = 1;

mbq = minibatchqueue(dsTrain,numOutputs, ...

MiniBatchSize = miniBatchSize, ...

MiniBatchFormat="CT",...

MiniBatchFcn=@preprocessMiniBatch, ...

PartialMiniBatch="discard");

numEpochs = 50; % Num of epochs

lr = 1e-4; % Learning rate

numIterationsperEpoch = ceil(numTrainSeq/miniBatchSize); % Num of Iteration per epoch

numIterations = numEpochs * numIterationsperEpoch;

avgGradientsEncoder = [];

avgGradientsSquaredEncoder = [];

avgGradientsDecoder = [];

avgGradientsSquaredDecoder = [];

monitor = trainingProgressMonitor( ...

Metrics="Loss", ...

Info="Epoch", ...

XLabel="Iteration");

epoch = 0;

iteration = 0;

%Train the model

while epoch < numEpochs && ~monitor.Stop

epoch = epoch + 1

shuffle(mbq);

while hasdata(mbq) && ~monitor.Stop

iteration = iteration + 1

XBatch = next(mbq);

if (executionEnvironment == "auto" && canUseGPU) || executionEnvironment == "gpu"

XBatch = gpuArray(XBatch);

end

compressed = forward(encoderNet1, XBatch);

d = size(compressed,1)/2;

zMean = compressed(1:d,:);

zLogvar = compressed(1+d:end,:);

sz = size(zMean);

epsilon = randn(sz);

sigma = exp(.5 * zLogvar);

z = epsilon .* sigma + zMean;

z = reshape(z, [sz]);

zSampled = dlarray(z, 'CT');

% calculate gradient of loss

[infGrad, genGrad] = dlfeval(@modelGradients1, encoderNet1, decoderNet, XBatch, zSampled,zMean,zLogvar);

% update parameters of Encoder/Decoder

[decoderNet.Learnables, avgGradientsDecoder, avgGradientsSquaredDecoder] = ...

adamupdate(decoderNet.Learnables, ...

genGrad, avgGradientsDecoder, avgGradientsSquaredDecoder, iteration, lr);

[encoderNet1.Learnables, avgGradientsEncoder, avgGradientsSquaredEncoder] = ...

adamupdate(encoderNet1.Learnables, ...

infGrad, avgGradientsEncoder, avgGradientsSquaredEncoder, iteration, lr);

end

% Update the training progress monitor.

recordMetrics(monitor,iteration,Loss=loss);

updateInfo(monitor,Epoch=epoch + " of " + numEpochs);

monitor.Progress = 100*iteration/numIterations;

end

function [infGrad, genGrad] = modelGradients1(encoderNet1, decoderNet, XBatch, zSampled,zMean,zLogvar)

xPred = forward(decoderNet, zSampled);

xPred = dlarray(xPred, 'CT');

loss = elboLoss(XBatch, xPred, zMean, zLogvar);

[genGrad, infGrad] = dlgradient(loss, decoderNet.Learnables, ...

encoderNet1.Learnables);

end

function elbo = elboLoss(x,xPred,zMean,zLogvar)

reconstructionLoss = mse(x,xPred); % Reconstruction loss.

KL = -0.5 * sum(1 + zLogvar - zMean.^2 - exp(zLogvar),1); % KL divergence.

KL = mean(KL);

elbo = reconstructionLoss + KL; % Combined loss.

end

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Richard 2022년 11월 18일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1854228-gradient-of-loss-for-variational-autoencoder#answer_1103798

편집: Richard 2022년 11월 25일

Zero gradients are normally caused by the computation between the inputs and the output loss not being traced. When dlgradient cannot see that the loss has a dependency on an input, it always assigns zero gradients for that input. Only computations that are inside the function that is passed to dlfeval are traced.

In this case, you have a chunk of code being run outside the dlfeval to compute zSampled, including the forwarding through the encoder. Try moving that code inside the modelGradients1 function.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

LPep 2022년 11월 18일

That did it, thanks v much!

댓글을 달려면 로그인하십시오.

Gradient of loss for variational autoencoder?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

Gradient of loss for variational autoencoder?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기