Weights and Biases Not Updating in Custom MATLAB dlnetwork Training Loop

Question

SYED 2024년 6월 30일 7:58

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2133221-weights-and-biases-not-updating-in-custom-matlab-dlnetwork-training-loop

편집: Ruth 2024년 7월 9일 8:59

Hello MATLAB Community,

I am currently working on training a custom autoencoder network using MATLAB's dlnetwork framework. Despite setting up a manual training loop with gradient computation and parameter updates using adamupdate, I've observed that the weights and biases of the network do not change between iterations. Additionally, all biases remain zero throughout training. I am using Matlab R2024a. Here are the relevant parts of my code:

trailingAvgG = [];
trailingAvgSqG = [];
trailingAvgR = [];
trailingAvgSqR = [];
miniBatchSize = 600;
learnRate = 0.01;
layers = [
sequenceInputLayer(1,MinLength = 2048)
modwtLayer('Level',5,'IncludeLowpass',false,'SelectedLevels',2:5,"Wavelet","sym2")
flattenLayer
convolution1dLayer(128,8,Padding="same",Stride=8)
batchNormalizationLayer()
tanhLayer
maxPooling1dLayer(2,Padding="same")
convolution1dLayer(32,8,Padding="same",Stride=4)
batchNormalizationLayer
tanhLayer
maxPooling1dLayer(2,Padding="same")
transposedConv1dLayer(32,8,Cropping="same",Stride=4)
tanhLayer
transposedConv1dLayer(128,8,Cropping="same",Stride=8)
tanhLayer
bilstmLayer(8)
fullyConnectedLayer(8)
dropoutLayer(0.2)
fullyConnectedLayer(4)
dropoutLayer(0.2)
fullyConnectedLayer(1)];
 net = dlnetwork(layers);    
 numEpochs = 200;
%dataMat = 1x2048x22275
 dldata = arrayDatastore(dataMat,IterationDimension=3);
mbq = minibatchqueue(dldata,...
MiniBatchSize=miniBatchSize, ...
 OutputEnvironment= "cpu");
 iteration = 0;
for epoch = 1:numEpochs
         shuffle(mbq);
     
while hasdata(mbq)
iteration = iteration+1;
 [XTrain] = next(mbq);
 XTrain = dlarray(XTrain,"TBC"); % 1(C)x600(B)x2048(T)
[datafromRNN,lossR] = RNN_model(XTrain,net);
[gradientsR] = dlfeval(@gradientFunction,mean(lossR), net);
[net,trailingAvgR,trailingAvgSqR] = adamupdate(net,gradientsR, ...
                trailingAvgR,trailingAvgSqR,iteration,learnRate);
disp(['Iteration ', num2str(iteration), ', Loss: ', num2str(extractdata(lossR))]);
    end
end
function [gradientsR] = gradientFunction(lossR, net)
    gradientsR = dlgradient(lossR, net.Learnables);
end
function [datafromRNN,loss] = RNN_model(data,net)
z = data;
[coder, last] = forward(net, z, 'Outputs', {'maxpool1d_2', 'fc_3'});
loss = mse(last,z);
end

Questions:

Why are the weights and biases not updating, and why do the biases remain zero?
How can I ensure that the gradients computed are correct and being applied effectively?
Are there any specific settings or modifications I should consider to resolve this issue?

Any insights or suggestions would be greatly appreciated!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Umar 2024년 6월 30일 14:29

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2133221-weights-and-biases-not-updating-in-custom-matlab-dlnetwork-training-loop#answer_1478821

Hi Syed,

To address the problem of weights and biases not updating during training, we need to ensure that the gradients are computed accurately and that the parameter updates are applied correctly. Let's make the necessary adjustments to the code:

% Initialize Adam optimizer parameters trailingAvgG = []; trailingAvgSqG = []; trailingAvgR = []; trailingAvgSqR = [];

% Define learning parameters miniBatchSize = 600; learnRate = 0.01;

% Define the neural network layers layers = [ % Your network layers here ];

net = dlnetwork(layers); numEpochs = 200;

% Assuming 'dataMat' is your input data dldata = arrayDatastore(dataMat, 'IterationDimension', 3); mbq = minibatchqueue(dldata, 'MiniBatchSize', miniBatchSize, 'OutputEnvironment', 'cpu');

iteration = 0; for epoch = 1:numEpochs shuffle(mbq);

    while hasdata(mbq)
        iteration = iteration + 1;
        [XTrain] = next(mbq);
        XTrain = dlarray(XTrain, 'TBC');

        % Call the RNN model function
        [datafromRNN, lossR] = RNN_model(XTrain, net);

        % Compute gradients and update parameters
        [gradientsR] = dlfeval(@gradientFunction, mean(lossR), net);
        [net, trailingAvgR, trailingAvgSqR] = adamupdate(net, gradientsR, trailingAvgR, trailingAvgSqR, iteration, learnRate);

        disp(['Iteration ', num2str(iteration), ', Loss: ', num2str(extractdata(lossR))]);
    end
end

function [gradientsR] = gradientFunction(lossR, net) gradientsR = dlgradient(lossR, net.Learnables); end

function [datafromRNN, loss] = RNN_model(data, net) z = data; [coder, last] = forward(net, z, 'Outputs', {'maxpool1d_2', 'fc_3'}); loss = mse(last, z); end

By ensuring that the gradients are correctly computed and the Adam optimizer updates the parameters accordingly, the weights and biases of the network should now change between iterations, leading to effective training progress.

Now let’s answer your questions.

Why are the weights and biases not updating, and why do the biases remain zero? If biases are not updating and remain zero, it could indicate a problem with the initialization or the learning rate being too low. Ensure that biases are initialized correctly, preferably with small random values to break symmetry. Additionally, consider adjusting the learning rate to a more suitable value that allows biases to update effectively.

How can I ensure that the gradients computed are correct and being applied effectively?

To verify the correctness of computed gradients and their effective application, you can employ various techniques:

Gradient Checking: Implement numerical gradient checking to compare computed gradients with numerical approximations. Discrepancies may indicate issues in gradient computation. Visualizing Gradients: Plot and analyze the gradients to ensure they follow expected patterns and magnitudes. Debugging Gradient Functions: Review the gradient computation function (gradientFunction) to ensure it correctly calculates gradients with respect to the loss.

Are there any specific settings or modifications I should consider to resolve this issue?

To enhance the training process and address the issues at hand, consider the following settings and modifications:

Learning Rate Adjustment: Experiment with different learning rates to find an optimal value that facilitates weight and bias updates without causing instability. Regularization Techniques: Introduce regularization methods like L1 or L2 regularization to prevent overfitting and aid in smoother weight updates. Batch Normalization: Verify the implementation of batch normalization layers to stabilize training and improve gradient flow. Network Architecture: Evaluate the complexity and design of your neural network architecture to ensure it is suitable for the task at hand and facilitates effective weight updates.

I hope this will help resolve your issues.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Umar 2024년 6월 30일 14:30

Sorry for the unformatted code

댓글을 달려면 로그인하십시오.

Answer 2

Ruth 2024년 7월 9일 8:58

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2133221-weights-and-biases-not-updating-in-custom-matlab-dlnetwork-training-loop#answer_1483211

편집: Ruth 2024년 7월 9일 8:59

MATLAB Online에서 열기

Hi Syed,

The forward call should also be inside a function called by dlfeval to ensure auto diff occurs as expected. I would recommend combining the loss and gradient calculations into one function to do this:

function [loss,gradientsR] = RNN_model(data,net)
z = data;
[coder, last] = forward(net, z, 'Outputs', {'maxpool1d_2', 'fc_3'});
loss = mean(mse(last,z));
gradientsR = dlgradient(loss, net.Learnables);
end

This should be called inside the loop using dlfeval:

[lossR, gradientsR] = dlfeval(@RNN_model,XTrain,net);

You might need to edit this a bit as I'm not completely sure of your code, for example where datafromRNN comes from however, moving the loss and gradient calculation into one function called by dlfeval should resolve the issue.

You can see an example where this is done here: https://uk.mathworks.com/help/deeplearning/ug/train-network-using-custom-training-loop.html

More info on auto diff in the Deep Learning Toolbox: https://uk.mathworks.com/help/deeplearning/ug/include-automatic-differentiation.html