Negative variance of state when training.

조회 수: 4 (최근 30일)

Alexander Resare 2024년 4월 13일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2106456-negative-variance-of-state-when-training

댓글: Malay Agarwal 2024년 4월 23일

For our image segmentation task, we are trying to implement a custom training loop for our network, giving us more freedom to visualize predictions while training. Bellow follows parts of the code that should be key in identifying the underlying issue:

%% Classes

classNames = ["bg", "live", "nk", "round", "blob", "other"];

labelIDs = [0 192 255 1 2 3];

numClasses = 6;

%% Create mobilenet

network = 'mobilenetv2';

lgraph = deeplabv3plusLayers([224 224 3],numClasses,network);

% X our whole trainig data

[m, s] = calculate_input_params(single(X));

input_layer_new = imageInputLayer([224 224 3], "Normalization","zscore", "Mean",m, "StandardDeviation",s);

lgraph = replaceLayer(lgraph, "input_1", input_layer_new);

lgraph = removeLayers(lgraph, "classification");

%% Initialize network, training data and parameters

net = dlnetwork(lgraph);

mbq = minibatchqueue(ds_augmented, "MiniBatchSize",16, "MiniBatchFormat",["SSCB" "SSB"]);

numepochs = 3;

initialLearnRate = 0.01;

decay = 0.01;

momentum = 0.9;

vel = [];

%% Necessary code to avoid error

try

nnet.internal.cnngpu.reluForward(1);

catch ME

end

%% Train network

epoch = 0;

iteration = 0;

while epoch < numepochs

epoch = epoch + 1;

shuffle(mbq);

while hasdata(mbq)

iteration = iteration + 1;

epoch_iteration = [epoch iteration]

[X_b, Y_b] = next(mbq);

Y_b = adjust_dimensions(Y_b);

[loss,gradients,state] = dlfeval(@modelLoss,net,X_b,Y_b);

net.State = state;

loss

learnRate = initialLearnRate/(1 + decay*iteration);

[net, vel] = sgdmupdate(net, gradients, vel, learnRate, momentum);

end

function [loss,gradients,state] = modelLoss(net,X_b,Y_b)

classWeights = [1 10 10 10 10 10];

% Forward data through network.

[Y_p,state] = forward(net,X_b);

% Calculate cross-entropy loss.

loss = crossentropy(Y_p,Y_b,classWeights,'WeightsFormat','UC','TargetCategories','independent');

% Calculate gradients of loss with respect to learnable parameters.

gradients = dlgradient(loss,net.Learnables);

end

Essentialy, when we run the Train network section, we manage to run a couple of iterations (number of iterations may vary), untill we get the following error:

Along this, we have noticed that no matter how many iterations we run, when we access X_b, Y_b, Y_p and try to visualize the first and second image of the batch, we always get the same prediction regardless of X_b and Y_b. It seems that Y_p that is generated from forwad(net, X_b) is somehow constant:

Since me and my lab partner do not pocess any formal training in deep learning and image segmentation, we find it challenging to connect the dots and overcome this problem. Any feedback regarding code or approach would be much appreciated.