필터 지우기
필터 지우기

Negative variance of state when training.

조회 수: 4 (최근 30일)
Alexander Resare
Alexander Resare 2024년 4월 13일
댓글: Malay Agarwal 2024년 4월 23일
For our image segmentation task, we are trying to implement a custom training loop for our network, giving us more freedom to visualize predictions while training. Bellow follows parts of the code that should be key in identifying the underlying issue:
%% Classes
classNames = ["bg", "live", "nk", "round", "blob", "other"];
labelIDs = [0 192 255 1 2 3];
numClasses = 6;
%% Create mobilenet
network = 'mobilenetv2';
lgraph = deeplabv3plusLayers([224 224 3],numClasses,network);
% X our whole trainig data
[m, s] = calculate_input_params(single(X));
input_layer_new = imageInputLayer([224 224 3], "Normalization","zscore", "Mean",m, "StandardDeviation",s);
lgraph = replaceLayer(lgraph, "input_1", input_layer_new);
lgraph = removeLayers(lgraph, "classification");
%% Initialize network, training data and parameters
net = dlnetwork(lgraph);
mbq = minibatchqueue(ds_augmented, "MiniBatchSize",16, "MiniBatchFormat",["SSCB" "SSB"]);
numepochs = 3;
initialLearnRate = 0.01;
decay = 0.01;
momentum = 0.9;
vel = [];
%% Necessary code to avoid error
try
nnet.internal.cnngpu.reluForward(1);
catch ME
end
%% Train network
epoch = 0;
iteration = 0;
while epoch < numepochs
epoch = epoch + 1;
shuffle(mbq);
while hasdata(mbq)
iteration = iteration + 1;
epoch_iteration = [epoch iteration]
[X_b, Y_b] = next(mbq);
Y_b = adjust_dimensions(Y_b);
[loss,gradients,state] = dlfeval(@modelLoss,net,X_b,Y_b);
net.State = state;
loss
learnRate = initialLearnRate/(1 + decay*iteration);
[net, vel] = sgdmupdate(net, gradients, vel, learnRate, momentum);
end
end
function [loss,gradients,state] = modelLoss(net,X_b,Y_b)
classWeights = [1 10 10 10 10 10];
% Forward data through network.
[Y_p,state] = forward(net,X_b);
% Calculate cross-entropy loss.
loss = crossentropy(Y_p,Y_b,classWeights,'WeightsFormat','UC','TargetCategories','independent');
% Calculate gradients of loss with respect to learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end
Essentialy, when we run the Train network section, we manage to run a couple of iterations (number of iterations may vary), untill we get the following error:
Along this, we have noticed that no matter how many iterations we run, when we access X_b, Y_b, Y_p and try to visualize the first and second image of the batch, we always get the same prediction regardless of X_b and Y_b. It seems that Y_p that is generated from forwad(net, X_b) is somehow constant:
Since me and my lab partner do not pocess any formal training in deep learning and image segmentation, we find it challenging to connect the dots and overcome this problem. Any feedback regarding code or approach would be much appreciated.
  댓글 수: 1
Malay Agarwal
Malay Agarwal 2024년 4월 23일
Could you please share your dataset or at least a part of it so that I can reproduce the issue?

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Image Data Workflows에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by