Why my validation RMSE and loss increase after some epoch by my training data increase

조회 수: 9 (최근 30일)
Hello everyone
I am trying to predict traffic flow of future steps by previous collected data so I Use LSTM for it
but my validation loss and rmse increase and training loss and rmse decrease .because I am net to LSTM I don't know which parameters I should check for improving model and predictions.
the picture of training progress is :
also I use different lags time for my predictions and here in my codes I have 4 step lag time
XTrain_ZaMir = (XTrain_ZaMir - mu_ZaMir)/sig_ZaMir;
YTrain_ZaMir = (YTrain_ZaMir - mu_ZaMir)/sig_ZaMir;
XTrain_ZaMir = XTrain_ZaMir(:,1:end-4);
YTrain_ZaMir = YTrain_ZaMir(:,5:end);
Test_ZaMir = [flowTe_ZaMir flowTeOther_ZaMir]';
nt = floor(0.7*length(Test_ZaMir));
YTest_ZaMir = Test_ZaMir(1,1:end);
XTest_ZaMir = Test_ZaMir(1,1:end); %One input
% XTest_ZaMir = Test_ZaMir(:,1:end); % More than One input
XTest_ZaMir = (XTest_ZaMir - mu_ZaMir)/sig_ZaMir;
YTest_ZaMir = (YTest_ZaMir - mu_ZaMir)/sig_ZaMir;
XVal_ZaMir = XTest_ZaMir(:,1:nt-4);
YVal_ZaMir = YTest_ZaMir(:,5:nt);
XTest_ZaMir = XTest_ZaMir(:,nt+4:end-1);
YTest_ZaMir = YTest_ZaMir(:,nt+5:end);
%% Layers and Options
numResponses = 1 ;
featureDimension = 1;
numHiddenUnits =200 ;
layers = [ ...
sequenceInputLayer(featureDimension)
lstmLayer(numHiddenUnits)
% dropoutLayer(0.002)
fullyConnectedLayer(numResponses)
regressionLayer
];
maxepochs = 250;
minibatchsize =128;
options = trainingOptions('adam', ... %%adam
'MaxEpochs',maxepochs, ...
'GradientThreshold',1, ...
'InitialLearnRate',0.005, ...
'ValidationData',{XVal_ZaMir,YVal_ZaMir},...
'ValidationFrequency',20,...
'Shuffle','every-epoch',...
'MiniBatchSize',minibatchsize,...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',150, ...
'LearnRateDropFactor',0.005, ...
'Verbose',1, ...
'Plots','training-progress');
%% Train the Network
[net,info] = trainNetwork(XTrain_ZaMir,YTrain_ZaMir,layers,options);
[net,YPred_ZaMir]= predictAndUpdateState(net,XTest_ZaMir);
numTimeStepsTest= (0.5*floor(length(XTest_ZaMir)));
for i = 2:numTimeStepsTest
[net,YPred_ZaMir(:,i)] = predictAndUpdateState(net,XTest_ZaMir(:,i-1),'ExecutionEnvironment','cpu');
% net = resetState(net);
end
YTest_ZaMir = sig_ZaMir*YTest_ZaMir + mu_ZaMir;
YPred_ZaMir = sig_ZaMir*YPred_ZaMir + mu_ZaMir;

답변 (1개)

Aneela
Aneela 2024년 9월 10일
편집: Aneela 2024년 9월 10일
Hi Arash,
You are experiencing “overfitting” with the LSTM model where training loss decreases while the validation loss increases.
  • Add a “dropoutLayer” after the LSTM layer to prevent overfitting.
dropoutLayer(0.2)
  • The initial learning rate is high which might overshoot the optimal weights. Reduce it to 0.001 or even lower and see if it improves convergence.
  • Add L2 regularization to the “fullyConnectedLayer” which prevents overfitting by adding a penalty which prevents model from learning complex patterns.
fullyConnectedLayer(numResponses, 'L2Factor', 0.001)
  • Implement early stopping by monitoring the validation loss. This can prevent overfitting by stopping training when the validation loss starts to increase.
Refer to the following MathWorks documentation for more information on LSTM: https://www.mathworks.com/discovery/lstm.html

카테고리

Help CenterFile Exchange에서 Sequence and Numeric Feature Data Workflows에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by