Hello everyone, I am trying to use an LSTM to predict and forecast the position of a vehicle and I would like to know how to train the system.
I have a dataset consisting of 230 vehicle samples i.e. a cell of 1 x 230 where each sample is a matrix of 4 features and the respective sequence length(60 - 300 timesteps). The objective is to forecast future (1 - 5 timesteps) steps of a given vehicle sample.
I am refering to this example to understand the way to forecast and this to see how to train the model for prediction. But in both the examples the LSTM model is used as a many to one example.
my features are in the x,y coordinate..
I would like to know how to train a LSTM model on multiple sequences containing mutltiple features and learn the behaviour of the vehicle model!
Thanks in advance

 채택된 답변

Asvin Kumar
Asvin Kumar 2019년 12월 30일
편집: Asvin Kumar 2019년 12월 30일

1 개 추천

Although this links to another example that uses the bilstmLayer, the underlying principles remain the same. You can use a fullyConnectedLayer with as many outputs as necessary for your use case. By setting the OutputMode to ‘sequence’ in your lstmLayer and preparing the predictors as mentioned in the first example which you linked, you should be able to achieve your desired result.
In your case, the output size of the fullyConnectedLayer would be 4, I suppose. Your predictors would be shifted in time by 1-5 steps, whichever you're trying to forecast. It might make sense to drop the softmaxLayer and the classificationLayer from the example for your requirement.

댓글 수: 11

Sharan Magavi
Sharan Magavi 2020년 1월 8일
Hello,
thanks for the response. I have used a similar architecture for my model but unfortunately my model predicts/forecasts the SAME values over multiple timesteps. The model fails to learn the underlying relations between the various features of the dataset. Any recommendations about the solution for this problem?
Asvin Kumar
Asvin Kumar 2020년 1월 9일
Can you please provide more details on your network architecture and how you're preparing your dataset?
Sharan Magavi
Sharan Magavi 2020년 1월 13일
Hello Asvin,
So my model is as exactly as in this example and I've not made any changes. Also, like in that example, i'm using only 1 sequence right now, not the entire dataset.
My dataset for this training example is a sequence of 4 x 150 ( 4 features, 150 timesteps). As shown in the example, i'm using 90% of the timesteps to train the model and remaining 10% as the test. So when I want to evaluate the system, i utilise the predictAndUpdateState method for the network.
Hope this clarifies the question. awaiting your reply
Asvin Kumar
Asvin Kumar 2020년 1월 13일
Hard to tell even with this information. Can you please share your code and sample data to test it on?
Sharan Magavi
Sharan Magavi 2020년 1월 13일
Unfortunately, I am not authorised to share the data.
Sharan Magavi
Sharan Magavi 2020년 1월 13일
code is as follows
n = randperm(size(trackdata,2), 1); %% select any random number
data = trackdata{1,n}; %extract track information to train
numTimeStepsTrain = floor(0.9*size(data, 2));
dataTrain = data(:,1:numTimeStepsTrain+1);
dataTest = data(:,numTimeStepsTrain+1:end);
mu = mean(dataTrain, 2);
sig = std(dataTrain, 0, 2);
dataTrainStandardized = (dataTrain - mu) ./ sig;
XTrain = dataTrainStandardized(:,1:end-1);
YTrain = dataTrainStandardized(:, 2:end);
numFeatures = size(XTrain, 1);
numResponses = size(YTrain, 1);
numHiddenUnits1 = 200;
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits1,'OutputMode','sequence')
fullyConnectedLayer(numResponses)
regressionLayer];
options = trainingOptions('adam', ...
'MaxEpochs',250, ...
'GradientThreshold',1, ...
'InitialLearnRate',0.005, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',125, ...
'LearnRateDropFactor',0.2, ...
'Verbose',0, ...
'SequenceLength', 'shortest', ...
'Plots','training-progress');
% training
net = trainNetwork(XTrain,YTrain,layers,options);
%
dataTestStandardized = (dataTest - mu) ./ sig;
XTest = dataTestStandardized(:,1:end-1);
net = predictAndUpdateState(net,XTrain);
[net,YPred(:,1)] = predictAndUpdateState(net,YTrain(:,end));
numTimeStepsTest = size(XTest,2);
for i = 2:numTimeStepsTest
[net,YPred(:,i)] = predictAndUpdateState(net,YPred(:,i-1),'ExecutionEnvironment','cpu');
end
YPred = sig.*YPred + mu;
YTest = dataTest(:, 2:end);
rmse = sqrt(mean((YPred-YTest).^2, 2))
Asvin Kumar
Asvin Kumar 2020년 1월 14일
Looking at the code and our dataset size, I would suggest you to decrease the number of hidden units. The dataset used in the example has 498 time steps while yours only has 150.
This still doesn't explain why the network didn't overfit. For that, I'd suggest you try tweaking with the learning rate periods and drop factor.
It's hard to provide further insights without looking at the data.
Sharan Magavi
Sharan Magavi 2020년 1월 15일
Thank you for the suggestion ashvin, I thought as much that maybe I need to make some changes with the hyperparameters.
Sharan Magavi
Sharan Magavi 2020년 1월 15일
Also Ashvin, How would you suggest to go about merging the regression and classfication of sequences into an LSTM network?
To clarify - the data i'm using is about vehicles at a roundabout so the prediction of position has to be done to determine the position of the vehicle simultenously I need to classify which exit the vehicle might take. Right now my approach is to develop 2 networks ( 1 for position prediction and the other for classification of exits). Is there a way like an esemble or a better architecture to have a single network for this application?
Thanks!
Sharan Magavi
Sharan Magavi 2020년 1월 16일
Hello Ashvin,
I was able to solve the problem I had with the prediction. I used an 'sgdm' solver instead of an adam solver and it made a big difference in my output.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기

질문:

2019년 12월 24일

댓글:

2020년 1월 16일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by