Info

이 질문은 마감되었습니다. 편집하거나 답변을 올리려면 질문을 다시 여십시오.

Neural network: train() behavior with earlier results

조회 수: 3 (최근 30일)
Akshay Joshi
Akshay Joshi 2018년 1월 28일
마감: MATLAB Answer Bot 2021년 8월 20일
I have a very large dataset of around 150GB that I need to process using neural networks. As this data is quite big, I've to break it into chunks, say 5000 elements are sent as 20 batches, each batch containing 250 elements. The following dummy code can be written for this:
for count = 1:num_batches
inputs = entire_input(1 + (count-1)*num_batches, count * num_batches);
targets = entire_targets(1 + (count-1)*num_batches, count * num_batches);
net = train(net, inputs, targets);
end
Will the net again start training with the fresh batch, or will it be able to retain weights calculated for previous batch? As per some of my discussions and findings, with each new batch, the weights start taking shape of current data and may overwrite previous weights.
Please advise if this method works well, or we can use some other method instead of train().

답변 (1개)

Greg Heath
Greg Heath 2018년 1월 29일
"Need to process" doesn't provide useful information.
What are you trying to design? Curvefitter/Regressor? PatternRecognizer/Unsupervised-Classifier/Supervised-Classifier? Timeseries??
In all cases, training, validation and test data should have similar summary statistics in all run batches. Otherwise training batch n will erase some of what is learned in batches 1 to n-1.
Your response should be far less vague than your original explanation.
Hope this helps.
Greg
Thank you for formally accepting my answer
  댓글 수: 1
Akshay Joshi
Akshay Joshi 2018년 1월 29일
편집: Akshay Joshi 2018년 1월 29일
Hi Greg,
I'm trying to design a supervised classifier with the help of multi layer perceptrons ( feedforwardnet). The input matrix is of 500,000 x 25 dimension, and output matrix 5,000 x 25.
Initially, I tried to train my network using nntool. But I was unable to feed dataset this large (150 GB) into it due to memory constraints, so decided to break data into chunks. For this purpose, I'm writing a Matlab script to create neural network and provide input in chunks.
In all cases, training, validation and test data should have similar summary statistics in all run batches.
Otherwise training batch n will erase some of what is learned in batches 1 to n-1.
Can you suggest some method through which we can retain the data of 1 to n-1 batches, and based on that, we calculate the result of say n to n+k batches?
Thanks for the earlier response. Hope I'm clear this time.

이 질문은 마감되었습니다.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by