What drives the memory usage in the Neural Network Toolbox?

I am using the simple neural network toolbox to build a regression model. This is simply an exploratory exercise, a preliminary test of the model size and structural relationships as well as a comparison of different input scaling choices, etc. I am using 2017b with Neural Network Toolbox.
I have 1005 inputs, 1 hidden layer, and 1 output. I have 158 samples of data to train with. All inputs and outputs are scaled and regularized.
I did some basic testing with 4000, 1000, and 100 neurons, just for a basic understanding of the requisite network size to capture the effects. I used a variety of training methods, and varied the training/testing division. All of this went as planned, but using Bayesian regularization showed that even 100 neurons was probably overkill. So I went to test with a range of 10-100 neurons, expecting my speeds to increase and my memory use to come down.
The memory usage with 48 neurons was over 10x as high as with 100-4000. And, during the 1st iteration, the memory use just kept climbing over 29 gigs, and the time it took was several 1000 times greater than the per-iteration time at 4000 neurons. The memory usage with 10 neurons was about 4x as high as with 4000 neurons and the per-iteration time was about 100x as high as with 4000 neurons. This is really counter-intuitive. Training method had no real impact on the memory issue. Can anyone explain what is happening here, why is MATLAB requesting ~30 gigs for a smaller network model? Why would reducing the neurons ever increase the memory use at all?
Thanks in advance to anyone who can help. I appreciate any insights on this.

답변 (1개)

Greg Heath
Greg Heath 2018년 2월 13일
편집: Greg Heath 2018년 2월 13일
N0 =158 samples define AT MOST, a N = 157-dimensional space.
Therefore, if this is serious work and you have no more data, you should reduce the number of components of your 158 input samples to no more than 157.
However, the default number of training vectors is
Ntrn = N-2*round(0.15*N) ~ 110
Therefore it might be wise to reduce the dimensionality to no more than 110, the space spanned by the training vectors.
The most common approach for dimensionality reduction is Principal-Component-Analysis, PCA. However, that neglects the target dimension(s).
Partial-Least-Squares (PLS) is more appropriate for regression because it includes the target dimension. However, it is not well known.
Since the target dimensionality is only 1 and you are probably unfamiliar with PLS, use PCA (and maybe later, investigate PLS).
Whatever you do
Ntrneq = Ntrn*O % 110 training equations
Nw = (I+1)*H+(H+1)*O % No. of unknowns weights for an I-H-O NN
node topology
For stable solutions need
Nw <= Ntrneq = 110
or equivalently
H <= Hub = (Ntrneq-O)/(I+O+1) = 109/160 = 0.68 !
==> H =0 => A LINEAR CLASSIFIER.
OTHERWISE: USE TRAINBR
help trainbr
doc trainbr
Hope this helps.
Thank you for formally accepting my answer
Greg

카테고리

도움말 센터File Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기

질문:

2018년 2월 9일

편집:

2018년 2월 13일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by