neural network back propagation problem
이전 댓글 표시
im using 2 inputs and single output. then the same network structure apply for 3 inputs and two outputs. however, i dont get too near output value. whats wrong with this network? or i need to change it to other type of structure?
clear all;clc;clear;
% load data % p=[0 0 1 1; 0 1 0 1]; % t = [0 1 1 0];
p = [0 0 0 0 1 1 1 1; 0 0 1 1 0 0 1 1; 0 1 0 1 0 1 0 1]; t = [0 1 0 0 0 1 1 1; 0 1 0 0 1 1 0 0];
net = newff(p,t,[15, 15],{'logsig','logsig'},'traingd');
net.trainParam.perf = 'mse'; net.trainParam.epochs = 100; net.trainParam.goal = 0; net.trainParam.lr = 0.9; net.trainParam.mc = 0.95; net.trainParam.min_grad = 0;
[net,tr] = train(net,p,t);
y=sim (net,p)'
채택된 답변
추가 답변 (3개)
Greg Heath
2013년 6월 13일
0 개 추천
Why don't you just use the code in help newff?
Note that you have a 3-15-15-2 node topology with
Nw = (3+1)*15+(15+1)*15+(15+1)*2 = 332 Unknown weights
Ntrn = 8 - 2*round(0.15*8) = 6 training patterns
Ntrneq = Ntrn*2 = 12 training equations
If 12 equations for 332 unknowns makes you uneasy, remove one of the hidden layers and remove some of the hidden nodes from the remaining hidden layer.
Hope this helps.
Thank you for formally accepting my answer
Greg
댓글 수: 6
azie
2013년 6월 13일
azie
2013년 6월 13일
Greg Heath
2013년 6월 13일
편집: Greg Heath
2013년 6월 13일
%azie about 12 hours ago
%done it with 1 layer with 15 neurons.still the same result. max validation fail. what is wrong with the
%code? or matlab cant run it with three inputs?
Nval = round(0.15*8) = 1
1. Training stopped because the error of that single validation example increased for 6 epochs.
Obviously you don't have enough data to divide into trn/val/tst subsets of practical size. Overwrite 'dividerand' with 'dividetrain'.
2. With H=15 and Ntrn = 8
Ntrneq = 8*2 = 16
but
Nw = (3+1)*15+(15+1)*2 = 92 > 5*Ntrneq
whereas, ideally, you would like
either Ntrneq >> Nw
or Ntrn > Nval >> 1.
or net.trainFcn = 'trainbr'; % Nval = 0
Greg Heath
2013년 6월 13일
%azie about 11 hours ago
% im reducing the layer to 1 and add the line below, the result come out better.
%net.divideFcn = '';
%if i want the network to predict the output for different samples,can i used the same code or i need to used patternnet?
NO! patternnet is for classification or pattern recognition!
net.divideFcn = '' is the same as net.divideFcn = 'dividetrain'. Good!
Next try net.trainFcn = 'trainbr' % (Nval = 0) Good for small data sets
Finally, try to reduce H as much as possible. To make sure that you do not have a disasterous choice of random intial weights, obtain 10 random weight initialization trials for each candidate value of H.
Start at the low end with H = [ [], 1:9 ]. This will probably yield 100 designs in less than a minute or 2.
Greg Heath
2013년 6월 13일
P.S. If you have patternnet, then newfit, newpr and newff are obsolete.
They should be replaced by fitnet, patternnet and feedforwardnet, respectively.
Use fitnet for regression and curve-fitting.
Use patternnet for classification and pattern recognition.
There is no reason to use feedforward net. It is called automatically by fitnet and patternnet.
Greg Heath
2013년 6월 15일
1. You mean a net with 2 HIDDEN layers. The unmodified term "layers" means hidden AND output layers.
In the last 30 years of designing NNs, I have never encountered a net that needed 2 hidden layers. Nets with 1 hidden layer can be universal approximators if they have enough hidden nodes. Universal approximators tend to interpolate well at the expense of extrapolating badly, especially if they have too many hidden nodes.
2. If you look at the code in help newff and doc newff, you will see that you don't need to specify a long list of net properties. Always try the defaults first. They are usually sufficient.
3. Since the default and alternative input normalizations (mapminmax and mapstd) tend to center the data, 'tansig', NOT 'logsig' is the best choice for a MLP hidden layer transfer function.
4. Overfitting/Overtraining/Generalization
Ntrneq = prod(size(t)) =7*2 = 14 % Training Equations
Nw = (3+1)*3+(3+1)*3+(3+1)*2 = 32 % Unknown weights
Nw > Ntrneq % OVERFITTING
None of the following conditions are satisfied
Ntrneq >> Nw % Overfitting mitigation
Nval >> 1 % Overtraining mitigation via validation stopping
net.trainFcn = 'trainbr' % Overtraining mitigation via regularization
Consequently you have an over-trained over-fit net that is not expected to generalize well.
I have no idea what deterministic transformation the data is supposed to represent. Therefore, it is difficult to evaluate a single net with non-design data to see if it is any good (i.e., can generalize ).
The original data represented the 8 corners of a 3-D cube. If the target for all 8 corners is known, the generalization capabilty could be tested via Leave-one-out cross-validation where eight nets are designed with 7 corners and tested on the eighth corner.
However, if you visualize a 3-D cube, notice that any corner can be considered to be an OUTLIER with respect to the other 7. Therefore, it would not be surprising if a net designed with 7 corners could not extrapolate well to the eighth corner.
An interesting demonstration would be to vary the number of hidden nodes from H = 0 to a value BEYOND the upper bound value H=Hub, where the number of unknown weights is greater than the number of training equations.
To mitigate the existence of bad random weight configurations, design Ntrials = 10 nets for each value of H from 0 to Hmax (numH = Hmax+1). Since N=8 and H are small, the N*numH*Ntrials = 80* numH designs can probably be designed in less than 5 or 10 minutes.
[ I Ntrn ] = size(ptrn) % [ 3 7 ]
[ O Ntrn ] = size(ttrn) % [ 2 7]
Ntrneq = prod(size(ttrn)) % 14
[ I Ntst] = size(ptst) % [ 3 1 ]
[ O Ntst ] = size(ttst) % [ 2 1 ]
% Nw = (I+1)*H+(H+1)*O = O +(I+O+1)*H
Hub = -1 + ceil( (Ntrneq-O) / (I+O+1)) % 1
Hmin = 0, dH = 1, Hmax = 3 % Choose numH = 4, Ndesigns = 320
More Later
azie
2013년 7월 10일
0 개 추천
댓글 수: 1
Greg Heath
2013년 7월 22일
Your problem is not suitable as a regression or a classification problem where a model designed with a subset of the data can generalize to the rest of the data.
All you have to do is visualize the cube in 3 dimensions. None of the points are characterized by the other 7 points.
카테고리
도움말 센터 및 File Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!