neural network back propagation problem

Question

0 개 추천

im using 2 inputs and single output. then the same network structure apply for 3 inputs and two outputs. however, i dont get too near output value. whats wrong with this network? or i need to change it to other type of structure?

clear all;clc;clear;

% load data % p=[0 0 1 1; 0 1 0 1]; % t = [0 1 1 0];

p = [0 0 0 0 1 1 1 1; 0 0 1 1 0 0 1 1; 0 1 0 1 0 1 0 1]; t = [0 1 0 0 0 1 1 1; 0 1 0 0 1 1 0 0];

net = newff(p,t,[15, 15],{'logsig','logsig'},'traingd');

net.trainParam.perf = 'mse'; net.trainParam.epochs = 100; net.trainParam.goal = 0; net.trainParam.lr = 0.9; net.trainParam.mc = 0.95; net.trainParam.min_grad = 0;

[net,tr] = train(net,p,t);

y=sim (net,p)'

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Greg Heath 2013년 6월 16일

MATLAB Online에서 열기

0 개 추천

% Ntrn/Nval/Ntest = 7/0/1

 close all, clear all, clc
 tic
 ptrn = [0 0 0 0 1 1 1 ; 0 0 1 1 0 0 1 ; 0 1 0 1 0 1 0 ]
 ttrn = [0 1 0 0 0 1 1 ; 0 1 0 0 1 1 0 ]
 ptst = [ 1; 1; 1 ]
 ttst  = [ 1; 0 ]
 [I Ntrn] = size (ptrn)         % [ 3 7 ]
 [O Ntrn] = size (ttrn)         % [ 2 7 ]
 Ntrneq   = prod(size(ttrn))    % 14
 MSEtrn00 = mean(var(ttrn',1))  % 0.2449
 [I Ntst] = size(ptst)          % [ 3 1 ]

% Nw = (I+1)*H+(H+1)*O = O +(I+O+1)*H < Ntrneq

 Hub = -1 + ceil( (Ntrneq-O) / (I+O+1))   %  1
 Nwub = O+(I+O+1)*Hub                     % 8 < 14
 Hmax = 3
 dH=1
 Hmin =0
 Ntrials = 20
 MSEgoal =  0.01*MSEtrn00   % 2.4e-3   => R2trn >= 0.99
 MinGrad =   MSEgoal/10     % 2.4e-4
 rng(0)
 j=0
 for h = Hmin:dH:Hmax
    j=j+1
    if h==0
        net = newff(ptrn,ttrn,[]);
        Nw = (I+1)*O
    else
        net = newff(ptrn,ttrn,h);
        Nw = (I+1)*h+(h+1)*O
    end
    Ndof = Ntrneq-Nw
    net.divideFcn           = 'dividetrain';
    net.trainParam.goal     = MSEgoal;
    net.trainParam.min_grad = MinGrad;
   for i = 1:Ntrials
        h      = h
        ntrial = i
        net    = configure(net,ptrn,ttrn);
        [ net tr Ytrn  ] = train(net,ptrn,ttrn);
        ytrn       = round(Ytrn)
        MSEtrn     = mse(ttrn-ytrn)
        R2trn(i,j) = 1-MSEtrn/MSEtrn00;
        Ytst       = net(ptst)
        ytst1(i,j) = round(Ytst(1));
        ytst2(i,j) = round(Ytst(2));
    end
 end
 H     = Hmin:dH:Hmax
 R2trn = R2trn
 ytst1 = ytst1
 ytst2 = ytst2

toc % 26 sec

% Training Summary:

R2trn > 0.71 only if the net is overfit (H=2, 3)
When R2trn > 0.71, R^2 = 1  (MEMORIZATION)
R2trn = 1  50% of the time when H = 2  and 90% 
   of the time when H = 3
When H=0 (linear), max(R2trn) = 0.71  25% of the time
When H =1, max(R2trn) = 0.42    60% of the time

% Generalization Summary

 1. ytst(1)  vs ttst(1)=1
    When H =0:3, the corresponding number of errors are
     [ 0   5  10  13 ]              
 2. ytst(2)  vs ttst(2)=0
    When H =0:3, the corresponding number of errors are
     [ 11  17 14  12 ]

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

Greg Heath 2013년 7월 10일

You don't seem to understand the following basic assumptions>

To expect a net to generalize to the complete data set.

1. The training set must adequately characterize the complete data set.

2. If overfitting precautions ( Ntrneq >> Nw or Nval >> 1 or trainFcn = 'trainbr') are not made, the net can memorize the training set but perform poorly for nontraining data.

Go to the comp.ai.neural-nets FAQ and search on

Generalization

Overfitting

Hope this helps.

If not, please respond with more questions.

Greg

azie 2013년 7월 19일

u mean that my data either not in a complete set or the network is overfitting? therefore i dont get the good result in nontraining data, is it? but i have done all the steps to prevent overfit, just dont know whether the experimental data is enough to cover everything or not.

댓글을 달려면 로그인하십시오.

Answer 2

Greg Heath 2013년 6월 13일

0 개 추천

Why don't you just use the code in help newff?

Note that you have a 3-15-15-2 node topology with

Nw = (3+1)*15+(15+1)*15+(15+1)*2 = 332 Unknown weights

Ntrn = 8 - 2*round(0.15*8) = 6 training patterns

Ntrneq = Ntrn*2 = 12 training equations

If 12 equations for 332 unknowns makes you uneasy, remove one of the hidden layers and remove some of the hidden nodes from the remaining hidden layer.

Hope this helps.

Thank you for formally accepting my answer

Greg

댓글 수: 6
이전 댓글 4개 표시 이전 댓글 4개 숨기기

Greg Heath 2013년 6월 13일

P.S. If you have patternnet, then newfit, newpr and newff are obsolete.

They should be replaced by fitnet, patternnet and feedforwardnet, respectively.

Use fitnet for regression and curve-fitting.

Use patternnet for classification and pattern recognition.

There is no reason to use feedforward net. It is called automatically by fitnet and patternnet.

azie 2013년 6월 14일

편집: azie 2013년 6월 14일

MATLAB Online에서 열기

Dear Greg,

    %%modified code
   p = [0 0 0 0 1 1 1 ; 0 0 1 1 0 0 1 ; 0 1 0 1 0 1 0 ]; 
  t = [0 1 0 0 0 1 1 ; 0 1 0 0 1 1 0 ];
[I N] = size (p);
[O N] = size (t);
net = newff(p,t,[3,3],{'logsig','logsig'},'trainlm');
net.divideFcn = '';
net.trainParam.perf = 'mse';
net.trainParam.epochs = 500;
net.trainParam.goal = 0;
net.trainParam.lr = 0.9;
net.trainParam.mc = 0.95;
net.trainParam.min_grad = 0;
net = init(net);
[net,tr] = train(net,p,t);
y=sim (net,p)'
j=[1 ; 1; 1];%suppose result=[ 1;0 ]
y=sim (net,j)'

a) im trying to change from two layers to one layer but there is funny result i get for the output. all the lowest output become not less than 0.5. but if im using 2 layers, the result become exactly like the target which i think is correct. So, that why im stay with 2 layer.

b) yes, im reducing the number of hidden neuron like u told me to. im shocked to know that even 2 neuron in both layers can give the same exact result as target. is it acceptable or not?

c)like i said before, im trying to predict the j value of input, even though the training session went well with almost zero error. but the network is still worst in predicting new input.what should i do?

댓글을 달려면 로그인하십시오.

Answer 3

Greg Heath 2013년 6월 15일

MATLAB Online에서 열기

0 개 추천

1. You mean a net with 2 HIDDEN layers. The unmodified term "layers" means hidden AND output layers.

In the last 30 years of designing NNs, I have never encountered a net that needed 2 hidden layers. Nets with 1 hidden layer can be universal approximators if they have enough hidden nodes. Universal approximators tend to interpolate well at the expense of extrapolating badly, especially if they have too many hidden nodes.

2. If you look at the code in help newff and doc newff, you will see that you don't need to specify a long list of net properties. Always try the defaults first. They are usually sufficient.

3. Since the default and alternative input normalizations (mapminmax and mapstd) tend to center the data, 'tansig', NOT 'logsig' is the best choice for a MLP hidden layer transfer function.

4. Overfitting/Overtraining/Generalization

 Ntrneq = prod(size(t)) =7*2 = 14         % Training Equations
 Nw     =  (3+1)*3+(3+1)*3+(3+1)*2 =  32  % Unknown weights 
 Nw     > Ntrneq                          % OVERFITTING

None of the following conditions are satisfied

 Ntrneq >> Nw              % Overfitting mitigation
 Nval >> 1                 % Overtraining mitigation via validation stopping
 net.trainFcn = 'trainbr'  % Overtraining mitigation via regularization

Consequently you have an over-trained over-fit net that is not expected to generalize well.

I have no idea what deterministic transformation the data is supposed to represent. Therefore, it is difficult to evaluate a single net with non-design data to see if it is any good (i.e., can generalize ).

The original data represented the 8 corners of a 3-D cube. If the target for all 8 corners is known, the generalization capabilty could be tested via Leave-one-out cross-validation where eight nets are designed with 7 corners and tested on the eighth corner.

However, if you visualize a 3-D cube, notice that any corner can be considered to be an OUTLIER with respect to the other 7. Therefore, it would not be surprising if a net designed with 7 corners could not extrapolate well to the eighth corner.

An interesting demonstration would be to vary the number of hidden nodes from H = 0 to a value BEYOND the upper bound value H=Hub, where the number of unknown weights is greater than the number of training equations.

To mitigate the existence of bad random weight configurations, design Ntrials = 10 nets for each value of H from 0 to Hmax (numH = Hmax+1). Since N=8 and H are small, the N*numH*Ntrials = 80* numH designs can probably be designed in less than 5 or 10 minutes.

 [ I Ntrn ] = size(ptrn)               % [ 3 7 ]
 [ O Ntrn ] = size(ttrn)               % [ 2 7]
 Ntrneq = prod(size(ttrn))             %  14
 [ I Ntst] = size(ptst)                % [ 3 1 ]
 [ O Ntst ] = size(ttst)               % [ 2  1 ]
 % Nw = (I+1)*H+(H+1)*O = O +(I+O+1)*H
 Hub = -1 + ceil( (Ntrneq-O) / (I+O+1))   %  1
 Hmin = 0, dH = 1, Hmax = 3  % Choose numH = 4, Ndesigns = 320

More Later

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 4

azie 2013년 7월 10일

0 개 추천

still searching for an answer. accepting your code and run it. however, it seem like 1- the epoch usually runs not more than 30epochs.is it okay? the performance goal met and sometime exceedthe Mu. is this will produce good result later on?

2-the prediction value is far from target with large error founded.

so,any suggestion?

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

Greg Heath 2013년 7월 22일

Your problem is not suitable as a regression or a classification problem where a model designed with a subset of the data can generalize to the rest of the data.

All you have to do is visualize the cube in 3 dimensions. None of the points are characterized by the other 7 points.

댓글을 달려면 로그인하십시오.

neural network back propagation problem

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

추가 답변 (3개)

댓글 수: 6
이전 댓글 4개 표시 이전 댓글 4개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

카테고리

태그

Community Treasure Hunt

neural network back propagation problem

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 3 이전 댓글 1개 표시 이전 댓글 1개 숨기기

추가 답변 (3개)

댓글 수: 6 이전 댓글 4개 표시 이전 댓글 4개 숨기기

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 1 이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

카테고리

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

댓글 수: 6
이전 댓글 4개 표시 이전 댓글 4개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기