Narx delays problem & multistep ahead predictions

Question

Oguz BEKTAS 2015년 10월 22일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/250445-narx-delays-problem-multistep-ahead-predictions

댓글: Greg Heath 2015년 10월 25일

Attached includes my datasets and NARX network architecture. The data is comprised of two datasets (training and test subsets )and I am trying to make multi-step predictions for the external test data subset by using the internal training data subset. As all the data curves have exponential growth,Narx model is confused with this type of growth and gave very inaccurate results. Then, I used the difference equation for exponential growts (a=x(i)/x(i-1)) to transform the data in a form that NARX can make meaningful predictions and it worked well in the model. However, now, the results are not always coherent and can come up with low performance of training. I suppose the problem is about Crross correlation between neural network time series (nncorr). I tried to find a solution from Greg's answers and tutorials and found the following code

X = zscore(cell2mat(x));

T = zscore(cell2mat(t));

[ I N ] = size(X)

[ O N ] = size(T)

crosscorrXT = nncorr(X,T,N-1);

autocorrT = nncorr(T,T,N-1);

crosscorrXT(1:N-1) = []; % Delete negative delays

autocorrT(1:N-1) = [];

sigthresh95 = 0.21 % Significance threshold

sigcrossind = crosscorrXT( crosscorrXT >= sigthresh95 )

sigautoind = autocorrT( autocorrT >= sigthresh95 )

inputdelays = sigcrossind(sigcrossind <= 35)

feedbackdelays = sigautoind(sigautoind <= 35)

feedbackdelays(1)=[] % Delete zero delay

In the original code with the simple data set, feedback delays only results with 1,2 and 3 and entered as 1:3 but in my case the results are much more different than that.

feedbackdelays=[0.227215651811241,0.233970150901862,0.284917894683197,0.264096206558765,0.393205223678322,0.574922519886786,0.294921921143733,0.270700384072885];

What is the point that I am missing? Can anyone tell me what can I do with the code above?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Greg Heath 2015년 10월 23일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/250445-narx-delays-problem-multistep-ahead-predictions#answer_197025

The version of the code you are using is both dated and error prone. Check both the NEWSGROUP and ANSWERS for the latest version.

Also: You are mistaking correlation values for correlation lags.

Hope this helps.

Thank you for formally accepting my answer

Greg

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

Oguz BEKTAS 2015년 10월 23일

편집: Oguz BEKTAS 2015년 10월 23일

MATLAB Online에서 열기

I suppose the latest version is as follows ;

[norm_inputs,xs] = mapminmax(TransformedTrainInput');
[norm_targets,ts] = mapminmax(TransformedTrainOutput');
[O N] = size(norm_targets); 
[I N] = size(norm_inputs);
zt = zscore(norm_targets',1)';
zx = zscore(norm_inputs',1)';
MSE00 = mean(var(TransformedTrainOutput,1)) 
MSEgoal = 0.005*MSE00 
Ntrn = N-2*round(0.15*N)
Ntrneq = Ntrn*O 
L = floor (0.95*(2*Ntrn-1)); 
rng('default')
for i = 1:100
      zn=zscore(randn(1,Ntrn),1);
      autocorrn = nncorr(zn,zn,Ntrn-1,'biased');
      sortabsautocorrn = sort(abs(autocorrn));
      thresh95(i) = sortabsautocorrn(L);
  end
meanthresh95 = mean(thresh95); 
autocorrt = nncorr(zt,zt, Ntrn-1,'biased');
sigflag95 = find(abs(autocorrt(Ntrn+1:2*Ntrn-1))>=meanthresh95);
crosscorrxt = nncorr(zx,zt, Ntrn-1,'biased');
crosscorrxt=cell2mat(crosscorrxt');
sigilag95 = -1 + find(abs(crosscorrxt(Ntrn+1:2*Ntrn-1))>=meanthresh95);
%FD = 1:?; % ??? 
%NFD = numel(FD); % ?
%ID = 1:?;
%NID = numel(ID);
%Hub = floor((Ntrneq-O)/(NID*I+NFD*O+O+1))

(modified from https://uk.mathworks.com/matlabcentral/answers/231504-auto-correlation-and-cross-correlation-with-narxnet-to-find-id-and-fd)

The outcomes of it are as;

MSE00 =
      0.0122
MSEgoal =
     6.1170e-05
Ntrn =
     133
Ntrneq =
     133
Hub =
       1
  sigflag95 =
       1     6     7     8     9    22    58    59    64
sigilag95 =
       0     5     6     7     8    21    57    58    63

As you said, I must have mistaken the correlation values for correlation lags but How am I supposed to decide on FD and ID values based on above code and the correlation values?

Oguz BEKTAS 2015년 10월 24일

MATLAB Online에서 열기

I think the example you mentioned is this;

X = tonndata(TransformedTrainInput,false,false);
T = tonndata(TransformedTrainOutput,false,false);
Xtest = tonndata(TransformedTestInput,false,false);
Ttest = tonndata(TransformedTestOutput,false,false);
x  = cell2mat(X);
         t  = cell2mat(T);
         [ I N ] = size(X);           
         [ O N ] = size(T);
        MSE00 = mean(var(t',1)) 
        MSE00a = mean(var(t',0))
        zx = zscore(cell2mat(X), 1);
        zt = zscore(cell2mat(T), 1);
            Ntrn = N-2*round(0.15*N)  
            trnind = 1:Ntrn
            Ttrn = T(trnind)
        Neq     = prod(size(Ttrn))    
          rng('default')
          % %  
          % %  
  FD   = 1:8; 
  ID   = 1:8; 
     NFD  = length(FD)   % 
     NID  = length(ID)  %
  MXFD  = max(FD)      
     MXID = max(ID)
  Ntrneq = prod(size(t))
     Hub     =  -1+ceil( (Ntrneq-O) / ((NID*I)+(NFD*O)+1))  
      Hmax    =  floor(Hub/10) %  
             Hmin    = 0
             dH      = 1
             Ntrials = 25
             j=0
             rng(4151941)
             trainFcn = 'trainbr';  % Bayesian Regularization backpropagation.
             for h = Hmin:dH:Hmax
                  j = j+1
                  if h == 0
                      net = narxnet( ID, FD, [],'open',trainFcn); 
                      Nw =  ( NID*I + NFD*O + 1)*O
                  else
                      net = narxnet( ID, FD, h, 'open',trainFcn);
                      Nw =  ( NID*I + NFD*O + 1)*h + ( h + 1)*O
                  end
                  Ndof            = Ntrn-Nw
                  [ Xs Xi Ai Ts ] = preparets(net,X,{},T);
                  ts              = cell2mat(Ts);
                  xs              = cell2mat(Xs);
                  MSE00s          = mean(var(ts',1))
                  MSE00as         = mean(var(ts'))
                  MSEgoal         = 0.01*Ndof*MSE00as/Neq
                  MinGrad         = MSEgoal/10
                  net.trainParam.goal      =  MSEgoal;
                  net.trainParam.min_grad  =  MinGrad;
                  net.divideFcn            =  'dividetrain';
                  for i = 1:Ntrials
                      net            =  configure(net,Xs,Ts);
                      [ net tr Ys ]  =  train(net,Xs,Ts,Xi,Ai);
                      ys             =  cell2mat(Ys);
                      stopcrit{i,j}  = tr.stop;
                      bestepoch(i,j) = tr.best_epoch;
                      MSE            = mse(ts-ys);
                      MSEa           = Neq*MSE/Ndof;
                      R2(i,j)        = 1-MSE/MSE00s; 
                      R2a(i,j)       = 1-MSEa/MSE00as;
                  end
               end
               stopcrit   =  stopcrit    %Min grad reached (for all).
               bestepoch  =  bestepoch
               R2         =  R2
               R2a        =  R2a
               Totaltime  =  toc

Many Thanks, that works perfectly and finds ys accurately. By the way, network training is much better with Bayesian Regularization backpropagation. But still, when I change the values of

FD   = 1:?;
ID   = 1:?;

The results change significantly. "FD=1:2; ID=1:2" gave unexpected results. Then, I tried it with 8 and 20, both resulted relatively well but when I increased it to 30, it was completely abrupt.

So, How can I determine FD and ID values at the beginning? and Is it related with significant lags?

Greg Heath 2015년 10월 25일

When you refer to my code, PLEASE give the EXACT reference. (Since I have thousands of posts, the reasons are obvious).

AFAIK, there is no direct method for optimizing the ID, FD, H combination. My common sense indicates that the choices of FD and ID should be GUIDED by the knowledge of all of the significant lags. Plots of the correlation functions with the significant correlations highlighted with red circles can be quite helpful in making the choice.

Just keep in mind that the number of unknown weights increases with the number of delays and hidden nodes. That is why I defined Hub as a measure of quantifying the onset of overfitting and I use the val subset to prevent overtraining an overfit net. If H << Hub is not feasible, an alternative, of course is to use regularization via MSEREG and/or TRAINBR.

댓글을 달려면 로그인하십시오.

Narx delays problem & multistep ahead predictions

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

Narx delays problem & multistep ahead predictions

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 4 이전 댓글 2개 표시이전 댓글 2개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 4
이전 댓글 2개 표시이전 댓글 2개 숨기기