Nonlinear regression with neural network
조회 수: 8 (최근 30일)
이전 댓글 표시
I would like to share with you how I approached a nonlinear regression problem (2 inputs, one output), and have your advice.
After some quick readings I settled for a network with one single hidden layer with the tansig transfer function and purelin for the output, as it seems to be the most common approach for such problems.
I used trainbr in order to automatically determine the regularization parameter. However, I didn't find out how to automatically determine the number of hidden neurons (which should normally be possible in the Bayesian framework if I'm not msitaken). So I couldn't conflate the training and validation sets ; I kept the validation set to evaluate architectures of increasing amounts of neurons.
So within one for loop going from 1 to 20, I trained networks with 1 to 20 neurons in the hidden layer. Then, I applied them on the validation set and computed the mean squared error.
First question : is this the most appropriate way to do? Would you have done differently?
The MSE keeps getting smaller as the number of neurons increase. I stopped at 20 as there seems to have no real benefits in going further. Then, I applied the 2-20-1 net to the test set, and got a very very small MSE of 4e^-6, and a correlation of 0.99999 between the test labels and the output of the network.
Second question : isn't it suspicious to get such a high performance? What do you think about this?
I'll be looking forward to your responses in order to validate or dismiss my approach.
댓글 수: 0
채택된 답변
Greg Heath
2013년 5월 20일
TRAINBR does not use a validation set. Therefore I am not quite sure what you are doing.
Are you using TRAINBR's default 15% test subset as a holdout (NO validation stopping) validation subset for choosing the best of multiple designs?
Then , I assume you have a third holdout subset that you use for testing, i.e., to use the "best net" performance on the test set as an unbiased estimate of it's performance on non-design operational data.
My advice is to use the smallest number of hidden nodes, H, that will yield a degree-of-freedom-adjusted coefficient of determination exceeding 99%. Use a double loop with ~ 10 random weight initialization designs (inner loop) for each value of H (outer loop). Ten values of h should be sufficient.
I have posted many double loop designs in ANSWERS and NEWSGROUP.
If you have more questions, please include your code with comments.
Hope this helps.
Thank you for formally accepting my answer
Greg
댓글 수: 2
Greg Heath
2013년 5월 31일
1. To get an unbiased estimate of performance on unseen data, use MSEtst from the net with the best MSEval.
2. To choose one net to use on future data, I would choose the net with the smallest number of hidden nodes that has an acceptable performance on all of the data.
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Modeling and Prediction with NARX and Time-Delay Networks에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!