Optimal hidden nodes number
이전 댓글 표시
Hello everyone, I would like to find optimal hidden nodes number using structured trial an error. I did the following simulation :
Hmin = 1;
Hmax = 30;
dH = 1;
NTrials = 5;
I took the minimum error for each 5 trials to plot the following graph:

My question is how to determine optimal hidden nodes from this graph ? Thank you.
댓글 수: 4
Jan
2017년 7월 8일
I have no idea about what you are doing and how you create the diagram based on the 4 variables. Perhaps some more details help to answer your question.
Hamza Ali
2017년 7월 8일
Joshua
2017년 7월 8일
Hamza, what specifically are you trying to get from the graph (minimum, inflection point, zero slope, etc.)? Unless you provide detailed information about what you're actually looking for and what algorithm was used to make the graph, we can't help you.
Hamza Ali
2017년 7월 9일
채택된 답변
추가 답변 (2개)
Walter Roberson
2017년 7월 8일
0 개 추천
When you use neural networks, the lowest theoretical error always occurs at the point where the state matrices are large enough to include exact copies of every sample that you ever trained on, plus the known output for each of those samples. For example if you train on 50000 samples each of 17 features, then a neural network that is 50000 * 17 large (exact copy of input data) + 50000 large (exact copy of output data) will have an error rate of 0 for that data.
Such a system might be pretty useless on other data.
Likewise, if you are were doing clustering, then you can achieve 100% accuracy by using one cluster per unique input sample.
So... before you can talk about "optimal", you need to define exactly what you mean by that.
There are a lot of things for which the Pareto Rule applies: "for many events, roughly 80% of the effects come from 20% of the causes". This applies recursively -- of the 20% that remains after the first pass, 80% will be explained by 20% of the second layer of causes. And you can keep going with that. But it is common that the cost of each layer you go through is roughly the same, so addressing the first 80% of the second 20% of the original costs about as much as dealing with the original 80% did, and doing the next step costs about as much as everything already spent, and so on. Basically, for each step closer to 100% accuracy you get, the costs double.
Where is the "optimal"? Well that depends on whether you have resource limitations or if you prize 100% accuracy more than anything.
Greg Heath
2017년 7월 13일
편집: Greg Heath
2017년 7월 13일
0 개 추천
I = 5, O = 1, N = 46824
Ntrn ~ 0.7*N = 32877 Ntrneq = Ntrn*O = 32877 Hub = (Ntrneq-O)/(I+O+1) = 32876/7 ~ 4682
For H << Hub, try H <= Hub/10 or H < 468
The reason for the quirky numbers is because your data base is HUGE!
I would just start with the default H = 10 with Ntrials = 10 and continue doubling H until success. Then I would consider reducing by filling in the gaps between values already tried.
Hope this helps.
Greg
댓글 수: 3
Greg Heath
2017년 7월 13일
With I = 17 input features and O = 1 output feature I am usually comfortable with
Ntrn > 100*max(I,O) = 1700.
Then
N >~ Ntrn/0.7 ~ 2430
In your case that would yield
H << (1700 - 1)/( 17 + 1+1) ~ 89
So, I would start with the default H = 10 and use the old double or half strategy to narrow in. OR, just set up a grid Hmin:dH:Hmax With Ntrials = 10 for each value of H that is tested.
Greg
Hamza Ali
2017년 10월 9일
Greg Heath
2017년 10월 10일
The best way to judge is to state, a priori, how much error you will accept.
The simplest model is output = constant. To minimize mean-square-error that constant should be the target mean
output = mean(target,2)
and the resulting MSE is the mean biased target variance.
vart1 = mse(target - mean(target,2))
= mean(var(target',1))
I am usually satisfied with the goal
MSEgoal = 0.01 * vart1
which yields the square statistic See Wikipedia)
Rsquare = 1 - MSE/MSEgoal = 0.99
Hope this helps.
Greg
카테고리
도움말 센터 및 File Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

