Choosing the best set of initial weights of a neural network to train all dataset

Question

Mirko Job 2019년 8월 8일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/475370-choosing-the-best-set-of-initial-weights-of-a-neural-network-to-train-all-dataset

댓글: Jonathan De Sousa 2022년 1월 30일

채택된 답변: Sourav Bairagya

I am developing a neural network for pattern recognition in Matlab.

Currently:

1) I divide my dataset into 6 folds (5 folds CV + 1 fold Test, which represent unseen data);

2) I choose 10 different number of hidden neurons;

3) I choose 10 different sets of initial weights (random);

4) For each fold (as test) (k);

- For each number of hidden neurons (i);

- - For each set of initial weights (j);

- - - I perform 5 fold CV (4 training and 1 early stop), saving the average performance (R^2) on Training Validation and Test and the average number of epochs of training across all iterations of the crossvalidation ([i,j,k] element of the result matrixes);

5) Averaging across the 6 different choices of test folds (k) (10x10x6 -> 10x10) I obtain a general estimate of the different models accross the entire DataSET considered as unseen data;

6) I choose the optimal number of hidden neurons as the value that describes the model which performs better in average across 10 iteration of different sets of initial weights (j);

7) I choose the number of training epochs as the average of training epochs found across the ten iteration of initial weights (j) for all possible choice of test set (k);

Now i have the number of hidden neurons and the number of epochs to train the final model on all data.

My question is how should i choose the initial set of weights ? Should I choose again ten sets of initial weights and train 10 different networks with the previous defined parameters to find the best ? In this case (since i don't have validation and test), the resulted net will not be overfitted?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Sourav Bairagya 2019년 8월 12일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/475370-choosing-the-best-set-of-initial-weights-of-a-neural-network-to-train-all-dataset#answer_387226

The simplest way to initialize weights and biases is to set those to small uniform random values which works well for neural networks with a single hidden layer. But, when number of hidden layers is more than one, then you can use a good initialization scheme like “Glorot (also known as Xavier) Initialization”.

As we don’t know anything about the dataset beforehand, hence one good way is to assign the weights from a Gaussian distribution which have zero mean and some finite variance. With each passing layer, it is expected that the variance should remain same. This will help to keep the signal from exploding to a high value or vanishing to zero. In other words, it basically keeps the variance same for input and output for a hidden layer in the network and prevent the network from being overfitted.

According to the “Glorot/Xavier Initialization process”, the weights are initialized as follows (as written in this pseudo-code format):

for each hidden layer weight:

variance=2.0/(number of input + number of output);

stddev = sqrt(variance);

weight = gaussian(mean=0.0, stddev);

end for

You can try this approach in your model to initialize the weights prior to training. As weight initialization does not depend upon the dataset, hence, there is no need to choose again ten sets of initial weights and train those different networks with the previously defined parameters to find the best one.

You can also use “fullyConnectedLayer” from “Deep Learning Toolbox”. Then, there the default initializer is ‘glorot’ initializer. For more information regarding this you can follow this link:

https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.fullyconnectedlayer.html

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Jonathan De Sousa 2022년 1월 30일

The Glorot initialisation scheme does not actually use a Gaussian distribution. The weights are sampled rather from a uniform distribution. Have a look at: https://uk.mathworks.com/help/deeplearning/ug/initialize-learnable-parameters-for-custom-training-loop.html#mw_1bd0f2c3-c7df-4841-89ce-a7574d2db8d9

댓글을 달려면 로그인하십시오.

Choosing the best set of initial weights of a neural network to train all dataset

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Choosing the best set of initial weights of a neural network to train all dataset

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기