About net.divideParaM.valRatio

Question

mike mike 2018년 9월 22일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/420262-about-net-divideparam-valratio

댓글: mike mike 2018년 9월 26일

I know it's possible to use

net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;

to divide the percentage of data into inputs for training, testing and validation. Now, in a classification problem, I didn't want the validation to be too low and I set

net.divideParam.valRatio = 0/100;

In fact, the neural network seemed not to use early stopping after 6 iterations of validation; by chance, I left the other parameters unchanged and so, in the code I wrote,

net.divideParam.trainRatio = 70/100;
    net.divideParam.valRatio =0/100;
    net.divideParam.testRatio = 15/100;

When the sum of the percentages of data distribution between training, testing and validation did not make 100% but the neural network runs the same without giving problems and without appearing error messages. I have done other tests modifying the percentages always so that it did not do 100 % as in the following cases:

net.divideParam.trainRatio = 35/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 25/100;

or

net.divideParam.trainRatio = 35/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 65/100;

my question is how to interpret the subdivision of the dataset between validation, test and training when the sum is not 100% and/or some data is set to 0%. If, for example, I put the training data at 0%, does this mean that the network is not being trained? Or if I put the test data at 0%, does it mean that the network is not being tested? And if the data distribution is greater than 100% does that mean that the remaining % of the inputs of the dataset is not used? And if the percentage distribution of the data is greater than 100% does this mean that some input is used both for the test and also, for example, for the validation?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Greg Heath 2018년 9월 23일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/420262-about-net-divideparam-valratio#answer_337940

편집: Greg Heath 2018년 9월 23일

MATLAB Online에서 열기

1. Now, in a classification problem, I didn't want the validation to be too low and I set

net.divideParam.valRatio = 0/100;

 % Your statement makes no sense: You have 
eliminated the val subset!

2. In fact, the neural network seemed not to use early stopping after 6 iterations of validation;

% Of course! valRatio = 0 eliminates the val subset!

3. In fact, the neural network seemed not to use early stopping after 6 iterations of validation; by chance, I left the other parameters unchanged and so, in the code I wrote,

    net.divideParam.trainRatio = 70/100;
    net.divideParam.valRatio   = 0/100;
    net.divideParam.testRatio  = 15/100;
 % The progam will AUTOMATICALLY CHANGE the fractions to have 
a unit sum. To find out what they are use
     a = net.divideParam.trainRatio 
     b = net.divideParam.valRatio 
     c = net.divideParam.testRatio

4. my question is how to interpret the subdivision of the dataset between validation, test and training when the sum is not 100% and/or some data is set to 0%. If, for example, I put the training data at 0%, does this mean that the network is not being trained? Or if I put the test data at 0%, does it mean that the network is not being tested? And if the data distribution is greater than 100% does that mean that the remaining % of the inputs of the dataset is not used? And if the percentage distribution of the data is greater than 100% does this mean that some input is used both for the test and also, for example, for the validation?

 See my answer to question 3.
 Hope this helps.
 %%Thank you for formally accepting my answer%%

Greg

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 2

mike mike 2018년 9월 23일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/420262-about-net-divideparam-valratio#answer_337959

Thank you greg everything is clear but I want to explain why I want to delete the validation set. I am trying to create a neural network to predict the direction of a stock index based on some indicators of technical analysis as input. I was inspired by the following work _ Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange [Yakup Kara , Melek Acar Boyacioglu, Ömer Kaan Baykan, 2010]_

you can easily find it on the internet.

In this work we talk about datasets training and datasets hold out.

In this work we talk about datasets for training and datasets for hold out. I did some research and discovered that hold-out data is synonymous with test data, so I thought I should delete the validation data and divide the data into 50% for training data and 50% for hold-out data (as the authors of the article do). If I tried to build the neural network studied in the article, implement it in Matlab with the data included in the article, with similar setting parameters but using the breakdown of the default dataset of Matlab (training, testing and validation) do not come close even remotely to the performance that are declared in the article for the training phase (I'm not importing the over fitting at the moment). If instead I divide the data into 50% training and 50% test, at least for the training phase I get very high performance data and compare them to the performance phase of the training phase of the article network. It's obvious that it's important that the net doesn't overfit and doesn't extrapolate but I want to see this in the next phase, once I understand the meaning of hold-out.

댓글 수: 2
없음 표시없음 숨기기

Greg Heath 2018년 9월 26일

The question is:

Do you understand the purpose of the validation subset?

Greg

mike mike 2018년 9월 26일

Yes, the validation data set is intended to avoid overfitting.

댓글을 달려면 로그인하십시오.

About net.divideParaM.valRatio

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (2개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

About net.divideParaM.valRatio

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (2개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2 없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시없음 숨기기