How do validation check work in Neuralnet ?
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
2 개 추천

I'm learning about the neural network in MATLAB. when I learn about the neural net, I don't see anything about validation check (usually data is divided by 2 training and test testing) but in Matlab, they have a part for validation and have Validation check(in figure = 6).
so what I want to know is why we need validation check and how it work to check
채택된 답변
Greg Heath
2017년 8월 7일
편집: Greg Heath
2018년 7월 11일
design = train + validate
train : Weight Estimation
validate: Not directly involved in weight estimation. Protects ability to generalize to nontraining data. Stops training when the nontraining val subset error rate increases CONTINUOUSLY for more than 6 (default) epochs.
val subset error rate is therefore SLIGHTLY biased.
test subset error rate is COMPLETELY unbiased
default division ratio = 0.7/0.15/0.15
If val stopping occurs, take a look at the error rate curves and you will see why training was stopped.
OBVIOUSLY, the most unbiased approach for constant timestep timeseries prediction is to use DIVIDEBLOCK data division with the validation subset in the middle.
Hope this helps
Thank you for formally accepting my answer
Greg
댓글 수: 8
R G
2017년 8월 7일
sorry, I can't get your point. you mentioned about "nontraining val subset error rate" so how can I calculate it or get it, and why we need it.
because I already try to delete validation part so it makes my learning step not stop by error, whole the process can learning longer and get more accuracy. So, is validation part useless?
Greg Heath
2017년 8월 7일
편집: Greg Heath
2017년 8월 7일
Reread my post.
What good does it do to get excellent performance on training data if the net doesn't generalize well and get good performance on ALL ( train + val + tst + unseen ) data ?
The nondesign test subset error rate is the unbiased estimate of net performance on unseen data.
The val subset helps prevent the nontraining error rate from getting too high.
So, why in the world would you want to delete it?
Hope this helps.
Greg
firstly, I don't know what validation is, so I delete it to try to see the difference. and after delete it, it can generalize well and get better performance in even training and testing data (except validation part because i deleted it)
secondly, my data set is small. so I need to try k-fold cross validation, and in that, the convention about k-folk only have 2 type (training and testing)
thirdly, I don't know how to get that error number (=6), it mean in validation data, I have 6 error or 6% in data is the error or something else.
and one more thing, do you have any reference for your point (paper, book ...etc) for me can read more detail about that and also convince my advisor to not delete it (he is the main reason why I have to delete validation part). it will be very helpful
1. Validation is a guard against overtraining an overfit net.
2. It is not necessary. However, it is so useful that it is a
MATLAB default.
3. If you use a val subset and it stops training. Look at the 2
NONTRAINING error rate curves and it will become obvious why
training was stopped.
4. k-fold crossval can be done with a val set. I have posted
several examples in the NEWSGROUP and/or ANSWERS.
5. Reread my post! Training stops if the val subset error increases
CONTINUOUSLY for 6 (default) epochs. This is interpreted
as the net is becoming unable to accurately estimate
outputs for nontraining (val + tst + unseen) data.
6. If you want to delete the val subset, then use BAYESIAN
REGULARIZATION via TRAINBR.
7. This is information that has been known and used for decades.
ANY decent book on NNs will explain it.
8. I'm sure you will find many discussions on it in the
COMP.AI.NEURAL-NETS NEWSGROUP as well as any decent NN Text.
Hope this helps.
Greg
R G
2017년 8월 9일
quite clear for me now, thank you!
ErikaZ
2018년 7월 10일
Hi Greg, I am using DIVIDEBLOCK data division for my NARX net. Can you explain briefly why "val stopping is the most important for timeseries design using DIVIDEBLOCK data division"? Thanks.
Greg Heath
2018년 7월 11일
편집: Greg Heath
2018년 7월 11일
Thanks for the heads up! Changed to:
OBVIOUSLY, the most UNBIASED approach for constant timestep timeseries prediction is to use DIVIDEBLOCK data division with the validation subset in the middle.
GREG
There is also a Mathworks article on this here: https://uk.mathworks.com/help/deeplearning/ug/train-and-apply-multilayer-neural-networks.html#bss331l-17
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Matrix Indexing에 대해 자세히 알아보기
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
