Machine learning and data normalization - how data should(?) be normalized.

조회 수: 6 (최근 30일)
Micha? Kowalczyk
Micha? Kowalczyk 2012년 3월 4일
답변: Mostafa Nakhaei 2019년 10월 18일
Hello, I have a general question about data normalization for classification algorithms: if I have a training set and a testing set, should I normalize them separately or join them for normalization step? And what if later I would like to use this classifier to classify a totally new portion of data? Should I keep extreme values of each feature to use them for normalization?
Second question I have: Is normalization really necessary? Does SVM need it?
Thank you in advance for any help. Cheers, Michael

답변 (2개)

Mostafa Nakhaei
Mostafa Nakhaei 2019년 10월 18일
Please note that the best practice in machine learning is to keep the distribution of testing and training the same. So, if you want to normalize your data, it is good to do the normalization on whole dataset first and then separate them. thus, your testing and training will have the same distribution. The common error is to separate the data and then normalize them individually.

BERGHOUT Tarek
BERGHOUT Tarek 2019년 2월 3일
1-you can normalize the eparately or together but the best way is to normalize the inside the trainig function ; if you add the normelization function inside the trainig function , you can use it for any dataset after that .
2- yes normalization alwaze necesery if and ownly if the activation fuinctions of your training model are bounded otherwise you don't have to normelize tham;
and for SVM if the kerenel function is bounded you must normelize you data.

카테고리

Help CenterFile Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by