How to do cross-validation with PLS feature extraction before SVM?

조회 수: 3 (최근 30일)
Juuso Korhonen
Juuso Korhonen 2021년 6월 17일
댓글: Rishik Ramena 2021년 8월 30일
Hi,
I would like to know the best way to do cross-validation with a pipeline where PLS feature extraction is done before fitting an SVM. Here is my current code:
% Cross validation (train: 80%, test: 20%)
rng default;
cv = cvpartition(size(X,1),'HoldOut',0.8);
idx = cv.test;
% Separate to training and test data
XTrain = X(~idx,:);
YTrain = Y(~idx, :);
XTest = X(idx,:);
YTest = Y(idx, :);
n_components = 10; % We should optimize this
[XL,yl,XS,YS,beta,PCTVAR, MSE, stats] = plsregress(XTrain,YTrain,n_components);
W = stats.W;
SVMModel = fitcsvm(XS,YTrain,'Standardize',false,'KernelFunction','rbf',...
'KernelScale','auto'); % I would like to have parameter optimization here
% PLS does centering of the data, X0 = X - mean(X)
% XS = X0 * W
XS_test = (XTest - mean(XTrain)) * W;
YPred = predict(SVMModel, XS_test);
accuracy = sum(YPred == YTest)/length(YPred)
The use of fitcsvm(..., 'Optimizehyperparameters', all) isn't suitable here since there is information leakage between the k-folds since the whole XTrain is used for plsregress to get XS. Are there some hyperparameter optimization functions in matlab where I could use the whole PLS+SVM as fitting function?
  댓글 수: 1
Rishik Ramena
Rishik Ramena 2021년 8월 30일
Yes your analysis is correct. The use of fitcsvm isn't suitable here due to the information leakage between the k-folds. There are no inbuilt hyperparameter optimization functions in matlab which can be used for the whole PLS+SVM setup.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Support Vector Machine Regression에 대해 자세히 알아보기

제품


릴리스

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by