Why are fitcsvm Hyperparameters trained on the whole dataset and used for crossvalidation?

Question

Patrick Schlegel 2019년 4월 23일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/457967-why-are-fitcsvm-hyperparameters-trained-on-the-whole-dataset-and-used-for-crossvalidation

댓글: Patrick Schlegel 2019년 4월 25일

hi everyone.

I'm currently working with SVMs for data separation and I noticed something conspicouos in a matlab example. The example code goes as follows:

%data generation and plotting
% rng default % For reproducibility
grnpop = mvnrnd([1,0],eye(2),10);
redpop = mvnrnd([0,1],eye(2),10);
plot(grnpop(:,1),grnpop(:,2),'go')
hold on
plot(redpop(:,1),redpop(:,2),'ro')
hold off
redpts = zeros(100,2);grnpts = redpts;
for i = 1:100
    grnpts(i,:) = mvnrnd(grnpop(randi(10),:),eye(2)*0.02);
    redpts(i,:) = mvnrnd(redpop(randi(10),:),eye(2)*0.02);
end
figure
plot(grnpts(:,1),grnpts(:,2),'go')
hold on
plot(redpts(:,1),redpts(:,2),'ro')
hold off
cdata = [grnpts;redpts];
grp = ones(200,1);
% Green label 1, red label -1
grp(101:200) = -1;
%%%
%here starts the interesting part
%%%
%Set up a partition for cross-validation.
c = cvpartition(200,'KFold',10);
%optimize svm parameters using cdata and group, i.e. all data we have
opts = struct('Optimizer','bayesopt','ShowPlots',true,'CVPartition',c,...
    'AcquisitionFunctionName','expected-improvement-plus');
svmmod = fitcsvm(cdata,grp,'KernelFunction','rbf',...
    'OptimizeHyperparameters','auto','HyperparameterOptimizationOptions',opts)
%calculate the loss using the partitions but also the svm.HyperparameterOptimizationResults that were optimized using the whole dataset!
lossnew = kfoldLoss(fitcsvm(cdata,grp,'CVPartition',c,'KernelFunction','rbf',...
    'BoxConstraint',svmmod.HyperparameterOptimizationResults.XAtMinObjective.BoxConstraint,...
    'KernelScale',svmmod.HyperparameterOptimizationResults.XAtMinObjective.KernelScale))

The whole example can also be found here:

https://de.mathworks.com/help/stats/support-vector-machines-for-binary-classification.html

As it is already apparent from my comment lines in the code example, the problem I have here is the following:

The Hyperparameters were optimized using all data we have, this means that these Hyperparameters already "have seen" the test partitions from the cross validation model and hence adapted to it in the optimization process. So this cross validation does not validate on test data that is entirely new to the trained SVM-Model and hence the cross-validation-error should be artificially low in some cases. I have also made some experiments which seem to confirm this.

My question now is, if I misunderstood something and if not, why are the hyperparameters trained this way?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Alan Weiss 2019년 4월 23일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/457967-why-are-fitcsvm-hyperparameters-trained-on-the-whole-dataset-and-used-for-crossvalidation#answer_371835

Perhaps I didn't explain well what the example is supposed to be showing. The second "fitting" step that you object to is not fitting anything at all, as you noticed. It is just the way I thought of to calculate the cross-valiidation loss using the hyperparameters that were already found. In the example I point out that the objective function that is returned in the first fitting step is exactly the same as lossnew, and that is the point that I was trying to make; you would never run the second "fit" in your own work because it is entirely redundant.

Sorry that I confused you.

Alan Weiss

MATLAB mathematical toolbox documentation

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Alan Weiss 2019년 4월 23일

Please carefully read the description of the Mdl argument that fitcsvm returns. The returned svmmod is a ClassificationSVM object, not a ClassificationPartitionedModel, even though it was optimized using a cross-validation procedure, because the arguments to fitcsvm do not include an explicit cross-validation name. If you want to get the partition back, well, you have to jump through some hoops, like I did in the example.

Alan Weiss

MATLAB mathematical toolbox documentation

Patrick Schlegel 2019년 4월 25일

So the exact type of Mdl changes dependig on the input, but the upper lines of code already produce a fully trained cross validated model. This answers my question

Thank you for your help

댓글을 달려면 로그인하십시오.

Why are fitcsvm Hyperparameters trained on the whole dataset and used for crossvalidation?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

Why are fitcsvm Hyperparameters trained on the whole dataset and used for crossvalidation?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기