SVM optimizer fails with large matrix
조회 수: 2 (최근 30일)
이전 댓글 표시
I am trying to optimize a binary SVM, but when I use all the data (10 Gb aproximmately), the optimization figures show up but nothing happens (over the course of days). The computer has enough memory (it shows a 25% usage) and CPU (it shows a 20% usage), so it is not a problem of running out of resources. I believe it may have entered some sort of internal loop, because when I only use 1% of the information, it works fine. Is there some step I'm missing? thanks for your time.
cvp = cvpartition(response, 'Holdout', 0.2);
trainingPredictors = predictors(cvp.training, :);
trainingResponse = response(cvp.training, :);
validationPredictors = predictors(cvp.test, :);
validationResponse = response(cvp.test, :);
% Train model
rng default
MdlT= fitcsvm(trainingPredictors, ...
trainingResponse, ...
'KernelFunction','rbf',...
'OptimizeHyperparameters','auto',...
'HyperparameterOptimizationOptions',...
struct('AcquisitionFunctionName','expected-improvement-plus'))
댓글 수: 0
답변 (1개)
Don Mathis
2020년 1월 10일
My guess is that it is trying to fit an SVm using the first set of hyperparameters it tried, and it's taking a long time. Try putting 'Verbose',2 inside the struct and it will display the hyperparameters it is about to try next. Then you can try passing those explicitly and see if takes a long time using those values.
For example:
>> load ionosphere
>> rng(0);
>> fitcsvm(X,Y,'OptimizeHyperparameters','auto','HyperparameterOptimizationOptions',struct('Verbose',2,'Holdout',.2))
Output:
Performing function evaluation on point: 64.836 | 0.0015729 |
Time to select the next point: 0.09863
Time to fit the model(s): 0.05019
|=====================================================================================================|
| Iter | Eval | Objective | Objective | BestSoFar | BestSoFar | BoxConstraint| KernelScale |
| | result | | runtime | (observed) | (estim.) | | |
|=====================================================================================================|
| 1 | Best | 0.38571 | 3.539 | 0.38571 | 0.38571 | 64.836 | 0.0015729 |
Performing function evaluation on point: 0.036335 | 5.5755 |
Time to select the next point: 0.096786
Time to fit the model(s): 0.034137
| 2 | Best | 0.35714 | 0.034081 | 0.35714 | 0.35892 | 0.036335 | 5.5755 |
Performing function evaluation on point: 0.0022147 | 0.0023957 |
Notice that the first iteration took 100 times longer than the second, because of the hyperparameter values used. You can then run those values manually to check the runtime by itself:
>> tic; fitcsvm(X,Y,'BoxConstraint',64.836,'KernelScale',.0015729,'Holdout',.2); toc
Elapsed time is 3.365133 seconds.
댓글 수: 1
Don Mathis
2020년 1월 10일
If that is the problem, a solution is to limit the ranges of the hyperparameters. Disallow very small vales of KernelScale and very large values of BoxConstraint. It's a bit complicated but you do it as follows. You will need to look at the 'params' for your case because their default ranges depend on an analysis of your data.
>> params = hyperparameters('fitcsvm', X, Y);
>> params(1).Range = [.001 10];
>> params(2).Range = [.1 100];
>> rng(0); fitcsvm(X,Y,'OptimizeHyperparameters',params,'HyperparameterOptimizationOptions',struct('Verbose',2,'Holdout',.2))
Performing function evaluation on point: 1.6139 | 0.12541 |
Time to select the next point: 0.095223
Time to fit the model(s): 0.047184
|=====================================================================================================|
| Iter | Eval | Objective | Objective | BestSoFar | BestSoFar | BoxConstraint| KernelScale |
| | result | | runtime | (observed) | (estim.) | | |
|=====================================================================================================|
| 1 | Best | 0.1 | 0.24221 | 0.1 | 0.1 | 1.6139 | 0.12541 |
Performing function evaluation on point: 0.01097 | 7.4669 |
Time to select the next point: 0.21631
Time to fit the model(s): 0.036374
| 2 | Accept | 0.35714 | 0.036359 | 0.1 | 0.11601 | 0.01097 | 7.4669 |
Performing function evaluation on point: 0.0016991 | 0.15478 |
참고 항목
카테고리
Help Center 및 File Exchange에서 Model Building and Assessment에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!