Optimizing several Gaussian Process models in parallel

조회 수: 7 (최근 30일)
Robert
Robert 2022년 4월 25일
답변: Ayush Anand 2023년 10월 20일
Dear community,
I have 80 datasets with 5000 datapoints each, that I want to fit a GP to. Is there any good way to parallelize this (HPC user here)?
Initially I would roughly realize this as follows: (it will run inside a batch script)
for i = 1:80
parfeval(p, @fitrgp, 1, X{i}, Y{i}, ...
'OptimizeHyperparameters', 'all',
'HyperparameterOptimizationOptions',...
struct(...
'MaxObjectiveEvaluations',500,...
'Optimizer', 'bayesopt',...
'Verbose', 0,...
'MaxTime', 60*60,...
'Repartition', true,...
'UseParallel', true,...
'Kfold' , 15))
end
% read results after it's finished
How does the 'Useparallel' option scale? Does it take effect at all if I let it run on a single worker? Is there any way that I can have multiple workers working for one fitrgp evaluation?
Best regards and thank you,
Robert
PS: I have up to ~500 cores available and I have up to 4 predictors.
  댓글 수: 1
Robert
Robert 2022년 4월 25일
Which submit arguments for a batch job would make sense and get the most out of our computing ressources?
--ntasks=81 --cpus-per-task=5?

댓글을 달려면 로그인하십시오.

답변 (1개)

Ayush Anand
Ayush Anand 2023년 10월 20일
Hi Robert,
I understand you are trying to run several Gaussian Process models in parallel and want to know more about the “UseParallelargument, and if it is possible to have several workers working for one fitrgp evaluation.
You can use parallel computing to speed up the process of fitting a Gaussian Process (GP) to multiple datasets. The UseParallel option in MATLAB's fitrgpfunction parallelizes the cross-validation process when estimating the hyperparameters of the GP model, however it doesn't parallelize the fitting for a single GP model.
Here's how it works:
  • When you set UseParallel to true, MATLAB uses parallel computing to perform multiple cross-validation folds simultaneously. Each worker is responsible for one or more folds.
  • The UseParallel option doesn't have an effect when you're running the function on a single worker. It's specifically designed to take advantage of multiple workers.
  • You can't use multiple workers for a single fitrgp evaluation. The UseParallel option only parallelizes the cross-validation process within a single fitrgp call, not the fitting process itself.
You can refer to the following page for more information on the fitrgp” function and the “UseParallel” argument:
I hope this helps!

카테고리

Help CenterFile Exchange에서 Gaussian Process Regression에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by