Improving performance of parallel code sections

조회 수: 1 (최근 30일)
A.B.
A.B. 2024년 1월 31일
편집: Matt J 2024년 2월 1일
Consider the following test code which benchmarks different methods of parallelization within MATLAB:
function partest()
delete(gcp("nocreate"));
pool = parpool("threads", 3);
njob = pool.NumWorkers;
ndim = 10000000;
state = rand(ndim, njob);
dummy = 0;
% parfor
tic
func = zeros(njob, 1);
parfor ijob = 1 : njob
func(ijob) = getFunc(state(:, ijob));
end
disp(" parfor = " + toc + " seconds.")
dummy = dummy + sum(func);
% spmd
tic
func = zeros(njob, 1);
spmd
indx = spmdIndex;
func(indx) = getFunc(state(:, indx));
end
disp(" spmd = " + toc + " seconds.")
%dummy = dummy + sum([func{:}]);
% parfeval
tic
clear func
fout(1 : njob) = parallel.FevalFuture;
for ijob = 1 : njob
fout(ijob) = parfeval(pool, @getFunc, 1, state(:, ijob));
end
func = fetchOutputs(fout);
disp("parfeval = " + toc + " seconds.")
dummy = dummy + sum(func);
% serial
tic
func = zeros(njob, 1);
for ijob = 1 : njob
func(ijob) = getFunc(state(:, ijob));
end
disp(" serial = " + toc + " seconds.")
dummy = dummy + sum(func);
disp(dummy)
delete(gcp("nocreate"));
end
function func = getFunc(state)
%func = -sum(state.^2);
func = 0;
for idim = 1 : length(state)
func = func - state(idim)^2;
end
end
Here is the benchmark output:
parfor = 0.092802 seconds.
spmd = 0.044583 seconds.
parfeval = 0.032457 seconds.
serial = 0.050159 seconds.
Firstly, why is parfeval outperforming all others?
Secondly, are there anything that could be done to any of these parallel constructs to improve their performance against the last (serial) case?
  댓글 수: 2
Edric Ellis
Edric Ellis 2024년 2월 1일
Can I ask
  • What OS are you using?
  • Which release of MATLAB?
  • How many physical cores do you have - what's the result of maxNumCompThreads.
A.B.
A.B. 2024년 2월 1일
WSL, 2023b, 12 per cpu, 12

댓글을 달려면 로그인하십시오.

답변 (1개)

Matt J
Matt J 2024년 1월 31일
편집: Matt J 2024년 1월 31일
The task is too small for parallelization to have any meaningful effect. Similarly, the relative performance numbers for the different methods is not meaningful. You need a task that is at least 30 seconds long serially for any of the comparisons to make any sense.
  댓글 수: 9
A.B.
A.B. 2024년 2월 1일
No, virtually. That was the response to the strangeness issue I mentioned and you disregaded as normal. The real issue which prompted this post is the performance difference between the constructs, which of course diminishes with workload size. I just moved on with what's pragmatic. I'd appreciate your explanation of the phenomenon if you have any.
Matt J
Matt J 2024년 2월 1일
편집: Matt J 2024년 2월 1일
What about my earlier comment about having more than 1 loop iteration per worker? It makes intuitive sense to me that parFeval would be optimal when you have only 1 function call per worker. Otherwise, if parfor were optimal for everything, why would they provide parFeval?

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Parallel Computing Fundamentals에 대해 자세히 알아보기

제품


릴리스

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by