Best way to speed up (parallelize) a large function that take three large 3D arrays as input

조회 수: 2 (최근 30일)
I have a function that I generated that is the doubled integral evaluated via trapezoidal rule using 1000 segments of another function in x and y:
. I have this very large function saved to a file, and the input I'm now feeding it, , are 3D arrays. As of right now, with arrays that are it takes roughly a minute for the computation to finish, and in future I'd like to potentially pass it far larger arrays (e.g. ). I've been trying to find away that I could speed this up using parallel processing. The issue is that all ways I've tried to implement any parallelization has slowed it down. What I start with is
s = 10;
xi = -1:2/(s-1):1;
[xi1,xi2,xi3] = ndgrid(xi,xi,xi);
I've tried the following things:
spmd with distributed()
XI1 = distributed(xi1);
XI2 = distributed(xi2);
XI3 = distributed(xi3);
spmd
Z = myfunc(XI1,XI2,XI3);
end
However this made the processing take roughly 30 minutes.
spmd with codistributed()
spmd
XI1 = codistributed(xi1);
XI2 = codistributed(xi2);
XI3 = codistributed(xi3);
Z = myfunc(XI1,XI2,XI3);
end
Z = gather(Z);
This made the computation take roughly 40 minutes.
parfor loop with mat2tiles()
XI1 = mat2tiles(xi1,[2,2,2]);
XI2 = mat2tiles(xi2,[2,2,2]);
XI3 = mat2tiles(xi3,[2,2,2]);
Z = mat2tiles(zeros(s,s,s),[2,2,2]);
N = numel(XI1);
parfor i=1:N
Z{i} = myfunc(XI1{I},XI2{I},XI3{i});
end
This took about 6 minutes to run. mat2tiles() is found here under file share: MAT2TILES: divide array into equal-sized sub-arrays - File Exchange - MATLAB Central (mathworks.com)
parfeval
work = parfeval(@myfunc,xi1,xi2,xi3);
Z = fetchOutputs(work);
This was evantually stopped because I waited over an hour for the the fetchOutputs to finish before cancelling.
I'm not too skilled with parallelization, but I imagine that there must be a faster way for me to have my different workers work on different parts of my array inputs. This is what I thought distributed() and codistributed() did, but the amount of time extra it took for them to finish was far too long to think it's that simple.
  댓글 수: 3
David Gillcrist
David Gillcrist 2022년 11월 4일
Sorry, I mixed up my functions. I'm fixing the original post.
Matt J
Matt J 2022년 11월 4일
편집: Matt J 2022년 11월 4일
How long does it take to process a single triplet xi1,xi2,xi3 without parallelization? How many workers, N, are you using?

댓글을 달려면 로그인하십시오.

답변 (1개)

Matt J
Matt J 2022년 11월 4일
편집: Matt J 2022년 11월 4일
I think you need to extract the local part, like in the following:
XI1 = distributed(xi1);
XI2 = distributed(xi2);
XI3 = distributed(xi3);
glp=@getLocalPart;
spmd
Z = myfunc(glp(XI1),glp(XI2),glp(XI3));
end

카테고리

Help CenterFile Exchange에서 Parallel Computing에 대해 자세히 알아보기

제품


릴리스

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by