Distributed array to solve a system of linear equations on a cluster

조회 수: 2 (최근 30일)
Melissa Zirps
Melissa Zirps 2021년 12월 2일
댓글: Oli Tissot 2023년 1월 12일
I'm trying to solve a system of linear equations in parallel on a computing cluster using iterative methods and distributed arrays. Right now my code looks like:
cores = 42;
cluster = parcluster;
parpool(cluster,cores);
K_solve_dist = distributed(K_solve);
force_vec_solv_dist = distributed(force_vec_solv);
[res_disp, flag_solv{ii,kk}(n,1)] = cgs(K_solve_dist,force_vec_solv_dist,tol_iter,max_iter);
However, regardless of how many cores I use, the run time seems to stay they same (this runtime is also the same as if I don't use distributed arrays at all). If I run it without the line "parpool(cluster,cores)" it runs almost 50% faster, but only uses 12 cores, even though there are more cores available. I'm trying to figure out if there's a way to use more than 12 cores and speed up the time it takes to preform this calculation.

답변 (2개)

Sam Marshalik
Sam Marshalik 2021년 12월 7일
Hey Melissa,
I would not think of distributed arrays as a way to speed up your computation. Distributed arrays are useful for when something does not fit into your machine's memory, so you spread the content of the matrix across multiple machines. This will not cause the code to run faster and in fact, like you saw, will probably run it slower, since you are introducing the overhead of communication into the equation.
It is worth pointing out that using distributed arrays on a single machine will not give you any benefit, since you are still limited to that one computer's hardware. If you have access to MATLAB Parallel Server on your cluster, then using distributed arrays with your computation will be helpful.
In short, you will truly see the benefit of using distributed arrays when you are working with very large data that can't fit on one machine. If you want to try to speed things up, you will want to take a look at things such as parfor, parfeval, gpuArray, and such parallel constructs.
  댓글 수: 2
Melissa Zirps
Melissa Zirps 2021년 12월 7일
Hi Sam,
Thanks for your response. However, as far as I can see, parfor, parfeval, and gpuArray can't be used to solve a system of linear equations? Is there a way to use parallelization while solving a system of linear equations?
Joss Knight
Joss Knight 2021년 12월 10일
gpuArray supports all the iterative solvers including cgs. However, it is mainly optimized for sparse matrices. If your matrix is dense you'll be better off using a direct solver (i.e. mldivide). This is of course also supported by gpuArray.

댓글을 달려면 로그인하십시오.


Eric Machorro
Eric Machorro 2023년 1월 11일
Piggy-backing on this question:
Setting aside the speed-up factor momentarily, How can I use CGS (or almost any Kylov type solver for that matter) with very big/long vectors which I cant hold all in memory? There are four variants to the problem
  1. I have a sparse matrix
  2. I have a symmetric, sparse matric (think Cholesky factor)
  3. I have a function handle that serves as the matrix-operator
  4. (revisting the speed up issue) I'd like to use this also in conjunction with non-GPU parallelization. Is this possible?
Does anyone have advice on any one of these?
Respectfully,
  댓글 수: 1
Oli Tissot
Oli Tissot 2023년 1월 12일
All the Krylov methods are supported for distributed arrays, so 1. and 3. are supported just as MATLAB does ; there is also the possibility to implement your own preconditioner through a function handle as well. However 2. is not supported as-this because there is no built-in notion of "symmetric matrix" in MATLAB and for MATLAB a Cholesky factor is not a symmetric matrix but a triangular matrix -- basically there is an ambiguity between symmetric and triangular and MATLAB considers those matrices as triangular and not symmetric. Of course, 3. is more generic than 2. so you can achieve 2. using 3. and implementing it yourself, if that makes sense.
Regarding 4., distributed arrays are multi-threaded: they'll use NumThreads as per your cluster profile configuration. Note the default is set to 1, so no multi-threading.
To use the distributed version, you simply need to call the Krylov method you'd like to use but with A being a distributed array.
Finally regarding speed-up and performance, the most consuming operations are usually the matrix-vector product and the preconditioner. If you know a clever way to apply your operator, you should use it. Same if you know a clever problem-specific preconditioner. There is usually a non-trivial balance to find between a extremely good preconditioner that is very costly to apply (extreme case here is \) and a poor preconditioner that will lead to a very poor convergence or even no convergence to the prescribed accuracy (extreme case here is no preconditioner at all).

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Parallel Computing Fundamentals에 대해 자세히 알아보기

제품


릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by