Distributed and spmd not running faster

Question

James 2025년 1월 25일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2173309-distributed-and-spmd-not-running-faster

댓글: siyu guo 2025년 3월 26일

I think I'm missing something fundamental about using distributed arrays with spmd. If I run the following the distributed version takes ~0.04s while the non-distributed version completes in ~0.2s (with a process pool matching the cores on my machine).

x = ones(10000, 10000);
tic
x = x * 2.3;
toc
Elapsed time is 0.035079 seconds.
x
x = 10000×10000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
x = distributed(ones(10000, 10000));
tic
spmd
    x = x * 2.3;
end
toc
Elapsed time is 0.226569 seconds.
gather(x)
ans = 10000×10000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000    2.3000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

What am I missing?

Edit: I moved the tic and toc to after the array initialization and before displaying x to not include that as I realized calling distributed is taking longer while it distributes the array across the processes and gather is taking time.

댓글 수: 2
없음 표시없음 숨기기

Walter Roberson 2025년 1월 25일

MATLAB Online에서 열기

x = distributed(ones(10000, 10000));

Better would be

x = ones(10000, 10000, 'distributed');

That should reduce the overall execution, but should not change the parts you are timing.

James 2025년 1월 25일

True, thanks!

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Edric Ellis 2025년 1월 27일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2173309-distributed-and-spmd-not-running-faster#answer_1558240

You're not missing anything. If you're only using the cores on your local machine, distributed is unlikely to be much use to you. The primary goal of distributed is to run on the memory of multiple machines, and enable computations that would otherwise not be possible. A simple breakdown would be:

Desktop MATLAB is generally good for large array operations that fit in memory
gpuArray can be even better, if you have a suitable GPU (better still if you can run in lower precision such as single)
distributed is best for array operations that fit in the memory of multiple machines
tall works well for operations on data backed by some form of storage (e.g. disk), and whole arrays can never fit in memory even across a cluster

Desktop MATLAB already runs many suitable operations in a multi-threaded manner - there is no way even in principle that distributed could perform better. In fact, for basic operations - if desktop MATLAB cannot multi-thread it - that may well mean that a distributed implementation is either not possible or not efficient.

In your case, one of the main things you're timing is the overhead of going into and out of an spmd context.

댓글 수: 2
없음 표시없음 숨기기

James 2025년 1월 27일

Perfect, thanks for breaking it all down.

siyu guo 2025년 3월 26일

Hello, senior, it was not easy to find an expert who also used matlab to do spmd. Recently, I tried to use spmd in matlab to design parallel "conjugate gradient method" to accelerate the iterative calculation of large sparse matrices. I just raised a question about spmdCat：my question about spmdCat

Sincerely hope you can help answer after watching this message, and I would like to ask you about the acceleration effect of matlab. Initially, the main body of my program was edited in matlab, but as the mesh is denser, the calculation time will increase geometrically.

So I want to seek advice from you whether to continue to write parallel programs in matlab, or switch to another language (or software) as soon as possible?

댓글을 달려면 로그인하십시오.

Distributed and spmd not running faster

댓글 수: 2
없음 표시없음 숨기기

채택된 답변

댓글 수: 2
없음 표시없음 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Distributed and spmd not running faster

댓글 수: 2 없음 표시없음 숨기기

채택된 답변

댓글 수: 2 없음 표시없음 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 2
없음 표시없음 숨기기