Why is vectorization faster than the parallel computing？

Question

wei zhang 2019년 9월 4일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/478841-why-is-vectorization-faster-than-the-parallel-computing

댓글: Sterling Baird 2020년 10월 26일

I am trying to speed my code, which processing some big geometry surface. From scanning on the website, I found 2 way to optimize my code. One is vectorization, the other is using parallel computing toolbox. My computer has 16 cores, but with relative low speed, 2.0 ghz. As my experience, I found vectorization is always faster than parallel computing. I just wonder the reason about it. Does Matlab build-in function could do the vectorization on many "virtual small processors", which is much more than the computer cores? Like GPU or something else? I want to know a little about the machinism under it.

Thank you very much for any hints and explainations.

댓글 수: 2
없음 표시없음 숨기기

Jan 2019년 9월 4일

The more specific a question is, the easier is an answer. If you post some code, it is possible to explain, what happens. Asking aboput the general mechanism demands for an exhaustive answer, which will most likely do not match your point.

Adam 2019년 9월 4일

Vectorisation takes advantage of the fact that the same (usually fairly simple, at a component level) operation is being performed on many elements of the array, which it can process very efficiently at the low-level.

Parallelisation has data copying overheads and many other considerations, as discussed by Jan in his answer below.

Where vectorisation is possibly it is usually preferable (faster) to parallelisation.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Jan 2019년 9월 4일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/478841-why-is-vectorization-faster-than-the-parallel-computing#answer_390355

It depends on the problem. Parallelization is not trivial. If you use e.g. 16 cores and write the results in neighboring elements of a UINT8 vector, you get a collision in the cache-line. As result the total computing time can exceed the time of a serial code, because the threads are waiting for eachother. Such collisions can occur in other resources also, e.g. if the memory is exhausted and expensive disk caching is used, or if data are requested through a network.

Many Matlab functions are mutli-threaded, e.g. sum: For large inputs Matlab computes the sum in several parts using different threads. For a 1e5 x 1e5 matrix all cores are used (most likely). Computing this by parallelization in a parfor loop is less efficient, because there is some overhead for starting the threads. The multi-threaded functions are written such, that resource collisions are avoided (at least in most cases. In some cases, e.g. logical indexing or cell2mat there is some potential for improvements).

So before you start to parallelize a function, check if it uses many cores already in the sequential version. Then starting mutliple threads of a parpool on a single machine will not improve the efficiency.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Sterling Baird 2020년 10월 26일

How does one check if many cores are already used in the sequential version?

For example:

mldivide ('\')
pdist (Statistics & Machine Learning Toolbox)
dsearchn
fitrgp (without hyperoptimization)

댓글을 달려면 로그인하십시오.

Why is vectorization faster than the parallel computing？

댓글 수: 2
없음 표시없음 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Why is vectorization faster than the parallel computing？

댓글 수: 2 없음 표시없음 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기