Why does parallel.pool.const create a copy of the variable in memory for each worker sequentially instead of in parallel?

조회 수: 3 (최근 30일)
When creating a parallel.pool.const on 9 workers prior to using parfor, I noticed that the memory usage ramps up in 9 successive steps instead of all at once. The attached image shows these steps in memory usage prior to entering the parfor using 'Resource Monitor' on Windows 7. This seems to mean that the copies for each worker are created sequentially instead of in parallel and this takes alot of time. Why are these copies not created in parallel for faster execution? I am running R2017a.

채택된 답변

Edric Ellis
Edric Ellis 2017년 6월 30일
I suspect you're creating the parallel.pool.Constant using data created on the client. It's much more efficient to have the workers create the data, if possible. Consider two cases:
% Case 1: data created on the client
parallel.pool.Constant(ones(1e4));
% Case 2: use the Constant constructor with a function handle to create
% the contents directly on the worker
parallel.pool.Constant(@() ones(1e4));
This results in the following memory usage pattern. In the screen-shot, case 1 is indicated with a red arrow, and case 2 with a green arrow.
As you can see, case 2 happens in parallel, and avoids the data transfer from the client to the workers (it's the data transfer that really causes the lack of parallelism).
If you really cannot create the data on the workers, you can use the parallel.pool.Constant constructor that accepts a Composite, like this:
% Build an empty Composite
c = Composite();
% Transfer the data from client only to worker 1
c{1} = ones(1e4);
c(2:end) = {[]};
spmd
% Use labBroadcast to copy data to all workers (labBroadcast
% is more efficient than the client/worker communication)
c = labBroadcast(1, c);
end
% Build the Constant from the Composite
c = parallel.pool.Constant(c);
% Flush memory on the workers by executing an empty SPMD block
spmd, end
  댓글 수: 1
Joseph Hall
Joseph Hall 2017년 6월 30일
Thank you. Unfortunately, I am using real-world data and cannot have the data be created inside the workers, but are there other modes of data transfer that could be done in parallel such as accessing files on disk? I don't quite understand why data transfer cannot be done in parallel.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Clusters and Clouds에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by