Assigning gpuArrays to different graphics cards

조회 수: 1 (최근 30일)
RWS
RWS 2020년 2월 19일
댓글: Adam Karboski 2020년 5월 19일
In the example below I use a parfor loop to assign a different 256x256x256 (random k-space) matrix to each of my 2 GPUs. Theoretically, I can then process these matrices in parallel on the 2 GPUs (here I've done an ifftn). The problem is that the parfor loop is very slow (presumably overhead related). This operation takes 6.5 seconds on my machine. If I replace the 'parfor' with a simple 'for' (and run 1 GPU sequentially), this operation takes 0.2 seconds.
Is there an easy (fast) way to assign a gpuArray to a specific graphics card - such that future operations on this gpuArray will use the specified graphics card? Rather than using parfor, I would prefer to simply use asynchronous CUDA kernels (invoking one directly following the other in Matlab code) to run both GPUs in parallel.
Dims = [256,256,256,2];
Kspace = complex(rand(Dims,'single'),rand(Dims,'single'));
Image = gpuArray(complex(zeros(Dims,'single')));
tic
parfor n = 1:2
gKspace = gpuArray(Kspace(:,:,:,n));
Image(:,:,:,n) = ifftn(fftshift(gKspace));
end
toc

채택된 답변

Joss Knight
Joss Knight 2020년 2월 22일
There is no way to do what you ask. Selecting a GPU is the only way to move data there, and selecting a GPU resets all GPU data.
The issue here is the way you're sending all the data to each worker and then indexing it, this is your bottleneck (and equivalently, moving all the results back). You need to amortise this communication cost, either by doing more work inside the loop or by loading the data you need directly onto each worker without first loading it onto the client.
Presumably you have more than 2 256^3 arrays. Put another loop inside your parfor and process all those arrays together. Move the results back to the CPU to save GPU memory. Eventually the communication overhead will be irrelevant and you'll see the gain of use of both your GPUs.
  댓글 수: 6
Adam Karboski
Adam Karboski 2020년 5월 15일
Same result. I also created a profile from scratch, same again.
Adam Karboski
Adam Karboski 2020년 5월 19일
Solved, solution was to disable nvidia-persistenced

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Parallel Computing Fundamentals에 대해 자세히 알아보기

태그

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by