Slice into gpuArray and perform functions on the GPU with arrayfun

Question

0 개 추천

I would like to know how I can index into a given matrix to make pairwise combinations of column-vectors, and perform operations on these vectors - all on the GPU. So consider the simple function below:

 function out = sum2Vecs(in1,in2) %in1 and in2 are (n x 1) vectors.
 out = sum(in1,1) + sum(in2,1); %Output is a scalar "double".
 end

Quick example: an array such as

fullMatrix = rand(3000,100);

Now I choose all pairwise column-vector combinations of "fullMatrix":

 idxArray      = nchoosek(1:100,2); %All possible pairwise index combinations of "fullMatrix".
 nCombinations = length(idxArray);

And a simple for-loop performs the "sum2Vecs" function on each combination of two-column vectors:

 for idx = 1 : nCombinations,
    outArray(idx) = sum2Vecs( fullMatrix(:,idxArray(idx,1)) , fullMatrix(:,idxArray(idx,2)) );
 end

Also, a parfor-loop with slicing works fine:

 parfor idx = 1 : nCombinations,
    in1 = fullMatrix(:,idxArray(idx,1));
    in2 = fullMatrix(:,idxArray(idx,2));
    outArray(idx) = sum2Vecs(in1,in2);
 end

My goal is to be able to perform this loop on the GPU using e.g. "arrayfun". But I am relatively inexperienced with this, so I would appreciate any helpful pointers. What I am particularly interested in learning is how to efficiently index into an array like "fullMatrix" and send parts of it to each GPU worker efficiently.

Thanks very much. Hamad.

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Matt J 2015년 1월 11일

편집: Matt J 2015년 1월 11일

MATLAB Online에서 열기

1 개 추천

In the generality that you've described, that kind of computation doesn't look like the kind of thing that's well-suited to the GPU . The GPU is for situations when you have lots of parallel tasks involving small chunks of data. The chunks in your example, two 3000x1 vectors, wouldn't likely be small enough unless the operation can be subdivided further.

For that specific example, I would probably try to vectorize on the GPU as follows,

       idxArray      =  gpuArray( nchoosek(1:100,2).' ) ;
       A= gpuArray(fullMatrix);
       [m,n]=size(A);
       outArray=sum(  reshape(A(:,idxArray),2*m ,[]),   1 );

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기

Hamad 2015년 1월 12일

Thanks very much Matt.

Joss Knight 2015년 2월 23일

arrayfun can take a user-defined function, as long as that function carries out scalar operations. You can also index into arrays in that function as long as the array is passed in as an upvalue - see for instance here, the Mandelbrot example on this page and the Monte Carlo example here.

You need to remember that GPU cores are not like parallel workers. They cannot perform complex vector operations. Taken together, they perform complex vector operations, but not individually. In PCT a large number of complex algorithms have been implemented in such a way as to take maximum advantage of the GPU. If you are having trouble formulating your problem in a data-parallel way, then post your real code and we can have a look at whether it is inherently parallelisable. The example you gave - summing vectors - is easily vectorizable as Matt showed above.

댓글을 달려면 로그인하십시오.

Slice into gpuArray and perform functions on the GPU with arrayfun

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기

카테고리

태그

Community Treasure Hunt

Slice into gpuArray and perform functions on the GPU with arrayfun

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 4 이전 댓글 2개 표시 이전 댓글 2개 숨기기

카테고리

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기