필터 지우기
필터 지우기

GPU arrayfun with shared arrays

조회 수: 1 (최근 30일)
Ray
Ray 2014년 11월 11일
편집: Matt J 2014년 11월 14일
Hi all,
I'm trying to speed-up some code I'm running by using the GPU functionality that comes with arrayfun.
I know arrayfun operates in an element-wise fashion however, I have a situation where I have some shared arrays involved in my function. For example, I have a function like:
f = f(a,b,A,B,C) Where a and b are (n x 1) arrays ie. the element-wise portion of the function. A, B, C are arrays that remain constant during each element-wise execution of a and b.
I've tried searching how to implement this but the results don't look too promising. Is it possible to do this using arrayfun? If not, is there another way I can speed-up such a function? I've tried utilising "par-for" but this actually turned out to be slower than a normal for-loop.
Thanks,
Ray

답변 (3개)

Matt J
Matt J 2014년 11월 11일
편집: Matt J 2014년 11월 11일
The only hope, I think, would be to write your own CUDA kernel implemention of f(), putting A,B,C in constant memory if they are small enough to fit there. You could manage this through MATLAB using a CUDAKernel object, see
and its setConstantMemory method.

Mikhail
Mikhail 2014년 11월 11일
You can try to use your function without arrayfun. If at least 1 of the arguments is on GPU, calculations will be performed on GPU.

Edric Ellis
Edric Ellis 2014년 11월 12일
Can you give a more concrete example of what you'd like to do with A, B, and C? You might be able to use a nested function with up-level variables. This example is quite complex, but it shows some of the more advanced things you can do with nested functions and arrayfun. In particular, the nested function updateParentGrid accesses the up-level variable grid and indexes into it to perform the stencil computation.
  댓글 수: 1
Matt J
Matt J 2014년 11월 14일
편집: Matt J 2014년 11월 14일
But can it be efficient to do this? I assume that there are CUDA threads doing each element-wise computation under the hood. If all threads need the variables A,B, and C, then surely those variables would need to be stored in constant memory in order for all threads to access them quickly enough.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by