MATLAB GPU: arrayfun with indexing

Hi
I am new to MATLAB GPU computing and have made some initial tests. Now I am looking to parallelize a the following code.
for i=1:n ;where n~1'000'000 and a, b,c of size ~300'000x1
currindices = indices(24,i);
a(currindices ) = a(currindices ) + A(24x24)*(b(currindices )+B(24x24)*c(currindices ));
end
In a test I parallelized this code without any of the indices by using arrayfun and it worked well. Meaning just having the following code in an function that was called by arrayfun:
for i=1:n
a=a+A*(b+B*c)
end
I wonder how to deal with the indexing of the vectors and whether arrayfun still makes sense. The matrices A and B are constant. I read that indexing is rather slow on a GPU.
What would be the best way to parallelize the above code?
Thanks for any help. This whole paralellization does not come natural to me yet.
BR

댓글 수: 6

Walter Roberson
Walter Roberson 2017년 10월 22일
편집: Walter Roberson 2017년 10월 24일
? currindices appears to be unused before you assign to it.
Markus Ess
Markus Ess 2017년 10월 22일
sorry, was a mistake. indexing should happen to currindices. fixed the code in the sample
Joss Knight
Joss Knight 2017년 10월 24일
I'm not sure what language you've written your code in so it's difficult to interpret. What is A(24x24)? And if this were MATLAB code then indices(24,i) would be a scalar. But then your algebra doesn't make sense.
Markus Ess
Markus Ess 2017년 10월 24일
편집: Walter Roberson 2017년 10월 24일
it wasn't meant to be real code. it is just to show that A is of size 24x24 and that for currindices I read 24 values. so currindices is currindices(:,i) in MATLAB code and the multiplication with A and B is simply that.
for i=1:n %;where n~1'000'000 and a, b,c of size ~300'000x1
currindices = indices(:,i);
a(currindices ) = a(currindices ) + A*(b(currindices )+B*c(currindices ));
end
well, one of the things I learnt anyway is that I have to use pagefun. the problem is still indexing.
however the main feeling i have is that anyway I have to rewrite the math for an optimal parallelization.
I don't think you need pagefun. Can't you just do this with indexing and matrix multiplication? It seems indices is the correct shape, namely 24-by-n. So b(indices) and c(indices) return 24-by-n, the multiplies return 24-by-n, and the addition works.
a(indices) = a(indices) + A * (b(indices) + B * c(indices));
If the indices repeat this may not work as you intended, because some elements of a will get one of the answers and not another. You might have to use accumarray in that case.
result = a(indices) + A * (b(indices) + B * c(indices));
a = accumarray(result, indices(:), size(a));
Markus Ess
Markus Ess 2017년 10월 31일
got it. at least on CPU the multiplication is 10 times faster than the for loop. anyway I know need to rewrite the code and see how that could work on a GPU.
thanks!

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

도움말 센터File Exchange에서 GPU Computing에 대해 자세히 알아보기

질문:

2017년 10월 22일

댓글:

2017년 10월 31일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by