BLAS or LAPACK in CUDA kernel

조회 수: 8 (최근 30일)
Rodrigo
Rodrigo 2013년 6월 5일
Hi, I need to do x=A\b several hundred million times, along with other trivial arithmetic, for an A that is 4x4 and dense. I was thinking about writing a little CUDA kernel that would get called within MATLAB to do this, but I don't know how I would call something like DGETRS or SGETRS within a thread. CUBLAS, MAGMA, and things of that kind seem to parallelize this operation for a single, massive A, but I don't how they would help me. Is this possible?
Thanks!

채택된 답변

James Lebak
James Lebak 2013년 6월 5일
You're correct that you can't call DGETRS and SGETRS directly, as that's CPU-side code. The CUDA 5 version of CUBLAS has a batched LU factorization API, and an ability to call BLAS routines on the device, either of which might be helpful. You can call CUBLAS CPU-side routines from a GPU MEX file in R2013a.
  댓글 수: 1
Rodrigo
Rodrigo 2013년 6월 5일
thanks. I guess I'll have to get the new version of matlab from my department and play around with the new CUDA toolbox.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by