BLAS or LAPACK in CUDA kernel
이전 댓글 표시
Hi, I need to do x=A\b several hundred million times, along with other trivial arithmetic, for an A that is 4x4 and dense. I was thinking about writing a little CUDA kernel that would get called within MATLAB to do this, but I don't know how I would call something like DGETRS or SGETRS within a thread. CUBLAS, MAGMA, and things of that kind seem to parallelize this operation for a single, massive A, but I don't how they would help me. Is this possible?
Thanks!
채택된 답변
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Get Started with GPU Coder에 대해 자세히 알아보기
제품
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!