Using a templated CUDA kernel via MATLAB
이전 댓글 표시
Hello,
Is it possible to use a C++-style templated CUDA kernel via MATLAB's GPU Computing interface?
For example, consider the following (useless) toy code:
template<typename T>
__global__ void get_nans(T*, const int*);
template<>
__global__ void get_nans<double>(double* out, const int* dims)
{
const int tx = blockIdx.x*blockDim.x + threadIdx.x;
const int ty = blockIdx.y*blockDim.y + threadIdx.y;
if ((tx < dims[1]) && (ty < dims[0]))
out[tx*dims[0] + ty] = nan(0);
}
template<>
__global__ void get_nans<float>(float* out, const int* dims)
{
const int tx = blockIdx.x*blockDim.x + threadIdx.x;
const int ty = blockIdx.y*blockDim.y + threadIdx.y;
if ((tx < dims[1]) && (ty < dims[0]))
out[tx*dims[0] + ty] = nanf(0);
}
I then compile this into PTX code, but when I try to instantiate the kernel object in MATLAB I get the following error:
>> k = parallel.gpu.CUDAKernel( 'get_nans.ptx', 'get_nans.cu' );
Error using handleKernelArgs (line 61)
Found multiple matching entries in the PTX code. Matches found:
_Z16get_nansIdEvPT_PKS0_S3_S3_PKiS5_
_Z16get_nansIfEvPT_PKS0_S3_S3_PKiS5_
Thank you,
Alex
채택된 답변
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 GPU Computing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!