GPU CUDA kernel malloc error

조회 수: 4 (최근 30일)
Gaszton
Gaszton 2011년 5월 10일
Hello, i have a geforce 425m card with compute capability 2.1 I wrote a kernel that is using malloc inside the kernel. First the ptx file didnot compiled. After I tried to set the nvcc parameter arch=sm_21 ( nvcc -I "D:\...VC\include" -arch=sm_21 -use_fast_math -ptx SR2.cu ) With this it compiled succesfully, i was just wondering why do i need the specify that. After that i tried to create the kernel in matlab:
ckernel=parallel.gpu.CUDAKernel('SR2.ptx', 'SR2.cu');
But i a get the error:
??? Error using ==> parallel.gpu.CUDAKernel
An error occurred during PTX compilation of <image>.
The information log was:
: Considering profile 'compute_20' for gpu='sm_21' in
'cuModuleLoadDataEx_2a9
The error log was:
The CUDA error code was: CUDA_ERROR_INVALID_IMAGE.
Before modifying the kernel to use malloc, and not specifying nvcc arch=sm_21, i was able to run my kernel from MATLAB without any problem.
I think that there is some configuration problem with CUDA. I hope someone has some idea how to solve this.
Thanks,
Gaszton
  댓글 수: 1
Gaszton
Gaszton 2011년 5월 10일
Seems like that there is no options in the cuModuleLoadDataEx for compute capability 2.1:
CUjit_target_enum; possible values are:
CU_TARGET_COMPUTE_10
CU_TARGET_COMPUTE_11
CU_TARGET_COMPUTE_12
CU_TARGET_COMPUTE_13
CU_TARGET_COMPUTE_20
http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/online/group__CUDA__MODULE_g9e8047e9dbf725f0cd7cafd18bfd4d12.html#g9e8047e9dbf725f0cd7cafd18bfd4d12
But in the cuda toolkit 3.2 release notes i found:
Added CU_TARGET_COMPUTE_21 to JIT options.

댓글을 달려면 로그인하십시오.

채택된 답변

Edric Ellis
Edric Ellis 2011년 5월 11일
You can get that error message if you have a mismatch between the CUDA runtime in use by Parallel Computing Toolbox and the version of nvcc that you're using. If you're using R2010b, you need to use CUDA-3.1; for R2011a, you can use CUDA-3.2. I was able to compile and use the following trivial kernel:
// simple.cu
__global__ void fcn( double * out ) {
int * x = (int *) malloc( 1024 );
out[0] = x[0];
free( x );
}
By compiling like so:
$ /usr/local/cuda32/cuda/bin/nvcc -arch compute_20 -ptx simple.cu
and then using within MATLAB R2011a like so:
>> k = parallel.gpu.CUDAKernel( 'simple.ptx' );
>> gather(k.feval(0))
ans =
1.768515945000000e+09
  댓글 수: 2
Gaszton
Gaszton 2011년 5월 11일
Thank you for your help,
I have R2010b, and cuda toolkit 3.2.
Everything worked, until i specified the -arch options to nvcc.
If i dont specify that, what is the default? i wonder why it is not 2.1 if i have a card that has 2.1 compute capability.
If i compile my cu with -arch compute_20 or sm_20 , i still get error from matlab.
I should install CUDA toolkit 3.1, and try out if it works?
with cuda_3.1 am i able to use kernel malloc?
Thank you,
Gaszton
Gaszton
Gaszton 2011년 5월 11일
Seems like, CUDA 3.1 does not support kernel malloc.
Otherwise with 3.1 i am able to use sm21 code in matlab.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by