GPU CUDA kernel malloc error

Question

Gaszton 2011년 5월 10일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/7172-gpu-cuda-kernel-malloc-error

Hello, i have a geforce 425m card with compute capability 2.1 I wrote a kernel that is using malloc inside the kernel. First the ptx file didnot compiled. After I tried to set the nvcc parameter arch=sm_21 ( nvcc -I "D:\...VC\include" -arch=sm_21 -use_fast_math -ptx SR2.cu ) With this it compiled succesfully, i was just wondering why do i need the specify that. After that i tried to create the kernel in matlab:

ckernel=parallel.gpu.CUDAKernel('SR2.ptx', 'SR2.cu');

But i a get the error:

    ??? Error using ==> parallel.gpu.CUDAKernel
    An error occurred during PTX compilation of <image>.
    The information log was:
    : Considering profile 'compute_20' for gpu='sm_21' in
    'cuModuleLoadDataEx_2a9
    The error log was:
    The CUDA error code was: CUDA_ERROR_INVALID_IMAGE.

Before modifying the kernel to use malloc, and not specifying nvcc arch=sm_21, i was able to run my kernel from MATLAB without any problem.

I think that there is some configuration problem with CUDA. I hope someone has some idea how to solve this.

Thanks,

Gaszton

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Gaszton 2011년 5월 10일

Seems like that there is no options in the cuModuleLoadDataEx for compute capability 2.1:

CUjit_target_enum; possible values are:

CU_TARGET_COMPUTE_10

CU_TARGET_COMPUTE_11

CU_TARGET_COMPUTE_12

CU_TARGET_COMPUTE_13

CU_TARGET_COMPUTE_20

http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/online/group__CUDA__MODULE_g9e8047e9dbf725f0cd7cafd18bfd4d12.html#g9e8047e9dbf725f0cd7cafd18bfd4d12

But in the cuda toolkit 3.2 release notes i found:

Added CU_TARGET_COMPUTE_21 to JIT options.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Edric Ellis 2011년 5월 11일

2
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/7172-gpu-cuda-kernel-malloc-error#answer_9870

MATLAB Online에서 열기

You can get that error message if you have a mismatch between the CUDA runtime in use by Parallel Computing Toolbox and the version of nvcc that you're using. If you're using R2010b, you need to use CUDA-3.1; for R2011a, you can use CUDA-3.2. I was able to compile and use the following trivial kernel:

    // simple.cu
    __global__ void fcn( double * out ) {
        int * x = (int *) malloc( 1024 );
        out[0] = x[0];
        free( x );
    }

By compiling like so:

$ /usr/local/cuda32/cuda/bin/nvcc -arch compute_20 -ptx simple.cu

and then using within MATLAB R2011a like so:

>> k = parallel.gpu.CUDAKernel( 'simple.ptx' );
>> gather(k.feval(0))
ans =
       1.768515945000000e+09

댓글 수: 2
없음 표시없음 숨기기

Gaszton 2011년 5월 11일

Thank you for your help,

I have R2010b, and cuda toolkit 3.2.

Everything worked, until i specified the -arch options to nvcc.

If i dont specify that, what is the default? i wonder why it is not 2.1 if i have a card that has 2.1 compute capability.

If i compile my cu with -arch compute_20 or sm_20 , i still get error from matlab.

I should install CUDA toolkit 3.1, and try out if it works?

with cuda_3.1 am i able to use kernel malloc?

Thank you,

Gaszton

Gaszton 2011년 5월 11일

Seems like, CUDA 3.1 does not support kernel malloc.

Otherwise with 3.1 i am able to use sm21 code in matlab.

댓글을 달려면 로그인하십시오.

GPU CUDA kernel malloc error

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 2
없음 표시없음 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

GPU CUDA kernel malloc error

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 2 없음 표시없음 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 2
없음 표시없음 숨기기