parallel.g​pu.CUDAKer​nel() not working

조회 수: 14 (최근 30일)
King Fisher
King Fisher 2011년 11월 27일
I have written the following .cu code:
__global__ void addToVector(float * pi, float c, int vecLen) {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if ( idx < vecLen ) {
pi[idx] += c;
}
}
after then I compiled it with :
nvcc -ptx code.cu //which generates a code.ptx file
The problem is that when I use above files to make kernel object, by
k = parallel.gpu.CUDAKernel('code.ptx','code.cu')
then following message occurs:
??? Error using ==> iParseToken at 266
Unsupported type in argument specification "fahad.cu".
Error in ==>
/usr/local*/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>iParseCPrototype at 192
Error in ==>
/usr/local/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>handleKernelArgs at 79
Note that there is no problem with hardware or drivers. gpuarray command works fine.
Would anybody help me in this regard !

답변 (1개)

Edric Ellis
Edric Ellis 2011년 11월 28일
CUDAKernel is very restrictive about precisely which types of argument it accepts; in particular, it only accepts basic "C" types (and const/pointer variants). Do you have any other _global_ entry points in the .cu file that you compiled to .ptx?
  댓글 수: 2
King Fisher
King Fisher 2011년 11월 29일
I have following simple .cu code:
__global__ void addToVector(float * pi, float c, int vecLen) {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if ( idx < vecLen ) {
pi[idx] += c;
}
}
And after compiling with command nvcc -ptx code.cu, I got the following .ptx code:
.version 1.4
.target sm_10, map_f64_to_f32
// compiled with /usr/local/cuda/open64/lib//be
// nvopencc 4.0 built on 2011-05-12
//-----------------------------------------------------------
// Compiling /tmp/tmpxft_00006747_00000000-9_fahad.cpp3.i (/tmp/ccBI#.8BZJzq)
//-----------------------------------------------------------
//-----------------------------------------------------------
// Options:
//-----------------------------------------------------------
// Target:ptx, ISA:sm_10, Endian:little, Pointer Size:64
// -O3 (Optimization level)
// -g0 (Debug level)
// -m2 (Report advisories)
//-----------------------------------------------------------
.file 1 "<command-line>"
.file 2 "/tmp/tmpxft_00006747_00000000-8_fahad.cudafe2.gpu"
.file 3 "/usr/lib/gcc/x86_64-redhat-linux/4.4.5/include/stddef.h"
.file 4 "/usr/local/cuda/bin/../include/crt/device_runtime.h"
.file 5 "/usr/local/cuda/bin/../include/host_defines.h"
.file 6 "/usr/local/cuda/bin/../include/builtin_types.h"
.file 7 "/usr/local/cuda/bin/../include/device_types.h"
.file 8 "/usr/local/cuda/bin/../include/driver_types.h"
.file 9 "/usr/local/cuda/bin/../include/surface_types.h"
.file 10 "/usr/local/cuda/bin/../include/texture_types.h"
.file 11 "/usr/local/cuda/bin/../include/vector_types.h"
.file 12 "/usr/local/cuda/bin/../include/device_launch_parameters.h"
.file 13 "/usr/local/cuda/bin/../include/crt/storage_class.h"
.file 14 "/usr/include/bits/types.h"
.file 15 "/usr/include/time.h"
.file 16 "fahad.cu"
.file 17 "/usr/local/cuda/bin/../include/common_functions.h"
.file 18 "/usr/local/cuda/bin/../include/math_functions.h"
.file 19 "/usr/local/cuda/bin/../include/math_constants.h"
.file 20 "/usr/local/cuda/bin/../include/device_functions.h"
.file 21 "/usr/local/cuda/bin/../include/sm_11_atomic_functions.h"
.file 22 "/usr/local/cuda/bin/../include/sm_12_atomic_functions.h"
.file 23 "/usr/local/cuda/bin/../include/sm_13_double_functions.h"
.file 24 "/usr/local/cuda/bin/../include/sm_20_atomic_functions.h"
.file 25 "/usr/local/cuda/bin/../include/sm_20_intrinsics.h"
.file 26 "/usr/local/cuda/bin/../include/surface_functions.h"
.file 27 "/usr/local/cuda/bin/../include/texture_fetch_functions.h"
.file 28 "/usr/local/cuda/bin/../include/math_functions_dbl_ptx1.h"
.entry _Z11addToVectorPffi (
.param .u64 __cudaparm__Z11addToVectorPffi_pi,
.param .f32 __cudaparm__Z11addToVectorPffi_c,
.param .s32 __cudaparm__Z11addToVectorPffi_vecLen)
{
.reg .u16 %rh<4>;
.reg .u32 %r<6>;
.reg .u64 %rd<6>;
.reg .f32 %f<5>;
.reg .pred %p<3>;
.loc 16 1 0
$LDWbegin__Z11addToVectorPffi:
mov.u16 %rh1, %ctaid.x;
mov.u16 %rh2, %ntid.x;
mul.wide.u16 %r1, %rh1, %rh2;
cvt.u32.u16 %r2, %tid.x;
add.u32 %r3, %r2, %r1;
ld.param.s32 %r4, [__cudaparm__Z11addToVectorPffi_vecLen];
setp.le.s32 %p1, %r4, %r3;
@%p1 bra $Lt_0_1026;
.loc 16 4 0
ld.param.u64 %rd1, [__cudaparm__Z11addToVectorPffi_pi];
cvt.s64.s32 %rd2, %r3;
mul.wide.s32 %rd3, %r3, 4;
add.u64 %rd4, %rd1, %rd3;
ld.global.f32 %f1, [%rd4+0];
ld.param.f32 %f2, [__cudaparm__Z11addToVectorPffi_c];
add.f32 %f3, %f1, %f2;
st.global.f32 [%rd4+0], %f3;
$Lt_0_1026:
.loc 16 6 0
exit;
$LDWend__Z11addToVectorPffi:
} // _Z11addToVectorPffi
And the error Which I get :
k = parallel.gpu.CUDAKernel('code.ptx','code.cu')
then following message occurs:
??? Error using ==> iParseToken at 266
Unsupported type in argument specification "fahad.cu".
Error in ==>
/usr/local*/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>iParseCPrototype at 192
Error in ==>
/usr/local/Matlab/2011a/toolbox/distcomp/gpu/+parallel/+internal/+gpu/handleKernelArgs.p>handleKernelArgs at 79
Edric Ellis
Edric Ellis 2011년 11월 29일
I'm not sure, but it looks like you're using a CUDA-4 compiler and R2011a. R2011a shipped with the CUDA 3.2 toolkit, so you might have better luck using that version of NVCC.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by