CUDAKernel Object crashes GPU

조회 수: 14 (최근 30일)
Omer Hamburger
Omer Hamburger 2018년 9월 27일
댓글: Omer Hamburger 2018년 10월 8일
Hi,
I am running some calculation using Matlab on the GPU using CUDAKernel Object.
It was working fine with a grid size of 41x41, but with different grid sizes the GPU crashes. Yet, it does not seem like a problem of memory since it is working with 61x61, but with 55x55 is crashes. The calculations are fine (I compared it to a CPU calculation).
When I loaded all my data to the GPU , I saw before the Kernel execution that I have left around 1.5GB out of its 2GB from 'Dedicated GPU Memory'.
Does the size of the "Result" vector that I sent to the GPU change during the calculation? I sent all zeros and then each thread is calculating a value for different cell in the vector.
The error message i get when I close Matlab is:
NVIDIA OpenGL Driver Unable to recover from a kernel exception. The application must close.
Error code: 3 (subcode 2)
I tried to change the settings of NVIDIA control panel Gloabl settings to 3D App - Visual Simulation.
This trick worked with the 55x55 grid, but did not solve the problem for other sizes such as 71x71 which makes me thing it is only in the right direction but not quiet sufficient.
Thank you very much, I am looking forward for your help.
  댓글 수: 8
Joss Knight
Joss Knight 2018년 10월 6일
Sorry, I can't interpret all that. Please just display the CUDAKernel object so I can see all its properties, show the line of code where you call feval, call size on all the array input arguments and show me the results, and give me the value of all the scalar input arguments ( mWidth, mHeight, colShift, SizeSparse, numReplica ).
Omer Hamburger
Omer Hamburger 2018년 10월 8일
Hi Joss, Please find the attached code that calls the kernel:
kernel1 = parallel.gpu.CUDAKernel('SpMV_Omer.ptx','SpMV_Omer.cu','Non_Transpose_matrix');
kernel1.ThreadBlockSize = [1024 1];
numReplica=N_p; %Np=4*Ny+1
Nk=N_range; %Nk=Nx+Np-1
Ny=N_x;
b=gpuArray(rand(Ny*Nk,1));
mHeight=size(A_mat,1);%Np*Nt
mWidth=size(A_mat,2);%Ny*Nx
colShift=Ny;
[mRows,mCols,mVal]=find(A_mat);
mRows=gpuArray(mRows);mCols=gpuArray(mCols);mVal=gpuArray(mVal);
SizeSparse=size(mVal,1);
Result=zeros(N_p*N_p*N_t,1,'gpuArray');
[Result]=feval(kernel1,b,mHeight,mWidth,colShift,mVal,mRows,mCols,SizeSparse,numReplica,Result);
Thank you for your help,
Omer

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

태그

제품


릴리스

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by