GPU problem CUDA_ERROR_UNKNOWN

조회 수: 18 (최근 30일)
Peter
Peter 2017년 1월 4일
이동: Matt J 2023년 3월 30일
I'm running a matlab simulation code using an iterative matrix equation solver. This solver is called on the GPU every few time steps in a time stepping loop. This goes well for some dozens of time steps (although the computations gradually slow down...) until the screen goes black for a short instant of time and the simulation crashes with the following error message:
Error using gpuArray/subsasgn
An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_UNKNOWN
After this, Matlab does not recognize the GPU device anymore: the command
gpuDevice
results in:
Error using gpuDevice (line 26)
An unexpected error occurred trying to retrieve CUDA device properties. The CUDA error was:
CUDA_ERROR_UNKNOWN
Restarting matlab is not sufficient to restore the GPU. Restarting the PC is.
I'm running matlab 2016b on windows 10, using an Nvidia TITAN X (Pascal) GPU with the newest driver installed.
Do the above symptoms inspire anyone for a diagnosis of this problem?
  댓글 수: 4
Xubin Lin
Xubin Lin 2020년 6월 13일
Dear Joss,
I also have the same problem.
An error occurred during PTX compilation of <image>.
The information log was:
The error log was:
The CUDA error code was: CUDA_ERROR_ILLEGAL_ADDRESS.
My output of gpuDevice is as follows(matlabR2019a and CUDA 10.2):
Name: 'GeForce GTX 1060'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 11
ToolkitVersion: 10
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 6.4425e+09
MultiprocessorCount: 10
ClockRateKHz: 1670500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
Swati Jain
Swati Jain 2023년 3월 30일
이동: Matt J 2023년 3월 30일
I'm facing this error while working on Deep Network Designer.
Please help me in solving this error.

댓글을 달려면 로그인하십시오.

채택된 답변

Peter
Peter 2017년 1월 6일
Monitoring the GPU performance revealed that most probably the temperature is causing the issue: Slowing down of performance goes with rising of temperature and performance is capped by temperature.
Crash of the GPU occurred when GPU reached 95 degrees...
  댓글 수: 2
Vaclav Bocek
Vaclav Bocek 2018년 4월 19일
이동: Matt J 2023년 3월 30일
How did you solved it please?
Peter
Peter 2018년 4월 23일
이동: Matt J 2023년 3월 30일
I solved it by: 1) a smarter placement of the GPU in the pc casing, allowing for better air-flow 2) change the behavior of the cooling fan: generally it only reacts to CPU activity. can be set in BIOS I believe. just made it blow a little harder. This is all very machine specific so it will take some investigating on your part to try these options.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Matt J
Matt J 2017년 1월 4일
I've had symptoms like that before. Re-installing/updating the GPU driver fixed it for me, but it was never clear to me what the root cause was.
  댓글 수: 1
Peter
Peter 2017년 1월 5일
thanks Matt, I did install the latest drivers (several times now) hoping for it to solve the issue but unfortunately without success.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by