Neural networks - CUDAKernel​/setConsta​ntMemory - the data supplied is too big for constant 'hintsD'

조회 수: 1 (최근 30일)
On R2015a with Parallel Computing Toolbox and Neural Network Toolbox.
Using the following code with GPU Nvidia GeForce GTX980 Ti:
net1 = feedforwardnet(20);
net1.trainFcn = 'trainscg';
x = inputs(1:4284,2:2000)'; % if I reduce this to 2:1900, it will work
t = double(targets'); % casting to double for GPU
t = t(:,1:4284);
% preparing for GPU xg = nndata2gpu(x); tg = nndata2gpu(t);
net1.input.processFcns = {'mapminmax'}; net1.output.processFcns = {'mapminmax'};
net2 = configure(net1,x,t); % Configure with MATLAB arrays
net2 = train(net2,xg,tg);
As you can see, this is not a big dataset. When I run this, it generates this error:
Error using parallel.gpu.CUDAKernel/setConstantMemory The data supplied is too big for constant 'hintsD'.
Error in nnGPU.codeHints (line 33) setConstantMemory(hints.yKernel,'hintsD',hints.double);
Error in nncalc.setup2 (line 13) calcHints = calcMode.codeHints(calcHints);
Error in nncalc.setup (line 17) [calcLib,calcNet] = nncalc.setup2(calcMode,calcNet,calcData,calcHints);
Error in network/train (line 357) [calcLib,calcNet,net,resourceText] = nncalc.setup(calcMode,net,data);
gpuDevice is showing this:
Name: 'GeForce GTX 980 Ti'
Index: 1
ComputeCapability: '5.2'
SupportsDouble: 1
DriverVersion: 8
ToolkitVersion: 6.5000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 6.4425e+09
AvailableMemory: 5.1520e+09
MultiprocessorCount: 22
ClockRateKHz: 1139500
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
As noted in the code above, if I reduce x marginally, it will run.
I don't understand why data of this size would generate a memory error?
Am I missing a step in preparing this for GPU?

채택된 답변

Amanjit Dulai
Amanjit Dulai 2017년 1월 3일
I was able to reproduce your issue. The best solution is to do the GPU training a different way by using the 'useGPU' flag. This does not use the shared memory in this way, and side-steps this issue. Your example code would look like this:
net1 = feedforwardnet(20);
net1.trainFcn = 'trainscg';
x = inputs(1:4284,2:2000)';
t = double(targets'); % casting to double for GPU
t = t(:,1:4284);
net1.input.processFcns = {'mapminmax'};
net1.output.processFcns = {'mapminmax'};
net1 = train(net1,x,t,'useGPU','yes');

추가 답변 (2개)

Joss Knight
Joss Knight 2016년 12월 19일
Constant memory is a special fast read-only cache with 64KB of space. That's enough to store about 8000 elements of double-precision data. Perhaps you want to use shared memory, which will give you 16 or 48MB depending on your device configuration.
  댓글 수: 2
Jacob Townsend
Jacob Townsend 2016년 12월 19일
Thanks a lot for the suggestion Joss. It seems the default methods are trying to allocate this to Constant Memory (in the error chain, it is running from nncalc all the way through to nnGPU.CodeHints). Are you aware of a way to change these options to default to using shared memory only?
Joss Knight
Joss Knight 2016년 12월 20일
Sorry, I don't understand this code to know what hints.double is and why it needs to be so large. I'll see if I can get someone who knows to help you.

댓글을 달려면 로그인하십시오.


Jacob Townsend
Jacob Townsend 2017년 2월 19일
Thanks for that - solved that step!

카테고리

Help CenterFile Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by