필터 지우기
필터 지우기

CUDA_ERROR​_ILLEGAL_A​DDRESS when runnin Faster R-CNN on Matlab 2018b

조회 수: 2 (최근 30일)
Ghada Khaled
Ghada Khaled 2018년 11월 10일
댓글: Ghada Khaled 2018년 11월 11일
I'm running faster R-CNN in `matlab 2018b` on a Windows 10. I face an exception `CUDA_ERROR_ILLEGAL_ADDRESS` when I increase the number of my training items or when I increase the `MaxEpoch`.
Below are the information of my `gpuDevice`
CUDADevice with properties:
Name: 'GeForce GTX 1050'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 9.2000
ToolkitVersion: 9.1000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 4.2950e+09
AvailableMemory: 3.4635e+09
MultiprocessorCount: 5
ClockRateKHz: 1493000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
And this is my code
latest_index =0;
for i=1:6
load (strcat('newDataset', int2str(i), '.mat'));
len =length(vehicleDataset.imageFilename);
for j=1:len
filename = vehicleDataset.imageFilename{j};
latest_index=latest_index+1;
fulldata.imageFilename{latest_index} = filename;
fulldata.vehicle{latest_index} = vehicleDataset.vehicle{j};
end
end
trainingDataTable = table(fulldata.imageFilename', fulldata.vehicle');
trainingDataTable.Properties.VariableNames = {'imageFilename','vehicle'};
data.trainingDataTable = trainingDataTable;
trainingDataTable(1:4,:)
% Split data into a training and test set.
idx = floor(0.6 * height(trainingDataTable));
trainingData = trainingDataTable(1:idx,:);
testData = trainingDataTable(idx:end,:);
% Create image input layer.
inputLayer = imageInputLayer([32 32 3]);
% Define the convolutional layer parameters.
filterSize = [3 3];
numFilters = 64;
% Create the middle layers.
middleLayers = [
convolution2dLayer(filterSize, numFilters, 'Padding', 1)
reluLayer()
convolution2dLayer(filterSize, numFilters, 'Padding', 1)
reluLayer()
maxPooling2dLayer(3, 'Stride',2)
];
finalLayers = [
fullyConnectedLayer(128)
% Add a ReLU non-linearity.
reluLayer()
fullyConnectedLayer(width(trainingDataTable))
% Add the softmax loss layer and classification layer.
softmaxLayer()
classificationLayer()
];
layers = [
inputLayer
middleLayers
finalLayers
];
% Options for step 1.
optionsStage1 = trainingOptions('sgdm', ...
'MaxEpochs', 2, ...
'MiniBatchSize', 1, ...
'InitialLearnRate', 1e-3, ...
'CheckpointPath', tempdir);
% Options for step 2.
optionsStage2 = trainingOptions('sgdm', ...
'MaxEpochs', 2, ...
'MiniBatchSize', 1, ...
'InitialLearnRate', 1e-3, ...
'CheckpointPath', tempdir);
% Options for step 3.
optionsStage3 = trainingOptions('sgdm', ...
'MaxEpochs', 2, ...
'MiniBatchSize', 1, ...
'InitialLearnRate', 1e-3, ...
'CheckpointPath', tempdir);
% Options for step 4.
optionsStage4 = trainingOptions('sgdm', ...
'MaxEpochs', 2, ...
'MiniBatchSize', 1, ...
'InitialLearnRate', 1e-3, ...
'CheckpointPath', tempdir);
options = [
optionsStage1
optionsStage2
optionsStage3
optionsStage4
];
doTrainingAndEval = true;
if doTrainingAndEval
% Set random seed to ensure example training reproducibility.
rng(0);
% Train Faster R-CNN detector. Select a BoxPyramidScale of 1.2 to allow
% for finer resolution for multiscale object detection.
detector = trainFasterRCNNObjectDetector(trainingData, layers, options, ...
'NegativeOverlapRange', [0 0.3], ...
'PositiveOverlapRange', [0.6 1], ...
'BoxPyramidScale', 1.2);
data.detector= detector;
else
% Load pretrained detector for the example.
detector = data.detector;
end
save mix_data data
if doTrainingAndEval
% Run detector on each image in the test set and collect results.
resultsStruct = struct([]);
for i = 1:height(testData)
% Read the image.
I = imread(testData.imageFilename{i});
% Run the detector.
[bboxes, scores, labels] = detect(detector, I);
% Collect the results.
resultsStruct(i).Boxes = bboxes;
resultsStruct(i).Scores = scores;
resultsStruct(i).Labels = labels;
end
% Convert the results into a table.
results = struct2table(resultsStruct);
data.results = results;
save mix_data data
else
% Load results from disk.
results = data.results;
end
% Extract expected bounding box locations from test data.
expectedResults = testData(:, 2:end);
% Evaluate the object detector using Average Precision metric.
[ap, recall, precision] = evaluateDetectionPrecision(results, expectedResults);
% Plot precision/recall curve
figure
plot(recall,precision)
xlabel('Recall')
ylabel('Precision')
grid on
title(sprintf('Average Precision = %.2f', ap))
First it prints the warning multiple time and throws the below exception
> Warning: An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_ILLEGAL_ADDRESS
> In trainFasterRCNNObjectDetector (line 320)
In rcnn_trail (line 184)
> Error using -
An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_ILLEGAL_ADDRESS
> Error in vision.internal.cnn.layer.SmoothL1Loss/backwardLoss (line 156)
idx = (X > -one) & (X < one);
> Error in nnet.internal.cnn.DAGNetwork/computeGradientsForTraining/efficientBackProp (line 585)
dLossdX = thisLayer.backwardLoss( ...
> Error in nnet.internal.cnn.DAGNetwork>@()efficientBackProp(i) (line 661)
@() efficientBackProp(i), ...
> Error in nnet.internal.cnn.util.executeWithStagedGPUOOMRecovery (line 11)
[ varargout{1:nOutputs} ] = computeFun();
> Error in nnet.internal.cnn.DAGNetwork>iExecuteWithStagedGPUOOMRecovery (line 1195)
[varargout{1:nargout}] = nnet.internal.cnn.util.executeWithStagedGPUOOMRecovery(varargin{:});
> Error in nnet.internal.cnn.DAGNetwork/computeGradientsForTraining (line 660)
theseGradients = iExecuteWithStagedGPUOOMRecovery( ...
> Error in nnet.internal.cnn.Trainer/computeGradients (line 184)
[gradients, predictions, states] = net.computeGradientsForTraining(X, Y,
needsStatefulTraining, propagateState);
> Error in nnet.internal.cnn.Trainer/train (line 85)
[gradients, predictions, states] = this.computeGradients(net, X, response,
needsStatefulTraining, propagateState);
> Error in vision.internal.cnn.trainNetwork (line 47)
trainedNet = trainer.train(trainedNet, trainingDispatcher);
> Error in fastRCNNObjectDetector.train (line 190)
[network, info] = vision.internal.cnn.trainNetwork(ds, lgraph, opts, mapping,
checkpointSaver);
> Error in trainFasterRCNNObjectDetector (line 410)
[stage2Detector, fastRCNN, ~, info(2)] = fastRCNNObjectDetector.train(trainingData, fastRCNN,
options(2), iStageTwoParams(params), checkpointSaver);
> Error in rcnn_trail (line 184)
detector = trainFasterRCNNObjectDetector(trainingData, layers, options, ...
  댓글 수: 4
Walter Roberson
Walter Roberson 2018년 11월 10일
That should be okay. You could experiment with CUDA 10 driver instead, as long as you are not using cudamex() or GPU Coder.
Ghada Khaled
Ghada Khaled 2018년 11월 11일
This is what I expected but still, I’m getting this error. Could the memory size be an issue? Since I only have 8 Gig of memory

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by