GPU utilization is not 100%.
조회 수: 37 (최근 30일)
이전 댓글 표시
The GPU usage is only 40% allocated for running the deep learning network.
Sometimes go up to 80% for a while but usually stay at 40%.
I want to know why.
댓글 수: 1
Walter Roberson
2019년 5월 22일
GPU can only run at full speed if the entire problem fits into memory. That is seldom the case for deep learning: those networks are updated incrementally, so transferring images in from disc and memory uses a fair bit of time.
답변 (4개)
Joss Knight
2019년 5월 31일
Your question is very hard to answer in it's current form. You want to know why GPU utilisation is not 100%? The answer is, because the GPU isn't running kernels 100% of the time. Why? I don't know, because you haven't provided any information about what you're doing. Maybe, as Walter says, a lot of time is being spent doing file I/O, perhaps because you have a very slow disk or slow network file access. Maybe you have a transformed datastore, or an imageDatastore with a custom ReadFcn, and the data processing is very complex and takes place on the CPU, blocking GPU execution while it is carried out. Maybe you have a very small network, or a low resolution network, or you don't have a high enough mini-batch size, and so you are not successfully occupying all the cores on the GPU. Maybe your network is so small that the amount of time spent running the MATLAB interpreter in order to generate the GPU kernels to do the computation outweighs the amount of time it takes to run those kernels.
If you want to know more, run the MATLAB profiler and find out where time is being spent during training.
댓글 수: 2
Ali Al-Saegh
2020년 12월 5일
Dear Joss,
I kindly invite you to help me by giving some advice on my question at
https://www.mathworks.com/matlabcentral/answers/680293-gpu-vs-cpu-in-training-time
yan gao
2021년 9월 25일
Dear Joss,
I kindly invite you to help me by giving some advice on my question at
Abolfazl Nejatian
2019년 12월 15일
dear Joss,
Thank you for the information you provided.
the strange thing is when i was testing my code on Linux and building my network with Python the GPU utilization grew up to around 100 percent but in windows with Matlab, it stays around 45 percent.
댓글 수: 1
Joss Knight
2019년 12월 15일
It's not strange. Windows is a different operating system, different file system, and completely different (and considerably slower at allocating memory) GPU driver. Do you have a different card in your Windows machine too? All could be a problem.
Plus, if you start with a model defined in a Python framework and optimized for that, and then adapt it, we've no idea how good a job you did. If you took a MATLAB example and then converted it to Python you might have the same problem with Python. Maybe you're not successfully prefetching your data from the file system. Maybe you're not using MEX acceleration when you should be. Maybe your GPU could be put in TCC mode. That's why it's so difficult to answer your question when you're not telling us what you're doing.
Abolfazl Nejatian
2019년 12월 16일
well, i know these are the different OS, but the vague point is, with the same resource(both of them use Tesla V100, actually i install both of OS on one machin), why they can't use GPU in a similar percent!
yes, absolutely i used MEX code on my Matlab.
then i try to train a Resnet with Python. ( and all of the initial value were same, input size, network layers and etc).
there is no code conversion between Matlab and Python i used Matlab function and Pretrained Net for this work and for Python use Keras and Pycharm.
but in a windows environment with Matlab, my GPU utilization goes around 45% and in Python, at Linux OS it was around 90%!
now the question is, do you recommend me reInstall my Matlab on Linux OS and then i can use more from my hardware resource?
댓글 수: 3
Joss Knight
2019년 12월 16일
Hi Abolfazi. I can't really recommend anything until I've seen your code. It may be as simple as changing the way you access your data; it may be that you should move to Linux; or it may be that there's nothing you can do. Maybe your Python code is grotesquely inefficient with GPU resources or spins up a lot of worthless kernels during spare cycles! It's just impossible to say. Give us your code, and run the MATLAB profiler and show us the profile report.
Markus Walser
2024년 9월 26일
편집: Markus Walser
2024년 9월 26일
Hi
I'm having the same problem with low GPU usage on a Windows 2019 server with current Matlab 2024b while training a yolox network. The load on the GPU looks like this:
The top profile entries of the training call trainYOLOXObjectDetector are:
And the code is like this:
% Load images and box labels
oldDataPath = "L:\DataStore";
basePath = "C:\DataStore";
pathes = {
fullfile(basePath, '10_ImageFolder', 'gTruthA.mat');
fullfile(basePath, '10_ImageFolder', 'gtruthB.mat');
fullfile(basePath, '20_ImageFolder', 'gtruthC.mat');
};
rng(0);
for idx = 1:numel(pathes)
load(pathes{idx}, 'gTruth');
alternativePaths = {[oldDataPath basePath]};
changeFilePaths(gTruth, alternativePaths);
[gTruthTrain, gTruthVal] = partitionGroundTruth(gTruth, 0.8);
if ~exist('gTruthTemp', 'var')
gTruthTemp = gTruth;
gTruthTrainTemp = gTruthTrain;
gTruthValTemp = gTruthVal;
else
gTruthTemp = merge(gTruthTemp, gTruth);
gTruthTrainTemp = merge(gTruthTrainTemp, gTruthTrain);
if ~isempty(gTruthVal)
gTruthValTemp = merge(gTruthValTemp, gTruthVal);
end
end
end
gTruth = gTruthTemp;
gTruthTrain = gTruthTrainTemp;
gTruthVal = gTruthValTemp;
clear gTruthTemp gTruthTrainTemp gTruthValTemp;
% Generate and combine datastores
classNames = gTruth.LabelDefinitions.Name;
imbxtrainds = combine(imageDatastore(gTruthTrain.DataSource.Source), boxLabelDatastore(gTruthTrain.LabelData));
if ~isempty(gTruthVal)
imbxvalds = combine(imageDatastore(gTruthVal.DataSource.Source), boxLabelDatastore(gTruthVal.LabelData));
else
imbxvalds = [];
end
% Image processing
imgSize = [96 576 3];
imbxtrainds = imbxtrainds.transform(@(x) imbxdsPreprocess(x, imgSize));
imbxtrainaugds = imbxtrainds.transform(@imbxdsAugmenter);
if ~isempty(imbxvalds)
imbxvalds = imbxvalds.transform(@(x) imbxdsPreprocess(x, imgSize));
imbxvalaugds = imbxvalds.transform(@imbxdsAugmenter);
end
% Create new yolo
net=yoloxObjectDetector('nano-coco', classNames, 'InputSize', imgSize);
% Train and transfer learning
imbxtrainaugds.reset();
options = trainingOptions("sgdm", ...
InitialLearnRate=1e-3, ...
MiniBatchSize=64,...
MaxEpochs=500, ...
BatchNormalizationStatistics="moving",...
ResetInputNormalization=false,...
VerboseFrequency=10,...
Plots="training-progress",...
Shuffle="every-epoch",...
ValidationData=imbxvalds,...
ExecutionEnvironment="auto",...
PreprocessingEnvironment="parallel");
[net, info] = trainYOLOXObjectDetector(imbxtrainds, net, options);
Do you have any idea, how to increase the GPU usage and speed up the training process?
Lamya Mohammad
2020년 2월 29일
Did you solve the problem? My utilization is 29% and I wish to increase it
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Image Data Workflows에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!