Deep Learning: higher training loss using GPU. Why?

조회 수: 2 (최근 30일)
EK_47
EK_47 2022년 9월 25일
답변: Piyush Dubey 2023년 9월 6일
Hi,
I am training a ResNet-50 network for object detction using about 3,000 images. I have tried it in two ways, using CPU and GPU.
1 - CPU: I used Intel Xeon Processor E5-2687W v3 (10 cores); it took 70 hours; training and validation losses at epoch 40 were 0.0532 and 0.0004.
2- GPU: I used NVIDIA GeForce RTX 3070 Ti 8GB; it took 6 hours; training and validation losses at epoch 40 were 0.0764 and 0.0013.
As you can see, using GPU it takes much less time to train the model, but the training loss is higher. Also, the model trained on GPU gives poorer performance in predicting unseen data.
Why is this? How can I get the same accuracy on GPU?
Thanks
  댓글 수: 2
Walter Roberson
Walter Roberson 2022년 9월 25일
On the GPU, is it being trained in single precision or in double precision ?
EK_47
EK_47 2022년 9월 25일
I do not know, but I think it is in single precision. I read somewhere that it’s not possible to change the default value which is single precision for deep learning in Matlab?

댓글을 달려면 로그인하십시오.

답변 (1개)

Piyush Dubey
Piyush Dubey 2023년 9월 6일
Hi @EK_47,
I understand that you are training a "ResNet-50" model for object detection using CPU and GPU. You have noticed that training on the GPU is faster but it results in higher training and validation losses compared to training on the CPU.
I would like to clarify that the training process generates training and validation datasets randomly. As a result, there will be slight variations in the training and validation losses each time you perform the training. Therefore, the differences in losses between CPU and GPU training may not be significant. Averaging the training and validation losses obtained over multiple training sessions with random data sets from training and validation dataset would serve as a better parameter for comparison of performance between CPU and GPU. This average for both CPU and GPU will turn out to be roughly equal.
To address this issue, you can consider applying specific seeding techniques to ensure that the training and validation datasets remain the same across multiple training sessions. By avoiding random seeding, you can achieve more consistent results and compare the losses between CPU and GPU training more effectively.
Additionally, you can use cross-validation techniques to compare the losses of the network trained on CPU and GPU. Cross-validation involves splitting the dataset into multiple subsets and performing training and validation on different combinations of these subsets. This can help provide a more reliable comparison of the performance between CPU and GPU training.
For more information on cross-validation techniques, I recommend referring to the following MathWorks documentation link:
By applying seeding techniques and utilizing cross-validation, you can obtain more reliable and comparable results for training the ResNet-50 model on both CPU and GPU.
Hope this helps.

카테고리

Help CenterFile Exchange에서 Image Data Workflows에 대해 자세히 알아보기

제품


릴리스

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by