Performance of GPU & CPU during deep learning
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
1 개 추천
Hello,
I need to know the role of both the GPU and CPU during deep learning. i.e. what each of them does during training?
Also, is it possible to measure the overhead time required for transferring data between memory and GPU?
Any help is appreciated!
채택된 답변
David Willingham
2020년 10월 22일
편집: David Willingham
2020년 10월 22일
For training, you can use either the CPU or the GPU. For certain Deep Learning problems which can a long time to train on a CPU, such Image Classification, training on the GPU is orders of magnitude faster. In general, if you have access to a reasonably powerful GPU use it for training. CPU can help when you need to scale up and run parallel jobs to utlize multiple cores on a machine that doesn't have access to multiple GPU's. An example, running an experiment to tune hyperparameters by training multiple networks at once in parallel using the Experiment Manager.
For inference (calling a trained model), CPU is generally sufficient. As an example for image classiciation you can get ~30 predictions / second on a CPU. However if you require very high frame rates or need to run a batch job, using a GPU will help you achieve faster results.
On overhead time of transferring data, here is a post which discusses it. It shows a few ways you can use various time functions for measuring execution time on a gpu. However I don't see data transfer time being a deciding factor between CPU / GPU.
My recommendation:
Training a single model
If you have access to a GPU, either locally or through a cloud platform, create a test to check if your training converges quicker or not. Good news is this is easy to do. Simply:
- Set the 'ExecutionEnvironment' setting to 'cpu' in your trainingoptions.
- Start training your model and monitor the training progress plot. Make note of how long each epoch takes to train.
- Stop the training.
- Set the 'ExecutionEnvironment' setting to 'gpu' in your trainingoptions.
- Start training your model and monitor the training progress plot. Make note of how long each epoch takes to train.
- Stop the training.
In most cases you'll observer the GPU is faster.
Hopefully this helps answer your question.
Regards,
댓글 수: 5
Ali Al-Saegh
2020년 10월 22일
편집: Ali Al-Saegh
2020년 10월 22일
Thanks David Willingham, I really appreciate your help.
I am still wondering about the job of the CPU during the training of a deep network using a GPU.
Hi Ali,
Can I ask what you're looking to find out? The GPU will be doing the training, all other operations will be handled by the CPU, I.e. updating the training progress plots etc.
David
Hello David,
Thank you very much for your effort and interest.
Actually, I have been asked, for my research work, to train a deep CNN on a heterogeneous system. Hence, I wanted to train my network on the GPU + CPU. I do not know if my idea is reasonable and applicable. I hope you help further.
Regards,
Ali
Hi Ali,
It doesn't seem unreasonable at all. My suggestion is to look at what workflows are best served on each, I.e. CPU and GPU. CPU's are better suited for parallel workflows that surround the training, GPU's for training an individual network. MATLAB has out of the box support for both of these when using common networks.
However, if you really want to tinker around and build your own training routine that combines them both, I suggest you look to build your own custom training loop. Here's an example of using this for parallel computing using custom training loops you could get started with.
David
@David Willingham, just reviving this post for a quick question. I know in many cases, in such scenarios a new post is preferred, but it's such a quick and direct follow up to this felt it to be appropriate to ask here.
Can a classiifer trained on a CPU, be used on a GPU? For instance, if a classiifer is trained on a CPU, and I later load it in a different code with
load classifier;
For later use with the predict function as follows:
class = predict(classifier,imageFeatures,'ObservationsIn','columns');
Would it be valid, if after loading the classifier, I wrote:
classifier = gpuArray(classifier);
And similarly, for the image (img) whose features I am extracting into the imageFeatures variable, wrote:
img = gpuArray(img);
Would this be a valid implementation or do I need to go back and retrian the classifier on a GPU. Further, can the variable class, above, be used directly in the rest of my code if it runs a CPU, or do I need to do:
class = gather(class);
For GPU computing I realize these might be very basic questions, but I don't fully understand how it works yet
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Parallel and Cloud에 대해 자세히 알아보기
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
