Volatile GPU-Util is 0% during Neural network training
조회 수: 9 (최근 30일)
이전 댓글 표시
Hello.
I would like to train my neural network with 4 GPUs (on a remote server)
To utilize my GPUs I set ExecutionEnvironment in the training option to 'multi-gpu'
However, the Volatile GPU-util remains at 0% during training.
It seems that the data load on the GPU memory.
I would appreciate your help.
댓글 수: 8
Joss Knight
2023년 9월 12일
편집: Joss Knight
2023년 9월 12일
Right, so the parfor is opening a pool with a lot of workers (presumably you have a large number of CPU cores); but unfortunately these are then not used for your preprocessing during training. You need to enable DispatchInBackground as well. Try that. You should have received a warning on the first run, telling you that most of your workers were not going to be used for training.
It does look as though the general problem is that your data preprocessing is dominating the training time meaning only a small proportion of each second is being spent computing gradients, and this is what the Utilization is measuring. If DispatchInBackground doesn't help we can explore further how to vectorize your transform functions; you might also consider using augmentedImageDatastore, which provides most of what you need. Or you could preprocess data on the GPU.
채택된 답변
aditi bagora
2023년 9월 25일
The error message indicates that there is an issue while distributing the data parallelly in the background. To fix the issue, the class "CustomImageDatastore" needs to implement an additional class "matlab.io.datastore.Subsettable." which will support parallel and multi-GPU environment.
For further details, refer the below documentation link.
Hope this helps you in solving the error.
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Image Data Workflows에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!