A problem when using "multi-gpu" as "ExecutionEnvironment" for training a CNN

Question

Hyung Joon 2022년 8월 7일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1775390-a-problem-when-using-multi-gpu-as-executionenvironment-for-training-a-cnn

댓글: Joss Knight 2022년 8월 8일

Hello,

I am experiencing weird problems when I use the “multi-gpu” as the “ExecutionEnvironment” in the training option for training a CNN.

I am using the simple CNN example shown in the MATLAB’s website: https://www.mathworks.com/help/deeplearning/ug/create-simple-deep-learning-network-for-classification.html

I only added the “multi-gpu” as the “ExecutionEnvironment” and changed the “MaxEpochs” to 50 in the training option.

I am using two of NVIDA RTX 2080 Ti 11GB from the same manufacturer (Galax). The two graphic cards are the same models from the same manufacturer.

In MATLAB R2020a, the CNN example works fine with faster training speeds when I use the “multi-gpu” as the “ExecutionEnvironment” in the training option. The training progress plot is attached below.

But in MATLAB R2022a, the training progress works really weird when I run the same code. The validation data curve goes high while the training data curve fluctuates at the middle and never goes up. The screenshot showing this is attached below.

And the validation accuracy in the command window after the training is also bad (63.92%) while the validation accuracy only in the training progress plot is good.

This issue also happens in other MATLAB versions later than R2020a. When I tested with a single GPU separately and individually, the training works fine.

Can anyone please help me with this issue? I need to use my two GPUs for the "multi-gpu" training in MATLAB R2022a for my work so I first need to fix this weird problem in the example.

I would appreciate any help. Thank you very much.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Joss Knight 2022년 8월 7일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1775390-a-problem-when-using-multi-gpu-as-executionenvironment-for-training-a-cnn#answer_1022590

편집: Joss Knight 2022년 8월 8일

Most likely this is this issue, which is fixed in the latest update to R2022a. You can also try downgrading your GPU drivers.

댓글 수: 2
없음 표시없음 숨기기

Hyung Joon 2022년 8월 8일

Thank you very much for your support. I uninstalled my Geforce driver (a 500 level version, probably quite recent one) and installed an older version (one of the 400 level versions). Then the "multi-gpu" now works!! I installed the MATLAB R2022a on July 21st. I think it was not the latest update. Thank you very much again for your help!!

Joss Knight 2022년 8월 8일

Glad to hear it!

댓글을 달려면 로그인하십시오.

A problem when using "multi-gpu" as "ExecutionEnvironment" for training a CNN

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 2
없음 표시없음 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

A problem when using "multi-gpu" as "ExecutionEnvironment" for training a CNN

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 2 없음 표시없음 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시없음 숨기기