필터 지우기
필터 지우기

How to transfer date from gpu to workplace faster?

조회 수: 5 (최근 30일)
Han F
Han F 2021년 1월 6일
댓글: Walter Roberson 2021년 1월 10일
Hi everyone,
There is bottleneck in my codes.
I use CUDA to accelerate some parts of my algorithm, and the results should be returned workplace for following calculation.
I use gather(), firstly, to transfer data from gpu to workplace. However, the profiler shows that execution time of gather() (1000s) account for almost 1/3 of total time (3000s). My question is any stratgies can help to improve data transfer process?
ALso, I posted a similar question https://www.mathworks.com/matlabcentral/answers/703592-why-does-the-gather-function-only-take-around-0-001-seconds-at-command-window-while-1s-in-a-loo . But I still confuse about how gputimeit() or wait(gpuDevice) work?
Thank you in advance!
  댓글 수: 3
Han F
Han F 2021년 1월 6일
Hi Walter, thank you for your comment. Following profiler is the latest result. Line 649 and 651 speed 650 seconds and total time is 2900s. [dk1,...,dk10] is a 1000*9 matrix.
Do you mean there is no way to reduce the time spend waiting for the GPU to finish ?
Joss Knight
Joss Knight 2021년 1월 10일
Can you explain how you expect to reduce the time spent waiting for the GPU to finish? The GPU has to finish computing a result before it can return a result! The only way to make that go faster is to get a faster GPU - or maybe write your own CUDA code, if you think you can implement a faster algorithm.

댓글을 달려면 로그인하십시오.

채택된 답변

Walter Roberson
Walter Roberson 2021년 1월 6일
These are the strategies to reduce the time spent in gather():
  • do less complex operations on the GPU so that the GPU finishes the task faster
  • Use smaller output matrices -- ideally only a scalar
  • Use a faster GPU with faster memory and faster I/O bus
  • Use faster memory on you CPU
  • don't use the GPU for operations that would complete about as quickly on the CPU. For example, even though addition is faster on the GPU, if you use large arrays, most of the time is spent transfering the data to the GPU and the results back from the GPU
There is no
gpuprefs(gpudevice, 'gather_rate', 'ludicrous_speed')
You have to wait for the GPU to finish executing the task given to it, and the driver has to do a DMA operation to get the data back.
Oh yes, there is one more strategy to reduce the time spent in gather():
  • use Linux instead of Windows. The Windows drivers are less efficient (for reasons having to do with the architecture requirements that Windows places.)
  댓글 수: 3
Han F
Han F 2021년 1월 6일
By the way, what's the meaning of gpuprefs(), I did not find it in help
Walter Roberson
Walter Roberson 2021년 1월 10일
https://youtu.be/NAWL8ejf2nM

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by