IFFT slow down with using gpuArray
이전 댓글 표시
Two sets of data A (4096 x 1024) matrix and B (32768 x 1024) matrix have been transferred to the GPU using gpuArray. A is passed into the FFT function and has shown a significant speed increase in comparison to the CPU A data. B is passed into the IFFT function and has shown approximately a 50% decrease in efficiency in comparison to the CPU B data. Is there a reason why the IFFT function does not have the speed increase proportional to the FFT function? I understand the sizes differ but I do no understand why the GPU implemented IFFT is slower then the CPU implemented IFFT. Also, the tic toc function and the run and time function were used to time the results. Thank you for your help.
댓글 수: 4
Jill Reese
2013년 5월 3일
What version of MATLAB are you running?
Michael
2013년 5월 3일
James Lebak
2013년 5월 3일
편집: James Lebak
2013년 5월 3일
When I time this on MATLAB R2013a, 3.5 GHz Xeon, with a Tesla C2075 GPU, I see 0.36 s for the IFFT of a 32768x1024 matrix on the CPU and 0.051s on the GPU. Here is the code I used:
x=gpuArray.ones(32768,1024);
gd=gpuDevice;
tic;y=ifft(x);wait(gd);toc
xc=gather(x);
tic;y=ifft(xc);toc
And the output:
Elapsed time is 0.050705 seconds.
Elapsed time is 0.364836 seconds.
I would be interested to know what this code shows you, and also whether having the other array that you mentioned in memory changes the performance. I didn't see a change, but I don't have access to this specific card that you have.
Michael
2013년 5월 3일
채택된 답변
추가 답변 (1개)
James Lebak
2013년 5월 3일
편집: James Lebak
2013년 5월 4일
1 개 추천
The GeForce GT630M is a mobile graphics card. Frequently, these cards don't perform as well in double-precision as they do in single-precision. If your application can handle single-precision, you can try the IFFT in single and see if that gives you better performance. If you need double precision performance, you might want to try a different card.
This especially applies if the card in question is compute capability 3.0. You can find out the compute capability of the card in MATLAB from the structure returned by 'gpuDevice'.
Edit: removed incorrect identification of the 630M.
댓글 수: 5
Michael
2013년 5월 3일
James Lebak
2013년 5월 3일
You are correct that the card can compute in double-precision, but that doesn't always mean that it can compute faster than your CPU. I apologize for getting the compute capability of your card wrong -- I misread the chart -- but the point is that many Geforce and mobile cards are good at single-precision computation and less good at double-precision.
Michael
2013년 5월 4일
Michael
2013년 5월 5일
카테고리
도움말 센터 및 File Exchange에서 GPU Computing in MATLAB에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!