필터 지우기
필터 지우기

Unexpected speed decrease of 2D Fourier Transform on GPU when iFFTed

조회 수: 3 (최근 30일)
Tutu
Tutu 2019년 6월 3일
답변: Joss Knight 2019년 6월 8일
I am applying a first FFT2 on a stack of images, croping a part of it, and iFFT2 this part:
For example on GPU: FFT2(1920*1240*30 (single) ) -> crop to 320*207*30 (single) -> iFFT2(320*207*30 (single) )
1920/6=320
1240/6=207
Here you may observe the time of execution, normalized to the number of single data processed, for each function:
timeexeeval.png
Note that the yellow line (FFT2+crop1/6+iFFT2) is more than an order of magnitude slower than the purple line which has 36 more data to process with iFFT2 !
Any idea on what is happening here?
Here is the script I have used:
clear
n=10;
cx=1920;
cy=1240;
FPT=2:5:50;
fpt=size(FPT,2);
b=zeros(1,fpt);
for kk=1:8
for ii=1:fpt
ii
I=gpuArray(single(rand(cy,cx,FPT(1,ii))));
Ia=gpuArray(single(rand(round(cy/6),round(cx/6),FPT(1,ii))+1i.*rand(round(cy/6),round(cx/6),FPT(1,ii))));
mask=zeros(cy,cx,FPT(1,ii));
mask(round(cy/2)-round(cy/12):round(cy/2)+round(cy/12),round(cx/2)-round(cx/12):round(cx/2)+round(cx/12))...
=(ones(size(round(cy/2)-round(cy/12):round(cy/2)+round(cy/12),2),size(round(cx/2)-round(cx/12):round(cx/2)+round(cx/12),2)));
mask=gpuArray(single(mask));
tic
for jj=1:n
switch kk
case 1
tic
B=fft2(I);
case 2
tic
B=fft2(I);
C=B(((cy/2)-round(cy/12)):((cy/2)+round(cy/12)),...
((cx/2)-round(cx/12)):((cx/2)+round(cx/12)),:);
case 3
tic
B=fft2(I);
C=B(((cy/2)-round(cy/12)):((cy/2)+round(cy/12)),...
((cx/2)-round(cx/12)):((cx/2)+round(cx/12)),:);
D=ifft2(C);
case 4
tic
B=fft2(I);
C=ifft2(B);
case 5
tic
B=fft2(I);
C=B.*mask;
D=ifft2(C);
case 6
tic
B=fft2(I);
C=B.*mask;
D=ifft2(C);
E1=imresize(abs(D),1/6);
E2=imresize(angle(D),1/6);
case 7
tic
C=fft2(I);
B=ifft2(Ia);
case 8
tic
B=ifft2(Ia);
end
end
b(1,ii)=toc/n; % b is the time of execution normalized to
%the amount of data and the number of time a case has been evaluated
end
hold on
plot(b)
clear A B C D I E1 E2
end
b is the variable plotted in the above graphic.
My graphic card is the GeForce RTX 2080 Ti.
Any help will be appreciated.
Thanks,
Tual

채택된 답변

Joss Knight
Joss Knight 2019년 6월 8일
I modified your code inserting wait(gpuDevice) before each tic and toc and got a much more sensible graph:
Capture.PNG
The GPU runs asynchronously so tic and toc often don't work very well. See the documentation here.

추가 답변 (1개)

Bruno Luong
Bruno Luong 2019년 6월 3일
If you want a fast FFT, make your data length power of 2, or product of small integers.
166 is bad since the prime factor is 2 * 83..
  댓글 수: 2
Tutu
Tutu 2019년 6월 3일
Thank you for the answer, I knew this though. But, besides, on GPU this doesn't make a big difference if you execute an important amount of data: whether or not you use data with 2^n dimension size.
Tutu
Tutu 2019년 6월 3일
Bruno Luong see,
untitledTotalt2.png
Note that the "y" is in hertz

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by