Why is x(:) so much slower than reshape(x,N,1) with complex arrays?

조회 수: 16(최근 30일)
Matt J 2021년 7월 27일
댓글: Walter Roberson 2021년 8월 14일
The two for loops below differ only in the flattening operation used to obtain A_1D . Why is the run time so much worse with A_3D(:) than with a call to reshape()?
Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(N,1);
tic
for k = 1:20
B = reshape( A0, [Nz,Ny,Nx] ) ;
A_3D = fftn(B);
A_1D = reshape( A_3D, N,1); %<--- Version 1
end
toc
Elapsed time is 3.770859 seconds.
tic
for k = 1:20
B = reshape( A0, [Nz,Ny,Nx] ) ;
A_3D = fftn(B);
A_1D = A_3D(:); %<--- Version 2
end
toc
Elapsed time is 5.056827 seconds.
댓글 수: 7표시숨기기 이전 댓글 수: 6
Bruno Luong 2021년 7월 28일
I must admit that understanding why/when MATLAB make data copy become obscure to me since few years now. I did not come to a full understanding of how it works.

댓글을 달려면 로그인하십시오.

채택된 답변

Matt J 2021년 7월 28일
The following simple test seems to support @Bruno Luong's conjecture that (:) results in data copying. The data of B1 resulting from reshape() has the same data pointer location as A, but B2 generated with (:) points to different data.
format debug
A=complex(rand(2),rand(2))
A =
Structure address = 7f3f47f4e0e0 m = 2 n = 2 pr = 7f3fcb0112e0 0.5114 + 0.6181i 0.5881 + 0.4450i 0.5713 + 0.9018i 0.3682 + 0.8103i
B1=reshape(A,4,1),
B1 =
Structure address = 7f3fcf1f4be0 m = 4 n = 1 pr = 7f3fcb0112e0 0.5114 + 0.6181i 0.5713 + 0.9018i 0.5881 + 0.4450i 0.3682 + 0.8103i
B2=A(:)
B2 =
Structure address = 7f3f47e45a20 m = 4 n = 1 pr = 7f3faff0b980 0.5114 + 0.6181i 0.5713 + 0.9018i 0.5881 + 0.4450i 0.3682 + 0.8103i
댓글 수: 8표시숨기기 이전 댓글 수: 7
Walter Roberson 2021년 8월 14일
The (:) options are the slowest. reshape(abs(A),N,1) might possibly be the fastest -- there is notable variation in different runs.
Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = complex(randn(Nx, Ny, Nz), randn(Nx, Ny, Nz));
t(1) = timeit(@() use_abs_all(A0, N), 0)
t = 0.0937
t(2) = timeit(@() use_abs_colon(A0, N), 0)
t = 1×2
0.0937 0.1727
t(3) = timeit(@() use_abs_reshape_null(A0, N), 0)
t = 1×3
0.0937 0.1727 0.0994
t(4) = timeit(@() use_abs_reshape_N(A0, N), 0)
t = 1×4
0.0937 0.1727 0.0994 0.0935
t(5) = timeit(@() use_all(A0, N), 0)
t = 1×5
0.0937 0.1727 0.0994 0.0935 0.1012
t(6) = timeit(@() use_colon(A0, N), 0)
t = 1×6
0.0937 0.1727 0.0994 0.0935 0.1012 0.1802
t(7) = timeit(@() use_reshape_null(A0, N), 0)
t = 1×7
0.0937 0.1727 0.0994 0.0935 0.1012 0.1802 0.1013
t(8) = timeit(@() use_reshape_N(A0, N), 0)
t = 1×8
0.0937 0.1727 0.0994 0.0935 0.1012 0.1802 0.1013 0.1018
cats = categorical({'abs(all)', 'abs(:)', 'reshape(abs,[])','reshape(abs,N)', 'all', '(:)', 'reshape([])', 'reshape(N)'});
bar(cats, t)
function B = use_abs_all(A, N)
B = max(abs(A), [], 'all');
end
function B = use_abs_colon(A, N)
B = max(abs(A(:)));
end
function B = use_abs_reshape_null(A, N)
B = max(reshape(abs(A), [], 1));
end
function B = use_abs_reshape_N(A, N)
B = max(reshape(abs(A), N, 1));
end
function B = use_all(A, N)
B = max(A, [], 'all');
end
function B = use_colon(A, N)
B = max(A(:));
end
function B = use_reshape_null(A, N)
B = max(reshape(A, [], 1));
end
function B = use_reshape_N(A, N)
B = max(reshape(A, N, 1));
end

댓글을 달려면 로그인하십시오.

추가 답변(1개)

Walter Roberson 2021년 7월 28일
Nx = 256;
Ny = 256;
Nz = 128;
N = Nx*Ny*Nz;
A0 = rand(Nx, Ny, Nz);
timeit(@() use_colon(A0, N), 0)
ans = 8.3490e-06
timeit(@() use_reshape_null(A0, N), 0)
ans = 6.5490e-06
timeit(@() use_reshape_N(A0, N), 0)
ans = 6.0925e-06
function use_colon(A, N)
B = A(:);
end
function use_reshape_null(A, N)
B = reshape(A, [], 1);
end
function use_reshape_N(A, N)
B = reshape(A, N, 1);
end
In this particular test, the timing is close enough that we can speculate some reasons:
Using an explicit size to reshape to is faster than reshape([]) because reshape([]) has to spend time calculating the size based upon dividing numel() by the size of the known parameters.
Using (:) versus reshape() is not immediately as clear. The model for (:) is that it invokes subsref() with struct('type', {'()'}, 'subs', {':'}) and then subsref() has to invoke reshape() . I point out "model" because potentially the Execution Engine could optimize all of this, and one would tend to think that optimization of (:) should be especially good.
댓글 수: 10표시숨기기 이전 댓글 수: 9
I noticed that when I re-run it within a script without clearing variables, the second peak at x=5 vanishes. Still curious but out of ideas.

댓글을 달려면 로그인하십시오.

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by