Efficient way of Vectorization

조회 수: 2 (최근 30일)
Nadeem Ahmed
Nadeem Ahmed 2022년 11월 29일
댓글: Matt J 2022년 11월 30일
Hello, I searched everywhere for the efficient explaination of vectorization, I would like to know how can we use the technique of vectorization efficiently if we have this kind of problem
clc
clear
close all
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
tic
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
toc
Elapsed time is 0.049266 seconds.
%VECTORIZATION
tic
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
toc;
Elapsed time is 0.014871 seconds.
norm(C1-C2)
ans = 0
This is a very basic example, although it is showing the improvement after vectorization but not that enough. If I make more divison and multiplication in the same function, "vectorization" will become even worse than "for loop". If anybody have any suggestion regarding this, it would be very helpful for me.
  댓글 수: 8
Mike Croucher
Mike Croucher 2022년 11월 30일
Thanks. So for N,M=50, the code runs in 0.01 seconds on my machine.
Increasing to N,M=100, the code runs in 0.22 seconds
Trying N,M=200, I run out of memory on my 32Gb laptop.
What values of N and M are you interested in and how fast do you need the code to be?
Nadeem Ahmed
Nadeem Ahmed 2022년 11월 30일
I will use this function for N,M>80 and I need to call this function more than thousands times therefore it should be of negligible time. Any suggestions are welcome.

댓글을 달려면 로그인하십시오.

채택된 답변

Matt J
Matt J 2022년 11월 30일
편집: Matt J 2022년 11월 30일
Unfortunately, this is a situation where the for loop is the fastest option. This is because vectorized solution does much more memory allocation than it should. I have raised this issue with MathWorks staff, but am not sure what is being done on it.
function test
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
timeit(@()method1)
timeit(@()method2)
ans =
0.0161
ans =
0.0210
function method1
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function method2
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
end
end
  댓글 수: 9
Bruno Luong
Bruno Luong 2022년 11월 30일
편집: Bruno Luong 2022년 11월 30일
I don't think the problem is allocating memory, but actually indexing with truncation index, which requires elements in memory to be rearranged.
I'm not surprised that to make a vectorize code as fast as the for-loop requires a big development of the internal engine (for instant using meta data that describe subarray of an array without copying the data).
Indexing is always the bottleneck of MATLAB.
Matt J
Matt J 2022년 11월 30일
I don't think the problem is allocating memory, but actually indexing with truncation index
Not sure what a "truncation index" refers to here. In any case, the subsref operations are definitely to blame, since when we revise the test with the indexing done offline, the vectorized version is much more competitive with the loops:
function test
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
[Q1,Q2,Q3,Q4,Q5,Q6]=...
deal( A(2:n-1,2:n-1) , B(2:n-1,1:n-2), A(1:n-2,2:n-1),...
B(3:n,1:n-2), A(2:n-1,3:n), B(3:n,2:n-1) );
timeit(@()method1)
timeit(@()method2)
ans =
0.0149
ans =
0.0051
function method1
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function method2
C2(2:n-1,2:n-1)=(Q1.*Q2 + Q3.*Q4)./(Q5.*Q6);
end
end

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Mike Croucher
Mike Croucher 2022년 11월 30일
Switch the order of the loops around. It will be faster because you'll be operating on the matrix column-wise
test
loops
ans = 0.0884
loops 2
ans = 0.0203
vector
ans = 0.0928
function test
n=2000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
disp('loops')
timeit(@()loops)
disp('loops 2')
timeit(@()loops2)
disp('vector')
timeit(@()vector)
function loops
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function loops2
for j=2:n-1
for i=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function vector
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
end
end
  댓글 수: 4
Nadeem Ahmed
Nadeem Ahmed 2022년 11월 30일
Yes, you are right, beacsue I changed all my for loop but still I didn't get any improvement.
Dyuman Joshi
Dyuman Joshi 2022년 11월 30일
This is neat, @Mike Croucher! Learned something new today :D

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Time Series Objects에 대해 자세히 알아보기

제품


릴리스

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by