vectorised code is terribly slower

Why is the vectorized version of simple local maxima detection code significantly slower (~2-3 times) than its for-loop version?
%ntest data
X = rand(100000,1000);
% findig local maxima over columns of X
% for-loop version
tic;
[I,J] = size(X);
Ind = false(I,J);
for j = 1:J
Ind(:,j) = diff( sign( diff([0; X(:,j); 0]) ) ) < 0;
end
toc
% vectorized version (~3 times slower than for-loop)
tic;
Ind_ = diff(sign(diff([zeros(1,J);X;zeros(1,J)],1,1)),1,1) < 0;
toc
% result identity test
isequal(Ind,Ind_)

댓글 수: 6

I guess because
[zeros(1,J);X;zeros(1,J)]
MATLAB needs to allocate big chunk of memory (and copy segment by segment, but that happens also with for-loop).
Michal
Michal 2019년 9월 9일
편집: Michal 2019년 9월 9일
@Bruno I think the problem could be in built-in diff function, which is not properly programmed in a a case of dim = 1 option. See timing of the following code:
%% test data
X = rand(100000,1000);
%% findig local maxima over columns of X
[I,J] = size(X);
array = [zeros(1,J);X;zeros(1,J)];
% for-loop version
tic;
Ind = false(I,J);
for j = 1:J
Ind(:,j) = diff( sign( diff(array(:,j)) ) ) < 0;
end
toc
% vectorized version (~2 times slower than for-loop)
tic;
Ind_ = diff(sign(diff(array,1,1)),1,1) < 0;
toc
%% result identity test
isequal(Ind,Ind_)
Bruno Luong
Bruno Luong 2019년 9월 9일
편집: Bruno Luong 2019년 9월 9일
Not entirely convinced. I still stick with memory related cause, because not only the verticat CAT but also DIFF, SIGN, DIFF create 3 big temporary arrays (hidden).
If you add 1,1 parameter in for-loop
tic;
[I,J] = size(X);
Ind = false(I,J);
for j = 1:J
Ind(:,j) = diff( sign( diff(array(:,j),1,1) ),1,1) < 0;
end
toc
it's still fast. How do you explain that?
You note also that the reative difference of CPU times is less if you reduce the first dimension of X.
Michal
Michal 2019년 9월 9일
편집: Michal 2019년 9월 9일
I guess, that In this case I call diff(array(:,j),1,1), where array(:,j) is a vector not matrix, so diff in this case does not perform computing over separated columns of array. May be the diff built-in function does not use multithreading properly in this case? But you are right the memory allocation in vectorized code could be really one (!) of slowness cause.
Bruno Luong
Bruno Luong 2019년 9월 9일
It is possibly that the DIFF implementation on array does not access sequently memory in case of 2D array data, but row-by-row of the array, that might slow down.
I don't think the multi-threading is wrongly implemented.
Michal
Michal 2019년 9월 9일
The main problem is, that during continuous development of JIT engine are alwyas changing MATLAB performance characteristics for vectorized codes. In general, the standard for-loop codes becomes faster and faster.
I have plenty of highly vectorized MATLAB codes created during last 10 years, which are during last few years becomes slower than theirs for-loop counter parts. So, there is no code performance stability.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

도움말 센터File Exchange에서 Execution Speed에 대해 자세히 알아보기

태그

질문:

2019년 9월 9일

댓글:

2019년 9월 9일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by