<< BitShift, Pow2 or standard multiplication? Code Optimization

조회 수: 1 (최근 30일)
covfefe
covfefe 2024년 10월 31일
편집: 埃博拉酱 2024년 11월 5일
I was looking to optimize a code base, only to be surprised by the results of comparing these different Matlab methods.
While I was expecting bitshift or pow2 to be order of magnitudes faster, the standard multiplication was actually 100x faster.
Am I missing something here? How can a standard multiplication be any faster than dedicated functions?
n = 1e6; x = floor(rand(1,n)*2^8); tic, for k = 1:1:n, x1(k) = bitshift(x(k),16); end; toc, tic, for k = 1:1:n, x2(k) = pow2(x(k),16); end; toc, tic, for k = 1:1:n, x3(k) = x(k)*2^16; end; toc
Results:
bitshift: Elapsed time is 0.156905 seconds.
pow2: Elapsed time is 0.146554 seconds.
*: Elapsed time is 0.001691 seconds.
Vectorized code has similar execution hierarchy:
n = 1e8; x = floor(rand(1,n)*2^8); tic, x1 = bitshift(x,16); toc, tic, x2 = pow2(x,16); toc, tic, x3 = x*2^16; toc
Elapsed time is 0.279544 seconds.
Elapsed time is 0.070829 seconds.
Elapsed time is 0.029266 seconds.

답변 (1개)

埃博拉酱
埃博拉酱 2024년 11월 1일
편집: 埃博拉酱 2024년 11월 1일
If the algorithm of these functions executes literally, then bitshift is actually adding to the exponential segment of the double, and pow2 is the same as the operator. While bitshifts should theoretically have lower power consumption, the difference in the time they consume is very small and far less than the overhead of the function call itself.
Yes, the main amount of time you observe that bitshift and pow2 spend is on function calls. MATLAB's function calls are very expensive because they don't optimize simple functions inline like C++ does.
If you use the vectorization method to calculate:
n = 1e6;
x = floor(rand(1,n)*2^8);
tic;
x1=bitshift(x,16);
toc
tic;
x2=pow2(x,16);
toc
x=uint32(x);
tic;
x3= x*2^16;
toc
You will observe that, as I said, pow2 and operators consume almost the same amount of time. However, bitshift still consumes the most time, which may be due to the fact that MATLAB adds more logical judgment steps to bitshift.
An interesting way to compare this is to use the doc command to look at the document length of bitshift and pow2, and you will find that the bitshift document is longer. Although not absolutely, I believe that in general, the longer the document, the more likely the logic of a function to be more complex. This means that bitshift has to deal with more different and more complex scenarios than pow2, and these judgment operations take more time.
MATLAB's expensive function call overhead feature tells us to try to avoid frequent calls to functions in loops and make the most of the vectorization support features of the functions themselves.
  댓글 수: 2
covfefe
covfefe 2024년 11월 4일
편집: covfefe 2024년 11월 4일
While function calls do slow things down, even with creating a dedicated multiplication AND power function, I still can't manage to have an execution time as slow as bitshift or pow2:
function x3 = pow(x1,x2)
x3 = x1^x2;
end
function x3 = mult(x1,x2)
x3 = x1*x2;
end
n = 1e6; x = floor(rand(1,n)*2^8); tic, for k = 1:1:n, x1(k) = bitshift(x(k),16); end; toc, tic, for k = 1:1:n, x2(k) = pow2(x(k),16); end; toc, tic, for k = 1:1:n, x3(k) = x(k)*2^16; end; toc, tic, for k = 1:1:n, x4(k) = mult(x(k),2^16); end; toc; tic, for k = 1:1:n, x5(k) = mult(x(k),pow(2,16)); end; toc;
Results:
bitshift: Elapsed time is 0.156905 seconds.
pow2: Elapsed time is 0.146554 seconds.
*: Elapsed time is 0.001691 seconds.
multi: Elapsed time is 0.015167 seconds.
mult + pow: Elapsed time is 0.107384 seconds.
埃博拉酱
埃博拉酱 2024년 11월 5일
편집: 埃博拉酱 2024년 11월 5일
There may also be some overhead of logic checks, such as checking if the input is floating-point. But I agree with you that these built-in functions shouldn't be so much slower than a handwritten implementation. If you edit the code files for these built-in functions, you'll see that they don't seem to have been updated in more than 10 years. Perhaps their internal implementations are based on some outdated technical concepts that have not been optimized by the new technology. Perhaps the checking of input parameters is a bit excessive and inefficient. Perhaps they have undergone performance tests that include only efficiency on large array inputs, not on frequent calls to small data.
According to my experience, MATLAB does have a tendency to optimize vectorized computing of big data while infinitely sacrificing the performance of frequent calls to small data. Even if programmers know that there may be some room for optimization for small calls, they may not have the motivation to do these optimizations due to the lack of performance testing for small calls.

댓글을 달려면 로그인하십시오.

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by