# Which one is faster: trace(a' * b) or sum(sum(a .* b', 2))?

조회 수: 11(최근 30일)
Xiaohan Du 21 Jun 2018
Hi all,
If I have 2 matrices with same size, I first perform SVD to each, then calculate the trace differently as follows:
clear; clc;
% this script tests what's the fastest way to compute trace with SVD
% vectors.
ma = rand(30000, 100); % large, rectangular matrices.
mb = rand(30000, 100);
[ua, sa, va] = svd(ma, 0);
[ub, sb, vb] = svd(mb, 0);
% case 1: trace(a' * b)
tic
for i = 1:1000
otptTr = trace((vb' * va) * sa' * (ua' * ub) * sb);
end
toc
% case 2: sum(sum(a.*b', 2))
tic
for i = 1:1000
otptSum = sum(sum(((vb' * va) * sa') .* ((ua' * ub) * sb)', 2));
end
toc
I'd like to know if these operations are repeated for many times, which one is faster? I thought case 2 would be faster, but my test results are:
Elapsed time is 8.242850 seconds.
Elapsed time is 8.267731 seconds
which is roughly the same.
Since trace only cares about the diagonal values, and the nth element in the diagonal is the scalar product of the nth row of A with the nth column of B, is there any ways faster by only computing the elements on the diagonal?

#### 댓글 수: 2

Which one is faster is whichever one you find to be faster. It won't necessarily be the same on different computers or for different sizes of inputs etc etc.
You should use
doc timeit
to get a more accurate idea of the speed of different approaches. I have loads of quick script files in our repository in which I run tests on the speed of different ways to achieve the same thing using this. It's a lot more useful than asking others which is faster because it is often impossible to answer.
John D'Errico 21 Jun 2018
Feel free to close the question again as you wish, as that would be your choice. I added an answer which explains the issues that you need to consider, and also showed how to use timeit, instead of tic and toc.

로그인 to comment.

### 답변 수 (2)

John D'Errico 21 Jun 2018
Well, I could tell you, but then...
Seriously, I cannot tell you, because this is something I cannot know. Nor can anyone else really know, unless they have exactly the same computer that you have, set up the same way, same OS, etc.
You have already seen that it is quite close, almost insignificantly different. I would suggest that a better test is to start working with functions. LEARN TO USE FUNCTIONS, NOT SCRIPTS. Functions can be sometimes better optimized by MATLAB. MATLAB treats scripts differently than it does functions, since in the latter the parser can sometimes make optimizations that will speed up the code. Scripts don't get that treatment. And of course, these issues MAY change with every MATLAB release, or with possibly different versions of the BLAS, which could strongly impact your results, since this is all matrix related computation.
But that also gives you the capability to use timeit, instead of tic and toc. The timeit function tests these things much better, more accurately than tic and toc. Timeit worries about problems with warming up a function, making sure it is properly in the function cache. And of course, do NOT do things on the side while you test it, because then you take away processing capability from MATLAB. So avoid use of tic and toc if you can.
And, yes, I could do the same test for you, then use timeit. But I won't do so. Because my test would not be valid for you. I won't have the same CPU, the same version of MATLAB, possibly, the same OS release, the same amount of RAM, cache sizes, number of CPUs, etc. All these things MIGHT be pertinent.
Anyway, I might run the comparison like this:
ma = rand(30000, 100); % large, rectangular matrices.
mb = rand(30000, 100);
[ua, sa, va] = svd(ma, 0);
[ub, sb, vb] = svd(mb, 0);
% case 1: trace(a' * b)
timeit(@() trace((vb' * va) * sa' * (ua' * ub) * sb))
% case 2: sum(sum(a.*b', 2))
timeit(@() sum(sum(((vb' * va) * sa') .* ((ua' * ub) * sb)', 2)))
Again, be careful though. Since I do have multiple other processes running on my machine as I write this, along with my open web browser, I can get variable results. It looks like the second call may be slightly consistently faster, on MY machine, but by a difference that is on the order of 1% when I ran repeated tests.
That is not relevant though, and is why I will not even report the numbers that I get. RUN IT YOURSELF. Make sure the matrices are the same size that you will always have, because the BLAS will interact with cache sizes, with the number of CPUs that you have available. In fact, the exact size of an array will even be important for something as trivial as a matrix multiply.

#### 댓글 수: 0

로그인 to comment.

tic
for i=1:10000
end
toc
tic
for i=1:10000
end
toc
so you kann make a realistic Idea wich code is faster.

#### 댓글 수: 1

Xiaohan Du 21 Jun 2018
this answer is useless, besides, why so many typos?

로그인 to comment.

이 질문에 답변하려면 로그인을(를) 수행하십시오.