Can I speed up this matrix multiplication?

조회 수: 63 (최근 30일)
Zenan Li
Zenan Li 2021년 5월 7일
편집: Bruno Luong 2021년 5월 18일
Sorry for this stupid question. I know that the matrix multiplication has been highly optimized by Matlab, but I am still want to figure out that can I further speed up only for this special case?
I have no idea for this bottleneck in my code. The simplified code is uploaded in the following.
% X, Q are given symmetric matrix
[P, ~] = eig(X);
ans = P*(Q.*(P'*Y*P))*P'; % Y will change in each loop.
The X, Y, and Q are 1000x1000 matrices. In each loop, I have to compute four matrix multiplications,
But actually, P and Q are constant matrices in each loop, so I want to know what I could do to further speed up this code?
I have tried gpuArray, but it cannot be faster than normal Matlab * operation (maybe my GPU is not good enough..)
Any suggestion will be appreciated.
  댓글 수: 6
Matt J
Matt J 2021년 5월 10일
Is Q low rank, by any chance?
Zenan Li
Zenan Li 2021년 5월 10일
Thanks, it seems Q can be low rank..

댓글을 달려면 로그인하십시오.

채택된 답변

Matt J
Matt J 2021년 5월 10일
편집: Matt J 2021년 5월 10일
For example, the size of Q is 1000x1000, but its rank is 10.
If so, then we can decompose Q into a short sum where and are the non-zero eigenvalues and vectors respectively of Q. Then the computation becomes, with representing the Kronecker product and representing diag(v_i),
or
So, you can now pre-compute the matrices and reduce the number of matrix multiplications in each of the above terms to two. Also, you have a 16-core CPU, so each of the terms can be computed in parallel and summed using parfor.
  댓글 수: 10
Nguyen Van Hieu
Nguyen Van Hieu 2021년 5월 18일
My problem is very similar this question. I need to compute:
ans=A*P*A', where A is sparse, P is symetric and number of zero elements in P are 0.
How can I speed up the calculation using paralell toolbox
Bruno Luong
Bruno Luong 2021년 5월 18일
편집: Bruno Luong 2021년 5월 18일
MATLAB matruix multiplication is perhaps multi-threaded.
You won't be able to speed up with putting parrallel on the single processor (PC).

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Matt J
Matt J 2021년 5월 8일
편집: Matt J 2021년 5월 8일
I have tried gpuArray, but it cannot be faster than normal Matlab * operation (maybe my GPU is not good enough..)
Depending on your GPU, and your precision requirements, it may be worth pre-casting the matrices to single precision. Some GPUs are not well-optimized for double float operations. For example, on my machine with the GeForce GTX 1050,
dtype={'double','single'};
for i=1:2
[P,Q,Y]=deal(rand(1000,dtype{i}));
timeit(@() P*(Q.*(P'*Y*P))*P')
[P,Q,Y]=deal(gpuArray.rand(1000,dtype{i}));
gputimeit(@() P*(Q.*(P'*Y*P))*P.')
end
I obtain,
ans = %double CPU
0.0607
ans = %double GPU
0.1678
ans = %single CPU
0.0261
ans = %single GPU
0.0065
  댓글 수: 13
Zenan Li
Zenan Li 2021년 5월 10일
Thanks, and I find that although Q is not sparse but it is low rank. How could I do?
Zenan Li
Zenan Li 2021년 5월 10일
For example, the size of Q is 1000x1000, but its rank is 10.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 GPU Computing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by