Can I speed up this matrix multiplication?
조회 수: 63 (최근 30일)
이전 댓글 표시
Sorry for this stupid question. I know that the matrix multiplication has been highly optimized by Matlab, but I am still want to figure out that can I further speed up only for this special case?
I have no idea for this bottleneck in my code. The simplified code is uploaded in the following.
% X, Q are given symmetric matrix
[P, ~] = eig(X);
ans = P*(Q.*(P'*Y*P))*P'; % Y will change in each loop.
The X, Y, and Q are 1000x1000 matrices. In each loop, I have to compute four matrix multiplications,
But actually, P and Q are constant matrices in each loop, so I want to know what I could do to further speed up this code?
I have tried gpuArray, but it cannot be faster than normal Matlab * operation (maybe my GPU is not good enough..)
Any suggestion will be appreciated.
댓글 수: 6
채택된 답변
Matt J
2021년 5월 10일
편집: Matt J
2021년 5월 10일
For example, the size of Q is 1000x1000, but its rank is 10.
If so, then we can decompose Q into a short sum where and are the non-zero eigenvalues and vectors respectively of Q. Then the computation becomes, with ⊗ representing the Kronecker product and representing diag(v_i),
or
So, you can now pre-compute the matrices and reduce the number of matrix multiplications in each of the above terms to two. Also, you have a 16-core CPU, so each of the terms can be computed in parallel and summed using parfor.
댓글 수: 10
Nguyen Van Hieu
2021년 5월 18일
My problem is very similar this question. I need to compute:
ans=A*P*A', where A is sparse, P is symetric and number of zero elements in P are 0.
How can I speed up the calculation using paralell toolbox
Bruno Luong
2021년 5월 18일
편집: Bruno Luong
2021년 5월 18일
MATLAB matruix multiplication is perhaps multi-threaded.
You won't be able to speed up with putting parrallel on the single processor (PC).
추가 답변 (1개)
Matt J
2021년 5월 8일
편집: Matt J
2021년 5월 8일
I have tried gpuArray, but it cannot be faster than normal Matlab * operation (maybe my GPU is not good enough..)
Depending on your GPU, and your precision requirements, it may be worth pre-casting the matrices to single precision. Some GPUs are not well-optimized for double float operations. For example, on my machine with the GeForce GTX 1050,
dtype={'double','single'};
for i=1:2
[P,Q,Y]=deal(rand(1000,dtype{i}));
timeit(@() P*(Q.*(P'*Y*P))*P')
[P,Q,Y]=deal(gpuArray.rand(1000,dtype{i}));
gputimeit(@() P*(Q.*(P'*Y*P))*P.')
end
I obtain,
ans = %double CPU
0.0607
ans = %double GPU
0.1678
ans = %single CPU
0.0261
ans = %single GPU
0.0065
참고 항목
카테고리
Help Center 및 File Exchange에서 GPU Computing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!