Massive slowdown for Apple Silicon in computing SVD
조회 수: 13 (최근 30일)
이전 댓글 표시
I recently notice that there is an extreme slowdown in my version of Matlab while computing an SVD when the size of the matrix crosses some threshold. I came up with the following example that demonstrates my issue:
N = [10000 11000 12000 13000];
for i = 1:4
A = randn(N(i),3);
tic;
[U,S,V] = svd(A,0);
toc;
end
When I run this in Matlab R2024b (macOS Apple silicon), the output is:
Elapsed time is 0.000396 seconds.
Elapsed time is 0.000275 seconds.
Elapsed time is 0.000264 seconds.
Elapsed time is 0.083150 seconds.
Of course the exact numbers vary trial to trial, but the speed for the last run (where N = 13000) is consistently orders of magnitude slower.
When I run this same code on Matlab R2024b (Intel processor) on the same computer, this slow down does not happen. I was able to replicate this issue across two different Macs (one with M1 and another with M3) and different versions of Matlab (going back to R2023b).
Any idea why this might be happening in the silicon version?
Edit: I'm running macOS 15.1.1
댓글 수: 0
채택된 답변
Mike Croucher
2024년 12월 5일
편집: Mike Croucher
2025년 7월 29일
Update:This has now been fixed as of R2025a Update 1
Your script on my M2 in R2025a Update 1:
>> slowSVD
Elapsed time is 0.000368 seconds.
Elapsed time is 0.000253 seconds.
Elapsed time is 0.000261 seconds.
Elapsed time is 0.000290 seconds.
Compared to R2024b:
Elapsed time is 0.000408 seconds.
Elapsed time is 0.000397 seconds.
Elapsed time is 0.000278 seconds.
Elapsed time is 0.145339 seconds.
In both cases I ran the script twice and reported the 2nd runtime in order to ensure I'm not including first run costs.
Thanks again for reporting this.
[My original response is below]
Hi
I have reproduced your times on my M2 MacbookPro using R2024b using both the default BLAS and also the Apple Silicon BLAS as described in my blog post https://blogs.mathworks.com/matlab/2023/12/13/life-in-the-fast-lane-making-matlab-even-faster-on-apple-silicon-with-apple-accelerate/
I am not sure what causes this but have reported it to development.
Thanks for the report.
Mike
댓글 수: 0
추가 답변 (1개)
Heiko Weichelt
2024년 12월 21일
Thanks for reporting this.
We identified the problem and are working on improving this in a future release.
As a temporary workaround, we recommend replacing:
[U,S,V]=svd(A,0);
with
[Q,R]=qr(A,"econ"); [U,S,V]=svd(R); U=Q*U;
In general, this step is not needed as the SVD performs the QR inside itself. The LAPACK library currently used on Apple Silicon, however, had suboptimal tuning parameters for this case.
On my machine, the time for the largest example improved as following:
>> tic; [U,S,V]=svd(A,0); toc
Elapsed time is 0.086851 seconds.
>> tic; [Q,R]=qr(A,"econ"); [U,S,V]=svd(R); U=Q*U; toc
Elapsed time is 0.000977 seconds.
댓글 수: 3
Heiko Weichelt
2025년 3월 26일
For the initial example, we also compute U, which is of same dimension as A, i.e., tall and skinny. Your solution isn't computing that yet.
Furthmore, the condition number of A.'*A might be as bad as the square of the condition number of A itself which can cause additional trouble for EIG. So I wouldn't advice this workaround as a general solution.
P Jeffrey Ungar
2025년 3월 26일
편집: P Jeffrey Ungar
2025년 3월 26일
I neglected part of the solution. Yours is more complete, but the condition number consideration is hardly a problem for a small number of vectors, even very long ones. My application is to get an orthonormal basis for a small set of long vectors that are guaranteed to be linearly independent. They are, in fact, a set of eigenvectors for a degenerate eigenvalue already obtained by the likes of eigs(). These are not guaranteed to be orthogonal. Below shows finishing the work still gives much faster performance.
The performance of svd() right now on R2025a (prerelease) makes it virtually unusable. For laughs give it a single vector of length 1000000 and watch it take 12 seconds on M4 Max! I sincerely hope this problem is addressed properly by the time it is released.
>> A = randn(10000000,10);
tm = tic(); [V,S] = eig(A.'*A); U = A*V./sqrt(diag(S).'); delt=toc(tm)
delt =
0.1727
>> tm = tic(); [Q,R] = qr(A,"econ"); [U,~,~] = svd(R); U = Q*U; toc(tm);
Elapsed time is 0.390255 seconds.
>>
참고 항목
카테고리
Help Center 및 File Exchange에서 Get Started with MATLAB에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!