How can I speed up this function? #vectorize

조회 수: 3 (최근 30일)
Fredrik P
Fredrik P 2024년 1월 12일
편집: Fredrik P 2024년 1월 14일
I'm struggling to speed up the code inside the version1 function. The only way to vectorize it that I can seem to figure out is version2, but that actually makes the function slower. Help would be much appreciated :-)
n1 = 400;
n2 = 3;
n3 = 7;
n4 = 2;
A = rand(n1, n2, n3);
B = randi(n2, [n4, n1, n3]);
disp(timeit(@() version1(A, B, C, n1, n2, n3, n4)));
disp(timeit(@() version2(A, B, C, n1, n2, n3, n4)));
function C = version1(A, B, n1, n2, n3, n4)
C = zeros(n4, n1, n3);
for ii = 1:n3
for jj = 1:n1
for kk = 1:n4
C(kk, jj, ii) = A( ...
jj, ...
B(kk, jj, ii), ...
ii ...
);
end
end
end
end
function C = version2(A, B, n1, n2, n3, n4)
C = zeros(n4, n1, n3);
for ii = 1:n3
for jj = 1:n1
C(:, jj, ii) = A( ...
jj, ...
B(:, jj, ii), ...
ii ...
);
end
end
end

채택된 답변

Matt J
Matt J 2024년 1월 12일
편집: Matt J 2024년 1월 13일
I doubt a vectorized solution would be faster than the loop, but here is one way to vectorize it.
[K,J,I]=ndgrid(1:n4,1:n1,1:n3); %Recycle this, if possible
Bvals=B(sub2ind(size(B),K,J,I));
C=A( sub2ind(size(A), J,Bvals,I ));
  댓글 수: 4
Matt J
Matt J 2024년 1월 13일
편집: Matt J 2024년 1월 13일
No, I would do this instead:
function C = version3(A, B, n1, n2, n3, n4)
[K,J,I]=ndgrid(1:n4,1:n1,1:n3); %Recycle this, if possible
Bvals=B(sub2ind(size(B),K,J,I));
C=A( sub2ind(size(A), J,Bvals,I ));
end
It doesn't matter though. Surely you've found by now that the for-loop is fastest.
Fredrik P
Fredrik P 2024년 1월 13일
편집: Fredrik P 2024년 1월 14일
For me, your latest code is a lot faster. Perhaps it comes down to my actual data not being random. But more likely, the code I initially posted was a bit unfair. There, I preallocated the C matrix, but the preallocation really ought be done inside function version1 and version2, as your version3 does not need any preallocation. Also, there is not really just one A matrix and one C matrix, but rather three of each: A1, A2, and A3 and C1, C2, and C3, which all can use a variable indices=sub2ind(size(A), J,Bvals,I ). Perhaps I was too eager to minimize my minimal working example :-/

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Hassaan
Hassaan 2024년 1월 12일
It seems that the complexity of the indexing in this specific case is not easily amenable to vectorization without encountering shape mismatches or broadcasting issues.it might be more practical to focus on optimizing the original loop-based implementations (version1 and version2). Often, the simplest solution can be the most efficient, especially if the code is already quite optimized and the overhead of additional complexity does not lead to significant performance gains.
If performance is critical, and these functions are a bottleneck in your application, you might consider other strategies, such as:
  1. Profiling: Use MATLAB's built-in profiling tools to identify the exact bottlenecks in your code.
  2. Parallel Computing: If you have access to MATLAB's Parallel Computing Toolbox, you might gain performance by distributing some of these operations across multiple cores or GPUs.
  3. Compiled Languages: For the most intensive computational tasks, rewriting the critical parts in a compiled language like C++ and integrating it with MATLAB might be beneficial.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.
Professional Interests
  • Technical Services and Consulting
  • Embedded Systems | Firmware Developement | Simulations
  • Electrical and Electronics Engineering
Feel free to contact me.

카테고리

Help CenterFile Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기

제품


릴리스

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by