Hello,
Is there a way to speed this up? This if within a function and i can use codegen - and I do. But I need to speed this up more. I think the problem is that one variable is huge and the access to it takes long? What should be done in such a case?
xElements = 1201;
maxN = 100;
umnHolder = complex(zeros(maxN + 1, maxN + 1));
betaSumSq1 = zeros(xElements, xElements); % preallocate
besselsFisher = zeros(1201, 1201, 101); % just to show the size LARGE, ~780 MB
XY = zeros(xElements, xElements); % just to show the size
acosContainer = XY; % just to show the size
parfor i = 1 : xElements
for j = 1 : xElements
umn = umnHolder;
for n = 0:maxN
mm = 1;
for m = -n:2:n
nn = n + 1; % for indexing
if m > 0
umn(nn, mm) = sqrt(n+1) * XY(i, j) * besselsFisher(i, j, nn) * cos( abs(m)*acosContainer(i, j) );
end
if m < 0
umn(nn, mm) = sqrt(n+1) * XY(i, j) * besselsFisher(i, j, nn) * sin( abs(m)*sign(x(i))*acosContainer(i, j) );
end
if m == 0
umn(nn, mm) = sqrt(n+1) * XY(i, j) * besselsFisher(i, j, nn);
end
mm = mm + 1;
end % m
end % n
beta1 = sum(sum(Aj1.*umn));
betaSumSq1(i, j) = abs(beta1).^2;
beta2 = sum(sum(Aj2.*umn));
betaSumSq2(i, j) = abs(beta2).^2;
end % j
end % i
Best regards, Alex

댓글 수: 4

Walter Roberson
Walter Roberson 2016년 7월 21일
Any hope for a meaningful speed-up is probably going to depend upon the extent to which the unknown function besselsFisher can be vectorized. I do not find that routine anywhere the internet.
Alex Kurek
Alex Kurek 2016년 7월 22일
besselsFisher i something I wrote. It is precomputed before, so in the loop I only refer to already computed values.
Thorsten
Thorsten 2016년 7월 22일
편집: Thorsten 2016년 7월 22일
The first step before optimising would be to identify where most of the time is spent using profile.
Most of the time is spent here:
umn(nn, mm) = sqrt(n+1) * XY(i, j) * besselsFisher(i, j, nn) * cos( abs(m)*acosContainer(i, j) );
umn(nn, mm) = sqrt(n+1) * XY(i, j) * besselsFisher(i, j, nn) * sin( abs(m)*sign(x(i))*acosContainer(i, j) );
This makes sense, since if you multiply
besselsFisher = zeros(1201, 1201, 101);
by e.g. 2 it takes ~1.6 seconds.

댓글을 달려면 로그인하십시오.

 채택된 답변

Walter Roberson
Walter Roberson 2016년 7월 22일

0 개 추천

You should factor out common sub-expressions. acosContainer(i, j) is the same for all m and n so assign it to a variable outside the m loop. Taking abs(m) is a waste of time when you know that m > 0 . sign(x(1)) is the same for all j, m, n so assign it to a variable. Multiplying by sign(x(1)) is done for the vector -n to -1 so you can vectorize to precalculate, sin((-n : 2 : -1) .* sign(x(1)) .* acosContainer(i, j)); you can probably vectorize the rest of that case as well.
All of the cases for any one n are multiplied by sqrt(n+1) so hold off on that multiplication until you have done the entire set of m values, and then multiply them all by sqrt(n+1) to get economy of scale.
And so on.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Execution Speed에 대해 자세히 알아보기

제품

질문:

2016년 7월 21일

답변:

2016년 7월 22일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by