Speeding up integration within nested for loops
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
I have to find the elements of the following matrix, each element of which is

where K itself involves an integral
where f and q are known (I've left them out so as not to overload the question with math); they involve trigonometric functions of theta and square roots of theta,x,x'.
To find J I tried two approaches:
1) Numerical integration: I defined K(x,x') as K = @(x,x') integral(@theta ..Integrand(theta,x,x').., 0, pi)
and J = @(x,limit1,limit2) integral(@(x') K(x,x'), limit1, limit2) and evaluate them as
for i = 1:numel(x)
for j = 1:numel(x)
J_arr(i,j) = (1/(x(j+1) - x(j)))*J(x(i),x_arr(j),x_arr(j+1));
end
end
This works but the issue is that it is too slow for my purpose - the array x has 2047 elements so this script is carrying out both integrals 2047*2047 times. Is there a faster way to do this integration numerically? (Have thought along the lines of integral2, arrayfun so far).
2) Symbolic integration: Define K and J as above, but try to symbolically integrate them before the loops so that inside the loop I would just be substituting values for x_i and the limits. The issue with this is that the functions f and q in the definition of K are sufficiently complicated that Matlab has trouble finding the analytic symbolic integral, even for the first inner (theta) integration.
Any help in speeding up the numerical calculation/implementing a symbolic integration would be much appreciated!
Thanks
채택된 답변
Johan
2022년 6월 20일
One trick I know to increase speed of calculation is to vectorize your code. I'm not sure that would work for you but you could try using the ArrayValued option of the integral function. Also it looks like the constant factor in Jij can be computed as an array once and multiplied with the result of the integral afterward.
x = rand(10,1);
Kint = @(x1,x2) integral(@(theta) sin(theta+x2+x1), 0, pi,'ArrayValued',true); %I put in a filler function here
Jint = @(x1,limit1,limit2) integral(@(x2) Kint(x1,x2), limit1, limit2,'ArrayValued',true);
prefactor = ones(numel(x),1).*(1./(x(2:end) - x(1:end-1)))'; %Compute prefactor for all ij
J_arr = zeros(numel(x),numel(x)-1); %initialize result array
for iJ = 1:numel(x)-1
J_arr(:,iJ) = Jint(x, x(iJ), x(iJ+1));
end
J_arr.*prefactor
ans = 10×9
1.5154 1.2476 1.0782 1.3314 1.3181 0.7666 1.1930 1.3678 1.4834
0.9472 0.5979 0.3974 0.7129 0.6969 0.0473 0.5474 0.7552 0.9151
0.9196 0.5679 0.3666 0.6839 0.6678 0.0159 0.5177 0.7263 0.8877
0.5684 0.1938 -0.0125 0.3196 0.3030 -0.3635 0.1486 0.3628 0.5382
1.4513 1.1705 0.9956 1.2592 1.2454 0.6764 1.1160 1.2966 1.4191
0.5364 0.1604 -0.0459 0.2869 0.2703 -0.3964 0.1158 0.3300 0.5064
0.2857 -0.0973 -0.3023 0.0330 0.0164 -0.6450 -0.1375 0.0756 0.2576
1.4131 1.1251 0.9472 1.2165 1.2025 0.6240 1.0707 1.2545 1.3808
0.7013 0.3336 0.1283 0.4563 0.4399 -0.2241 0.2864 0.4994 0.6703
1.6477 1.4118 1.2564 1.4840 1.4719 0.9648 1.3574 1.5176 1.6167
Hope this helps,
댓글 수: 5
Aravindh Shankar
2022년 6월 20일
편집: Aravindh Shankar
2022년 6월 20일
Hello Johan, thanks for the answer. I tried implementing your suggestion but observed that doing it this way is actually slower. Comparison results:
- Doing it my way with nested for loops: First outer loop iteration (i = 1, inner j loop executed ~2000 times) took 71.986 seconds, so the entire script would take ~2000*72 seconds to finish.
- Doing it your way with the single for loop: First loop iteration (iJ = 1) took 178.845 seconds, so the entire script would take ~2000*179 seconds to finish.
So there seems to be a ~2.5x slowdown when changing it to what you suggested. I suspect that the prefactor simplification is not really significant and that ArrayValued is implicitly doing an additional loop anyway so that could be why it is slower that way.
Are you sure about that ? When I compare the two method on small arrays the nested loops is slower than the arrayed method, and it looks like it grows worse the larger the input array.
Could you try to time the full calculation on a sub sample of your X array input ? Also posting the code you use for the computation would be helpful.
X = rand(5,1);
fun1 = @() arrayed(X);
fun2 = @() nested(X);
timed(1,1) = timeit(fun1);
timed(2,1) = timeit(fun2);
X = rand(10,1);
fun1 = @() arrayed(X);
fun2 = @() nested(X);
timed(1,2) = timeit(fun1);
timed(2,2) = timeit(fun2);
hold on
plot([5,10],timed(1,:),'-x','DisplayName','Arrayed')
plot([5,10],timed(2,:),'-x','DisplayName','Nested')
hold off
legend('location','northwest')
xlabel('Input array size')
ylabel('Average computing time (s)')

function arrayed(x)
Kint = @(x1,x2) integral(@(theta) sin(theta).*cos(x2)./sqrt(x1), 0, pi,'ArrayValued',true);
Jint = @(x1,limit1,limit2) integral(@(x2) Kint(x1,x2), limit1, limit2,'ArrayValued',true);
J_arr = zeros(numel(x)-1,numel(x));
for iJ = 1:numel(x)-1
J_arr(:,iJ) = (1./(x(2:end) - x(1:end-1))).*Jint(x(1:end-1), x(iJ), x(iJ+1));
end
end
function nested(x)
Kint = @(x1,x2) integral(@(theta) sin(theta).*cos(x2)./sqrt(x1), 0, pi);
Jint = @(x1,limit1,limit2) integral(@(x2) Kint(x1,x2), limit1, limit2,'ArrayValued',true);
J_arr = zeros(numel(x)-1,numel(x));
for iI = 1:numel(x)
for iJ = 1:numel(x)-1
J_arr(iI,iJ) = (1./(x(iJ+1) - x(iJ))).*Jint(x(iI), x(iJ), x(iJ+1));
end
end
end
First part of code is same as yours, with the array X taken to be -0.999:0.451:1 first and then X = -0.999:0.251:1, so the input array size is 5 and 8. Second part of code is attached below. Note: The x_i_bar thing is a small correction to my original post (I had left it out for the sake of readability).
Sorry for the delay in replying to you. I had been suffering from COVID last week, just recovered.
Kindly let me know your thoughts on why this behavior is different from your code.

function nested(x)
%Constants
mstar = 1;
kappa = 1;
g_v = 1;
el = 1.38*1e4;
r_s = 5;
n = mstar^2*el^4/(pi*kappa^2*r_s^2);
k_F = sqrt(2*pi*n);
E_F = (k_F^2)/(2*mstar);
q_TF = 2*g_v*el^2*mstar/kappa;
q = @(theta,x_1,x_2) (sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - (4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))));
w_p = @(theta,x_1,x_2) (sqrt(2*pi*el^2*n*q(theta,x_1,x_2)/(kappa*mstar)));
w_pp = @(theta,x_1,x_2) (w_p(theta,x_1,x_2).*sqrt(1 + (q(theta,x_1,x_2)/q_TF)));
theta_integrand = @(theta,x_1,x_2) ((mstar/(2*pi^2)).*(2*pi*el^2./(kappa.*q(theta,x_1,x_2))).*(1 - (w_p(theta,x_1,x_2).^2./(w_pp(theta,x_1,x_2).*(w_pp(theta,x_1,x_2) + E_F*abs(x_1) + E_F.*abs(x_2))))));
Kint = @(x_1,x_2) (integral(@(theta) theta_integrand(theta,x_1,x_2),0,pi));
Jint = @(x_1,lim1,lim2) (integral(@(x_2) Kint(x_1,x_2),lim1,lim2,'ArrayValued',true));
J_arr = zeros(numel(x)-1);
for i = 1:length(x)-1
x_i_bar = (x(i) + x(i+1))/2;
for j = 1:numel(x)-1
J_arr(i,j) = (1./(x(j+1) - x(j))).*(Jint(x_i_bar,x(j),x(j+1)));
end
end
end
function arrayed(x)
%Constants
mstar = 1;
kappa = 1;
g_v = 1;
el = 1.38*1e4;
r_s = 5;
n = mstar^2*el^4/(pi*kappa^2*r_s^2);
k_F = sqrt(2*pi*n);
E_F = (k_F^2)/(2*mstar);
q_TF = 2*g_v*el^2*mstar/kappa;
q = @(theta,x_1,x_2) (sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - (4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))));
w_p = @(theta,x_1,x_2) (sqrt(2*pi*el^2*n*q(theta,x_1,x_2)/(kappa*mstar)));
w_pp = @(theta,x_1,x_2) (w_p(theta,x_1,x_2).*sqrt(1 + (q(theta,x_1,x_2)/q_TF)));
theta_integrand = @(theta,x_1,x_2) ((mstar/(2*pi^2)).*(2*pi*el^2./(kappa.*q(theta,x_1,x_2))).*(1 - (w_p(theta,x_1,x_2).^2./(w_pp(theta,x_1,x_2).*(w_pp(theta,x_1,x_2) + E_F*abs(x_1) + E_F.*abs(x_2))))));
Kint = @(x_1,x_2) (integral(@(theta) theta_integrand(theta,x_1,x_2),0,pi,'ArrayValued',true));
Jint = @(x_1,lim1,lim2) (integral(@(x_2) Kint(x_1,x_2),lim1,lim2,'ArrayValued',true));
J_arr = zeros(numel(x)-1);
x_i_bar = (x(1:end-1) + x(2:end))/2;
for j = 1:numel(x)-1
J_arr(:,j) = (1./(x(2:end)-x(1:end-1))).*Jint(x_i_bar,x(j),x(j+1));
end
end
The integral function seems to have trouble reaching its default tolerance value for your function, I reduced the tolerance to 1e-4 I don't know if that is satisfactory for you or not.
With larger tolerance, your code is still faster for nested than for arrayed for small input size, however, I see the nested way seems to increase quadraticaly with input size whereas the arrayed increases somewhat linearly:

A small optimisation I did was un-nest your functions in the Kint integration. I noticed Matlab tends to slow down then you have function calling function I'm not sure why. Its much less readable and you should be extra careful the equation is right but that leads to faster computation.
step = [0.451,0.251,0.151,0.1];
timed = zeros(2,2);
N = zeros(2,1);
for i_step = 1:length(step)
X = -0.999:step(i_step):1;
%evaluate only once because of computation time limit
tic
nested(X);
timed(i_step, 1) = toc;
tic
arrayed(X);
timed(i_step, 2) = toc;
% Version I used on my computer for the graph above
% funnest = @() nested(X);
% funarr = @() arrayed(X);
% timed(i_step, 1) = timeit(funnest);
% timed(i_step, 2) = timeit(funarr);
N(i_step) = size(X,2);
end
plot(N,timed,'-*')
legend({'nested','arrayed'})
xlabel('X array size')
ylabel('Computing time (s)')
axis square

function nested(x)
%Constants
mstar = 1;
kappa = 1;
g_v = 1;
el = 1.38*1e4;
r_s = 5;
n = mstar^2*el^4/(pi*kappa^2*r_s^2);
k_F = sqrt(2*pi*n);
E_F = (k_F^2)/(2*mstar);
q_TF = 2*g_v*el^2*mstar/kappa;
Kint = @(x_1,x_2) (integral(@(theta) ((mstar/(2*pi^2)).*...
(2*pi*el^2./(kappa.*(sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - (4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta)))))).*...
(1 - ((sqrt(2*pi*el^2*n*(sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) -...
(4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))))/(kappa*mstar))).^2 ...
./(((sqrt(2*pi*el^2*n*(sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) -...
(4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))))/(kappa*mstar))).*...
sqrt(1 + ((sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - (4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))))/q_TF))).*...
(((sqrt(2*pi*el^2*n*(sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - (4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*...
cos(theta))))/(kappa*mstar))).*sqrt(1 + ((sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - ...
(4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))))/q_TF))) + E_F*abs(x_1) + E_F.*abs(x_2))))))...
,0,pi,'ArrayValued',true,'RelTol',0,'AbsTol',1e-4));
Jint = @(x_1,lim1,lim2) (integral(@(x_2) Kint(x_1,x_2),lim1,lim2,'ArrayValued',true,'RelTol',0,'AbsTol',1e-4));
J_arr = zeros(numel(x)-1);
for i = 1:length(x)-1
x_i_bar = (x(i) + x(i+1))/2;
for j = 1:numel(x)-1
J_arr(i,j) = (1./(x(j+1) - x(j))).*(Jint(x_i_bar,x(j),x(j+1)));
end
end
end
function arrayed(x)
%Constants
mstar = 1;
kappa = 1;
g_v = 1;
el = 1.38*1e4;
r_s = 5;
n = mstar^2*el^4/(pi*kappa^2*r_s^2);
k_F = sqrt(2*pi*n);
E_F = (k_F^2)/(2*mstar);
q_TF = 2*g_v*el^2*mstar/kappa;
Kint = @(x_1,x_2) (integral(@(theta) ((mstar/(2*pi^2)).*...
(2*pi*el^2./(kappa.*(sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - (4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta)))))).*...
(1 - ((sqrt(2*pi*el^2*n*(sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) -...
(4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))))/(kappa*mstar))).^2 ...
./(((sqrt(2*pi*el^2*n*(sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) -...
(4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))))/(kappa*mstar))).*...
sqrt(1 + ((sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - (4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))))/q_TF))).*...
(((sqrt(2*pi*el^2*n*(sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - (4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*...
cos(theta))))/(kappa*mstar))).*sqrt(1 + ((sqrt((2*mstar*E_F*(x_1 + x_2 + 2)) - ...
(4*mstar*E_F*sqrt(x_1 + 1)*sqrt(x_2 + 1).*cos(theta))))/q_TF))) + E_F*abs(x_1) + E_F.*abs(x_2))))))...
,0,pi,'ArrayValued',true,'RelTol',0,'AbsTol',1e-4));
Jint = @(x_1,lim1,lim2) (integral(@(x_2) Kint(x_1,x_2),lim1,lim2,'ArrayValued',true,'RelTol',0,'AbsTol',1e-4));
J_arr = zeros(numel(x)-1);
x_i_bar = (x(1:end-1) + x(2:end))/2;
for j = 1:numel(x)-1
J_arr(:,j) = (1./(x(2:end)-x(1:end-1))).*Jint(x_i_bar,x(j),x(j+1));
end
end
Aravindh Shankar
2022년 7월 5일
편집: Aravindh Shankar
2022년 7월 5일
I agree with all your comments Johan, glad that it has been improved. Thanks for your help with optimizing the code, I'm really grateful for your effort!
As a future step I will try parallelizing using parfor once I have server access to see if that improves speed as well.
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Numerical Integration and Differentiation에 대해 자세히 알아보기
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
