필터 지우기
필터 지우기

codes runs too long

조회 수: 1 (최근 30일)
zamri
zamri 2013년 1월 5일
Hi, i wonder if there is somebody who can help. I have this program where it calculates pressure inside a cavity. For normal case, the program just took 30 minutes to give result. I did modification to the program where the concerned codes a shown below, could this codes are the culprit where it took 15 hrs to run and give result. a, b, c, d are bessel functions that i dont bother to write here. Note: this is just partial of the program, i doubt other part is the reason because i only change numeric values for them.
HHvp=1e-10;
HHvp1=1e-10;
HHvp2=1e-10;
HHvp3=1e-10;
HHvp4=1e-10;
HHvp5=1e-10;
for m1=1:5
for n1=1:5
for l1=1:5
omega_n(m1,n1,l1)=m1*n1*l1 % a lot more to this, to long to write)
end
end
end
for n1=1:5
for l1=1:5
HHvp1= HHvp1 + a*n1*l1
end
end
for n1=1:5
for l1=1:5
HHvp2= HHvp2 + b*n1*l1
end
end
for n1=1:5
for l1=1:5
HHvp3= HHvp3 + c*n1*l1
end
end
for n1=1:5
for l1=1:5
HHvp4= HHvp4 + d*n1*l1
end
end
HHvp=HHvp1+HHvp2+HHvp3+HHvp4;
Hvp(mmm2,nnn2,lll2,num2)= HHvp;
[EDITED, Jan, code formatted]
  댓글 수: 4
Azzi Abdelmalek
Azzi Abdelmalek 2013년 1월 5일
Simon, I did'nt see the comment, maybe because the question was not well formated. But what is the aim to post a part of code that took 0.02 s?
Jan
Jan 2013년 1월 6일
@Azzi: I hope that the OP will explain this.

댓글을 달려면 로그인하십시오.

채택된 답변

Jan
Jan 2013년 1월 6일
편집: Jan 2013년 1월 6일
In reply to your posted code:
When you use the profile to find the most time consuming lines, a standard procedure gets obvious: Avoid repeated calculations inside a loop. Here the Bessel function waste a lot of time and using temporary variables accelerate the code by a factor > 100:
r=1e-10+(lll2-1)*dr; % After this line insert this:
tmp1 = 7.7*besselj(0,48.3*r)-0.32*bessely(0,48.3*r);
tmp2 = -22*besselj(0,96.7*r)-18.2*bessely(0,96.7*r);
tmp3 = -3*besselj(0,145*r)-17.5*bessely(0,145*r);
tmp4 = -0.8*besselj(0,193*r)-1.6*bessely(0,193*r);
tmp5 = -1.8*besselj(0,242*r)-0.37*bessely(0,242*r);
% And now replace the correspondig calculations by "tmpX" using the
% Replace dialog of the editor.
Btw., the code is extremely ugly. A substantial cleanup would at first increase the speed and at second allow a successful debugging. Most of all consider the MLint warnings (see the orange bars in the Editor). Then add comments and split the very large lines into readable pieces. Then it will be much easier to identify the parts, which are calculated repeatedly. Finally I guess, that a speedup of factor 1000 is possible without any magic or sophisticated tricks.
  • Is sqrt(m1*pi/(R2-R1))^2 really wanted?
  • In the calculation of Hvp_0_1_0 you assume that n1 has the last value from the last FOR loop, which uses n1 as counter. Better define n1 and l1 explicitly outside the loops.
  • Is "3.142" a crude approximation of pi? If so, don't do it. Never loose accuracy, when there is no benefit.
  댓글 수: 3
zamri
zamri 2013년 1월 6일
dear Jan,
thanks so much for your advise, i was able to run it in 5 minutes. I appreciate it so much. Sorry for being like 5 yr old on this.
Jan
Jan 2013년 1월 6일
편집: Jan 2013년 1월 6일
I've replaced some constants in the loops and abbreviated whatever I found logical. Of course I cannot debug the results, so I leave this up to you. Some details look strange, like the mentioned sqrt(x)^2.
With less redundancies, the code runs in 100 seconds. Perhaps the code is even less readable as before, sorry. When this code is used for serious science, I would suggest to rewrite it from scratch after mathematical simplifications.
clc
%Torus Dimension
R2=0.275;
R1=0.21;
W=0.205;
ht=0.015;
Theta=2*pi;
%Young's Modulus
E=7.5E+7;
% Poisson's Ratio
GAMA=0.40;
% Plate density
Lo=1350;
Lo0=1.3;
%Plate loss factor
Eta1=0.03;
Eta2=0.03;
c=343;
%excitation force
%F=F0*sin(2*pi*f);
F0=1;
% Excitation force location
thetaf=0;
zf=W/2;
%Plate longitudinal wave velocity
cLt=sqrt(E/Lo/(1-GAMA^2));
% Plate Bending Stiffness
B=E*ht^3/(12*(1-GAMA^2));
dr0=(R2-R1)/20
dtheta0=2*pi/20;
dz0=W/20;
df=100/100;
dr=(R2-R1)/2;
dtheta=2*pi/3;
dz=W/2;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
HFv=zeros(3,3,51);
HFvv=zeros(3,3,51);
Hvp=zeros(3,3,3,51);
CHvp=zeros(3,3,3,51);
omega_mn1=zeros(5,5);
omega_mn=zeros(5,5);
p_omega=zeros(3,3,3,51);
mean_p2=zeros(3,3,3,51);
omega_n=zeros(5,5,5);
Freq=zeros(51,1);
tmp6 = (1-GAMA^2);
h = waitbar(0,'Please wait...');
for num2 = 200:250
disp(num2);
f=1e-10+(num2-1)*df;
omega=2*3.1415928*f;
omega2 = omega ^ 2;
Freq(num2,1)=f;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
coef_Hvp=0.0005i*(Theta/2)*(W/2)*omega*c^2*Lo0;
coef_HFv=2i*omega*F0/(Lo*ht*pi*W);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
coef_1_0_0=0.105i*Theta*W*omega*c^2*Lo0;
coef_0_1_0=0.041i*(Theta/2)*W*omega*c^2*Lo0;
coef_0_0_1=0.0002i*Theta*(W/2)*omega*c^2*Lo0;
coef_1_1_0=2.3i*(Theta/2)*W*omega*c^2*Lo0;
coef_0_1_1=0.0002i*(Theta/2)*(W/2)*omega*c^2*Lo0;
coef_1_0_1=0.0006i*Theta*(W/2)*omega*c^2*Lo0;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
EEc=1e-10;
EEp=1e-10;
for mmm2=1:3
theta=1e-10+(mmm2-1)*dtheta;
for nnn2=1:3
z=1e-10+(nnn2-1)*dz;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
HFFVV=1e-10;
gama=W*sqrt(ht)/((R2)^(3/2));
q2 = (1/sqrt(tmp6*Lo/E));
for m=1:5
q1 = sin(m*pi*z/W);
q3 = sin(m*pi*zf/W);
for n=1:5
omega_mn1(mmm2,nnn2)=sqrt(tmp6*((1+(m*3.142*R2/(n*W))^2)^-2)*(m*3.142/(n*gama)^4)+ ...
(1+(m*3.142*R2/(n*W))^2)^2*(n^4)/12);
omega_mn(mmm2,nnn2)=omega_mn1(mmm2,nnn2)*(ht)/R2^2*q2;
HFFVV=HFFVV+q3*sin(n*pi*thetaf/Theta)*q1*sin(n*pi*theta/Theta)/ ...
(omega_mn(m,n)^2-omega2+1i*Eta1*omega_mn(m,n)^2);
end
end
HFvv(mmm2,nnn2,num2)=coef_HFv*HFFVV;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
q3 = (ht)/R2^2*(1/sqrt((tmp6)*Lo/E));
gama=W*sqrt(ht)/((R2)^(3/2));
q5 = sin((1:5)*pi*thetaf/Theta) .* sin((1:5)*pi*theta0/Theta);
for lll2=1:3
r=1e-10+(lll2-1)*dr;
tmp1 = 7.7*besselj(0,48.3*r)-0.32*bessely(0,48.3*r);
tmp2 = -22*besselj(0,96.7*r)-18.2*bessely(0,96.7*r);
tmp3 = -3*besselj(0,145*r)-17.5*bessely(0,145*r);
tmp4 = -0.8*besselj(0,193*r)-1.6*bessely(0,193*r);
tmp5 = -1.8*besselj(0,242*r)-0.37*bessely(0,242*r);
pw=1e-10;
for nn= 1:21
theta0=1e-10+(nn-1)*dtheta0;
for ll=1:21
z0=1e-10+(ll-1)*dz0;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
HHFv=1e-10;
q12 = (1:5) .^ 4;
q13 = ((1:5) * gama) .^ 4;
for m=1:5
q1 = sin(m*pi*zf/W);
q2 = q1 * sin(m*pi*z0/W);
q11 = 1 / q6^2;
for n=1:5
q6 = 1+(m*3.142*R2/(n*W))^2;
omega_mn1(mmm2,nnn2) = sqrt(tmp6 * q11 * ...
(m*3.142/q13(n)) + q6^2*q12(n)/12);
omega_mn(mmm2,nnn2) = omega_mn1(mmm2,nnn2)*q3;
q4 = omega_mn(m,n)^2;
HHFv=HHFv + q2 * q5(n) / (q4-omega2+1i*Eta1*q4);
end
end
HFv(mmm2,nnn2,num2)=coef_HFv*HHFv;
HHvp=1e-10;
HHvp1=1e-10;
HHvp2=1e-10;
HHvp3=1e-10;
HHvp4=1e-10;
HHvp5=1e-10;
q7 = ((1:5) * pi / W) .^ 2;
for m1=1:5
q1 = abs(m1*pi/(R2-R1));
for n1=1:5
q2 = c * q1 + (2*n1/(R2+R1))^2;
for l1=1:5
omega_n(m1,n1,l1) = q2 + q7(l1);
end
end
end
omega_n2 = omega_n .^ 2;
omega_x = omega_n2 - omega2+1i*Eta2*omega_n2;
q7 = cos((1:5)*pi*z0/W) .* cos((1:5)*pi*zf/W);
q8 = (2/(R2+R1))*theta0;
q9 = (2/(R2+R1))*thetaf;
for n1=1:5
q1 = cos(n1*q8)* cos(n1*q9);
for l1=1:5
q10 = q1 * q7(l1) / omega_x(m1,n1,l1);
HHvp1 = HHvp1 + tmp1 * q10;
HHvp2 = HHvp2 + tmp2 * q10;
HHvp3 = HHvp3 + tmp3 * q10;
HHvp4 = HHvp4 + tmp4 * q10;
HHvp5 = HHvp5 + tmp5 * q10;
end
end
HHvp=HHvp1+HHvp2+HHvp3+HHvp4+HHvp5;
Hvp(mmm2,nnn2,lll2,num2)= HHvp;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
n1_ = 5;
l1_ = 5;
Hvp_1_0_0=tmp5/((c^2*(pi/(R2-R1))^2*(1+1i*Eta2))-omega2);
Hvp_0_1_0=(cos(n1_*q8)*cos(n1_*q9))/((c^2*(2/(R2+R1))^2*(1+1i*Eta2))-omega2);
Hvp_0_0_1=(cos(l1_*pi*z0/W)*cos(l1_*pi*zf/W))/((c^2*(pi/W)^2*(1+1i*Eta2))-omega2);
Hvp_1_1_0=((tmp1)*(cos(n1_*q8)*cos(n1_*q9)))/((c^2*((pi/(R2-R1))^2+(2/(R2+R1))^2)*(1+1i*Eta2))-omega2);
Hvp_0_1_1=(cos(n1_*q8)*cos(n1_*q9)*((cos(l1_*pi*z0/W)*cos(l1_*pi*zf/W))))/((c^2*((2/(R2+R1))^2+(pi/W)^2)*(1+1i*Eta2))-omega2);
Hvp_1_0_1=((tmp1)*(cos(l1_*pi*z0/W)*cos(l1_*pi*zf/W)))/((c^2*((pi/(R2-R1))^2+(pi/W)^2)*(1+1i*Eta2))-omega2);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
CHvp(mmm2,nnn2,lll2,num2)=coef_Hvp*HHvp+coef_1_0_0*Hvp_1_0_0+coef_0_1_0*Hvp_0_1_0+coef_0_0_1*Hvp_0_0_1+coef_1_1_0*Hvp_1_1_0+coef_0_1_1*Hvp_0_1_1+coef_1_0_1*Hvp_1_0_1;
IntFunction=HFv(mmm2,nnn2,num2)*CHvp(mmm2,nnn2,lll2,num2);
pw=pw+dtheta0*dz0*IntFunction;
end
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
p_omega(mmm2,nnn2,lll2,num2)=pw;
mean_p2(mmm2,nnn2,lll2,num2)=abs(p_omega(mmm2,nnn2,lll2,num2)*conj(p_omega(mmm2,nnn2,lll2,num2))/2);
end
end
end
waitbar(num2/250,h)
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
m_p2_CLT1_1_1=zeros(51,1);
m_p2_CLT1_2_1=zeros(51,1);
m_p2_CLT1_3_1=zeros(51,1);
m_p2_CLT2_1_1=zeros(51,1);
m_p2_CLT2_2_1=zeros(51,1);
m_p2_CLT2_3_1=zeros(51,1);
m_p2_CLT3_1_1=zeros(51,1);
m_p2_CLT3_2_1=zeros(51,1);
m_p2_CLT3_3_1=zeros(51,1);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
for m2=200:250
m_p2_CLT1_1_1(m2,1)=10*log10(mean_p2(1,1,1,m2));
m_p2_CLT1_2_1(m2,1)=10*log10(mean_p2(1,2,1,m2));
m_p2_CLT1_3_1(m2,1)=10*log10(mean_p2(1,3,1,m2));
m_p2_CLT2_1_1(m2,1)=10*log10(mean_p2(2,1,1,m2));
m_p2_CLT2_2_1(m2,1)=10*log10(mean_p2(2,2,1,m2));
m_p2_CLT2_3_1(m2,1)=10*log10(mean_p2(2,3,1,m2));
m_p2_CLT3_1_1(m2,1)=10*log10(mean_p2(3,1,1,m2));
m_p2_CLT3_2_1(m2,1)=10*log10(mean_p2(3,2,1,m2));
m_p2_CLT3_3_1(m2,1)=10*log10(mean_p2(3,3,1,m2));
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
figure(1)
plot(Freq(:,1), m_p2_CLT1_1_1(:,1), Freq(:,1), m_p2_CLT1_2_1(:,1), Freq(:,1), m_p2_CLT1_3_1(:,1), 'LineWidth',2);
grid on
figure(2)
plot(Freq(:,1), m_p2_CLT2_1_1(:,1), Freq(:,1), m_p2_CLT2_2_1(:,1), Freq(:,1), m_p2_CLT2_3_1(:,1), 'LineWidth',2);
grid on
figure(3)
plot(Freq(:,1), m_p2_CLT3_1_1(:,1), Freq(:,1), m_p2_CLT3_2_1(:,1), Freq(:,1), m_p2_CLT3_3_1(:,1), 'LineWidth',2);
grid on

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Jan
Jan 2013년 1월 5일
편집: Jan 2013년 1월 5일
Look at this piece of code:
for n1=1:5
for l1=1:5
HHvp4= HHvp4 + d*n1*l1
end
end
This can be abbreviated to:
HHvp4 = HHvp4 + 225 * d;
A pre-allocation is strongly recommended, when you fill an array like omega_n iteratively. But your example runs in 0.000032 seconds on my computer, such that I do not think that this is a bottleneck - except if you call this millions of times, and if so, please explain this. Anyhow, I would create this array like this:
omega_n = zeros(5, 5, 5);
omega_n(:, :, 1) = (1:5)' * (1:5);
for ii = 2:5
omega_n(:, :, ii) = ii * omega_n(:, :, 1);
end
Puh, looks more like Matlab, but it takes 0.000020 also. I do not assume that 0.01 thousandth seconds have been worth to improve this part. But who cares about minutes of programming time, when the execution time is 0.1 seconds shorter?! This is the fastest method I've found under R2009a/64 on a Windows7/Core2Duo:
omega_n = zeros(5, 5, 5);
for l1=1:5
for n1=1:5
for m1 = 1:5
omega_n(m1, n1, l1) = m1 * n1 * l1;
end
end
end
0.0000089 seconds. The order of loops has been reversed, such that neighboring elements are written. But as long as 125*8 bytes match into the processor cache, this is not essential. Using a temporary variable c = n1 * l1 in the inner loop is more efficient in theory, but the improvement is not significant.
  댓글 수: 4
Azzi Abdelmalek
Azzi Abdelmalek 2013년 1월 5일
Zamri, there are some missing data, which allows to test your code (like F0). Also there are parts of your code which are not well formated.
Jan
Jan 2013년 1월 6일
@Zamri: I've formatted your code again. Please follow the linked instructions. Thanks.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Mathematics에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by