Parallel calculating for fast execution
조회 수: 8 (최근 30일)
이전 댓글 표시
I wrote a code with two scripts: 1- a function which gives a needed value Tmax. Tmax depends on SIX variable inputs. 2- a script to calculat many other quantities where we need to call the function Tmax. For this we have to do this a lot of times and that needs a lot of time. I am lookin for a way to reduce calculating time.
In my second script, I change all the for loop with parfor loop where it is possible. I saw an amelioration and the time is reduced but not so much. I have a powerful machine and the configuration is attached. I hope be able to divide the execution time by 32 as I have 32 cores. That's not happening and I am wondering why. The points at which I calculate my Tmax are independant, so I think that it is possible to give to each core n/32 points if n is the number of my points= sample size. I ask you to explain this issue. Can we call a function n times in parallel with (n/ 32) times executed for each core?
I put the 2 scripts below:
1- the function :
function T_max = EPO_OILS_SEMIBATCH(Par)
global UA Tj0 Taj F tadd CHP_initial
F=Par(1);
tadd=Par(2);
UA=Par(3);
Taj=Par(4);
Tj0=Par(5);
CHP_initial=Par(6);
tspan=[0:10:10000];
y0=[0 CHP_initial 0 ((1-(0.14*CHP_initial)*0.24285)*1000)/18 0.5 1.70 0.00 0.00 0.00 Tj0 0.26];
[t, y]=ode23s(@semibatch,tspan,y0);
T_max = max(y(:,10));
function dydt=semibatch(t,y)
global UA Tj0 Taj F tadd
if t > tadd
F=0;
end
dydt=[(1/(1+((1-((y(11)-0.12)/y(11)))/(((y(11)-0.12)/y(11))*9))))*((F/(y(11)-0.12))*(24-y(1))-((0.15*exp(-(18041.857)*((1/y(10))-(0.0029411))))*( 0.0017029)*((y(1)/y(4))^0.5)*(y(1)*y(2)-(1/(0.96*exp((671.157)*((1/y(10))-(0.003298)))))*y(3)*y(4)))+((0.0009*exp(-(2429.636)*((1/y(10))-(0.0029411))))*y(3))+(1-((y(11)-0.12)/y(11)))*(((0.00576*exp(-(7409.189)*((1/y(10))-(0.0029411))))*y(3)*y(5))+((0.00437*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(6))+((0.004*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(7))-((0.00339*exp(-(5063.74789)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*y(1)*((y(1)/y(4))^0.5)))/((y(11)-0.12)/y(11)));...
-((0.15*exp(-(18041.857)*((1/y(10))-(0.0029411))))*( 0.0017029)*((y(1)/y(4))^0.5)*(y(1)*y(2)-(1/(0.96*exp((671.157)*((1/y(10))-(0.003298)))))*y(3)*y(4)))-(F*y(2)/(y(11)-0.12)); ((-F*y(3)/(y(11)-0.12))+((0.15*exp(-(18041.857)*((1/y(10))-(0.0029411))))*( 0.0017029)*((y(1)/y(4))^0.5)*(y(1)*y(2)-(1/(0.96*exp((671.157)*((1/y(10))-(0.003298)))))*y(3)*y(4)))-((0.0009*exp(-(2429.636)*((1/y(10))-(0.0029411))))*y(3))-((0.001*exp(-(2405.581)*((1/y(10))-(0.0029411))))*y(3))-(1-((y(11)-0.12)/y(11)))*(((0.00576*exp(-(7409.189)*((1/y(10))-(0.0029411))))*y(3)*y(5))+((0.00437*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(6))+((0.004*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(7))+((0.0592*exp(-(8419.53331)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*y(3)*((y(1)/y(4))^0.5)))/((y(11)-0.12)/y(11)));...
((0.15*exp(-(18041.857)*((1/y(10))-(0.0029411))))*( 0.0017029)*((y(1)/y(4))^0.5)*(y(1)*y(2)-(1/(0.96*exp((671.157)*((1/y(10))-(0.003298)))))*y(3)*y(4)))+((0.001*exp(-(2405.581)*((1/y(10))-(0.0029411))))*y(3))-(F*y(4)/(y(11)-0.12))-(1-((y(11)-0.12)/y(11)))*((0.000237*exp(-(18041.857)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*((y(1)*y(4))^0.5))/((y(11)-0.12)/y(11)); -((0.00576*exp(-(7409.189)*((1/y(10))-(0.0029411))))*y(3)*y(5)); -((0.00437*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(6)); ((0.00437*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(6))-((0.004*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(7)); (((0.00576*exp(-(7409.189)*((1/y(10))-(0.0029411))))*y(3)*y(5))+((0.00437*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(6))+((0.004*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(7)))-((0.000237*exp(-(18041.857)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*((y(1)*y(4))^0.5))-((0.00339*exp(-(5063.74789)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*y(1)*((y(1)/y(4))^0.5))-((0.0592*exp(-(8419.53331)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*y(3)*((y(1)/y(4))^0.5));...
((0.000237*exp(-(18041.857)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*((y(1)*y(4))^0.5))+((0.00339*exp(-(5063.74789)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*y(1)*((y(1)/y(4))^0.5))+((0.0592*exp(-(8419.53331)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*y(3)*((y(1)/y(4))^0.5)); (1/(((y(11)-0.12)*1.00+0.12*0.93)*2000))*((-(y(11)-0.12)*(((0.15*exp(-(18041.857)*((1/y(10))-(0.0029411))))*( 0.0017029)*((y(1)/y(4))^0.5)*(y(1)*y(2)-(1/(0.96*exp((671.157)*((1/y(10))-(0.003298)))))*y(3)*y(4)))*-5580+((0.001*exp(-(2405.581)*((1/y(10))-(0.0029411))))*y(3))*-359000+((0.0009*exp(-(2429.636)*((1/y(10))-(0.0029411))))*y(3))*-163000)-0.12*((((0.00576*exp(-(7409.189)*((1/y(10))-(0.0029411))))*y(3)*y(5))+((0.00437*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(6))+((0.004*exp(-(1804.1857)*((1/y(10))-(0.0029411))))*y(3)*y(7)))*-230000+(((0.000237*exp(-(18041.857)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*((y(1)*y(4))^0.5))+((0.00339*exp(-(5063.74789)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*y(1)*((y(1)/y(4))^0.5))+((0.0592*exp(-(8419.53331)*((1/y(10))-(0.0029411))))*( 0.0017029)*y(8)*y(3)*((y(1)/y(4))^0.5)))*-90000))+UA*(Tj0-y(10))+24*F*20*(Taj-y(10))); F];
2- the calculating:
tic
n=100; % n is big (till 1000000 and more)
p=6;
F_max= 0.002;
F_min= 0.001;
tadd_max=1200;
tadd_min= 600;
UA_max= 100;
UA_min= 1;
Taj_max= 308.15;
Taj_min=293.15;
Tj0_max= 343.15;
Tj0_min= 313.15;
CHP_initial_max=8;
CHP_initial_min=2.9;
sob1 = sobolset(p);
An = net(sob1,n);
Par_max=[F_max tadd_max UA_max Taj_max Tj0_max CHP_initial_max];
Par_min=[F_min tadd_min UA_min Taj_min Tj0_min CHP_initial_min];
A=zeros(size(An,1),size(An,2));
parfor i=1:size(An,1)
A(i,:)=An(i,:).*(Par_max-Par_min)+Par_min;
end
A;
T_max_A=[];
parfor i=1:n
T_max_A(i)= EPO_OILS_SEMIBATCH(A(i,:));
end
f_0 = (1/n)*sum(T_max_A)
D_T = ((1/n)*sum(T_max_A.^2))- f_0^2
댓글 수: 2
답변 (2개)
Jan
2017년 1월 27일
Sorry, this will not solve the problem:
For the speed part: Wow, this is a cruel code! I would not dare to simplify it manually. So just some ideas:
- sqrt() is cheaper than ^0.5.
- There are a lot of terms in the king of exp(a*(1/y(10))-b). Because the exp() function is very expensive, you can try to combine these terms to reduce the number of calls.
By the way, you can omit the square brackets in tspan=[0:10:10000], see why-not-use-square-brackets. But here the saved microseconds will not matter.
if t > tadd F=0; end adds a discontinuity to the integertation. Matlab's integrators handle smooth functions only, see http://www.mathworks.com/matlabcentral/answers/59582#answer_72047 .
I'm wondering, if you can trust the results: Most of the constants have 3 or 5 valid digits only, some have 8. The formula has about 100 terms. Without any analysis I guess, that the cancellation error might dominate the solution.
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Creating and Concatenating Matrices에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!