execution time with or without parfor
이전 댓글 표시
I have a simple code for testing parfor in my local profile (with 4 cores)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%code 1
matlabpool open 4 % 2 or 1
tic;
parfor i = 1:30
res = 0;
for n = 1 : 3000000
res = res + sin(n) + cos(n);
end
A(i) = res;
end
toc;
matlabpool close
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%code 2
tic;
for i = 1:30
res = 0;
for n = 1 : 3000000
res = res + sin(n) + cos(n);
end
A(i) = res;
end
toc;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
I have executed code 1 using 4 labs or 2 labs or 1 lab and executed code 2. the results is here:
code-1 - 8 labs(4 core with 4 hypthread) --> 15 sec
code-1 - 4 labs --> 22 sec
code-1 - 2 labs --> 35 sec
code-1 - 1 labs --> 65 sec
code-2 - --> 18 sec
regards the results, it is better to use code-2 and releasing all other cores (you may also consider the time needed to run 'matlabpool open' and 'matlabpool close'). I have read this : http://www.mathworks.co.uk/matlabcentral/answers/44734-there-is-aproblem-in-parfor
but it seems in this case execution time is much longer than setup time of parallel mechanism.
if there is not any thing wrong with my results, main question is when its better to use parfor.
댓글 수: 17
Matt J
2014년 2월 3일
I can't reproduce that, I'm afraid. I see close to linear speed-up with 2,4, and 12 workers in the pool. What version of MATLAB are you using and what CPU(s)?
Edric Ellis
2014년 2월 4일
NUMLABS is designed to return 1 inside PARFOR because you cannot use labSend/labReceive there. This is described in the documentation.
amir
2014년 2월 4일
Matt J
2014년 2월 4일
NUMLABS will only return a meaningful value inside an SPMD...END block.
Matt J
2014년 2월 4일
@mohammad
Are there any other machines available to you that you could test it on, to check whether the problem is platform-dependent?
amir
2014년 2월 4일
amir
2014년 2월 5일
As I mentioned here, I ran the first version of the code and successfully achieved near linear speed-up with PARFOR. That was with R2013b. I haven't run the second version of the code yet, but I don't see any significant modification in it that would lead me to expect a different result.
So, the slow behavior you're seeing has to be environment-related.
Here are my results when I run the modified version of the test code for poolsize=0:12. The three columns correspond to R2011b, R2012b, and R2013b
Times =
19.9430 20.4689 21.0302
21.1632 21.8318 23.0208
10.6021 10.7968 11.5326
7.0738 7.3209 7.9293
5.7969 5.9354 6.1944
4.3994 4.5522 4.9174
3.7105 3.8611 4.1811
3.6653 3.7533 3.9924
3.0179 3.1299 3.2726
2.9612 3.0899 3.2563
2.3155 2.3643 2.5791
2.3111 2.3792 2.5677
2.3000 2.3633 2.6129
Interestingly, performance gets a bit slower with more recent releases. Not sure if that's a significant trend, though. This is on an Intel Xeon X5680 @3.33 Ghz, dual hexacore CPU.
So... still baffled.
Matt J
2014년 2월 6일
Any difference if you pre-allocate A first?
Matt J
2014년 2월 6일
I wish one system like that.do you fly with it ?
Not always. Like you, I've also had cases where PARFOR mysteriously under-performs in environment-dependent ways. See this thread, for instance
Matt J
2014년 2월 6일
You're not doing any of this over a network are you? This is all on a local CPU?
채택된 답변
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Parallel for-Loops (parfor)에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!