I want to optimize a mex-compiled function (fortran-90 source) defined over an 1D interval by computing its values on a sufficiently fine sampling. It works fine with a for-loop but when I try parfor (for speed) I get crashes in the mex-compiled code (getting a error from one of the workers). Is this a documented problem, and does anyone have suggestions how to localize what goes wrong?
I run MatlabR2013a and Ubuntu 13.10 on a 16 core (32 virtual) machine and I get 12 workers when I do matlabpool.

댓글 수: 1

Matt J
Matt J 2014년 2월 6일
No, there is no general prohibition against using mex files with parfor. Show us the plain for-loop and the parfor version.

댓글을 달려면 로그인하십시오.

 채택된 답변

Matt J
Matt J 2014년 2월 6일
편집: Matt J 2014년 2월 6일

0 개 추천

You should try running a plain for-loop first, but with the iterations in random order, i.e., instead of
for i=1:n
...
end
run as
for i=randperm(n)
...
end
This is a good way to test whether your code is independent of the order of the iterations (a basic requirement of parfor) before the Parallel Computing Toolbox even gets involved.

댓글 수: 5

martin
martin 2014년 2월 6일
편집: Matt J 2014년 2월 6일
My script:
cat timeme_straightsearch.m
stupidvector=zeros(1,360);
disp('search with conventional for-loop:')
tic
for countindex = 1:360
stupidvector(countindex)=localsearchstrul_valuespectral(countindex);
end
[foundpsi,whichelement]=min(stupidvector)
toc
disp('search with parfor-loop:')
tic
parfor countindex = 1:360
stupidvector(countindex)=localsearchstrul_valuespectral(countindex);
end
[foundpsi,whichelement]=min(stupidvector)
toc
and in matlab:
>> matlabpool
Starting matlabpool using the 'local' profile ... connected to 12 workers.
>> timeme_straightsearch
search with conventional for-loop:
foundpsi =
-5.3948
whichelement =
121
Elapsed time is 186.328337 seconds.
search with parfor-loop:
Error using distcomp.remoteparfor/getCompleteIntervals (line 22)
The session that parfor is using has shut down.
Error in timeme_straightsearch (line 13)
parfor countindex = 1:360
Caused by:
Error using distcomp.remoteparfor/getCompleteIntervals (line
22)
The session that parfor is using has shut down.
The client lost connection to lab 1. This might be due to network
problems, or the interactive matlabpool job might have errored.
Matt J
Matt J 2014년 2월 6일
Are your parallel labs on remote machines? It rather does look to me like a network error like the error message suggests.
martin
martin 2014년 2월 6일
No, its one machine. My interpretation of the error was that the the process generating the error just was aware that the process on the worker had died
Matt J
Matt J 2014년 2월 6일
Can you try it on a different machine to see if it's hardware problem? I don't see anything wrong with the code.
martin
martin 2014년 2월 9일
Thanks for your input, I will try another machine asap. Just an additional observation: The program crashes on the fortran90 statement "call mxCopyPtrToReal8(inptr_xdim,realxdim,1)" i.e a standardconstruction right out of the manualmapges for mex

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Parallel Computing Fundamentals에 대해 자세히 알아보기

질문:

2014년 2월 6일

댓글:

2014년 2월 9일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by