Parfor Error: lost connection

조회 수: 6 (최근 30일)
George
George 2012년 11월 7일
Dear all,
I am using parfor-loop in my script. the script works fine on my own computer with matlab 2012a. but when I run the script on another computer with matlab 2011b, it gives me error:
Error using parallel_function (line 598) The session that parfor is using has shut down
Error in Myscript (line 97) parfor k=1:px
The client lost connection to lab 6. This might be due to network problems, or the interactive matlabpool job might have errored. This is causing: java.io.IOException: An existing connection was forcibly closed by the remote host
Anyone can give me any clue about this problem?
thank you in advance.
Kind regards George
  댓글 수: 2
Walter Roberson
Walter Roberson 2012년 11월 7일
Are both machines running 32 bit or both 64 bit?
Are you attempting to transfer more than 2 Gb of data?
George
George 2012년 11월 8일
Dear Walter,
Both are 64bit matlab running on windows7 64bit with 24Gb Ram.
Dataset is around 1.5Gb
Thank you.
Rgards,
George

댓글을 달려면 로그인하십시오.

답변 (2개)

Jason Ross
Jason Ross 2012년 11월 7일
There are a number of logs you can look at to try and gain some insight. They are located (by default) either in /var/log/mdce on Linux/Mac, and %TEMP%\MDCE\log on Windows. This might tell you what was going on.
Since the connection was "forcibly closed", that could be the result of something at the OS level, and you could take a look at the system logs / event viewer for any clues as to what might be going on.
Without reviewing the logs, though, there's not a lot to go on. There are many reasons something could forcibly shut down.
You can also try running a validation of the cluster (Parallel, Manage, select cluster profile, validate) or run the connectivity tests in Admin Center (matlabroot/toolbox/distcomp/bin/admincenter) to see if there's something off with respect to your setup.
Note I'm assuming that you are using MDCE. It would also be helpful if you could list what OS you are on, too.
  댓글 수: 4
George
George 2012년 11월 12일
yes, I am using local scheduler, and not job manager.
I tried to run the script on 2012b. it's strange that the first two times run successfuly, but the third time gives me the similar error.
The client lost connection to lab 4. This might be due to network problems, or the interactive matlabpool job might have errored.
I found MatlabDesktopCreateError.log in the AppData\Local\Temp, but it's creadted in Septemember.
any suggestion?
thank you.
Jason Ross
Jason Ross 2012년 11월 19일
I was out of town for a little while -- unfortunately I don't have much of a general suggestion. You might want to contact support, as it might be related to your unique situation somehow.

댓글을 달려면 로그인하십시오.


Francisco
Francisco 2013년 2월 5일
편집: Francisco 2013년 2월 5일
May be you are not working completely within the MATLAB environment. Like for example, you are using the system environment while invoking results from a 32bit application run in parallel which executes outside MATLAB for 64bit.
If so, the solution could be export to that application a lesser amount of data to not be always working nearly around the fine limits of the allowed memory usable by that application; even if it worked for two or three cycles, with a huge amount of data in the memory, a tiny increment in the working memory could perturb the task executed outside MATLAB.

카테고리

Help CenterFile Exchange에서 Parallel Computing Fundamentals에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by