Loop through files of remote server

조회 수: 2 (최근 30일)
CP
CP 2011년 10월 10일
I'm using matlab to perform simulations on a cluster, where the jobs are submitted from a local machine through matlab to the cluster.
A sample submission script might look like this:
%%Get handle to the job scheduler
sched = findResource();
%%Create a job
job = createJob(sched, 'FileDependencies', {'Analysis.m'});
%%Create the tasks
filelist=dir('/dir1/dir2/')
for tidx = 1:length(filelist)
tasks(tidx) = createTask(job,@Analysis, 1, {tidx});
end
%%Submit the job
submit(job)
I'm now trying to obtain and loop through some files on the cluster and run a script on the files, say, Analysis.m
How would I do this to get a file list on the cluster and not on the local machine from which the job-scheduling, and then pass each file to the Analysis.m one at a time?

채택된 답변

Edric Ellis
Edric Ellis 2011년 10월 10일
Perhaps a PARFOR loop on the cluster would be simplest. Something like this:
function x = doStuff
% list the files on the cluster
d = dir( '/path/on/cluster/*.dat' );
% loop over the files - this will spread the work
% among the workers
parfor ii = 1:numel( d )
x(ii) = someFcn( d(ii).name );
end
Then, you need to submit 'doStuff' as a matlabpool job to the cluster, something like this:
job = createMatlabpoolJob( sched, 'MaximumNumberOfWorkers', 4 );
createTask( job, @doStuff, 1, ... );
Or, in this situation, you can even use the BATCH command to submit the job
job = batch( @doStuff, 1, {}, 'Matlabpool', 4 );
  댓글 수: 4
CP
CP 2011년 10월 10일
Ok my attempt is below. It seems one of the workers exited and the local matlab showed the following message:
"Field reference for multiple structure elements that is followed by more reference blocks is an error"
There are no structures in the PrevTargAnalyze.m that is called, so any idea on what is causing the message? (the dir command is probably the only structure I see).
%%%%%%%%%%%%%%%%%%%%%%%%
%%----startJobs.m-----%%
%%%%%%%%%%%%%%%%%%%%%%%%
sched = findResource();
nlabs = 4;
job = createMatlabPoolJob(sched, 'FileDependencies', {'doJobs.m','PrevTargAnalyze.m'}, 'MinimumNumberOfWorkers', nlabs, 'MaximumNumberOfWorkers', nlabs);
task = createTask(job, @doJobs,1,{});
submit(job)
waitForState(job, 'finished')
if ~isempty(task.ErrorMessage)
disp(task.ErrorMessage)
else
y = getAllOutputArguments(job)
end
%%%%%%%%%%%%%%%%%%%%%%%
%%-----doJobs.m------%%
%%%%%%%%%%%%%%%%%%%%%%%
function x = doJobs
dirlist=dir('/scratch/harry/SpatialWM/delay/');
for idxd = 1:length(dirlist)
%grab current target neuron upper and lower range from directory
name
[i j]=strread(dirlist.name(idxd), '%s %s', 'delimiter', '-');
%Set slice time to analyze
sliceTime=2500;
filedir=['/scratch/harry/SpatialWM/delay/' dirlist.name(idxd)];
filelist = dir([filedir '*.dat']);
parfor idxf = 1:length(filelist)
PrevTargAnalyze(['/scratch/harry/SpatialWM/delay/' dirlist(idxd).name '/' filelist(idxf).name],i,j,sliceTime);
end
end
CP
CP 2011년 10월 12일
Ok I managed to fix everything and got this to work. There's only one issue with it, in that it is limited to 16 workers (server setting) per job. How can I split this into several jobs using this method?

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Walter Roberson
Walter Roberson 2011년 10월 10일
You have dirlist.name(idxd) but that should be dirlist(idxd).name
  댓글 수: 2
CP
CP 2011년 10월 10일
Thanks, I then got the message Output argument "x" not assigned, and fixed that by changing function x = doJobs to function doJobs() and now I'm getting "Too many output arguments." This is extremely hard to debug without line numbers, is there some way I can retrieve those, as I don't want to be spamming the forum for every little trivial error.
CP
CP 2011년 10월 10일
I also tried reverting back to:
function x = dojobs
and adding an output to PrevTargAnalyze function as well, and then doing x = PrevTargAnalyze inside the parfor, and it still complains that x is not assigned =/

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by