Share Code with Workers
When you submit a job, the software evaluates the tasks of the job on different machines. Each machine must have access to all the files it needs to evaluate its tasks. The following sections explains the basic mechanisms for sharing code with the workers.
Note
For an example that shows how to share code with workers using
batch
, see Run Batch Job and Access Files from Workers.
Workers Access Files Directly
If the workers all have access to the same drives on the network, they can access the necessary files that reside on these shared resources. This is the preferred method for sharing data, as it minimizes network traffic.
You must define each worker session's search path so that it looks for files in the right places. You can define the path:
By using the job's
AdditionalPaths
property. This is the preferred method for setting the path, because it is specific to the job.AdditionalPaths
identifies folders to be added to the top of the command search path of worker sessions for this job. If you also specifyAttachedFiles
, theAttachedFiles
are aboveAdditionalPaths
on the workers' path.When you specify
AdditionalPaths
at the time of creating a job, the settings are combined with those specified in the applicable cluster profile. SettingAdditionalPaths
on a job object after it is created does not combine the new setting with the profile settings, but overwrites existing settings for that job.AdditionalPaths
is empty by default. For a mixed-platform environment, the character vectors can specify both UNIX® and Microsoft® Windows® style paths; those setting that are not appropriate or not found for a particular machine generate warnings and are ignored.This example sets the MATLAB® worker path in a mixed-platform environment to use functions in both the central repository
/central/funcs
and the department archive/dept1/funcs
, which each also have a Windows UNC path.c = parcluster(); % Use default job1 = createJob(c); ap = {'/central/funcs','/dept1/funcs', ... '\\OurDomain\central\funcs','\\OurDomain\dept1\funcs'}; job1.AdditionalPaths = ap;
By putting the
path
command in any of the appropriate startup files for the worker:matlabroot
\toolbox\local\startup.mmatlabroot
\toolbox\parallel\user\jobStartup.mmatlabroot
\toolbox\parallel\user\taskStartup.m
Access to these files can be passed to the worker by the job's
AttachedFiles
orAdditionalPaths
property. Otherwise, the version of each of these files that is used is the one highest on the worker's path.
Access to files among shared resources can depend upon permissions based on the
user name. You can set the user name with which the MATLAB Job Scheduler and worker services of MATLAB
Parallel Server™ software run by setting the MJSUSER
value in the
mjs_def
file before starting the services. For Microsoft
Windows operating systems, there is also MJSPASS
for
providing the account password for the specified user. For an explanation of service
default settings and the mjs_def
file, see Modify Script Defaults (MATLAB Parallel Server) in the MATLAB
Parallel Server System Administrator's Guide.
Pass Data to and from Worker Sessions
A number of properties on task and job objects are designed for passing code or data from client to scheduler to worker, and back. This information could include MATLAB code necessary for task evaluation, or the input data for processing or output data resulting from task evaluation. The following properties facilitate this communication:
InputArguments
— This property of each task contains the input data you specified when creating the task. This data gets passed into the function when the worker performs its evaluation.OutputArguments
— This property of each task contains the results of the function's evaluation.JobData
— This property of the job object contains data that gets sent to every worker that evaluates tasks for that job. This property works efficiently because the data is passed to a worker only once per job, saving time if that worker is evaluating more than one task for the job. (Note: Do not confuse this property with theUserData
property on any objects in the MATLAB client. Information inUserData
is available only in the client, and is not available to the scheduler or workers.)AttachedFiles
— This property of the job object is a cell array in which you manually specify all the folders and files that get sent to the workers. On the worker, the files are installed and the entries specified in the property are added to the search path of the worker session.AttachedFiles
contains a list of folders and files that the worker need to access for evaluating a job's tasks. The value of the property (empty by default) is defined in the cluster profile or in the client session. You set the value for the property as a cell array of character vectors. Each character vector is an absolute or relative pathname to a folder or file. (Note: If these files or folders change while they are being transferred, or if any of the folders are empty, a failure or error can result. If you specify a pathname that does not exist, an error is generated.)The first time a worker evaluates a task for a particular job, the scheduler passes to the worker the files and folders in the
AttachedFiles
property. On the worker machine, a folder structure is created that is exactly the same as that accessed on the client machine where the property was set. Those entries listed in the property value are added to the top of the command search path in the worker session. (Subfolders of the entries are not added to the path, even though they are included in the folder structure.) To find out where the files are placed on the worker machine, use the functiongetAttachedFilesFolder
in code that runs on the worker.When the worker runs subsequent tasks for the same job, it uses the folder structure already set up by the job's
AttachedFiles
property for the first task it ran for that job.When you specify
AttachedFiles
at the time of creating a job, the settings are combined with those specified in the applicable profile. SettingAttachedFiles
on a job object after it is created does not combine the new setting with the profile settings, but overwrites the existing settings for that job.The transfer of
AttachedFiles
occurs for each worker running a task for that particular job on a machine, regardless of how many workers run on that machine. Normally, the attached files are deleted from the worker machine when the job is completed, or when the next job begins.AutoAttachFiles
— This property of the job object uses a logical value to specify that you want MATLAB to perform an analysis on the task functions in the job and on manually attached files to determine which code files are necessary for the workers, and to automatically send those files to the workers. You can set this property value in a cluster profile using the Profile Manager, or you can set it programmatically on a job object at the command line.c = parcluster(); j = createJob(c); j.AutoAttachFiles = true;
The supported code file formats for automatic attachment are MATLAB files (
.m
extension), P-code files (.p
), and MEX-files (.mex
). Note thatAutoAttachFiles
does not include data files for your job; use theAttachedFiles
property to explicitly transfer these files to the workers.Use
listAutoAttachedFiles
to get a listing of the code files that are automatically attached to a job.If the
AutoAttachFiles
setting istrue
for the cluster profile used when starting a parallel pool, MATLAB performs an analysis onspmd
blocks,parfor
-loops, and other attached files to determine what other code files are necessary for execution, then automatically attaches those files to the parallel pool so that the code is available to the workers.
Note
There is a default maximum amount of data that can be sent in a single call
for setting properties. This limit applies to the
OutputArguments
property as well as to data passed into a
job as input arguments or AttachedFiles
. If the limit is
exceeded, you get an error message. For more information about this data
transfer size limit, see Attached Files Size Limitations.
Pass MATLAB Code for Startup and Finish
As a session of MATLAB, a worker session executes its startup
.m
file
each time it starts. You can place the startup.m
file in any
folder on the worker's MATLAB search path, such as toolbox/parallel/user
.
These additional files can initialize and clean up a worker session as it begins or completes evaluations of tasks for a job:
jobStartup
.m
automatically executes on a worker when the worker runs its first task of a job.taskStartup
.m
automatically executes on a worker each time the worker begins evaluation of a task.poolStartup
.m
automatically executes on a worker each time the worker is included in a newly started parallel pool.taskFinish
.m
automatically executes on a worker each time the worker completes evaluation of a task.
Empty versions of these files are provided in the folder:
matlabroot/toolbox/parallel/user
You can edit these files to include whatever MATLAB code you want the worker to execute at the indicated times.
Alternatively, you can create your own versions of these files and pass them to
the job as part of the AttachedFiles
property, or include the
path names to their locations in the AdditionalPaths
property.
The worker gives precedence to the versions provided in the
AttachedFiles
property, then to those pointed to in the
AdditionalPaths
property. If any of these files is not
included in these properties, the worker uses the version of the file in the
toolbox/parallel/user
folder of the worker's MATLAB installation.
See Also
jobStartup
| taskStartup
| poolStartup
| taskFinish