Documentation

Create and Manage Clusters

Create a Cloud Cluster

If you are new to cloud clusters, see Getting Started with Cloud Center.

  1. In Cloud Center, click Create a Cluster.

  2. On the Create Cluster page, specify your cluster options.

    Tip

    Specify a cluster name and click Create Cluster to try a default cluster. Cloud Center prompts you if you need to create a new SSH key. You might want to configure other cluster settings, such as cluster size, machine types and storage options. For example, for deep learning, choose a Machine type with GPUs such as the P2 or G3 instances.

    OptionDescription
    Give this cluster a nameSpecify a name.
    MATLAB VersionSelect the same version as your local desktop client MATLAB®.
    Automatically terminate clusterSelect a timeout for the cluster so that it shuts down automatically.
    Cluster Log LevelChange the cluster log level. If you need to diagnose cluster issues with support engineers, increase the log level for more detail. Log levels above Medium can negatively impact performance.
    Location & Network

    Select the Region where your cluster will run. Consider your location and connectivity. Select a Network that meets the requirements for Connecting a Desktop Computer (Client Machine) to MATLAB Parallel Server Running on the Amazon EC2 Cloud. You can only use the Amazon Virtual Private Cloud (VPC) network type with Cloud Center. For more information, see Configure AWS VPC for Cloud Center.

    Use a dedicated headnodeEnabled (default) - Add a headnode instance that only runs management services (for example, the job manager), and does not host any MATLAB workers. Cloud Center uses the instance type shown in Headnode Machine Type (read-only). This mode improves performance. For details, see Use a Dedicated Headnode Instance for Management Services.

    Disabled - The headnode shares the job manager and workers. This mode minimizes machine cost, but can reduce performance. For details see Use a Shared Instance for Management Services.

    Worker Machine Type

    Choose an instance that suits your application. Types vary by hardware specification, including number of cores, memory, and GPU support. For details, see Choose Supported EC2 Instance Machine Types.

    Tip

    For deep learning, choose a machine type with GPUs such as the P2 or G3 instances. P2s have GPUs with high performance for general computation. G3s have GPUs with high single-precision performance for deep learning, image processing, and computer vision.

    Workers per MachineThe maximum number of workers per machine depends on your selected Worker Machine Type, and it corresponds to the number of physical CPU cores.
    Allow cluster to auto-resizeEnabled - The number of machines in your cluster will shrink or grow depending on the amount of work submitted to the cluster. Set Workers in Cluster to the maximum number of workers you want in the cluster. You must use a dedicated headnode to enable the auto-resize feature. For more information, see Resize Clusters Automatically.

    Disabled (default) - The number of machines in your cluster remains fixed at the number of workers set by Workers in Cluster.

    Workers in Cluster

    Choose the number of workers, using the Upper Limit menu. If you select a number greater than the Workers per Machine, you see the Machines in Cluster information update to show more than one machine. Cloud Center supports a maximum of 1024 workers per cluster.

    The Initial Count field shows the number of workers your cluster will start with. If Allow cluster to auto-resize is disabled, the Initial Count field matches your Upper Limit selection.

    If Allow cluster to auto-resize is enabled, the Upper Limit menu sets the maximum number of workers for your cluster, in increments of Workers per Machine. The Initial Count field is zero. You cluster starts with zero workers and can resize up to the maximum number of workers. For more information, see Resize Clusters Automatically.

    Cluster Shared Storage, Local Machine StorageIf you have data stored in an Amazon S3 bucket, then you can use datastores in MATLAB to directly access the data without needing any storage on the cluster. For details, see Transfer Data To Amazon S3 Buckets. You can also select the cluster and local storage options when creating your cluster. For details, see Cluster File System and Storage.
    SSH key

    If you do not have a key, Cloud Center prompts you to create one. AWS requires an SSH key to start EC2 instances. Click create a new key, in the dialog box, enter a name, and click Download Key. Your browser might require you to identify a location. You get a root access key file with the extension .pem. Store this file in a safe place, because you cannot download it again. However, you can always create a new key, and download its key file. You can specify the same SSH key for multiple clusters.

    If you want to log in as root to your cloud cluster machines, you need the SSH key. Cluster machines have no password, so you use a key to log in using SSH. Cloud Center also provides a nonroot user access key file, which is unique to each cluster. For details on the user access key file, see Download SSH Key Identity File.

    If you have existing keys, select from the keys for the specified region of your AWS account, or create a new key. Otherwise, Cloud Center uses the previously selected key or the first key listed alphabetically in the AWS account.

    Operating System Image (AMI)If you have created a custom AMI, you can select it. See Create a Custom Amazon Machine Image (AMI).
  3. Click Create Cluster to create and start your cluster machines. The cluster starts a number of machines (instances) determined by your choices of number of workers and workers per machine. Cloud Center displays the cluster status Starting, and indicates the interim status of all the cluster machines.

    It can take up to several minutes for a cluster to completely start up. The status indicates the stages of the process. To get status on any individual cluster machine, click Headnode or Worker expanders.

    When the cluster is started and ready for use, Cloud Center displays the cluster status as Online.

Tip

For next steps using your new cluster, discover the cluster from MATLAB. See Access Cloud Cluster from MATLAB.

This figure shows an example of Create Cluster settings.

This figure shows a typical cluster status after starting a standard 18-worker cluster with a 2-hour time limit.

If the cluster fails to start completely, its status will indicate that. For information on the failure, click the appropriate Headnode or Worker expander to read the respective log. Often you can shut down your failed cluster and attempt to start it again.

Discover Clusters on Local Machine

To access a cluster created in your account, use Discover Clusters from MATLAB. See Access Cloud Cluster from MATLAB.

Alternatively, it can be useful to download and share the cluster profile. When your cloud cluster is starting or online, click MATLAB Cluster Profile to save a cluster profile from Cloud Center onto your local machine, allowing you to access that cluster from MATLAB and the Cluster Profile Manager. Save the profile in a folder accessible from your client MATLAB. Any user who imports this cluster profile can access the cluster. See Import Cluster Profiles.

View, Edit, Start, or Stop Clusters

To view, edit, start up, or shut down your clusters, click My Clusters to see a list of your clusters. You can have more than one cluster, some running (online) and some shut down (offline). The following listing shows a pair of clusters, one currently online and ready, the other offline:

To stop a cluster, click Shut Down in the Actions column. Shutting down a cluster does not remove it from your list. You can start the cluster again at a later time. To permanently remove a cluster, click Delete. You can shut down a cluster during its startup if it fails to start, takes too long, or you change your mind.

For detailed information about a particular cluster, click its name in the list.

If the cluster is online, you can select Edit to change the Cluster Timeout and Cluster Log Level.

If the cluster is offline, you can select Edit to change the following cluster characteristics:

  • Cluster Timeout

  • Cluster Log Level

  • Use a dedicated headnode

  • Worker Machine Type

  • Allow cluster to auto-resize

  • Workers in Cluster

  • Workers per Machine

  • Local Storage Volume Size

  • EBS Snapshot ID

  • SSH Key

Click Update to save your changes.

Cluster File System and Storage

Tip

If you have data stored in an Amazon S3 bucket, then you can use datastores in MATLAB to directly access the data without needing any storage on the cluster. For details, see Transfer Data To Amazon S3 Buckets. You can also select the following storage options when creating your cluster.

  • Cluster Shared Storage

    • Persisted Storage. To request shared disk space that remains after you shut down the cluster, select a disk size. The shared storage is mounted at /shared/persisted. For details, see the table below.

    • Amazon S3 Data. To transfer individual files from an Amazon S3 bucket to the cluster machines, click Add Files. You can specify S3 files only when creating your cluster and starting it for the first time. When the cluster starts up, files are copied to /shared/imported. See Copy Data from Amazon S3 Account to Your Cluster.

  • Local Machine Storage

    • Volume Size: To request an Amazon EBS Volume, enter a number of GB in the box, e.g. 100. This requests a local SSD, created on each worker machine of your cluster. The SSD is mounted at /mnt/localdata. Use this option when read/write performance is critical.

    • EBS Snapshot ID: If you previously saved an EBS snapshot of your data on Amazon, then enter the ID. The data is copied to the SSD volume attached to each worker machine. If you provide a formatted snapshot, then the file system type must be ext3, ext4, or xfs. For ext3 and ext4, the full volume size of the file system might not be immediately available when the instance comes online. Growing the file system to full capacity can take up to 30 minutes after the instance is online, depending on the size of the extN volume and the instance type. You can access all data in the original snapshot as soon as the cluster is online.

After selecting your storage options, click Create Cluster. For details on other cluster settings, see Create a Cloud Cluster.

All worker machines have access to local and cluster shared storage. You can use these folders for storing data generated by your jobs, and for data you want to transfer between the cluster and your client location. See Transfer Data to or from a Cloud Cluster. The paths are the same for all worker machines of the cluster. Changes to files and folders under /mnt/localdata are not visible to other machines. Files and folders under the /shared mount point are shared by all worker machines of your cluster. Changes made by any machine are visible to all other machines. Each folder has different longevity, as shown in the table.

LocationSizeUsage
/mnt/localdataSpecified in cluster configuration

The location of the local machine storage volume. Each worker machine gets its own copy of the data. Temporary and intermediate data can also be written to this location.

Deleted when cluster is stopped. The data is not retained between cluster runs. If you have specified an EBS snapshot, then the data is copied again when the cluster is started.

/shared/persistedSpecified at cluster creation

The location of the cluster shared persisted storage and MATLAB Job Scheduler data. This folder is shared among worker machines and is retained between cluster runs. Save data you want to retrieve on the next start of the cluster in folders and files under /shared/persisted.

Deleted when cluster is deleted.

/shared/tmpVaries with instance type

This folder is shared among worker machines and is not retained between cluster runs. Use it to store temporary and intermediate data that must be visible or accessible from multiple cluster machines. The available storage space depends on the ephemeral storage available on the selected machine instance type.

Deleted when cluster is stopped.

/shared/importedPart of allocation for
/shared/tmp

The location of the cluster shared Amazon S3 data. Selected Amazon S3 objects are copied to this location as part of the cluster start up and are deleted on shut down.

Deleted when cluster is stopped; copied again when cluster is started.

Note:

  • To use /shared/tmp or /shared/imported, you must disable the dedicated headnode mode (see Use a Dedicated Headnode Instance for Management Services), and use an instance with ephemeral storage. Consult the table in Choose Supported EC2 Instance Machine Types to find out which instances have ephemeral storage.

  • Cloud cluster machines can share these folders only with machines of the same cluster; that is, there is no file sharing between different clusters.

  • You create, start, stop, and delete your cloud clusters independent of your local MATLAB session. Deleting an associated cluster object in MATLAB does not affect the cloud cluster or its persistent storage.

  • When a cluster times out, it shuts down and clears the contents of /shared/tmp, /shared/imported and /mnt/localdata, but preserves the content of /shared/persisted. If you use an automatic shutdown setting for your cluster, ensure that you have all data you need from /shared/tmp and /mnt/localdata before this timeout occurs.

  • The contents of /shared/tmp are built using ephemeral storage.

Choose Supported EC2 Instance Machine Types

On the Create Cluster page, under Machine Configuration, select a Worker Machine Type from the list. You can also edit the instance type on existing clusters. Amazon EC2 provides instance types with various combinations of CPU, GPU, memory, network performance, and storage. Choose an instance that suits your application.

Tip

For deep learning, choose an instance with GPUs such as the P2, P3, or G3 instances. The different machine types provide a range of performance at different prices. For example, P2s have K80 GPUs and P3s have V100 GPUs, both with high performance for general computation including deep learning. G3s have M60 GPUs with high single-precision performance well suited for deep learning, image processing and computer vision.

Cloud Center supports the following instance types:

Instance Type

Physical CPU cores - Max workers per machine

GPUs

Memory (GB)

Network Performance (Gigabit)

Ephemeral Storage (GB)

Compute Optimized

     
c4.xlarge207.5High0
c4.2xlarge4015High0
c4.4xlarge8030High0
c4.8xlarge18060100
c3.8xlarge1606010640
      

GPU

     
p3.2xlarge4461Up to 10 Gigabit0
p3.8xlarge16424410 Gbps0
p3.16xlarge32848825 Gbps0
g3.4xlarge81122Up to 10 Gigabit0
g3.8xlarge162244100
g3.16xlarge324488200
p2.xlarge2161High0
p2.8xlarge168488100
p2.16xlarge3216732100
g2.2xlarge4115High60 (SSD)
g2.8xlarge1646010240 (SSD)
      

Memory Optimized

     
r4.xlarge2030.5Up to 100
r4.2xlarge4061Up to 100
r4.4xlarge80122Up to 100
r4.8xlarge160244100
r4.16xlarge320488200
r3.xlarge2030.5Moderate80 (SSD)
r3.2xlarge4061High160 (SSD)
r3.4xlarge80122High320 (SSD)
r3.8xlarge16024410640 (SSD)
      

Storage Optimized

     
i3.xlarge2030.5Up to 10

950 (NVMe SSD)

/mnt/localdata0 (950)

i3.2xlarge4061Up to 10

1900 (NVMe SSD)

/mnt/localdata0 (1900)

i3.4xlarge80122Up to 10

3800 (NVMe SSD)

/mnt/localdata[0-1] (1900 x 2)

i3.8xlarge16024410

7600 (NVMe SSD)

/mnt/localdata[0-3] (1900 x 4)

i3.16xlarge32048820

15200 (NVMe SSD)

/mnt/localdata[0-7] (1900 x 8)

For details on other cluster settings, see Create a Cloud Cluster.

For more details on newly added compute optimized instances and regional availability, see the Amazon Web Services web site: Amazon EC2 Instance Types. Note that Amazon Web Services describes instances in terms of vCPUs, where v means virtual core (or logical core). Each physical core has two virtual cores. For example, a c3.8xlarge has 16 physical CPU cores, which corresponds to 32 vCPUs.

Note

  • Cloud Center only supports Linux on-demand instances.

  • c4.8xlarge is the suggested default instance.

  • Not all instances are available in all regions.

  • Cloud Center currently supports the following regions:

    • US East (N.Virginia)

    • EU West (Ireland)

    • AP Northeast (Tokyo)

  • Cloud Center supports reserved instances in addition to on-demand. Cloud Center does not support dedicated or spot instances.

  • Cloud Center supports at most one worker per physical core. Although Amazon Web Services machines can have many virtual cores, Cloud Center restricts use to at most one worker per physical core for optimal performance. Each physical core has two virtual cores with a shared Floating Point Unit. Most MATLAB computations use this unit because they are double-precision floating point. Restricting to one worker per physical core ensures that each worker has exclusive access to a Floating Point Unit and optimizes performance.

To use reserved instances with Cloud Center, you need to purchase reserved instances with the following Cloud Center supported attributes:

  • Instance type: one of the machine types supported by Cloud Center, listed in the table in this section: Choose Supported EC2 Instance Machine Types.

  • Platform description: Linux.

  • Tenancy: default.

  • Region: one of regions supported by Cloud Center (US East (N.Virginia), EU West (Ireland), AP Northeast (Tokyo)).

  • Availability Zone: Availability Zone within the selected region, must match the Availability Zone of the subnet selected.

Amazon enforces service resource limits on a per-region basis. To view Amazon EC2 and VPC limits for your account in the Amazon EC2 console, see Amazon EC2 Service Limits.

For pricing and billing information, see the Amazon web site: Amazon EC2 Pricing.

Use a Dedicated Headnode Instance for Management Services

In dedicated headnode mode, any management services (for example, the job manager or the file server) run on a separate instance called the headnode. The headnode does not host any workers. This approach is useful when workers run computations that use many system resources, such as memory, processor, network, or local storage. Using many system resources can negatively impact the performance of the job manager. In that case, if the job manager becomes unresponsive, then the MATLAB client can lose communication with the cluster.

In dedicated headnode mode, the job manager is optimized for the instance types that Cloud Center allows you to choose. A dedicated headnode adds an additional instance to your cluster, and so adds to your overall cost.

In dedicated headnode mode, the cluster machines are in the same availability zone, which improves the communication speed between the headnode and the workers. For instructions on enabling or disabling the dedicated headnode, see Create a Cloud Cluster.

Use a Shared Instance for Management Services

If you disable the dedicated headnode mode, any management services (for example, the job manager) run on one instance in your cluster, along with workers. Use this mode if you want to reduce the number of machines in your cluster by one. This mode uses the same machine type for the headnode and the workers.

For instructions on enabling or disabling the dedicated headnode, see Create a Cloud Cluster.

Resize Clusters Automatically

Your cluster can resize automatically based on the amount of work submitted to the cluster. You must use a dedicated headnode to use automatic resizing.

To enable automatic resizing, select the option Allow cluster to auto-resize on the Create Cluster page. Specify the maximum number of workers that you require in the cluster using the Upper Limit menu next to Workers in Cluster. To ensure that all machines are started with the same number of workers, the Upper Limit menu options are multiples of the Workers per Machine value.

Based on your Upper Limit selection and the Workers per Machine value, the Machines in Cluster field displays the maximum number of machines for your cluster, including the headnode machine. Your cluster will not auto-resize above the number of machines displayed in the Machines in Cluster field or the number of workers specified by Workers in Cluster. You can use this to set an upper limit of the costs you are willing to pay for the cluster.

Note

Set the Workers in Cluster field to a provide a maximum cluster size that you are prepared to pay for.

You can view the maximum number of workers and current requested number of workers on the Cluster Summary page in Cloud Center. You can also view these properties from your cluster object in MATLAB, using the properties MaxNumWorkers and NumWorkersRequested. For more information, see parallel.Cluster.

Tip

To avoid your cluster shutting down when all workers become idle and no jobs are in the queue, set the termination policy of your cluster to a After a fixed time period or Never . Your cluster will remain online with only the dedicated headnode machine until more jobs are submitted or the cluster times out.

Cluster Growing and Shrinking

Your cluster starts with the dedicated headnode machine and zero workers. When you submit jobs to the cluster, the cluster will grow to accommodate the next queued job by adding machines, up to the maximum number set when you created the cluster. The cluster continues to grow until it either runs out of queued jobs or the upper limit for the number of workers prevents it from growing to accommodate the next queued job. Workers are added in increments of Workers per Machine.

As workers become available, they can be assigned to the next job in the queue. A queued job is scheduled as soon as there are enough available workers to run the job.

Machines no longer in use are removed from the cluster. If even one worker on a machine is busy, that machine will not be removed until all workers on the machine are idle. Cloud Center removes machines which are idle for at least five minutes, checking for idle workers every five minutes. It can take up to 15 minutes to remove a machine when all workers on that machine become idle. Your cluster can be reduced to zero workers when no jobs are running. In this case, only the headnode remains in the cluster.

Distributing Jobs Across Machines

Workers on a new machine or from a finishing job do not necessarily become available to the cluster at the same time. The cluster schedules jobs to idle workers as soon as the minimum requirements of the job are met. Consequently, as running jobs finish and queued jobs start on the cluster, jobs can be distributed across several machines. In such cases, you can find that even though the number of active workers in your cluster corresponds to a smaller number of machines than your cluster is currently using, the cluster does not shrink as there are active workers on each machine.

The following example shows one of several ways that the cluster can distribute these jobs among workers. The actual distribution depends on the order in which workers become available on new machines and after jobs finish running. Suppose you create a cluster with a maximum of 16 workers, with four workers per machine. The cluster starts with zero workers. You submit four jobs: one six-worker job, two four-worker jobs, and one five-worker job. The jobs finish in the order they are submitted.

First, the cluster grows to eight workers on two machines to run the first, six-worker job. To run the second job, the cluster needs two additional workers. A third machine is requested to provide the additional workers. The job is assigned to the two free workers on the existing machine and two workers on the new machine. Similarly, two additional workers are required to run the third job, so the cluster requests a fourth machine. Now, 14 out of 16 workers on four machines are in use. There are not enough workers available to run the final, five-worker job. This job remains in the queue while the first three jobs run.

When the first job finishes, the six workers that ran the job become idle. They do not necessarily become idle at the same time. As soon as three additional workers are available, the cluster assigns workers for the final, five-worker job.

When the second job finishes, there are still active workers across all four machines. The cluster cannot shrink, even though there are seven idle workers.

When the third job finishes, all workers on one machine become idle. When they have been idle for over five minutes, that machine can be removed. The cluster shrinks to three machines.

When the fourth and final job is finished, all workers on the three remaining machines become idle. If no further jobs are submitted, the cluster is reduced to zero workers. Only the dedicated headnode machine remains in the cluster.

AWS Resource Limits

If you reach an AWS quota limit error or other resource constraint during the lifetime of the cluster, Cloud Center reduces the maximum number of workers to the number of workers that Cloud Center allocated successfully prior to encountering the error. Queued jobs that are not supported by the reduced maximum cluster size are cancelled and removed from the queue. If you stop and restart the cluster, the restriction is removed and the cluster attempts to grow to the maximum you specified.

AWS resource limits are per region. The limits apply to resources shared across different IAM users. You can find information on your AWS instance limits by selecting the Limits navigation element from the EC2 service page.

Cluster Access and Security Groups

Set Cluster Access

Cluster Access settings control which computers can access your Amazon cluster from the Internet. To access the Cluster Access setting in the Cloud Center, click Cluster Access beneath Preferences in the navigator. Cluster access comprises a list of IP ranges for the computers that can access your cloud cluster. Your access might already be set up for your Amazon Web Services account, or you might have to create or modify them here.

The IP addresses in the listing must be those of the machines as seen from the Internet, which is often different from their local IP addresses. To be sure you get the proper IP address, see your administrator, or use one of the many available websites that can return this information to you.

The format for an access listing is a 4-field IP address, optionally followed by a slash (/) and a value identifying the number of bits of the address to use for matching starting from the left of the address. There are eight bits per field in the IP address. For example, suppose the IP address of your machine is 123.123.234.56. The format to allow only that exact IP address access to your cluster is:

123.123.234.56/32

The /32 indicates 32 bits, which requires matching on all four fields of the address. (If no field matching bits are specified, the default is 32, matching the entire address exactly.)

Matching only part of the address allows a range of IP addresses to access your cluster. This might be useful when accessing the cluster from different client machines on the same network, or if your client machine has an assigned IP address that might change.

For example, if you want to allow other machines from your network to access your cluster if their IP addresses start with 123.123, regardless of what the last two fields are, you could format the address this way:

123.123.0.0/16

Caution

Make your address formats as strict as possible, using as many fields and bits as you can. Address formats that are too open can increase the risk of unauthorized access to your cluster. A format that uses no bits for matching (e.g., 0.0.0.0/0, or 123.123.234.56/0) allows all machines on the Internet to access your cluster.

The computer you are currently accessing the Cloud Center from is automatically added to the access list.

To add machines to the allowed listing, add the IP address or range in the blank field and click Add. To remove an allowed address, click Remove next to the address in the list. You can have up to eight rules in your list; if you already have eight when you add a rule, the oldest is deleted.

When you start a cluster, Cloud Center creates a security group called cluster-access-<identifier> if it does not already exist. Whether the group already exists or is new, Cloud Center then opens the necessary ports for cluster communications. See also Server Sockets Accessed by Client for information about port usage.

The cluster access rules listed are propagated to security groups when an associated cluster is started. You can also apply updates to the cluster access rules to security groups for online clusters, replacing any previous rule set.

The cluster access list is saved between sessions, and is the same for all clusters that share the same network configuration. Clusters configured for Amazon Virtual Private Cloud (VPC) will share a security group per VPC. The security group associated with your cluster is identified on the Cluster Details page when the cluster is online.

Do not modify the security groups created and managed by Cloud Center. You can manage additional security group rules via a separate security group that you attach once the cluster machines are online.

Server Sockets Accessed by Client

MATLAB Parallel Server™ and the Cloud Center require the client to contact servers listening on certain sockets within the cloud. If you limit access to remote ports from your client machines, make sure you allow access to the following remote ports for communication with the cluster resources:

TCP PortsUsage
443Web access to Cloud Center, online licensing, and Amazon Web Services
22SSH

Depending on the MATLAB release that you are using with Cloud Center, make sure that you allow access to the following additional remote ports:

  • For MATLAB releases up to R2016b (includes support for up to 32 workers per machine):

    TCP PortsUsage
    27355Access to MATLAB Job Scheduler on head node
    14350–14479Parallel pool workers

  • For MATLAB R2017a (includes support for up to 32 workers per machine):

    TCP PortsUsage
    27355–27453Access to MATLAB Job Scheduler on head node

In addition, all ports are open for communication between machines within the same cloud cluster, as defined by rules in your AWS security group.

Security Within Clusters

Users with access to a cluster can perform all supported cluster activity. More specifically, anyone with access to the cluster can see or manipulate all the files, processes, and jobs in the cluster, regardless of ownership. If security is a concern, consider limiting who has access to shared clusters or providing users with their own clusters.

Configure AWS VPC for Cloud Center

This section provides guidelines for configuring your VPC to work with Cloud Center. With EC2-VPC, instances run in a virtual private cloud (VPC) that is logically isolated to only one AWS account. MATLAB Parallel Server for Amazon EC2® supports configurations with the Headnode and workers in the same subnet. You need Public IP addressing and internet access.

VPC and Subnet Configuration

Create a VPC and subnet, if you do not have them already. You can create a simple VPC as follows:

Assign Classless Inter-Domain Routing (CIDR) block sizes that support a minimum of at least the number of IP addresses required for the maximum number of cluster machines you wish to create, plus 5. Amazon reserves the first four (4) IP addresses and the last one (1) IP address of every subnet for IP networking purposes. For example, if you wish to run at least 254 cluster machines, your CIDR block size must be at least /23. This allows for a maximum of 507 hosts after subtracting the five reserved for Amazon use. The table below illustrates some options. Also, your network engineering group can help you determine the CIDR blocks needed for your VPC and subnets.

Maximum number of cluster machines desiredSuitable CIDR block
5910.0.0.0/26
12310.0.0.0/25
25110.0.0.0/24
50710.0.0.0/23
101910.0.0.0/22

Set the Enable DNS hostnames option to Yes to ensure that instances receive a DNS hostname. For more information, see Using DNS with Your VPC.

An internet gateway allows your instances to communicate with the internet. An internet gateway should be attached to your VPC.

The route tables control VPC networking. You must define a route to enable traffic destined for an IP address outside the VPC (0.0.0.0/0) to flow from the subnet to the Internet gateway.

Connecting a Desktop Computer (Client Machine) to MATLAB Parallel Server Running on the Amazon EC2 Cloud

  • The client machine must be able to make outgoing connections to any DNS name in the domain mathworks.com and in amazonaws.com on port 443 (https), or have a properly configured SSL capable proxy server that can contact those domains.

  • The client machine must be able to make outgoing connections to the cluster machines in the amazonaws.com domain directly on ports 27355 and 14350 to 14351 + 4*N, where N is the maximum number of workers on a single machine. For example, if there were 8 workers per machine, you should ensure that ports 14350 to 14383 can be contacted.

    Note that ability to "make outgoing connections" means that the client machine must be able to instantiate a socket to the cluster. At a TCP level, this means that the initial SYN packet for the TCP/IP communication comes from the client. Most NAT and general firewalls allow this type of communication, but if you have more stringent rules, you might need to enable such outgoing communication.

  • You must connect the client and the cluster running in the cloud via "always connected" TCP communications. Should a network device between the client computer and the cluster reset the TCP stream, then any open interactive parpool sessions will be shut down.

  • You must configure Cloud Center Cluster Access to allow connections from your computer's external IP address. In most local networks, policies in place mean that the public Internet address of the computer, as seen from other places on the Internet, differs from the local address. Contact your administrator or visit https://whatismyipaddress.com to determine the public Internet address of your computer.

Troubleshooting

Problems and Symptoms

Potential Causes and Solutions

Cluster startup fails due to timeout and no Cluster Start Messages are visible

  • Verify the network access control list (ACL) associated with the cluster’s subnet allows all required inbound and outbound traffic. For more information regarding Network ACLs, see Amazon VPC User Guide.

  • Verify the cluster’s VPC has an Internet gateway attached to enable cluster instances to communicate with the Internet. For more information on configuring an Internet gateway for your VPC, see Amazon VPC Internet Gateway.

  • Verify that the cluster’s subnet route table enables traffic to be routed to the Internet. In a public subnet, this is ensured via a route to an Internet gateway. For information on configuring VPC route tables, see Amazon VPC Route Table.

  • Verify that cluster instances are able to resolve the Fully Qualified Domain Names (FQDN) of all cluster instances. Ensure that the Dynamic Host Configuration Protocol (DHCP) Options Sets associated with the cluster’s VPC are configured correctly. For more information about configuring DHCP Options Sets, see Amazon VPC DHCP Options.

Client is unable to connect to the cluster

  • Ensure that the connectivity checker completes all tests successfully.

    • Ensure that your network firewall allows outbound HTTP and HTTPS traffic to MathWorks and Amazon Web Services domains.

    • Ensure that your network firewall allows outbound traffic on all required ports.

  • Ensure that the cluster profile validation succeeds. This tool verifies connectivity from the client to the cluster instances.

  • Ensure that your client machine's IP address is registered on the cluster access page in Cloud Center.

  • Verify that the cluster is on-line.

AWS Identity and Access Management (IAM)

Create New IAM Role

In order to manage MATLAB Distributing Computing Server clusters in Amazon Web Services (AWS), MathWorks Cloud Center needs access to your AWS resources. You can use an IAM role to establish a trusted relationship between your AWS account and the account belonging to MathWorks Cloud Center. After this relationship is established, the Cloud Center application can obtain temporary security credentials that can then be used to access AWS resources in your account.

Note

AWS GovCloud accounts are not supported in Cloud Center.

To create a role, in Cloud Center, click User Preferences, and follow the on-screen instructions to guide you through the steps.

  1. Click the link in Step 1 to open the Identity and Access Management (IAM) console in a new browser window. Log in to Amazon Web Services (AWS) if prompted. It is easier to complete the steps if you can position both the Cloud Center and AWS console windows to be visible at the same time.

  2. Follow the Cloud Center on-screen instructions.

  3. In the last step, return to the Cloud Center User Preferences window and paste your Role ARN in the Role ARN box.

    Click Save and check that you see your updated AWS account credentials.

Create Custom IAM Access Policy

If you are an intermediate or advanced user of Amazon Web Services, and you are not comfortable granting the AdministratorAccess policy, you can create a custom IAM Policy for finer grained access control.

  1. When you log into Cloud Center, go to the User Preferences page to set up access to your Amazon Web Service (AWS) account. See image under step 11 in the previous section.

  2. On the User Preferences page, you see the MathWorks AWS Account ID and External ID. You will need to copy these IDs in step 11 below.

  3. Log in to the Amazon Web Service (AWS) management console.

  4. Under Security & Identity, click Identity & Access Management to navigate to the IAM dashboard.

  5. In the IAM console, go to the Policies node and select Create Policy. If this is the first time you have worked with IAM policies, select Get Started, and then Create Policy.

  6. In Review Policy, enter a Policy Name and Description (optional). Copy the text below in the Policy Document box:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "cloudformation:*",
            "sns:*",
            "ec2:*",
            "s3:*",
            "sqs:*",
            "iam:*",
            "autoscaling:*"
          ],
          "Resource": "*"
        }
      ]
    } 

    Click Create Policy.

  7. Switch to the Roles page in the left hand navigation pane and click Create New Role.

  8. Enter a Role Name and click Next Step.

  9. On the Select Role Type page, select Role for Cross-Account Access > Allows IAM users from a 3rd party AWS Account to access this account. Click Select > Next Step.

  10. On the Establish Trust page, paste the MathWorks AWS Account ID and the External ID copied from the User Preferences > Add Amazon Web Services Credentials page in Cloud Center. Ensure Require MFA is not selected. Click Next Step.

  11. On the Attach Policy screen, search for the Policy you created in step 7. Select this policy and click Next Step.

  12. On the Review screen, you see a summary of the IAM Role you have just created. Copy your Role ARN. You will need this Role ARN in step 15 below. Click Create Role to save your work.

  13. On the page listing IAM Roles in your account, you now see the role you created for MathWorks Cloud Center.

  14. Return to the Cloud Center User Preferences window and paste your Role ARN (copied in step 13) in the Role ARN box. Click Save and check that your AWS account credentials have been updated.

Edit IAM Role

You can update your AWS Credentials and modify your IAM Role settings as follows:

  1. Navigate to the Edit AWS Credentials page in Cloud Center.

  2. Open a new browser window and log into your AWS Console.

  3. Click on Identity & Access Management to enter the IAM Console.

  4. Click on Roles in the left hand navigation pane.

  5. Click the Role Name you want to edit.

  6. On the Trust Relationships tab, you can modify the trusted entities and conditions of the trust relationship. Click the Show policy document link to see the current policy document. Click Edit Trust Relationship to edit the policy document. Insert the correct values for the AWS account ID and ExternalId shown in italics in the policy document template below:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::MathWorks's_AWS_Account_ID:root"
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "External_ID"
            }
          }
        }
      ]
    }
    

    Below, you see an example of a policy with both substitutions in place. The AWS account ID shown below is the AWS Account Mathworks uses for Cloud Center. The ExternalId value must match the External ID you see on the User Preferences page for AWS credentials in Cloud Center.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::123456789012:root"
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "5b7a6de3-9be1-4554-a740-c861f80ff1f"
            }
          }
        }
      ]
    }
    

    Click Update Trust Policy.

  7. Click the Permissions tab to set the permissions allowed by users who assume the role. You can attach a custom policy or use the built-in AdministratorAccess managed policy.

  8. Confirm that the settings in your Amazon account match the configuration you have supplied to Cloud Center. Save your changes on the Cloud Center Update AWS Credentials page. See the “Update Amazon Web Services Credentials” figure below.

  9. You are directed to User Preferences and you see a confirmation message.

If you are updating your AWS credentials in Cloud Center to integrate with a different AWS account, note the following points:

  • Stop all clusters and wait for them to be completely stopped before updating or deleting your AWS settings in Cloud Center. Otherwise, Cloud Center may not be able to shut down your resources appropriately.

  • When switching AWS accounts, you must update the SSH key name for any existing cluster before attempting to restart the cluster via Cloud Center in the new AWS account.

  • When switching AWS accounts, any existing data on your persistent storage will not be copied to clusters in the new AWS account.

  • When switching AWS accounts, Amazon S3 data from the previous AWS account will not be downloaded to clusters started in the new AWS account.

Create a Custom Amazon Machine Image (AMI)

You can use an Amazon® Machine Image (AMI) when creating a cloud cluster. To create your custom AMI for this purpose, you modify an AMI provided in Cloud Center. Then you can install drivers, libraries, or other utilities, so that they are available for subsequent clusters without having to reinstall them each time.

Use the following procedure to create and customize a cluster AMI in the Amazon EC2 Dashboard of the AWS Management Console.

  1. Start a cluster in Cloud Center as usual, specifying the MATLAB Version you need. This cluster needs only one worker, and under Advance Options you can choose None for persisted storage space.

  2. In a separate browser window, navigate to the AWS Management Console at
    https://console.aws.amazon.com. Select Services > EC2, then click Running Instances. Your instances will include clusters you start in Cloud Center.

  3. Select the running instance you started in step 1. It has the same name as the cluster in Cloud Center.

  4. In the Instance Details section (lower half of page), look for the AMI ID of your new instance. Note or copy the value shown in parentheses. It will be the string starting with ami- followed by some hexadecimal code.

  5. Shut down the cluster in Cloud Center. You no longer need it; you can delete it if you want to.

  6. In the AWS Management Console, click Launch Instance at the top of the page. This starts a wizard with the steps shown in tabs at the top of the page; the first tab is Choose AMI.

  7. On the left side, choose the following settings:

    1. Select My AMIs.

    2. Select Ownership: Shared with me.

  8. In the Search My AMIs field, enter the AMI ID value noted above (starts with ami-, do not include parentheses). When your instance is shown, click Select.

  9. Click the tab Choose Instance Type, and select a type.

  10. Click the tab Configure Security Groups.

  11. Modify or add a security rule with the following settings:

    TypeSSH
    ProtocolTCP
    Port Range22
    SourceMy IP
  12. Click Review and Launch.

    If you see a dialog box asking about booting from General Purpose (SSD), select your preferred option and click Next.

  13. If everything looks correct in the review dialog box, click Launch.

  14. You will be asked to select a key pair. You can use an existing key pair that you have access to. After acknowledging, click Launch Instances.

    You can track the progress of your instance. Click View Instances. The Instance State for your new instance should say Running before you proceed.

  15. If necessary, log on to the new instance via SSH and install any libraries, drivers, etc.

  16. Stop the running instance by selecting it in the AWS Management Console, then clicking Actions > Stop.

  17. In the AWS Management Console select the instance (it might still be selected) and click Actions > Create Image.

  18. Provide a name and description that will help you identify your new AMI. Use a name that suggests the MATLAB version, installed libraries or drivers, etc. Click Create Image. Note its AMI ID.

In Cloud Center, you can now use that AMI when starting a new cluster. It will be available in the Operating System Image drop-down list in the Advanced Options of the Create Cluster dialog box.