title | titleSuffix | description | services | author | ms.author | ms.reviewer | ms.service | ms.subservice | ms.date | ms.topic | ms.custom |
---|---|---|---|---|---|---|---|---|---|---|---|
Create training & deploy computes (studio) |
Azure Machine Learning |
Use studio to create training and deployment compute resources (compute targets) for machine learning |
machine-learning |
sdgilley |
sgilley |
sgilley |
machine-learning |
core |
10/21/2021 |
how-to |
contperf-fy21q1, sdkv1, event-tier1-build-2022 |
In this article, learn how to create and manage compute targets in Azure Machine studio. You can also create and manage compute targets with:
- Azure Machine Learning Learning SDK or CLI extension for Azure Machine Learning
- The VS Code extension for Azure Machine Learning.
Important
Items marked (preview) in this article are currently in public preview. The preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
- If you don't have an Azure subscription, create a free account before you begin. Try the free or paid version of Azure Machine Learning today
- An Azure Machine Learning workspace
With Azure Machine Learning, you can train your model on a variety of resources or environments, collectively referred to as compute targets. A compute target can be a local machine or a cloud resource, such as an Azure Machine Learning Compute, Azure HDInsight, or a remote virtual machine. You can also create compute targets for model deployment as described in "Where and how to deploy your models".
To see all compute targets for your workspace, use the following steps:
-
Navigate to Azure Machine Learning studio.
-
Under Manage, select Compute.
-
Select tabs at the top to show each type of compute target.
:::image type="content" source="media/how-to-create-attach-studio/view-compute-targets.png" alt-text="View list of compute targets":::
Follow the previous steps to view the list of compute targets. Then use these steps to create a compute target:
-
Select the tab at the top corresponding to the type of compute you will create.
-
If you have no compute targets, select Create in the middle of the page.
:::image type="content" source="media/how-to-create-attach-studio/create-compute-target.png" alt-text="Create compute target":::
-
If you see a list of compute resources, select +New above the list.
:::image type="content" source="media/how-to-create-attach-studio/select-new.png" alt-text="Select new":::
-
Fill out the form for your compute type:
-
Select Create.
-
View the status of the create operation by selecting the compute target from the list:
:::image type="content" source="media/how-to-create-attach-studio/view-list.png" alt-text="View compute status from a list":::
Follow the steps in Create and manage an Azure Machine Learning compute instance.
Create a single or multi node compute cluster for your training, batch inferencing or reinforcement learning workloads. Use the steps above to create the compute cluster. Then fill out the form as follows:
Field | Description |
---|---|
Location | The Azure region where the compute cluster will be created. By default, this is the same location as the workspace. Setting the location to a different region than the workspace is in preview, and is only available for compute clusters, not compute instances. When using a different region than your workspace or datastores, you may see increased network latency and data transfer costs. The latency and costs can occur when creating the cluster, and when running jobs on it. |
Virtual machine type | Choose CPU or GPU. This type cannot be changed after creation |
Virtual machine priority | Choose Dedicated or Low priority. Low priority virtual machines are cheaper but don't guarantee the compute nodes. Your job may be preempted. |
Virtual machine size | Supported virtual machine sizes might be restricted in your region. Check the availability list |
Select Next to proceed to Advanced Settings and fill out the form as follows:
Field | Description |
---|---|
Compute name | |
Minimum number of nodes | Minimum number of nodes that you want to provision. If you want a dedicated number of nodes, set that count here. Save money by setting the minimum to 0, so you won't pay for any nodes when the cluster is idle. |
Maximum number of nodes | Maximum number of nodes that you want to provision. The compute will autoscale to a maximum of this node count when a job is submitted. |
Idle seconds before scale down | Idle time before scaling the cluster down to the minimum node count. |
Enable SSH access | Use the same instructions as Enable SSH access for a compute instance (above). |
Advanced settings | Optional. Configure a virtual network. Specify the Resource group, Virtual network, and Subnet to create the compute instance inside an Azure Virtual Network (vnet). For more information, see these network requirements for vnet. Also attach managed identities to grant access to resources |
SSH access is disabled by default. SSH access cannot be changed after creation. Make sure to enable access if you plan to debug interactively with VS Code Remote.
[!INCLUDE amlinclude-info]
Once the compute cluster is created and running, see Connect with SSH access.
[!INCLUDE aml-clone-in-azure-notebook]
During cluster creation or when editing compute cluster details, in the Advanced settings, toggle Assign a managed identity and specify a system-assigned identity or user-assigned identity.
[!INCLUDE aml-clone-in-azure-notebook]
Important
Using Azure Kubernetes Service with Azure Machine Learning has multiple configuration options. Some scenarios, such as networking, require additional setup and configuration. For more information on using AKS with Azure ML, see Create and attach an Azure Kubernetes Service cluster.
Create or attach an Azure Kubernetes Service (AKS) cluster for large scale inferencing. Use the steps above to create the AKS cluster. Then fill out the form as follows:
Field | Description |
---|---|
Compute name | |
Kubernetes Service | Select Create New and fill out the rest of the form. Or select Use existing and then select an existing AKS cluster from your subscription. |
Region | Select the region where the cluster will be created |
Virtual machine size | Supported virtual machine sizes might be restricted in your region. Check the availability list |
Cluster purpose | Select Production or Dev-test |
Number of nodes | The number of nodes multiplied by the virtual machine’s number of cores (vCPUs) must be greater than or equal to 12. |
Network configuration | Select Advanced to create the compute within an existing virtual network. For more information about AKS in a virtual network, see Network isolation during training and inference with private endpoints and virtual networks. |
Enable SSL configuration | Use this to configure SSL certificate on the compute |
To use compute targets created outside the Azure Machine Learning workspace, you must attach them. Attaching a compute target makes it available to your workspace. Use Attached compute to attach a compute target for training. Use Inference clusters to attach an AKS cluster for inferencing.
Use the steps above to attach a compute. Then fill out the form as follows:
-
Enter a name for the compute target.
-
Select the type of compute to attach. Not all compute types can be attached from Azure Machine Learning studio. The compute types that can currently be attached for training include:
- An Azure Virtual Machine (to attach a Data Science Virtual Machine)
- Azure Databricks (for use in machine learning pipelines)
- Azure Data Lake Analytics (for use in machine learning pipelines)
- Azure HDInsight
- Kubernetes (preview)
-
Fill out the form and provide values for the required properties.
[!NOTE] Microsoft recommends that you use SSH keys, which are more secure than passwords. Passwords are vulnerable to brute force attacks. SSH keys rely on cryptographic signatures. For information on how to create SSH keys for use with Azure Virtual Machines, see the following documents:
-
Select Attach.
[!INCLUDE arc-enabled-machine-learning-create-training-compute]
Important
To attach an Azure Kubernetes Services (AKS) or Azure Arc-enabled Kubernetes cluster, you must be subscription owner or have permission to access AKS cluster resources under the subscription. Otherwise, the cluster list on "attach new compute" page will be blank.
To detach your compute use the following steps:
- In Azure Machine Learning studio, select Compute, Attached compute, and the compute you wish to remove.
- Use the Detach link to detach your compute.
If you created your compute instance or compute cluster with SSH access enabled, use these steps for access.
-
Find the compute in your workspace resources:
- On the left, select Compute.
- Use the tabs at the top to select Compute instance or Compute cluster to find your machine.
-
Select the compute name in the list of resources.
-
Find the connection string:
-
For a compute instance, select Connect at the top of the Details section.
:::image type="content" source="media/how-to-create-attach-studio/details.png" alt-text="Screenshot: Connect tool at the top of the Details page.":::
-
For a compute cluster, select Nodes at the top, then select the Connection string in the table for your node. :::image type="content" source="media/how-to-create-attach-studio/compute-nodes.png" alt-text="Screenshot: Connection string for a node in a compute cluster.":::
-
-
Copy the connection string.
-
For Windows, open PowerShell or a command prompt:
-
Go into the directory or folder where your key is stored
-
Add the -i flag to the connection string to locate the private key and point to where it is stored:
ssh -i <keyname.pem> azureuser@... (rest of connection string)
-
-
For Linux users, follow the steps from Create and use an SSH key pair for Linux VMs in Azure
-
For SCP use:
scp -i key.pem -P {port} {fileToCopyFromLocal } azureuser@yourComputeInstancePublicIP:~/{destination}
After a target is created and attached to your workspace, you use it in your run configuration with a ComputeTarget
object:
[!INCLUDE sdk v1]
from azureml.core.compute import ComputeTarget
myvm = ComputeTarget(workspace=ws, name='my-vm-name')
- Use the compute resource to submit a training run.
- Learn how to efficiently tune hyperparameters to build better models.
- Once you have a trained model, learn how and where to deploy models.
- Use Azure Machine Learning with Azure Virtual Networks