title | titleSuffix | description | services | ms.service | ms.subservice | ms.custom | ms.topic | ms.author | author | ms.reviewer | ms.date |
---|---|---|---|---|---|---|---|---|---|---|---|
Secure network traffic flow |
Azure Machine Learning |
Learn how network traffic flows between components when your Azure Machine Learning workspace is in a secured virtual network. |
machine-learning |
machine-learning |
enterprise-readiness |
event-tier1-build-2022 |
conceptual |
jhirono |
jhirono |
larryfr |
04/08/2022 |
When your Azure Machine Learning workspace and associated resources are secured in an Azure Virtual Network, it changes the network traffic between resources. Without a virtual network, network traffic flows over the public internet or within an Azure data center. Once a virtual network (VNet) is introduced, you may also want to harden network security. For example, blocking inbound and outbound communications between the VNet and public internet. However, Azure Machine Learning requires access to some resources on the public internet. For example, Azure Resource Management is used for deployments and management operations.
This article lists the required traffic to/from the public internet. It also explains how network traffic flows between your client development environment and a secured Azure Machine Learning workspace in the following scenarios:
-
Using Azure Machine Learning studio to work with:
- Your workspace
- AutoML
- Designer
- Datasets and datastores
[!TIP] Azure Machine Learning studio is a web-based UI that runs partially in your web browser, and makes calls to Azure services to perform tasks such as training a model, using designer, or viewing datasets. Some of these calls use a different communication flow than if you are using the SDK, CLI, REST API, or VS Code.
-
Using Azure Machine Learning studio, SDK, CLI, or REST API to work with:
- Compute instances and clusters
- Azure Kubernetes Service
- Docker images managed by Azure Machine Learning
Tip
If a scenario or task is not listed here, it should work the same with or without a secured workspace.
This article assumes the following configuration:
- Azure Machine Learning workspace using a private endpoint to communicate with the VNet.
- The Azure Storage Account, Key Vault, and Container Registry used by the workspace also use a private endpoint to communicate with the VNet.
- A VPN gateway or Express Route is used by the client workstations to access the VNet.
Scenario | Required inbound | Required outbound | Additional configuration |
---|---|---|---|
Access workspace from studio | NA |
|
You may need to use a custom DNS server. For more information, see Use your workspace with a custom DNS. |
Use AutoML, designer, dataset, and datastore from studio | NA | NA |
|
Use compute instance and compute cluster |
|
|
If you use a firewall, create user-defined routes. For more information, see Configure inbound and outbound traffic. |
Use Azure Kubernetes Service | NA | For information on the outbound configuration for AKS, see How to deploy to Azure Kubernetes Service. | Configure the Internal Load Balancer. For more information, see How to deploy to Azure Kubernetes Service. |
Use Docker images managed by Azure Machine Learning | NA |
|
If the Azure Container Registry for your workspace is behind the VNet, configure the workspace to use a compute cluster to build images. For more information, see How to secure a workspace in a virtual network. |
Important
Azure Machine Learning uses multiple storage accounts. Each stores different data, and has a different purpose:
-
Your storage: The Azure Storage Account(s) in your Azure subscription are used to store your data and artifacts such as models, training data, training logs, and Python scripts. For example, the default storage account for your workspace is in your subscription. The Azure Machine Learning compute instance and compute clusters access file and blob data in this storage over ports 445 (SMB) and 443 (HTTPS).
When using a compute instance or compute cluster, your storage account is mounted as a file share using the SMB protocol. The compute instance and cluster use this file share to store the data, models, Jupyter notebooks, datasets, etc. The compute instance and cluster use the private endpoint when accessing the storage account.
-
Microsoft storage: The Azure Machine Learning compute instance and compute clusters rely on Azure Batch, and access storage located in a Microsoft subscription. This storage is used only for the management of the compute instance/cluster. None of your data is stored here. The compute instance and compute cluster access the blob, table, and queue data in this storage, using port 443 (HTTPS).
Machine Learning also stores metadata in an Azure Cosmos DB instance. By default, this instance is hosted in a Microsoft subscription and managed by Microsoft. You can optionally use an Azure Cosmos DB instance in your Azure subscription. For more information, see Data encryption with Azure Machine Learning.
Note
The information in this section is specific to using the workspace from the Azure Machine Learning studio. If you use the Azure Machine Learning SDK, REST API, CLI, or Visual Studio Code, the information in this section does not apply to you.
When accessing your workspace from studio, the network traffic flows are as follows:
- To authenticate to resources, Azure Active Directory is used.
- For management and deployment operations, Azure Resource Manager is used.
- For Azure Machine Learning specific tasks, Azure Machine Learning service is used
- For access to Azure Machine Learning studio (https://ml.azure.com), Azure FrontDoor is used.
- For most storage operations, traffic flows through the private endpoint of the default storage for your workspace. Exceptions are discussed in the Use AutoML, designer, dataset, and datastore section.
- You also need to configure a DNS solution that allows you to resolve the names of the resources within the VNet. For more information, see Use your workspace with a custom DNS.
:::image type="content" source="./media/concept-secure-network-traffic-flow/workspace-traffic-studio.png" alt-text="Diagram of network traffic between client and workspace when using studio":::
The following features of Azure Machine Learning studio use data profiling:
- Dataset: Explore the dataset from studio.
- Designer: Visualize module output data.
- AutoML: View a data preview/profile and choose a target column.
- Labeling
Data profiling depends on the Azure Machine Learning managed service being able to access the default Azure Storage Account for your workspace. The managed service doesn't exist in your VNet, so can’t directly access the storage account in the VNet. Instead, the workspace uses a service principal to access storage.
Tip
You can provide a service principal when creating the workspace. If you do not, one is created for you and will have the same name as your workspace.
To allow access to the storage account, configure the storage account to allow a resource instance for your workspace or select the Allow Azure services on the trusted services list to access this storage account. This setting allows the managed service to access storage through the Azure data center network.
Next, add the service principal for the workspace to the Reader role to the private endpoint of the storage account. This role is used to verify the workspace and storage subnet information. If they're the same, access is allowed. Finally, the service principal also requires Blob data contributor access to the storage account.
For more information, see the Azure Storage Account section of How to secure a workspace in a virtual network.
:::image type="content" source="./media/concept-secure-network-traffic-flow/storage-traffic-studio.png" alt-text="Diagram of traffic between client, data profiling, and storage":::
Azure Machine Learning compute instance and compute cluster are managed services hosted by Microsoft. They're built on top of the Azure Batch service. While they exist in a Microsoft managed environment, they're also injected into your VNet.
When you create a compute instance or compute cluster, the following resources are also created in your VNet:
-
A Network Security Group with required outbound rules. These rules allow inbound access from the Azure Machine Learning (TCP on port 44224) and Azure Batch service (TCP on ports 29876-29877).
[!IMPORTANT] If you usee a firewall to block internet access into the VNet, you must configure the firewall to allow this traffic. For example, with Azure Firewall you can create user-defined routes. For more information, see How to use Azure Machine Learning with a firewall.
-
A load balancer with a public IP.
Also allow outbound access to the following service tags. For each tag, replace region
with the Azure region of your compute instance/cluster:
Storage.region
- This outbound access is used to connect to the Azure Storage Account inside the Azure Batch service-managed VNet.Keyvault.region
- This outbound access is used to connect to the Azure Key Vault account inside the Azure Batch service-managed VNet.
Data access from your compute instance or cluster goes through the private endpoint of the Storage Account for your VNet.
If you use Visual Studio Code on a compute instance, you must allow other outbound traffic. For more information, see How to use Azure Machine Learning with a firewall.
:::image type="content" source="./media/concept-secure-network-traffic-flow/compute-instance-and-cluster.png" alt-text="Diagram of traffic flow when using compute instance or cluster":::
Securing an online endpoint with a private endpoint is a preview feature.
[!INCLUDE preview disclaimer]
Inbound communication with the scoring URL of the online endpoint can be secured using the public_network_access
flag on the endpoint. Setting the flag to disabled
restricts the online endpoint to receiving traffic only from the virtual network. For secure inbound communications, the Azure Machine Learning workspace's private endpoint is used.
Outbound communication from a deployment can be secured on a per-deployment basis by using the egress_public_network_access
flag. Outbound communication in this case is from the deployment to Azure Container Registry, storage blob, and workspace. Setting the flag to true
will restrict communication with these resources to the virtual network.
Note
For secure outbound communication, a private endpoint is created for each deployment where egress_public_network_access
is set to disabled
.
Visibility of the endpoint is also governed by the public_network_access
flag of the Azure Machine Learning workspace. If this flag is disabled
, then the scoring endpoints can only be accessed from virtual networks that contain a private endpoint for the workspace. If it is enabled
, then the scoring endpoint can be accessed from the virtual network and public networks.
Configuration | Inbound (Endpoint property) |
Outbound (Deployment property) |
Supported? |
---|---|---|---|
secure inbound with secure outbound | public_network_access is disabled |
egress_public_network_access is disabled |
Yes |
secure inbound with public outbound | public_network_access is disabled |
egress_public_network_access is enabled |
Yes |
public inbound with secure outbound | public_network_access is enabled |
egress_public_network_access is disabled |
Yes |
public inbound with public outbound | public_network_access is enabled |
egress_public_network_access is enabled |
Yes |
For information on the outbound configuration required for Azure Kubernetes Service, see the connectivity requirements section of How to deploy to Azure Kubernetes Service.
Note
The Azure Kubernetes Service load balancer is not the same as the load balancer created by Azure Machine Learning. If you want to host your model as a secured application, only available on the VNet, use the internal load balancer created by Azure Machine Learning. If you want to allow public access, use the public load balancer created by Azure Machine Learning.
If your model requires extra inbound or outbound connectivity, such as to an external data source, use a network security group or your firewall to allow the traffic.
Azure Machine Learning provides Docker images that can be used to train models or perform inference. If you don't specify your own images, the ones provided by Azure Machine Learning are used. These images are hosted on the Microsoft Container Registry (MCR). They're also hosted on a geo-replicated Azure Container Registry named viennaglobal.azurecr.io
.
If you provide your own docker images, such as on an Azure Container Registry that you provide, you don't need the outbound communication with MCR or viennaglobal.azurecr.io
.
Tip
If your Azure Container Registry is secured in the VNet, it cannot be used by Azure Machine Learning to build Docker images. Instead, you must designate an Azure Machine Learning compute cluster to build images. For more information, see How to secure a workspace in a virtual network.
:::image type="content" source="./media/concept-secure-network-traffic-flow/azure-machine-learning-docker-images.png" alt-text="Diagram of traffic flow when using provided Docker images":::
Now that you've learned how network traffic flows in a secured configuration, learn more about securing Azure ML in a virtual network by reading the Virtual network isolation and privacy overview article.
For information on best practices, see the Azure Machine Learning best practices for enterprise security article.