title | titleSuffix | description | services | ms.service | ms.subservice | ms.reviewer | ms.author | author | ms.date | ms.topic | ms.custom | adobe-target |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Deploy machine learning models |
Azure Machine Learning |
Learn how and where to deploy machine learning models. Deploy to Azure Container Instances, Azure Kubernetes Service, and FPGA. |
machine-learning |
machine-learning |
core |
larryfr |
ssambare |
shivanissambare |
11/12/2021 |
how-to |
devx-track-python, deploy, devx-track-azurecli, contperf-fy21q2, contperf-fy21q4, mktng-kw-nov2021, cliv1, sdkv1, event-tier1-build-2022 |
true |
[!INCLUDE sdk & cli v1]
Learn how to deploy your machine learning or deep learning model as a web service in the Azure cloud.
[!INCLUDE endpoints-option]
The workflow is similar no matter where you deploy your model:
- Register the model.
- Prepare an entry script.
- Prepare an inference configuration.
- Deploy the model locally to ensure everything works.
- Choose a compute target.
- Deploy the model to the cloud.
- Test the resulting web service.
For more information on the concepts involved in the machine learning deployment workflow, see Manage, deploy, and monitor models with Azure Machine Learning.
[!INCLUDE cli v1]
[!INCLUDE cli10-only]
- An Azure Machine Learning workspace. For more information, see Create an Azure Machine Learning workspace.
- A model. The examples in this article use a pre-trained model.
- A machine that can run Docker, such as a compute instance.
- An Azure Machine Learning workspace. For more information, see Create an Azure Machine Learning workspace.
- A model. The examples in this article use a pre-trained model.
- The Azure Machine Learning software development kit (SDK) for Python.
- A machine that can run Docker, such as a compute instance.
[!INCLUDE cli v1]
To see the workspaces that you have access to, use the following commands:
az login
az account set -s <subscription>
az ml workspace list --resource-group=<resource-group>
[!INCLUDE sdk v1]
from azureml.core import Workspace
ws = Workspace(subscription_id="<subscription_id>",
resource_group="<resource_group>",
workspace_name="<workspace_name>")
For more information on using the SDK to connect to a workspace, see the Azure Machine Learning SDK for Python documentation.
A typical situation for a deployed machine learning service is that you need the following components:
- Resources representing the specific model that you want deployed (for example: a pytorch model file).
- Code that you will be running in the service, that executes the model on a given input.
Azure Machine Learnings allows you to separate the deployment into two separate components, so that you can keep the same code, but merely update the model. We define the mechanism by which you upload a model separately from your code as "registering the model".
When you register a model, we upload the model to the cloud (in your workspace's default storage account) and then mount it to the same compute where your webservice is running.
The following examples demonstrate how to register a model.
[!INCLUDE trusted models]
[!INCLUDE cli v1]
The following commands download a model and then register it with your Azure Machine Learning workspace:
wget https://aka.ms/bidaf-9-model -O model.onnx --show-progress
az ml model register -n bidaf_onnx \
-p ./model.onnx \
-g <resource-group> \
-w <workspace-name>
Set -p
to the path of a folder or a file that you want to register.
For more information on az ml model register
, see the reference documentation.
If you need to register a model that was created previously through an Azure Machine Learning training job, you can specify the experiment, run, and path to the model:
az ml model register -n bidaf_onnx --asset-path outputs/model.onnx --experiment-name myexperiment --run-id myrunid --tag area=qna
The --asset-path
parameter refers to the cloud location of the model. In this example, the path of a single file is used. To include multiple files in the model registration, set --asset-path
to the path of a folder that contains the files.
For more information on az ml model register
, see the reference documentation.
You can register a model by providing the local path of the model. You can provide the path of either a folder or a single file on your local machine.
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=register-model-from-local-file-code)]
To include multiple files in the model registration, set model_path
to the path of a folder that contains the files.
For more information, see the documentation for the Model class.
When you use the SDK to train a model, you can receive either a Run object or an AutoMLRun object, depending on how you trained the model. Each object can be used to register a model created by an experiment run.
-
Register a model from an
azureml.core.Run
object:[!INCLUDE sdk v1]
model = run.register_model(model_name='bidaf_onnx', tags={'area': 'qna'}, model_path='outputs/model.onnx') print(model.name, model.id, model.version, sep='\t')
The
model_path
parameter refers to the cloud location of the model. In this example, the path of a single file is used. To include multiple files in the model registration, setmodel_path
to the path of a folder that contains the files. For more information, see the Run.register_model documentation. -
Register a model from an
azureml.train.automl.run.AutoMLRun
object:[!INCLUDE sdk v1]
description = 'My AutoML Model' model = run.register_model(description = description, tags={'area': 'qna'}) print(run.model_id)
In this example, the
metric
anditeration
parameters aren't specified, so the iteration with the best primary metric will be registered. Themodel_id
value returned from the run is used instead of a model name.For more information, see the AutoMLRun.register_model documentation.
To deploy a registered model from an
AutoMLRun
, we recommend doing so via the one-click deploy button in Azure Machine Learning studio.
The entry script receives data submitted to a deployed web service and passes it to the model. It then returns the model's response to the client. The script is specific to your model. The entry script must understand the data that the model expects and returns.
The two things you need to accomplish in your entry script are:
- Loading your model (using a function called
init()
) - Running your model on input data (using a function called
run()
)
For your initial deployment, use a dummy entry script that prints the data it receives.
:::code language="python" source="~/azureml-examples-main/python-sdk/tutorials/deploy-local/source_dir/echo_score.py":::
Save this file as echo_score.py
inside of a directory called source_dir
. This dummy script returns the data you send to it, so it doesn't use the model. But it is useful for testing that the scoring script is running.
An inference configuration describes the Docker container and files to use when initializing your web service. All of the files within your source directory, including subdirectories, will be zipped up and uploaded to the cloud when you deploy your web service.
The inference configuration below specifies that the machine learning deployment will use the file echo_score.py
in the ./source_dir
directory to process incoming requests and that it will use the Docker image with the Python packages specified in the project_environment
environment.
You can use any Azure Machine Learning inference curated environments as the base Docker image when creating your project environment. We will install the required dependencies on top and store the resulting Docker image into the repository that is associated with your workspace.
Note
Azure machine learning inference source directory upload does not respect .gitignore or .amlignore
[!INCLUDE cli v1]
A minimal inference configuration can be written as:
:::code language="json" source="~/azureml-examples-main/python-sdk/tutorials/deploy-local/dummyinferenceconfig.json":::
Save this file with the name dummyinferenceconfig.json
.
See this article for a more thorough discussion of inference configurations.
The following example demonstrates how to create a minimal environment with no pip dependencies, using the dummy scoring script you defined above.
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=inference-configuration-code)]
For more information on environments, see Create and manage environments for training and deployment.
For more information on inference configuration, see the InferenceConfig class documentation.
A deployment configuration specifies the amount of memory and cores your webservice needs in order to run. It also provides configuration details of the underlying webservice. For example, a deployment configuration lets you specify that your service needs 2 gigabytes of memory, 2 CPU cores, 1 GPU core, and that you want to enable autoscaling.
The options available for a deployment configuration differ depending on the compute target you choose. In a local deployment, all you can specify is which port your webservice will be served on.
[!INCLUDE cli v1]
[!INCLUDE aml-local-deploy-config]
For more information, see the deployment schema.
The following Python demonstrates how to create a local deployment configuration:
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=deployment-configuration-code)]
You are now ready to deploy your model.
[!INCLUDE cli v1]
Replace bidaf_onnx:1
with the name of your model and its version number.
az ml model deploy -n myservice \
-m bidaf_onnx:1 \
--overwrite \
--ic dummyinferenceconfig.json \
--dc deploymentconfig.json \
-g <resource-group> \
-w <workspace-name>
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=deploy-model-code)]
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=deploy-model-print-logs)]
For more information, see the documentation for Model.deploy() and Webservice.
Let's check that your echo model deployed successfully. You should be able to do a simple liveness request, as well as a scoring request:
[!INCLUDE cli v1]
curl -v http://localhost:32267
curl -v -X POST -H "content-type:application/json" \
-d '{"query": "What color is the fox", "context": "The quick brown fox jumped over the lazy dog."}' \
http://localhost:32267/score
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=call-into-model-code)]
Now it's time to actually load your model. First, modify your entry script:
:::code language="python" source="~/azureml-examples-main/python-sdk/tutorials/deploy-local/source_dir/score.py":::
Save this file as score.py
inside of source_dir
.
Notice the use of the AZUREML_MODEL_DIR
environment variable to locate your registered model. Now that you've added some pip packages.
[!INCLUDE cli v1]
:::code language="json" source="~/azureml-examples-main/python-sdk/tutorials/deploy-local/inferenceconfig.json":::
Save this file as inferenceconfig.json
[!INCLUDE sdk v1]
env = Environment(name='myenv')
python_packages = ['nltk', 'numpy', 'onnxruntime']
for package in python_packages:
env.python.conda_dependencies.add_pip_package(package)
inference_config = InferenceConfig(environment=env, source_directory='./source_dir', entry_script='./score.py')
For more information, see the documentation for LocalWebservice, Model.deploy(), and Webservice.
Deploy your service again:
[!INCLUDE cli v1]
Replace bidaf_onnx:1
with the name of your model and its version number.
az ml model deploy -n myservice \
-m bidaf_onnx:1 \
--overwrite \
--ic inferenceconfig.json \
--dc deploymentconfig.json \
-g <resource-group> \
-w <workspace-name>
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=re-deploy-model-code)]
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=re-deploy-model-print-logs)]
For more information, see the documentation for Model.deploy() and Webservice.
Then ensure you can send a post request to the service:
[!INCLUDE cli v1]
curl -v -X POST -H "content-type:application/json" \
-d '{"query": "What color is the fox", "context": "The quick brown fox jumped over the lazy dog."}' \
http://localhost:32267/score
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=send-post-request-code)]
[!INCLUDE aml-deploy-target]
Once you've confirmed your service works locally and chosen a remote compute target, you are ready to deploy to the cloud.
Change your deploy configuration to correspond to the compute target you've chosen, in this case Azure Container Instances:
[!INCLUDE cli v1]
The options available for a deployment configuration differ depending on the compute target you choose.
:::code language="json" source="~/azureml-examples-main/python-sdk/tutorials/deploy-local/re-deploymentconfig.json":::
Save this file as re-deploymentconfig.json
.
For more information, see this reference.
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=deploy-model-on-cloud-code)]
Deploy your service again:
[!INCLUDE cli v1]
Replace bidaf_onnx:1
with the name of your model and its version number.
az ml model deploy -n myservice \
-m bidaf_onnx:1 \
--overwrite \
--ic inferenceconfig.json \
--dc re-deploymentconfig.json \
-g <resource-group> \
-w <workspace-name>
To view the service logs, use the following command:
az ml service get-logs -n myservice \
-g <resource-group> \
-w <workspace-name>
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=re-deploy-service-code)]
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=re-deploy-service-print-logs)]
For more information, see the documentation for Model.deploy() and Webservice.
When you deploy remotely, you may have key authentication enabled. The example below shows how to get your service key with Python in order to make an inference request.
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=call-remote-web-service-code)]
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=call-remote-webservice-print-logs)]
See the article on client applications to consume web services for more example clients in other languages.
[!INCLUDE Email Notification Include]
During model deployment, you may see the service state change while it fully deploys.
The following table describes the different service states:
Webservice state | Description | Final state? |
---|---|---|
Transitioning | The service is in the process of deployment. | No |
Unhealthy | The service has deployed but is currently unreachable. | No |
Unschedulable | The service cannot be deployed at this time due to lack of resources. | No |
Failed | The service has failed to deploy due to an error or crash. | Yes |
Healthy | The service is healthy and the endpoint is available. | Yes |
Tip
When deploying, Docker images for compute targets are built and loaded from Azure Container Registry (ACR). By default, Azure Machine Learning creates an ACR that uses the basic service tier. Changing the ACR for your workspace to standard or premium tier may reduce the time it takes to build and deploy images to your compute targets. For more information, see Azure Container Registry service tiers.
Note
If you are deploying a model to Azure Kubernetes Service (AKS), we advise you enable Azure Monitor for that cluster. This will help you understand overall cluster health and resource usage. You might also find the following resources useful:
If you are trying to deploy a model to an unhealthy or overloaded cluster, it is expected to experience issues. If you need help troubleshooting AKS cluster problems please contact AKS Support.
[!INCLUDE cli v1]
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/2.deploy-local-cli.ipynb?name=delete-resource-code)]
az ml service delete -n myservice
az ml service delete -n myaciservice
az ml model delete --model-id=<MODEL_ID>
To delete a deployed webservice, use az ml service delete <name of webservice>
.
To delete a registered model from your workspace, use az ml model delete <model id>
Read more about deleting a webservice and deleting a model.
[!Notebook-python[] (~/azureml-examples-main/python-sdk/tutorials/deploy-local/1.deploy-local.ipynb?name=delete-resource-code)]
To delete a deployed web service, use service.delete()
.
To delete a registered model, use model.delete()
.
For more information, see the documentation for WebService.delete() and Model.delete().
- Troubleshoot a failed deployment
- Update web service
- One click deployment for automated ML runs in the Azure Machine Learning studio
- Use TLS to secure a web service through Azure Machine Learning
- Monitor your Azure Machine Learning models with Application Insights
- Create event alerts and triggers for model deployments