Skip to content

Files

Latest commit

May 25, 2022
132b23f · May 25, 2022

History

History
225 lines (175 loc) · 9.96 KB

how-to-use-batch-endpoint-sdk-v2.md

File metadata and controls

225 lines (175 loc) · 9.96 KB
title titleSuffix description services ms.service ms.subservice ms.topic author ms.author ms.reviewer ms.date ms.custom
Use batch endpoints for batch scoring using Python SDK v2 (preview)
Azure Machine Learning
In this article, learn how to create a batch endpoint to continuously batch score large data using Python SDK v2 (preview).
machine-learning
machine-learning
mlops
how-to
shivanissambare
ssambare
larryfr
05/25/2022
how-to, devplatv2, sdkv2

Use batch endpoints for batch scoring using Python SDK v2 (preview)

[!INCLUDE sdk v2]

Important

SDK v2 is currently in public preview. The preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Learn how to use batch endpoints to do batch scoring using Python SDK v2. Batch endpoints simplify the process of hosting your models for batch scoring, so you can focus on machine learning, not infrastructure. For more information, see What are Azure Machine Learning endpoints?.

In this article, you'll learn to:

  • Connect to your Azure machine learning workspace from the Python SDK v2.
  • Create a batch endpoint from Python SDK v2.
  • Create deployments on that endpoint from Python SDK v2.
  • Test a deployment with a sample request.

Prerequisites

1. Connect to Azure Machine Learning workspace

The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section, we'll connect to the workspace in which the job will be run.

  1. Import the required libraries:

    # import required libraries
    from azure.ai.ml import MLClient, Input
    from azure.ai.ml.entities import (
        BatchEndpoint,
        BatchDeployment,
        Model,
        Environment,
        BatchRetrySettings,
    )
    from azure.ai.ml.entities._assets import Dataset
    from azure.identity import DefaultAzureCredential
    from azure.ai.ml.constants import BatchDeploymentOutputAction
  2. Configure workspace details and get a handle to the workspace:

    To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We'll use these details in the MLClient from azure.ai.ml to get a handle to the required Azure Machine Learning workspace. This example uses the default Azure authentication.

    # enter details of your AML workspace
    subscription_id = "<SUBSCRIPTION_ID>"
    resource_group = "<RESOURCE_GROUP>"
    workspace = "<AML_WORKSPACE_NAME>"
    # get a handle to the workspace
    ml_client = MLClient(
        DefaultAzureCredential(), subscription_id, resource_group, workspace
    )

Create batch endpoint

Batch endpoints are endpoints that are used batch inferencing on large volumes of data over a period of time. Batch endpoints receive pointers to data and run jobs asynchronously to process the data in parallel on compute clusters. Batch endpoints store outputs to a data store for further analysis.

To create an online endpoint, we'll use BatchEndpoint. This class allows user to configure the following key aspects:

  • name - Name of the endpoint. Needs to be unique at the Azure region level
  • auth_mode - The authentication method for the endpoint. Currently only Azure Active Directory (Azure AD) token-based (aad_token) authentication is supported.
  • identity- The managed identity configuration for accessing Azure resources for endpoint provisioning and inference.
  • defaults - Default settings for the endpoint.
    • deployment_name - Name of the deployment that will serve as the default deployment for the endpoint.
  • description- Description of the endpoint.
  1. Configure the endpoint:

    # Creating a unique endpoint name with current datetime to avoid conflicts
    import datetime
    
    batch_endpoint_name = "my-batch-endpoint-" + datetime.datetime.now().strftime(
        "%Y%m%d%H%M"
    )
    
    # create a batch endpoint
    endpoint = BatchEndpoint(
        name=batch_endpoint_name,
        description="this is a sample batch endpoint",
        tags={"foo": "bar"},
    )
  2. Create the endpoint:

    Using the MLClient created earlier, we'll now create the Endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues.

    ml_client.begin_create_or_update(endpoint)

Create a deployment

A deployment is a set of resources required for hosting the model that does the actual inferencing. We'll create a deployment for our endpoint using the BatchDeployment class. This class allows user to configure the following key aspects.

  • name - Name of the deployment.
  • endpoint_name - Name of the endpoint to create the deployment under.
  • model - The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.
  • environment - The environment to use for the deployment. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification.
  • code_path- Path to the source code directory for scoring the model
  • scoring_script - Relative path to the scoring file in the source code directory
  • compute - Name of the compute target to execute the batch scoring jobs on
  • instance_count- The number of nodes to use for each batch scoring job.
  • max_concurrency_per_instance- The maximum number of parallel scoring_script runs per instance.
  • mini_batch_size - The number of files the code_configuration.scoring_script can process in one run() call.
  • retry_settings- Retry settings for scoring each mini batch.
    • max_retries- The maximum number of retries for a failed or timed-out mini batch (default is 3)
    • timeout- The timeout in seconds for scoring a mini batch (default is 30)
  • output_action- Indicates how the output should be organized in the output file. Allowed values are append_row or summary_only. Default is append_row
  • output_file_name- Name of the batch scoring output file. Default is predictions.csv
  • environment_variables- Dictionary of environment variable name-value pairs to set for each batch scoring job.
  • logging_level- The log verbosity level. Allowed values are warning, info, debug. Default is info.
  1. Configure the deployment:

    # create a batch deployment
    model = Model(path="./mnist/model/")
    env = Environment(
        conda_file="./mnist/environment/conda.yml",
        image="mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:latest",
    )
    deployment = BatchDeployment(
        name="non-mlflow-deployment",
        description="this is a sample non-mlflow deployment",
        endpoint_name=batch_endpoint_name,
        model=model,
        code_path="./mnist/code/",
        scoring_script="digit_identification.py",
        environment=env,
        compute="cpu-cluster",
        instance_count=2,
        max_concurrency_per_instance=2,
        mini_batch_size=10,
        output_action=BatchDeploymentOutputAction.APPEND_ROW,
        output_file_name="predictions.csv",
        retry_settings=BatchRetrySettings(max_retries=3, timeout=30),
        logging_level="info",
    )
  2. Create the deployment:

    Using the MLClient created earlier, we'll now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues.

    ml_client.begin_create_or_update(deployment)

Test the endpoint with sample data

Using the MLClient created earlier, we'll get a handle to the endpoint. The endpoint can be invoked using the invoke command with the following parameters:

  • name - Name of the endpoint
  • input_path - Path where input data is present
  • deployment_name - Name of the specific deployment to test in an endpoint
  1. Invoke the endpoint:

    # create a dataset form the folderpath
    input = Input(path="https://pipelinedata.blob.core.windows.net/sampledata/mnist")
    
    # invoke the endpoint for batch scoring job
    job = ml_client.batch_endpoints.invoke(
        endpoint_name=batch_endpoint_name,
        input_data=input,
        deployment_name="non-mlflow-deployment",  # name is required as default deployment is not set
        params_override=[{"mini_batch_size": "20"}, {"compute.instance_count": "4"}],
    )
  2. Get the details of the invoked job:

    Let us get details and logs of the invoked job

    # get the details of the job
    job_name = job.name
    batch_job = ml_client.jobs.get(name=job_name)
    print(batch_job.status)
    # stream the job logs
    ml_client.jobs.stream(name=job_name)

Clean up resources

Delete endpoint

ml_client.batch_endpoints.begin_delete(name=batch_endpoint_name)

Next steps

If you encounter problems using batch endpoints, see Troubleshooting batch endpoints.