title	titleSuffix	description	services	author	ms.author	ms.service	ms.subservice	ms.custom	ms.topic	ms.date
Set up AutoML for computer vision	Azure Machine Learning	Set up Azure Machine Learning automated ML to train computer vision models with the CLI v2 and Python SDK v2 (preview).	machine-learning	swatig007	swatig	machine-learning	automl	event-tier1-build-2022	how-to	05/26/2022

Set up AutoML to train computer vision models

[!INCLUDE sdk v2]

[!div class="op_single_selector" title1="Select the version of Azure Machine Learning CLI extension you are using:"]

v1

v2 (current version)

Important

This feature is currently in public preview. This preview version is provided without a service-level agreement. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

In this article, you learn how to train computer vision models on image data with automated ML with the Azure Machine Learning CLI extension v2 or the Azure Machine Learning Python SDK v2 (preview).

Automated ML supports model training for computer vision tasks like image classification, object detection, and instance segmentation. Authoring AutoML models for computer vision tasks is currently supported via the Azure Machine Learning Python SDK. The resulting experimentation runs, models, and outputs are accessible from the Azure Machine Learning studio UI. Learn more about automated ml for computer vision tasks on image data.

Prerequisites

CLI v2

An Azure Machine Learning workspace. To create the workspace, see Create an Azure Machine Learning workspace.
Install and set up CLI (v2) and make sure you install the ml extension.

Python SDK v2 (preview)

An Azure Machine Learning workspace. To create the workspace, see Create an Azure Machine Learning workspace.
The Azure Machine Learning Python SDK v2 (preview) installed.

To install the SDK you can either,
- Create a compute instance, which automatically installs the SDK and is pre-configured for ML workflows. For more information, see Create and manage an Azure Machine Learning compute instance.
- Use the following commands to install Azure ML Python SDK v2:
  - Uninstall previous preview version:
```
pip uninstall azure-ai-ml
```
  - Install the Azure ML Python SDK v2:
```
pip install azure-ai-ml
```
[!NOTE] Only Python 3.6 and 3.7 are compatible with automated ML support for computer vision tasks.

Select your task type

Automated ML for images supports the following task types:

Task type	AutoML Job syntax
image classification	CLI v2: `image_classification` SDK v2: `image_classification()`
image classification multi-label	CLI v2: `image_classification_multilabel` SDK v2: `image_classification_multilabel()`
image object detection	CLI v2: `image_object_detection` SDK v2: `image_object_detection()`
image instance segmentation	CLI v2: `image_instance_segmentation` SDK v2: `image_instance_segmentation()`

CLI v2

[!INCLUDE cli v2]

This task type is a required parameter and can be set using the task key.

For example:

task: image_object_detection

Python SDK v2 (preview)

Based on the task type, you can create automl image jobs using task specific automl functions.

For example:

from azure.ai.ml import automl
image_object_detection_job = automl.image_object_detection()

Training and validation data

In order to generate computer vision models, you need to bring labeled image data as input for model training in the form of an MLTable. You can create an MLTable from training data in JSONL format.

If your training data is in a different format (like, pascal VOC or COCO), you can apply the helper scripts included with the sample notebooks to convert the data to JSONL. Learn more about how to prepare data for computer vision tasks with automated ML.

Note

The training data needs to have at least 10 images in order to be able to submit an AutoML run.

Warning

Creation of MLTable is only supported using the SDK and CLI to create from data in JSONL format for this capability. Creating the MLTable via UI is not supported at this time.

JSONL schema samples

The structure of the TabularDataset depends upon the task at hand. For computer vision task types, it consists of the following fields:

Field	Description
`image_url`	Contains filepath as a StreamInfo object
`image_details`	Image metadata information consists of height, width, and format. This field is optional and hence may or may not exist.
`label`	A json representation of the image label, based on the task type.

The following is a sample JSONL file for image classification:

{
      "image_url": "AmlDatastore://image_data/Image_01.png",
      "image_details":
      {
          "format": "png",
          "width": "2230px",
          "height": "4356px"
      },
      "label": "cat"
  }
  {
      "image_url": "AmlDatastore://image_data/Image_02.jpeg",
      "image_details":
      {
          "format": "jpeg",
          "width": "3456px",
          "height": "3467px"
      },
      "label": "dog"
  }

The following code is a sample JSONL file for object detection:

{
    "image_url": "AmlDatastore://image_data/Image_01.png",
    "image_details":
    {
        "format": "png",
        "width": "2230px",
        "height": "4356px"
    },
    "label":
    {
        "label": "cat",
        "topX": "1",
        "topY": "0",
        "bottomX": "0",
        "bottomY": "1",
        "isCrowd": "true",
    }
}
{
    "image_url": "AmlDatastore://image_data/Image_02.png",
    "image_details":
    {
        "format": "jpeg",
        "width": "1230px",
        "height": "2356px"
    },
    "label":
    {
        "label": "dog",
        "topX": "0",
        "topY": "1",
        "bottomX": "0",
        "bottomY": "1",
        "isCrowd": "false",
    }
}

Consume data

Once your data is in JSONL format, you can create training and validation MLTable as shown below.

:::code language="yaml" source="~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/data/training-mltable-folder/MLTable":::

Automated ML doesn't impose any constraints on training or validation data size for computer vision tasks. Maximum dataset size is only limited by the storage layer behind the dataset (i.e. blob store). There's no minimum number of images or labels. However, we recommend starting with a minimum of 10-15 samples per label to ensure the output model is sufficiently trained. The higher the total number of labels/classes, the more samples you need per label.

CLI v2

[!INCLUDE cli v2]

Training data is a required parameter and is passed in using the training key of the data section. You can optionally specify another MLtable as a validation data with the validation key. If no validation data is specified, 20% of your training data will be used for validation by default, unless you pass validation_data_size argument with a different value.

Target column name is a required parameter and used as target for supervised ML task. It's passed in using the target_column_name key in the data section. For example,

target_column_name: label
training_data:
  path: data/training-mltable-folder
  type: mltable
validation_data:
  path: data/validation-mltable-folder
  type: mltable

Python SDK v2 (preview)

You can create data inputs from training and validation MLTable from your local directory or cloud storage with the following code:

[!Notebook-python[] (~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/automl-image-object-detection-task-fridge-items.ipynb?name=data-load)]

Training data is a required parameter and is passed in using the training_data parameter of the task specific automl type function. You can optionally specify another MLTable as a validation data with the validation_data parameter. If no validation data is specified, 20% of your training data will be used for validation by default, unless you pass validation_data_size argument with a different value.

Target column name is a required parameter and used as target for supervised ML task. It's passed in using the target_column_name parameter of the task specific automl function. For example,

from azure.ai.ml import automl
image_object_detection_job = automl.image_object_detection(
    training_data=my_training_data_input,
    validation_data=my_validation_data_input,
    target_column_name="label"
)

Compute to run experiment

Provide a compute target for automated ML to conduct model training. Automated ML models for computer vision tasks require GPU SKUs and support NC and ND families. We recommend the NCsv3-series (with v100 GPUs) for faster training. A compute target with a multi-GPU VM SKU leverages multiple GPUs to also speed up training. Additionally, when you set up a compute target with multiple nodes you can conduct faster model training through parallelism when tuning hyperparameters for your model.

The compute target is passed in using the compute parameter. For example:

CLI v2

[!INCLUDE cli v2]

compute: azureml:gpu-cluster

Python SDK v2 (preview)

from azure.ai.ml import automl

compute_name = "gpu-cluster"
image_object_detection_job = automl.image_object_detection(
    compute=compute_name,
)

Configure model algorithms and hyperparameters

With support for computer vision tasks, you can control the model algorithm and sweep hyperparameters. These model algorithms and hyperparameters are passed in as the parameter space for the sweep.

The model algorithm is required and is passed in via model_name parameter. You can either specify a single model_name or choose between multiple.

Supported model algorithms

The following table summarizes the supported models for each computer vision task.

Task	Model algorithms	String literal syntax `default_model`* denoted with *
Image classification (multi-class and multi-label)	MobileNet: Light-weighted models for mobile applications ResNet: Residual networks ResNeSt: Split attention networks SE-ResNeXt50: Squeeze-and-Excitation networks ViT: Vision transformer networks	`mobilenetv2` `resnet18` `resnet34` `resnet50` `resnet101` `resnet152` `resnest50` `resnest101` `seresnext` `vits16r224` (small) `vitb16r224`* (base) `vitl16r224` (large)
Object detection	YOLOv5: One stage object detection model Faster RCNN ResNet FPN: Two stage object detection models RetinaNet ResNet FPN: address class imbalance with Focal Loss Note: Refer to `model_size` hyperparameter for YOLOv5 model sizes.	`yolov5`* `fasterrcnn_resnet18_fpn` `fasterrcnn_resnet34_fpn` `fasterrcnn_resnet50_fpn` `fasterrcnn_resnet101_fpn` `fasterrcnn_resnet152_fpn` `retinanet_resnet50_fpn`
Instance segmentation	MaskRCNN ResNet FPN	`maskrcnn_resnet18_fpn` `maskrcnn_resnet34_fpn` `maskrcnn_resnet50_fpn`* `maskrcnn_resnet101_fpn` `maskrcnn_resnet152_fpn` `maskrcnn_resnet50_fpn`

In addition to controlling the model algorithm, you can also tune hyperparameters used for model training. While many of the hyperparameters exposed are model-agnostic, there are instances where hyperparameters are task-specific or model-specific. Learn more about the available hyperparameters for these instances.

Data augmentation

In general, deep learning model performance can often improve with more data. Data augmentation is a practical technique to amplify the data size and variability of a dataset which helps to prevent overfitting and improve the model’s generalization ability on unseen data. Automated ML applies different data augmentation techniques based on the computer vision task, before feeding input images to the model. Currently, there is no exposed hyperparameter to control data augmentations.

Task	Impacted dataset	Data augmentation technique(s) applied
Image classification (multi-class and multi-label)	Training Validation & Test	Random resize and crop, horizontal flip, color jitter (brightness, contrast, saturation, and hue), normalization using channel-wise ImageNet’s mean and standard deviation Resize, center crop, normalization
Object detection, instance segmentation	Training Validation & Test	Random crop around bounding boxes, expand, horizontal flip, normalization, resize Normalization, resize
Object detection using yolov5	Training Validation & Test	Mosaic, random affine (rotation, translation, scale, shear), horizontal flip Letterbox resizing

Configure your experiment settings

Before doing a large sweep to search for the optimal models and hyperparameters, we recommend trying the default values to get a first baseline. Next, you can explore multiple hyperparameters for the same model before sweeping over multiple models and their parameters. This way, you can employ a more iterative approach, because with multiple models and multiple hyperparameters for each, the search space grows exponentially and you need more iterations to find optimal configurations.

CLI v2

[!INCLUDE cli v2]

If you wish to use the default hyperparameter values for a given algorithm (say yolov5), you can specify it using model_name key in image_model section. For example,

image_model:
    model_name: "yolov5"

Python SDK v2 (preview)

If you wish to use the default hyperparameter values for a given algorithm (say yolov5), you can specify it using model_name parameter in set_image_model method of the task specific automl job. For example,

image_object_detection_job.set_image_model(model_name="yolov5")

Once you've built a baseline model, you might want to optimize model performance in order to sweep over the model algorithm and hyperparameter space. You can use the following sample config to sweep over the hyperparameters for each algorithm, choosing from a range of values for learning_rate, optimizer, lr_scheduler, etc., to generate a model with the optimal primary metric. If hyperparameter values are not specified, then default values are used for the specified algorithm.

Primary metric

The primary metric used for model optimization and hyperparameter tuning depends on the task type. Using other primary metric values is currently not supported.

accuracy for IMAGE_CLASSIFICATION
iou for IMAGE_CLASSIFICATION_MULTILABEL
mean_average_precision for IMAGE_OBJECT_DETECTION
mean_average_precision for IMAGE_INSTANCE_SEGMENTATION

Experiment budget

You can optionally specify the maximum time budget for your AutoML Vision training job using the timeout parameter in the limits - the amount of time in minutes before the experiment terminates. If none specified, default experiment timeout is seven days (maximum 60 days). For example,

CLI v2

[!INCLUDE cli v2]

limits:
  timeout: 60

Python SDK v2 (preview)

[!Notebook-python[] (~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/automl-image-object-detection-task-fridge-items.ipynb?name=limit-settings)]

Sweeping hyperparameters for your model

When training computer vision models, model performance depends heavily on the hyperparameter values selected. Often, you might want to tune the hyperparameters to get optimal performance. With support for computer vision tasks in automated ML, you can sweep hyperparameters to find the optimal settings for your model. This feature applies the hyperparameter tuning capabilities in Azure Machine Learning. Learn how to tune hyperparameters.

Define the parameter search space

You can define the model algorithms and hyperparameters to sweep in the parameter space.

See Configure model algorithms and hyperparameters for the list of supported model algorithms for each task type.
See Hyperparameters for computer vision tasks hyperparameters for each computer vision task type.
See details on supported distributions for discrete and continuous hyperparameters.

Sampling methods for the sweep

When sweeping hyperparameters, you need to specify the sampling method to use for sweeping over the defined parameter space. Currently, the following sampling methods are supported with the sampling_algorithm parameter:

Sampling type	AutoML Job syntax
Random Sampling	`random`
Grid Sampling	`grid`
Bayesian Sampling	`bayesian`

Note

Currently only random sampling supports conditional hyperparameter spaces.

Early termination policies

You can automatically end poorly performing runs with an early termination policy. Early termination improves computational efficiency, saving compute resources that would have been otherwise spent on less promising configurations. Automated ML for images supports the following early termination policies using the early_termination parameter. If no termination policy is specified, all configurations are run to completion.

Early termination policy	AutoML Job syntax
Bandit policy	CLI v2: `bandit` SDK v2: `BanditPolicy()`
Median stopping policy	CLI v2: `median_stopping` SDK v2: `MedianStoppingPolicy()`
Truncation selection policy	CLI v2: `truncation_selection` SDK v2: `TruncationSelectionPolicy()`

Learn more about how to configure the early termination policy for your hyperparameter sweep.

Resources for the sweep

You can control the resources spent on your hyperparameter sweep by specifying the max_trials and the max_concurrent_trials for the sweep.

Note

For a complete sweep configuration sample, please refer to this tutorial.

Parameter	Detail
`max_trials`	Required parameter for maximum number of configurations to sweep. Must be an integer between 1 and 1000. When exploring just the default hyperparameters for a given model algorithm, set this parameter to 1.
`max_concurrent_trials`	Maximum number of runs that can run concurrently. If not specified, all runs launch in parallel. If specified, must be an integer between 1 and 100. NOTE: The number of concurrent runs is gated on the resources available in the specified compute target. Ensure that the compute target has the available resources for the desired concurrency.

You can configure all the sweep related parameters as shown in the example below.

CLI v2

[!INCLUDE cli v2]

sweep:
  limits:
    max_trials: 10
    max_concurrent_trials: 2
  sampling_algorithm: random
  early_termination:
    type: bandit
    evaluation_interval: 2
    slack_factor: 0.2
    delay_evaluation: 6

Python SDK v2 (preview)

[!Notebook-python[] (~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/automl-image-object-detection-task-fridge-items.ipynb?name=sweep-settings)]

Fixed settings

You can pass fixed settings or parameters that don't change during the parameter space sweep as shown below.

CLI v2

[!INCLUDE cli v2]

image_model:
  early_stopping: True
  evaluation_frequency: 1

Python SDK v2 (preview)

[!Notebook-python[] (~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/automl-image-object-detection-task-fridge-items.ipynb?name=pass-arguments)]

Incremental training (optional)

Once the training run is done, you have the option to further train the model by loading the trained model checkpoint. You can either use the same dataset or a different one for incremental training.

Pass the checkpoint via run ID

You can pass the run ID that you want to load the checkpoint from.

CLI v2

[!INCLUDE cli v2]

image_model:
  checkpoint_run_id : "target_checkpoint_run_id"

Python SDK v2 (preview)

To find the run ID from the desired model, you can use the following code.

# find a run id to get a model checkpoint from
import mlflow

# Obtain the tracking URL from MLClient
MLFLOW_TRACKING_URI = ml_client.workspaces.get(
    name=ml_client.workspace_name
).mlflow_tracking_uri
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)

from mlflow.tracking.client import MlflowClient

mlflow_client = MlflowClient()
mlflow_parent_run = mlflow_client.get_run(automl_job.name)

# Fetch the id of the best automl child run.
target_checkpoint_run_id = mlflow_parent_run.data.tags["automl_best_child_run_id"]

To pass a checkpoint via the run ID, you need to use the checkpoint_run_id parameter in set_image_model function.

image_object_detection_job = automl.image_object_detection(
    compute=compute_name,
    experiment_name=exp_name,
    training_data=my_training_data_input,
    validation_data=my_validation_data_input,
    target_column_name="label",
    primary_metric="mean_average_precision",
    tags={"my_custom_tag": "My custom value"},
)

image_object_detection_job.set_image_model(checkpoint_run_id=target_checkpoint_run_id)

automl_image_job_incremental = ml_client.jobs.create_or_update(
    image_object_detection_job
)

Submit the AutoML job

CLI v2

[!INCLUDE cli v2]

To submit your AutoML job, you run the following CLI v2 command with the path to your .yml file, workspace name, resource group and subscription ID.

az ml job create --file ./hello-automl-job-basic.yml --workspace-name [YOUR_AZURE_WORKSPACE] --resource-group [YOUR_AZURE_RESOURCE_GROUP] --subscription [YOUR_AZURE_SUBSCRIPTION]

Python SDK v2 (preview)

When you've configured your AutoML Job to the desired settings, you can submit the job.

[!Notebook-python[] (~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/automl-image-object-detection-task-fridge-items.ipynb?name=submit-run)]

Outputs and evaluation metrics

The automated ML training runs generates output model files, evaluation metrics, logs and deployment artifacts like the scoring file and the environment file which can be viewed from the outputs and logs and metrics tab of the child runs.

Tip

Check how to navigate to the run results from the View run results section.

For definitions and examples of the performance charts and metrics provided for each run, see Evaluate automated machine learning experiment results

Register and deploy model

Once the run completes, you can register the model that was created from the best run (configuration that resulted in the best primary metric).

You can deploy the model from the Azure Machine Learning studio UI. Navigate to the model you wish to deploy in the Models tab of the automated ML run and select the Deploy.

You can configure the model deployment endpoint name and the inferencing cluster to use for your model deployment in the Deploy a model pane.

Code examples

CLI v2

Review detailed code examples and use cases in the azureml-examples repository for automated machine learning samples.

Python SDK v2 (preview)

Review detailed code examples and use cases in the GitHub notebook repository for automated machine learning samples.

Next steps

Tutorial: Train an object detection model (preview) with AutoML and Python.
Troubleshoot automated ML experiments.

Files

how-to-auto-train-image-models.md

Latest commit

History

how-to-auto-train-image-models.md

File metadata and controls

Set up AutoML to train computer vision models

Prerequisites

Select your task type

Training and validation data

JSONL schema samples

Consume data

Compute to run experiment

Configure model algorithms and hyperparameters

Supported model algorithms

Data augmentation

Configure your experiment settings

Primary metric

Experiment budget

Sweeping hyperparameters for your model

Define the parameter search space

Sampling methods for the sweep

Early termination policies

Resources for the sweep

Fixed settings

Incremental training (optional)

Pass the checkpoint via run ID

Submit the AutoML job

[!Notebook-python[] (~/azureml-examples-main/sdk/jobs/automl-standalone-jobs/automl-image-object-detection-task-fridge-items/automl-image-object-detection-task-fridge-items.ipynb?name=submit-run)]

Outputs and evaluation metrics

Register and deploy model

Code examples

Next steps