Skip to content

Files

Latest commit

974ed1d · Feb 5, 2022

History

History
84 lines (57 loc) · 4.6 KB

how-to-export-delete-data.md

File metadata and controls

84 lines (57 loc) · 4.6 KB
title titleSuffix description services ms.service ms.subservice author ms.author ms.date ms.topic
Export or delete workspace data
Azure Machine Learning
Learn how to export or delete your workspace with the Azure Machine Learning studio, CLI, SDK, and authenticated REST APIs.
machine-learning
machine-learning
mldata
lgayhardt
lagayhar
10/21/2021
how-to

Export or delete your Machine Learning service workspace data

In Azure Machine Learning, you can export or delete your workspace data using either the portal's graphical interface or the Python SDK. This article describes both options.

[!INCLUDE GDPR-related guidance]

[!INCLUDE GDPR-related guidance]

Control your workspace data

In-product data stored by Azure Machine Learning is available for export and deletion. You can export and delete using Azure Machine Learning studio, CLI, and SDK. Telemetry data can be accessed through the Azure Privacy portal.

In Azure Machine Learning, personal data consists of user information in run history documents.

Delete high-level resources using the portal

When you create a workspace, Azure creates several resources within the resource group:

  • The workspace itself
  • A storage account
  • A container registry
  • An Applications Insights instance
  • A key vault

These resources can be deleted by selecting them from the list and choosing Delete

:::image type="content" source="media/how-to-export-delete-data/delete-resource-group-resources.png" alt-text="Screenshot of portal, with delete icon highlighted":::

Run history documents, which may contain personal user information, are stored in the storage account in blob storage, in subfolders of /azureml. You can download and delete the data from the portal.

:::image type="content" source="media/how-to-export-delete-data/storage-account-folders.png" alt-text="Screenshot of azureml directory in storage account, within the portal":::

Export and delete machine learning resources using Azure Machine Learning studio

Azure Machine Learning studio provides a unified view of your machine learning resources, such as notebooks, datasets, models, and experiments. Azure Machine Learning studio emphasizes preserving a record of your data and experiments. Computational resources such as pipelines and compute resources can be deleted using the browser. For these resources, navigate to the resource in question and choose Delete.

Datasets can be unregistered and Experiments can be archived, but these operations don't delete the data. To entirely remove the data, datasets and experiment data must be deleted at the storage level. Deleting at the storage level is done using the portal, as described previously. An individual Run can be deleted directly in studio. Deleting a Run deletes the Run's data.

Note

Prior to unregistering a Dataset, use its Data source link to find the specific Data URL to delete.

You can download training artifacts from experimental runs using the Studio. Choose the Experiment and Run in which you're interested. Choose Output + logs and navigate to the specific artifacts you wish to download. Choose ... and Download.

You can download a registered model by navigating to the Model and choosing Download.

:::image type="contents" source="media/how-to-export-delete-data/model-download.png" alt-text="Screenshot of studio model page with download option highlighted":::

Export and delete resources using the Python SDK

You can download the outputs of a particular run using:

# Retrieved from Azure Machine Learning web UI
run_id = 'aaaaaaaa-bbbb-cccc-dddd-0123456789AB'
experiment = ws.experiments['my-experiment']
run = next(run for run in ex.get_runs() if run.id == run_id)
metrics_output_port = run.get_pipeline_output('metrics_output')
model_output_port = run.get_pipeline_output('model_output')

metrics_output_port.download('.', show_progress=True)
model_output_port.download('.', show_progress=True)

The following machine learning resources can be deleted using the Python SDK:

Type Function Call Notes
Workspace delete Use delete-dependent-resources to cascade the delete
Model delete
ComputeTarget delete
WebService delete