Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 3b86a3e

Browse files
author
Anna Rode
committedJan 16, 2020
adding new doc for how to debug pipelines with application insights
1 parent 7290ba0 commit 3b86a3e

File tree

2 files changed

+130
-0
lines changed

2 files changed

+130
-0
lines changed
 
Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
---
2+
title: Debug and troubleshoot Machine Learning Pipelines in Application Insights
3+
titleSuffix: Azure Machine Learning
4+
description: Add logging to your training and batch scoring pipelines and view the logged results in application insights.
5+
services: machine-learning
6+
author: anrode
7+
ms.author: anrode
8+
ms.reviewer: anrode
9+
ms.service: machine-learning
10+
ms.subservice: core
11+
ms.workload: data-services
12+
ms.topic: conceptual
13+
ms.date: 01/15/2020
14+
15+
ms.custom: seodec18
16+
---
17+
# Debug and troubleshoot Machine Learning Pipelines in Application Insights
18+
[!INCLUDE [applies-to-skus](../../includes/aml-applies-to-basic-enterprise-sku.md)]
19+
20+
The [OpenCensus](https://opencensus.io/quickstart/python/) python library can be used to route logs to Application Insights from your scripts. The benefit of having all logs for multiple pipeline runs in Application Insights is that you can track trends over time across similar pipeline runs, or compare pipeline runs with different parameters and data.
21+
22+
In addition, it can provide history of exceptions and error messages. Since Application Insights integrates with Azure Alerts, you can also create alerts based on Application Insights queries.
23+
24+
## Prerequisites
25+
26+
* Follow the steps to create an [Azure Machine Learning](./how-to-manage-workspace.md) workspace and [create your first pipeline](./how-to-create-your-first-pipeline.md)
27+
* [Configure your development environment](./how-to-configure-environment.md) to install the Azure Machine Learning SDK. We used Visual Studio Code to write the python scripts in this example
28+
* Follow this [guide](https://code.visualstudio.com/docs/python/python-tutorial) to set up your Visual Studio Code Python environment
29+
* We recommend creating a virtual environment and [installing new packages](https://code.visualstudio.com/docs/python/python-tutorial#_install-and-use-packages) there
30+
* Install the [Python OpenCensus package](https://pypi.org/project/opencensus/)
31+
* Create an [Application Insights instance](../azure-monitor/app/opencensus-python.md)(this doc also contains information on getting the connection string for the resource)
32+
33+
## Getting Started
34+
35+
The following is a quickstart for using OpenCensus specific to this use case. For a detailed tutorial, see [OpenCensus Azure Monitor Exporters](https://github.com/census-instrumentation/opencensus-python/tree/master/contrib/opencensus-ext-azure)
36+
37+
After you install the OpenCensus Python library, import the AzureLogHandler class. This helps to route logs to Application Insights. You will also need the Python Logging library.
38+
39+
```python
40+
from opencensus.ext.azure.log_exporter import AzureLogHandler
41+
import logging
42+
```
43+
44+
Then, create a Python logger, and add an AzureLogHandler to it. You will also need to set the required `APPLICATIONINSIGHTS_CONNECTION_STRING` environment variable or provide the instrumentation key inline.
45+
46+
```python
47+
# Use OpenCensus Logging
48+
49+
# If you do not want to use env variable, instantiate handler this way:
50+
handler = AzureLogHandler(connection_string='<connection string>')
51+
logger.addHandler(handler)
52+
53+
# Otherwise, you must set the env variable APPLICATIONINSIGHTS_CONNECTION_STRING
54+
try:
55+
logger.addHandler(AzureLogHandler()).
56+
except ValueError as ex:
57+
logger.warning("Could not find application insights key. Either set the APPLICATIONINSIGHTS_CONNECTION_STRING " \
58+
"environment variable or pass in a connection_string to AzureLogHandler.")
59+
```
60+
61+
## Logging with Custom Dimensions
62+
63+
Plaintext strings as logs are helpful in the case where an engineer or data scientist is diagnosing one specific pipeline step, and is already has context of the experiment, parent pipeline, and step that is being evaluated.
64+
In other cases, like when someone is managing several models, a model's performance over time, or doesn't have the time to dive into each individual step and download the logs to view progress, Custom Dimensions can provide helpful context to a log message.
65+
66+
Custom Dimensions are a dictionary of key-value (stored as string, string) pairs that is sent to Application Insights and displayed as a column in the query results. Its individual dimensions can be queried on.
67+
68+
### Helpful dimensions to include
69+
70+
| Field | Reasoning/Example |
71+
|--------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
72+
| parent_run_id | Can query across logs for those with same parent_run_id to see logs over time for all steps, instead of having to dive into each individual step |
73+
| step_id | Can query across logs for those with same step_id to see where an issue occurred with a narrow scope to just the individual step |
74+
| step_name | Can query across logs to find how a specific step has performed over time, or find a step_id for recent runs without diving into the portal UI |
75+
| experiment_name | Can query across logs to find how a specific experiment has performed over time, or find a parent_run_id or step_id for recent runs without diving into the portal UI |
76+
| experiment_url | Can provide a link directly back to the experiment run for further investigation with less clicks, or to use to drill into from a dashboard |
77+
| build_url and/or build_version | Can correlate logs to the code version that provided the step and pipeline logic. This can further help to diagnose issues, or identify models with specific traits (log/metric values) |
78+
| run_type | Can differentiate between different model types, or training vs. scoring runs |
79+
80+
### Creating the custom dimensions dictionary
81+
82+
```python
83+
run = Run.get_context(allow_offline=False)
84+
85+
# get value from environment variable
86+
build_id = os.environ["BUILD_ID"]
87+
88+
custom_dimensions = {
89+
"parent_run_id": run.parent.id,
90+
"step_id": run.id,
91+
"step_name": run.name,
92+
"experiment_name": run.experiment.name,
93+
"run_url": run.parent.get_portal_url(),
94+
"build_id": build_id,
95+
# construct Azure DevOps url from helper given this id
96+
"build_url": "https://dev.azure.com/<your org here>/<your project here>/_build/results?buildId={build_id}&view=results",
97+
"run_type": "training"
98+
}
99+
100+
# logger has AzureLogHandler registered previously
101+
logger.info("Info for application insights", custom_dimensions)
102+
103+
```
104+
105+
## OpenCensus Python logging considerations
106+
107+
The OpenCensus AuzreLogHandler is used to normal traditional Python logs to Application Insights. Due to this behavior, normal Python logging nuances should be considered. For example, when a logger is created, it has a default log level and will show logs greater than or equal to that level. A good reference for understanding and effectively utilizing the Python logging features is the [Logging Cookbook](https://docs.python.org/3/howto/logging-cookbook.html).
108+
109+
The `APPLICATIONINSIGHTS_CONNECTION_STRING` environment variable is needed for the OpenCensus library. Consider setting this environment variable instead of passing it in as a pipeline parameter to reduce the amount of parameters needed and avoid passing around plaintext connection strings. # TODO: How to
110+
111+
## Querying logs in Application Insights
112+
113+
The logs routed to Application Insights will show up under 'traces'. Be sure to adjust your time window to include your pipeline run.
114+
115+
![Application Insights Query result](./media/how-to-debug-pipelines-application-insights/traces-application-insights-query.png)
116+
117+
The result in Application Insights will show the log message and level, file path and code line number the log is from, as well as any custom dimensions included. In this image, the customDimensions dictionary shows the key/value pairs from the previous [code sample](#creating-custom-dimensions-dictionary).
118+
119+
## Additional helpful queries
120+
121+
This section contains helpful queries besides just the 'traces' query we initially used as en example to verify that logs are being piped to your Application Insights instance.
122+
123+
Some of the queries below use ‘severityLevel’. For more information on Application Insights severity levels, see this [reference](https://docs.microsoft.com/en-us/dotnet/api/microsoft.applicationinsights.datacontracts.severitylevel?view=azure-dotnet). These severity levels correspond to the level the Python log was originally sent with. For additional query information, see [Azure Monitor Log Queries](https://docs.microsoft.com/en-us/azure/azure-monitor/log-query/query-language).
124+
125+
| Use case | Query |
126+
|------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|
127+
| Log results for specific custom dimension, for example 'parent_run_id' | `traces`<br>`| where customDimensions.['parent_run_id'] == '931024c2-3720-11ea-b247-c49deda841c1'` |
128+
| Log results for training runs over the last 7 days | `traces`<br>`| where timestamp > ago(7d) and customDimensions['run_type'] == 'training'` |
129+
| Log results with severityLevel Error from the last 7 days | `traces`<br>`| where timestamp > ago(7d) and severityLevel == 3` |
130+
| Count of log results with severityLevel Error over the last 7 days | `traces`<br>`| where timestamp > ago(7d) and severityLevel == 3`<br>`| summarize count()` |

0 commit comments

Comments
 (0)
Please sign in to comment.