Skip to content

Files

Latest commit

6012776 · Apr 27, 2022

History

History
73 lines (49 loc) · 5.4 KB

azure-database-explorer-output.md

File metadata and controls

73 lines (49 loc) · 5.4 KB
title description author ms.author ms.service ms.topic ms.date
Azure Data Explorer output from Azure Stream Analytics (Preview)
This article describes using Azure Database Explorer as an output for Azure Stream Analytics.
enkrumah
ebnkruma
stream-analytics
conceptual
04/27/2022

Azure Data Explorer output from Azure Stream Analytics (Preview)

You can use Azure Data Explorer as an output for analyzing large volumes of diverse data from any data source, such as websites, applications, IoT devices, and more. Azure Data Explorer is a fast and highly scalable data exploration service for log and telemetry data. It helps you handle the many data streams emitted by modern software, so you can collect, store, and analyze data. This data is used for diagnostics, monitoring, reporting, machine learning, and additional analytics capabilities.

Azure Data Explorer supports several ingestion methods, including connectors to common services like Event Hubs, programmatic ingestion using SDKs, such as .NET and Python, and direct access to the engine for exploration purposes. Azure Data Explorer integrates with analytics and modeling services for additional analysis and visualization of data.

For more information about Azure Data Explorer, visit the What is Azure Data Explorer documentation.

To learn more about how to create an Azure Data Explorer and cluster by using the Azure portal, visit: Quickstart: Create an Azure Data Explorer cluster and database

Output configuration

The following table lists the property names and their description for creating an Azure Data Explorer output:

Property name Description
Output alias A friendly name used in queries to direct the query output to this database.
Subscription Select the Azure subscription that you want to use for your cluster.
Cluster Choose a unique name that identifies your cluster. The domain name [region].kusto.windows.net is appended to the cluster name you provide. The name can contain only lowercase letters and numbers. It must contain from 4 to 22 characters.
Database The name of the database where you are sending your output. The database name must be unique within the cluster.
Authentication A managed identity from Azure Active Directory allows your cluster to easily access other Azure AD-protected resources such as Azure Key Vault. The identity is managed by the Azure platform and doesn't require you to provision or rotate any secrets. Managed identity configuration is currently supported only to enable customer-managed keys for your cluster..
Table The table name where the output is written. The table name is case-sensitive. The schema of this table should exactly match the number of fields and their types that your job output generates.

Partitioning

Partitioning needs to enabled and is based on the PARTITION BY clause in the query. When the Inherit Partitioning option is enabled, it follows the input partitioning for fully parallelizable queries.

When to use Azure Data Explorer or/and Azure Stream Analytics

Azure Stream Analytics:

  • Stream Processing Engine - Continuous/ Streaming real-time analytics
  • Job based
  • ASA has a lookback window period of 1ms to 7 days for in-memory temporal analytics/stream processing
  • Ingest from Event Hubs, IoTHub with subsecond latency

Azure Data Explorer:

  • Analytical Engine - On-demand/ Interactive real-time analytics
  • Streaming data ingestion into persistent data store along with querying capabilities
  • Ingest data from Event Hubs, IoT Hub, Blob, Data Lake, Kafka, Logstash, Spark, ADF.
  • 10 seconds to 5 minutes latency for high throughput workloads
  • Simple data transformation can be done with update policy during ingestion

You can significantly grow the scope of real-time analytics by leveraging ASA and ADX together. Below are a few scenarios:

  • Stream Analytics identifies anomalies in real time and Data Explorer helps determine how and why it occurred through interactive exploration
  • Stream Analytics deserializes incoming data stream for use in Data Explorer (E.g. ingest Protobuff format by using custom deserializer, custom binaries formats etc.)
  • Stream Analytics can perform aggregates, filters, enrich, and transform incoming data streams for use in Data Explorer

Limitation

  • The number of columns in Azure Stream Analytics job query should match with Azure Data Explorer table and should be in the same order.
  • The name of the columns & data type should match between Azure Stream Analytics SQL query and Azure Data Explorer table.
  • Azure Data Explorer has an aggregation (batching) policy for data ingestion, designed to optimize the ingestion process. The policy is configured to 5 minutes, 1000 items or 1 GB of data by default, so you may experience a latency. See batching policy for aggregation options.
  • Test connection to Azure Data Explorer is not supported in jobs running in Shared multi-tenant environment.

Next steps