Skip to content

Commit 19f12bb

Browse files
committedOct 31, 2021
Add Ingest Event Grid
1 parent 5efdf04 commit 19f12bb

29 files changed

+397
-19
lines changed
 
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
ms.topic: include
3+
ms.date: 11/02/2021
4+
author: shsagir
5+
ms.author: shsagir
6+
ms.service: synapse-analytics
7+
ms.subservice: data-explorer
8+
---
9+
|**Property** | **Property description**|
10+
|---|---|
11+
| `rawSizeBytes` | Size of the raw (uncompressed) data. For Avro/ORC/Parquet, that is the size before format-specific compression is applied. Provide the original data size by setting this property to the uncompressed data size in bytes.|
12+
| `kustoTable` | Name of the existing target table. Overrides the `Table` set on the `Data Connection` blade. |
13+
| `kustoDataFormat` | Data format. Overrides the `Data format` set on the `Data Connection` blade. |
14+
| `kustoIngestionMappingReference` | Name of the existing ingestion mapping to be used. Overrides the `Column mapping` set on the `Data Connection` blade.|
15+
| `kustoIgnoreFirstRecord` | If set to `true`, Kusto ignores the first row of the blob. Use in tabular format data (CSV, TSV, or similar) to ignore headers. |
16+
| `kustoExtentTags` | String representing [tags](/azure/data-explorer/kusto/management/extents-overview?context=/azure/synapse-analytics/context/context) that will be attached to resulting extent. |
17+
| `kustoCreationTime` | Overrides [$IngestionTime](/azure/data-explorer/kusto/query/ingestiontimefunction?context=/azure/synapse-analytics/context/context?pivots=azuredataexplorer) for the blob, formatted as an ISO 8601 string. Use for backfilling. |

‎articles/synapse-analytics/data-explorer/ingest-data/data-explorer-ingest-data-streaming.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.service: synapse-analytics
1111
ms.subservice: data-explorer
1212
---
1313

14-
# Configure streaming ingestion on your Azure Synapse Data Explorer pool
14+
# Configure streaming ingestion on your Azure Synapse Data Explorer pool (Preview)
1515

1616
Streaming ingestion is useful for loading data when you need low latency between ingestion and query. Consider using streaming ingestion in the following scenarios:
1717

‎articles/synapse-analytics/data-explorer/ingest-data/data-explorer-ingest-data-supported-formats.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.service: synapse-analytics
1111
ms.subservice: data-explorer
1212
---
1313

14-
# Data formats supported by Azure Synapse Data Explorer for ingestion
14+
# Data formats supported by Azure Synapse Data Explorer for ingestion (Preview)
1515

1616
Data ingestion is the process by which data is added to a table and is made available for query in Data Explorer. For all ingestion methods, other than ingest-from-query, the data must be in one of the supported formats. The following table lists and describes the formats that Data Explorer supports for data ingestion.
1717

Lines changed: 96 additions & 0 deletions
This file contains bidirectional or hidden Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
---
2+
title: Event Grid data connection for Azure Synapse Data Explorer (Preview)
3+
description: This article provides an overview of how to ingest (load) data into Azure Synapse Data Explorer from Event Grid.
4+
ms.topic: how-to
5+
ms.date: 11/02/2021
6+
author: shsagir
7+
ms.author: shsagir
8+
ms.reviewer: tzgitlin
9+
services: synapse-analytics
10+
ms.service: synapse-analytics
11+
ms.subservice: data-explorer
12+
---
13+
# Event Grid data connection (Preview)
14+
15+
Event Grid ingestion is a pipeline that listens to Azure storage, and updates Azure Data Explorer to pull information when subscribed events occur. Data Explorer offers continuous ingestion from Azure Storage (Blob storage and ADLSv2) with [Azure Event Grid](/azure/event-grid/overview) subscription for blob created or blob renamed notifications and streaming these notifications to Data Explorer via an Event Hub.
16+
17+
The Event Grid ingestion pipeline goes through several steps. You create a target table in Data Explorer into which the [data in a particular format](#data-format) will be ingested. Then you create an Event Grid data connection in Data Explorer. The Event Grid data connection needs to know [events routing](#events-routing) information, such as what table to send the data to and the table mapping. You also specify [ingestion properties](#ingestion-properties), which describe the data to be ingested, the target table, and the mapping. You can generate sample data and [upload blobs](#upload-blobs) or [rename blobs](#rename-blobs) to test your connection. [Delete blobs](#delete-blobs-using-storage-lifecycle) after ingestion. This process can be managed through the [Azure portal](data-explorer-ingest-event-grid-portal.md). <!-- , using [one-click ingestion](one-click-ingestion-new-table.md), programmatically with [C#](data-connection-event-grid-csharp.md) or [Python](data-connection-event-grid-python.md), or with the [Azure Resource Manager template](data-connection-event-grid-resource-manager.md). -->
18+
19+
<!-- For general information about data ingestion in Data Explorer, see [Data Explorer data ingestion overview](ingest-data-overview.md). -->
20+
21+
## Data format
22+
23+
- See [supported formats](data-explorer-ingest-data-supported-formats.md).
24+
- See [supported compressions](data-explorer-ingest-data-supported-formats.md#supported-data-compression-formats).
25+
- The original uncompressed data size should be part of the blob metadata, or else Data Explorer will estimate it. The ingestion uncompressed size limit per file is 4 GB.
26+
27+
> [!NOTE]
28+
> Event Grid notification subscription can be set on Azure Storage accounts for `BlobStorage`, `StorageV2`, or [Data Lake Storage Gen2](/azure/storage/blobs/data-lake-storage-introduction).
29+
30+
## Ingestion properties
31+
32+
You can specify [ingestion properties](data-explorer-ingest-data-properties.md) of the blob ingestion via the blob metadata.
33+
You can set the following properties:
34+
35+
[!INCLUDE [ingestion-properties-event-grid](../includes/data-explorer-event-grid-ingestion-properties.md)]
36+
37+
## Events routing
38+
39+
When setting up a blob storage connection to Data Explorer cluster, specify target table properties:
40+
41+
- table name
42+
- data format
43+
- mapping
44+
45+
This setup is the default routing for your data, sometimes referred to as `static routing`.
46+
You can also specify target table properties for each blob, using blob metadata. The data will dynamically route, as specified by [ingestion properties](#ingestion-properties).
47+
48+
The following example shows you how to set ingestion properties on the blob metadata before uploading it. Blobs are routed to different tables.
49+
50+
For more information, see [upload blobs](#upload-blobs).
51+
52+
```csharp
53+
// Blob is dynamically routed to table `Events`, ingested using `EventsMapping` data mapping
54+
blob = container.GetBlockBlobReference(blobName2);
55+
blob.Metadata.Add("rawSizeBytes", "4096‬"); // the uncompressed size is 4096 bytes
56+
blob.Metadata.Add("kustoTable", "Events");
57+
blob.Metadata.Add("kustoDataFormat", "json");
58+
blob.Metadata.Add("kustoIngestionMappingReference", "EventsMapping");
59+
blob.UploadFromFile(jsonCompressedLocalFileName);
60+
```
61+
62+
## Upload blobs
63+
64+
You can create a blob from a local file, set ingestion properties to the blob metadata, and upload it. For examples, see [Ingest blobs into Data Explorer by subscribing to Event Grid notifications](data-explorer-ingest-event-grid-portal.md#generate-sample-data)
65+
66+
> [!NOTE]
67+
> - Use `BlockBlob` to generate data. `AppendBlob` is not supported.
68+
> - Using Azure Data Lake Gen2 storage SDK requires using `CreateFile` for uploading files and `Flush` at the end with the close parameter set to "true".
69+
<!-- > For a detailed example of Data Lake Gen2 SDK correct usage, see [upload file using Azure Data Lake SDK](data-connection-event-grid-csharp.md#upload-file-using-azure-data-lake-sdk). -->
70+
> - When the Event Hub endpoint doesn't acknowledge receipt of an event, Azure Event Grid activates a retry mechanism. If this retry delivery fails, Event Grid can deliver the undelivered events to a storage account using a process of *dead-lettering*. For more information, see [Event Grid message delivery and retry](/azure/event-grid/delivery-and-retry#retry-schedule-and-duration).
71+
72+
## Rename blobs
73+
74+
When using ADLSv2, you can rename a blob to trigger blob ingestion to Data Explorer. For example, see [Ingest blobs into Data Explorer by subscribing to Event Grid notifications](data-explorer-ingest-event-grid-portal.md#generate-sample-data).
75+
76+
> [!NOTE]
77+
> - Directory renaming is possible in ADLSv2, but it doesn't trigger *blob renamed* events and ingestion of blobs inside the directory. To ingest blobs following renaming, directly rename the desired blobs.
78+
> - If you defined filters to track specific subjects while [creating the data connection](data-explorer-ingest-event-grid-portal.md#create-an-event-grid-data-connection).<!-- or while creating [Event Grid resources manually](ingest-data-event-grid-manual.md#create-an-event-grid-subscription), these filters are applied on the destination file path. -->
79+
80+
## Delete blobs using storage lifecycle
81+
82+
Data Explorer won't delete the blobs after ingestion. Use [Azure Blob storage lifecycle](/azure/storage/blobs/storage-lifecycle-management-concepts?tabs=azure-portal) to manage your blob deletion. It's recommended to keep the blobs for three to five days.
83+
84+
## Known Event Grid issues
85+
86+
- When using Data Explorer to [export](/azure/data-explorer/kusto/management/data-export/export-data-to-storage?context=/azure/synapse-analytics/context/context) the files used for event grid ingestion, note:
87+
- Event Grid notifications aren't triggered if the connection string provided to the export command or the connection string provided to an [external table](/azure/data-explorer/kusto/management/data-export/export-data-to-an-external-table?context=/azure/synapse-analytics/context/context) is a connecting string in [ADLS Gen2 format](/azure/data-explorer/kusto/api/connection-strings/storage?context=/azure/synapse-analytics/context/context#azure-data-lake-storage-gen2) (for example, `abfss://filesystem@accountname.dfs.core.windows.net`) but the storage account isn't enabled for hierarchical namespace.
88+
- If the account isn't enabled for hierarchical namespace, connection string must use the [Blob Storage](/azure/data-explorer/kusto/api/connection-strings/storage?context=/azure/synapse-analytics/context/context#azure-blob-storage) format (for example, `https://accountname.blob.core.windows.net`). The export works as expected even when using the ADLS Gen2 connection string, but notifications won't be triggered and Event Grid ingestion won't work.
89+
90+
## Next steps
91+
92+
- [Ingest blobs into Data Explorer by subscribing to Event Grid notifications](data-explorer-ingest-event-grid-portal.md)
93+
<!-- - [Create an Event Grid data connection for Data Explorer by using C#](data-connection-event-grid-csharp.md)
94+
- [Create an Event Grid data connection for Data Explorer by using Python](data-connection-event-grid-python.md)
95+
- [Create an Event Grid data connection for Data Explorer by using Azure Resource Manager template](data-connection-event-grid-resource-manager.md)
96+
- [Use one-click ingestion to ingest CSV data from a container to a new table in Data Explorer](one-click-ingestion-new-table.md) -->

0 commit comments

Comments
 (0)
Please sign in to comment.