title | description | ms.author | author | ms.reviewer | ms.service | ms.subservice | ms.topic | ms.date |
---|---|---|---|---|---|---|---|---|
Common mapping data flow errors and messages |
Learn how to troubleshoot common error codes and messages for mapping data flows in Azure Data Factory. |
makromer |
kromerm |
daperlov |
data-factory |
data-flows |
troubleshooting |
05/12/2022 |
This article lists common error codes and messages reported by mapping data flows in Azure Data Factory, along with their associated causes and recommendations.
- Message: Data preview, debug, and pipeline data flow execution failed because container does not exist
- Cause: A dataset contains a container that doesn't exist in storage.
- Recommendation: Make sure that the container referenced in your dataset exists and can be accessed.
- Message: JSON parsing error, unsupported encoding or multiline
- Cause: Possible problems with the JSON file: unsupported encoding, corrupt bytes, or using JSON source as a single document on many nested lines.
- Recommendation: Verify that the JSON file's encoding is supported. On the source transformation that's using a JSON dataset, expand JSON Settings and turn on Single Document.
-
Message: Broadcast join timeout error, make sure broadcast stream produces data within 60 secs in debug runs and 300 secs in job runs
-
Cause: Broadcast has a default timeout of 60 seconds on debug runs and 300 seconds on job runs. The stream chosen for broadcast is too large to produce data within this limit.
-
Recommendation: Check the Optimize tab on your data flow transformations for join, exists, and lookup. The default option for broadcast is Auto. If Auto is set, or if you're manually setting the left or right side to broadcast under Fixed, you can either set a larger Azure integration runtime (IR) configuration or turn off broadcast. For the best performance in data flows, we recommend that you allow Spark to broadcast by using Auto and use a memory-optimized Azure IR.
If you're running the data flow in a debug test execution from a debug pipeline run, you might run into this condition more frequently. That's because Azure Data Factory throttles the broadcast timeout to 60 seconds to maintain a faster debugging experience. You can extend the timeout to the 300-second timeout of a triggered run. To do so, you can use the Debug > Use Activity Runtime option to use the Azure IR defined in your Execute Data Flow pipeline activity.
-
Message: Broadcast join timeout error, you can choose 'Off' of broadcast option in join/exists/lookup transformation to avoid this issue. If you intend to broadcast join option to improve performance, then make sure broadcast stream can produce data within 60 secs in debug runs and 300 secs in job runs.
-
Cause: Broadcast has a default timeout of 60 seconds in debug runs and 300 seconds in job runs. On the broadcast join, the stream chosen for broadcast is too large to produce data within this limit. If a broadcast join isn't used, the default broadcast by dataflow can reach the same limit.
-
Recommendation: Turn off the broadcast option or avoid broadcasting large data streams for which the processing can take more than 60 seconds. Choose a smaller stream to broadcast. Large Azure SQL Data Warehouse tables and source files aren't typically good choices. In the absence of a broadcast join, use a larger cluster if this error occurs.
- Message: Converting to a date or time failed due to an invalid character
- Cause: Data isn't in the expected format.
- Recommendation: Use the correct data type.
- Message: Column name needs to be specified in the query, set an alias if using a SQL function
- Cause: No column name is specified.
- Recommendation: Set an alias if you're using a SQL function like min() or max().
- Message: INT96 is legacy timestamp type, which is not supported by ADF Dataflow. Please consider upgrading the column type to the latest types.
- Cause: Driver error.
- Recommendation: INT96 is a legacy timestamp type that's not supported by Azure Data Factory data flow. Consider upgrading the column type to the latest type.
- Message: The uncommitted block count cannot exceed the maximum limit of 100,000 blocks. Check blob configuration.
- Cause: The maximum number of uncommitted blocks in a blob is 100,000.
- Recommendation: Contact the Microsoft product team for more details about this problem.
- Message: The specified source path has either multiple partitioned directories (for example, <Source Path>/<Partition Root Directory 1>/a=10/b=20, <Source Path>/<Partition Root Directory 2>/c=10/d=30) or partitioned directory with other file or non-partitioned directory (for example <Source Path>/<Partition Root Directory 1>/a=10/b=20, <Source Path>/Directory 2/file1), remove partition root directory from source path and read it through separate source transformation.
- Cause: The source path has either multiple partitioned directories or a partitioned directory that has another file or non-partitioned directory.
- Recommendation: Remove the partitioned root directory from the source path and read it through separate source transformation.
- Message: Please make sure that the type of parameter matches with type of value passed in. Passing float parameters from pipelines isn't currently supported.
- Cause: Data types are incompatible between the declared type and the actual parameter value.
- Recommendation: Check that the parameter values passed into the data flow match the declared type.
- Message: Expression cannot be parsed.
- Cause: An expression generated parsing errors because of incorrect formatting.
- Recommendation: Check the formatting in the expression.
- Message: Implicit cartesian product for INNER join is not supported, use CROSS JOIN instead. Columns used in join should create a unique key for rows.
- Cause: Implicit cartesian products for INNER joins between logical plans aren't supported. If you're using columns in the join, create a unique key.
- Recommendation: For non-equality based joins, use CROSS JOIN.
- Message: During Data Flow debug and data preview: GetCommand OutputAsync failed with ...
- Cause: This error is a back-end service error.
- Recommendation: Retry the operation and restart your debugging session. If retrying and restarting doesn't resolve the problem, contact customer support.
- Message: Cluster ran into out of memory issue during execution, please retry using an integration runtime with bigger core count and/or memory optimized compute type
- Cause: The cluster is running out of memory.
- Recommendation: Debug clusters are meant for development. Use data sampling and an appropriate compute type and size to run the payload. For performance tips, see Mapping data flow performance guide.
- Message: Please make sure that the access key in your Linked Service is correct
- Cause: The account name or access key is incorrect.
- Recommendation: Ensure that the account name or access key specified in your linked service is correct.
- Message: Column name used in expression is unavailable or invalid.
- Cause: An invalid or unavailable column name is used in an expression.
- Recommendation: Check the column names used in expressions.
- Message: Internal server error
- Cause: The cluster is running out of disk space.
- Recommendation: Retry the pipeline. If doing so doesn't resolve the problem, contact customer support.
- Message: The store configuration is not defined. This error is potentially caused by invalid parameter assignment in the pipeline.
- Cause: Invalid store configuration is provided.
- Recommendation: Check the parameter value assignment in the pipeline. A parameter expression may contain invalid characters.
- Message: The pipeline expression cannot be evaluated.
- Cause: The pipeline expression passed in the Data Flow activity isn't being processed correctly because of a syntax error.
- Recommendation: Check data flow activity name. Check expressions in activity monitoring to verify the expressions. For example, data flow activity name can't have a space or a hyphen.
- Message: The activity was running on Azure Integration Runtime and failed to decrypt the credential of data store or compute connected via a Self-hosted Integration Runtime. Please check the configuration of linked services associated with this activity, and make sure to use the proper integration runtime type.
- Cause: Data flow doesn't support linked services on self-hosted integration runtimes.
- Recommendation: Configure data flow to run on a Managed Virtual Network integration runtime.
- Message: Invalid xml validation mode is provided.
- Cause: An invalid XML validation mode is provided.
- Recommendation: Check the parameter value and specify the right validation mode.
- Message: The field for corrupt records must be string type and nullable.
- Cause: An invalid data type of the column
\"_corrupt_record\"
is provided in the XML source. - Recommendation: Make sure that the column
\"_corrupt_record\"
in the XML source has a string data type and nullable.
- Message: Malformed xml with path in FAILFAST mode.
- Cause: Malformed XML with path exists in the FAILFAST mode.
- Recommendation: Update the content of the XML file to the right format.
- Message: Reference resource in xml data file cannot be resolved.
- Cause: The reference resource in the XML data file can't be resolved.
- Recommendation: Check the reference resource in the XML data file.
- Message: Schema validation failed.
- Cause: The invalid schema is provided on the XML source.
- Recommendation: Check the schema settings on the XML source to make sure that it's the subset schema of the source data.
- Message: External reference resource in xml data file is not supported.
- Cause: The external reference resource in the XML data file is not supported.
- Recommendation: Update the XML file content when the external reference resource is not supported now.
- Message: Either one of account key or tenant/spnId/spnCredential/spnCredentialType or miServiceUri/miServiceToken should be specified.
- Cause: An invalid credential is provided in the ADLS Gen2 linked service.
- Recommendation: Update the ADLS Gen2 linked service to have the right credential configuration.
- Message: Only one of the three auth methods (Key, ServicePrincipal and MI) can be specified.
- Cause: Invalid auth method is provided in ADLS gen2 linked service.
- Recommendation: Update the ADLS Gen2 linked service to have one of three authentication methods that are Key, ServicePrincipal and MI.
- Message: Service principal credential type is invalid.
- Cause: The service principal credential type is invalid.
- Recommendation: Please update the ADLS Gen2 linked service to set the right service principal credential type.
- Message: Either one of account key or sas token should be specified.
- Cause: An invalid credential is provided in the Azure Blob linked service.
- Recommendation: Use either account key or SAS token for the Azure Blob linked service.
- Message: Only one of the two auth methods (Key, SAS) can be specified.
- Cause: An invalid authentication method is provided in the linked service.
- Recommendation: Use key or SAS authentication for the Azure Blob linked service.
- Message: Partition key path should be specified for update and delete operations.
- Cause: The partition key path is missing in the Azure Cosmos DB sink.
- Recommendation: Provide the partition key in the Azure Cosmos DB sink settings.
- Message: Partition key path cannot be empty for update and delete operations.
- Cause: The partition key path is empty for update and delete operations.
- Recommendation: Use the providing partition key in the Azure Cosmos DB sink settings.
- Message: Partition key is not mapped in sink for delete and update operations.
- Cause: An invalid partition key is provided.
- Recommendation: In Cosmos DB sink settings, use the right partition key that is same as your container's partition key.
- Message: 'id' property should be mapped for delete and update operations.
- Cause: The
id
property is missed for update and delete operations. - Recommendation: Make sure that the input data has an
id
column in Azure Cosmos DB sink transformation settings. If not, use a select or derived column transformation to generate this column before the sink transformation.
- Message: partition key should start with /.
- Cause: An invalid partition key is provided.
- Recommendation: Ensure that the partition key start with
/
in Cosmos DB sink settings, for example:/movieId
.
- Message: Invalid connection mode.
- Cause: An invalid connection mode is provided.
- Recommendation: Confirm that the supported mode is Gateway and DirectHttps in Cosmos DB settings.
- Message: Either accountName or accountEndpoint should be specified.
- Cause: Invalid account information is provided.
- Recommendation: In the Cosmos DB linked service, specify the account name or account endpoint.
- Message: GitHub store does not allow writes.
- Cause: The GitHub store is read only.
- Recommendation: The store entity definition is in some other place.
- Message: User/password should be specified.
- Cause: The User/password is missed.
- Recommendation: Make sure that you have right credential settings in the related PostgreSQL linked service.
-
Message: Only blob storage type can be used as stage in snowflake read/write operation.
-
Cause: An invalid staging configuration is provided in the Snowflake.
-
Recommendation: Update Snowflake staging settings to ensure that only Azure Blob linked service is used.
-
Message: Snowflake stage properties should be specified with Azure Blob + SAS authentication.
-
Cause: An invalid staging configuration is provided in the Snowflake.
-
Recommendation: Ensure that only the Azure Blob + SAS authentication is specified in the Snowflake staging settings.
- Message: The spark type is not supported in snowflake.
- Cause: An invalid data type is provided in the Snowflake.
- Recommendation: Please use the derive transformation before applying the Snowflake sink to update the related column of the input data into the string type.
- Message: Blob storage staging properties should be specified.
- Cause: An invalid staging configuration is provided in the Hive.
- Recommendation: Please check if the account key, account name and container are set properly in the related Blob linked service, which is used as staging.
-
Message: ADLS Gen2 storage staging only support service principal key credential.
-
Cause: An invalid staging configuration is provided in the Hive.
-
Recommendation: Please update the related ADLS Gen2 linked service that is used as staging. Currently, only the service principal key credential is supported.
-
Message: ADLS Gen2 storage staging properties should be specified. Either one of key or tenant/spnId/spnKey or miServiceUri/miServiceToken is required.
-
Cause: An invalid staging configuration is provided in the Hive.
-
Recommendation: Update the related ADLS Gen2 linked service with right credentials that are used as staging in the Hive.
- Message: Unsupported Column(s).
- Cause: Unsupported Column(s) are provided.
- Recommendation: Update the column of input data to match the data type supported by the Hive.
- Message: Storage type can either be blob or gen2.
- Cause: Only Azure Blob or ADLS Gen2 storage type is supported.
- Recommendation: Choose the right storage type from Azure Blob or ADLS Gen2.
- Message: Either one of empty lines or custom header should be specified.
- Cause: An invalid delimited configuration is provided.
- Recommendation: Please update the CSV settings to specify one of empty lines or the custom header.
- Message: Column delimiter is required for parse.
- Cause: The column delimiter is missed.
- Recommendation: In your CSV settings, confirm that you have the column delimiter which is required for parse.
- Message: Either one of user/pwd or tenant/spnId/spnKey or miServiceUri/miServiceToken should be specified.
- Cause: An invalid credential is provided in the MSSQL linked service.
- Recommendation: Please update the related MSSQL linked service with right credentials, and one of user/pwd or tenant/spnId/spnKey or miServiceUri/miServiceToken should be specified.
- Message: Unsupported field(s).
- Cause: Unsupported field(s) are provided.
- Recommendation: Modify the input data column to match the data type supported by MSSQL.
- Message: Only one of the three auth methods (Key, ServicePrincipal and MI) can be specified.
- Cause: An invalid authentication method is provided in the MSSQL linked service.
- Recommendation: You can only specify one of the three authentication methods (Key, ServicePrincipal and MI) in the related MSSQL linked service.
- Message: Cloud type is invalid.
- Cause: An invalid cloud type is provided.
- Recommendation: Check your cloud type in the related MSSQL linked service.
- Message: Blob storage staging properties should be specified.
- Cause: Invalid blob storage staging settings are provided
- Recommendation: Please check if the Blob linked service used for staging has correct properties.
- Message: Storage type can either be blob or gen2.
- Cause: An invalid storage type is provided for staging.
- Recommendation: Check the storage type of the linked service used for staging and make sure that it's Blob or Gen2.
- Message: ADLS Gen2 storage staging only support service principal key credential.
- Cause: An invalid credential is provided for the ADLS gen2 storage staging.
- Recommendation: Use the service principal key credential of the Gen2 linked service used for staging.
- Message: ADLS Gen2 storage staging properties should be specified. Either one of key or tenant/spnId/spnCredential/spnCredentialType or miServiceUri/miServiceToken is required.
- Cause: Invalid ADLS Gen2 staging properties are provided.
- Recommendation: Please update ADLS Gen2 storage staging settings to have one of key or tenant/spnId/spnCredential/spnCredentialType or miServiceUri/miServiceToken.
- Message: Timestamp and version can't be set at the same time.
- Cause: The timestamp and version can't be set at the same time.
- Recommendation: Set the timestamp or version in the delta settings.
- Message: Key column(s) should be specified for non-insertable operations.
- Cause: Key column(s) are missed for non-insertable operations.
- Recommendation: Specify key column(s) on delta sink to have non-insertable operations.
- Message: Recreate and truncate options can't be both specified.
- Cause: Recreate and truncate options can't be specified simultaneously.
- Recommendation: Update delta settings to have either recreate or truncate operation.
- Message: Excel sheet name or index is required.
- Cause: An invalid Excel worksheet configuration is provided.
- Recommendation: Check the parameter value and specify the sheet name or index to read the Excel data.
- Message: Excel sheet name and index cannot exist at the same time.
- Cause: The Excel sheet name and index are provided at the same time.
- Recommendation: Check the parameter value and specify the sheet name or index to read the Excel data.
- Message: Invalid range is provided.
- Cause: An invalid range is provided.
- Recommendation: Check the parameter value and specify the valid range by the following reference: Excel format in Azure Data Factory-Dataset properties.
- Message: Excel worksheet does not exist.
- Cause: An invalid worksheet name or index is provided.
- Recommendation: Check the parameter value and specify a valid sheet name or index to read the Excel data.
- Message: Read excel files with different schema is not supported now.
- Cause: Reading excel files with different schemas is not supported now.
- Recommendation: Please apply one of following options to solve this problem:
- Use ForEach + data flow activity to read Excel worksheets one by one.
- Update each worksheet schema to have the same columns manually before reading data.
- Message: Data type is not supported.
- Cause: The data type is not supported.
- Recommendation: Please change the data type to 'string' for related input data columns.
- Message: Invalid excel file is provided while only .xlsx and .xls are supported.
- Cause: Invalid Excel files are provided.
- Recommendation: Use the wildcard to filter, and get
.xls
and.xlsx
Excel files before reading data.
- Message: Explicitly broadcasted dataset using left/right option should be small enough to fit in node's memory. You can choose broadcast option 'Off' in join/exists/lookup transformation to avoid this issue or use an integration runtime with higher memory.
- Cause: The size of the broadcasted table far exceeds the limits of the node memory.
- Recommendation: The broadcast left/right option should be used only for smaller dataset size which can fit into node's memory, so make sure to configure the node size appropriately or turn off the broadcast option.
- Message: The TCP/IP connection to the host has failed. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.
- Cause: The SQL database's firewall setting blocks the data flow to access.
- Recommendation: Please check the firewall setting for your SQL database, and allow Azure services and resources to access this server.
- Message: Transferring unroll memory to storage memory failed. Cluster ran out of memory during execution. Please retry using an integration runtime with more cores and/or memory optimized compute type.
- Cause: The cluster has insufficient memory.
- Recommendation: Please use an integration runtime with more cores and/or the memory optimized compute type.
- Message: Failed to delete data from cosmos after 3 times retry.
- Cause: The throughput on the Cosmos collection is small and leads to meeting throttling or row data not existing in Cosmos.
- Recommendation: Please take the following actions to solve this problem:
- If the error is 404, make sure that the related row data exists in the Cosmos collection.
- If the error is throttling, please increase the Cosmos collection throughput or set it to the automatic scale.
- If the error is request timed out, please set 'Batch size' in the Cosmos sink to smaller value, for example 1000.
- Cause: Error/invalid rows are found when writing to the Azure Synapse Analytics sink.
- Recommendation: Please find the error rows in the rejected data storage location if it is configured.
- Message: Exception is happened while writing error rows to storage.
- Cause: An exception happened while writing error rows to the storage.
- Recommendation: Please check your rejected data linked service configuration.
- Message: Field in struct does not exist.
- Cause: Invalid or unavailable field names are used in expressions.
- Recommendation: Check field names used in expressions.
- Message: XML Element has sub elements or attributes which can't be converted.
- Cause: The XML element has sub elements or attributes which can't be converted.
- Recommendation: Update the XML file to make the XML element has right sub elements or attributes.
- Message: Cloud type is invalid.
- Cause: An invalid cloud type is provided.
- Recommendation: Check the cloud type in your related ADLS Gen2 linked service.
- Message: Cloud type is invalid.
- Cause: An invalid cloud type is provided.
- Recommendation: Please check the cloud type in your related Azure Blob linked service.
- Message: Cosmos DB throughput scale operation cannot be performed because another scale operation is in progress, please retry after sometime.
- Cause: The throughput scale operation of the Azure Cosmos DB can't be performed because another scale operation is in progress.
- Recommendation: Login to Azure Cosmos DB account, and manually change container throughput to be auto scale or add a custom activity after mapping data flows to reset the throughput.
- Message: Path does not resolve to any file(s). Please make sure the file/folder exists and is not hidden.
- Cause: An invalid file/folder path is provided, which can't be found or accessed.
- Recommendation: Please check the file/folder path, and make sure it is existed and can be accessed in your storage.
- Message: File names cannot have empty value(s) while file name option is set as per partition.
- Cause: Invalid partition file names are provided.
- Recommendation: Please check your sink settings to have the right value of file names.
- Message: The result has 0 output columns. Please ensure at least one column is mapped.
- Cause: No column is mapped.
- Recommendation: Please check the sink schema to ensure that at least one column is mapped.
- Message: The column in source configuration cannot be found in source data's schema.
- Cause: Invalid columns are provided on the source.
- Recommendation: Check columns in the source configuration and make sure that it's the subset of the source data's schemas.
- Message: Custom resource can only have one Key/Id mapped to filter.
- Cause: Invalid configurations are provided.
- Recommendation: In your AdobeIntegration settings, make sure that the custom resource can only have one Key/Id mapped to filter.
- Message: Only single partition is supported. Partition schema may be RoundRobin or Hash.
- Cause: Invalid partition configurations are provided.
- Recommendation: In AdobeIntegration settings, confirm that only the single partition is set and partition schemas may be RoundRobin or Hash.
- Message: Key must be specified for non-insertable operations.
- Cause: Key columns are missed.
- Recommendation: Update AdobeIntegration settings to ensure key columns are specified for non-insertable operations.
- Message: Partition type has to be roundRobin.
- Cause: Invalid partition types are provided.
- Recommendation: Please update AdobeIntegration settings to make your partition type is RoundRobin.
- Message: Only privacy regulation that's currently supported is 'GDPR'.
- Cause: Invalid privacy configurations are provided.
- Recommendation: Please update AdobeIntegration settings while only privacy 'GDPR' is supported.
- Message: Job aborted due to stage failure. Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues.
- Cause: Data flow activity run failed because of transient network issues or one node in spark cluster ran out of memory.
- Recommendation: Use the following options to solve this problem:
-
Option-1: Use a powerful cluster (both drive and executor nodes have enough memory to handle big data) to run data flow pipelines with setting "Compute type" to "Memory optimized". The settings are shown in the picture below.
:::image type="content" source="media/data-flow-troubleshoot-guide/configure-compute-type.png" alt-text="Screenshot that shows the configuration of Compute type.":::
-
Option-2: Use larger cluster size (for example, 48 cores) to run your data flow pipelines. You can learn more about cluster size through this document: Cluster size.
-
Option-3: Repartition your input data. For the task running on the data flow spark cluster, one partition is one task and runs on one node. If data in one partition is too large, the related task running on the node needs to consume more memory than the node itself, which causes failure. So you can use repartition to avoid data skew, and ensure that data size in each partition is average while the memory consumption isn't too heavy.
:::image type="content" source="media/data-flow-troubleshoot-guide/configure-partition.png" alt-text="Screenshot that shows the configuration of partitions.":::
[!NOTE] You need to evaluate the data size or the partition number of input data, then set reasonable partition number under "Optimize". For example, the cluster that you use in the data flow pipeline execution is 8 cores and the memory of each core is 20GB, but the input data is 1000GB with 10 partitions. If you directly run the data flow, it will meet the OOM issue because 1000GB/10 > 20GB, so it is better to set repartition number to 100 (1000GB/100 < 20GB).
-
Option-4: Tune and optimize source/sink/transformation settings. For example, try to copy all files in one container, and don't use the wildcard pattern. For more detailed information, reference Mapping data flows performance and tuning guide.
-
- Cause: Error/Invalid rows were found while writing to Azure SQL Database sink.
- Recommendation: Please find the error rows in the rejected data storage location if configured.
- Message: Exception is happened while writing error rows to storage.
- Cause: An exception happened while writing error rows to the storage.
- Recommendation: Check your rejected data linked service configuration.
- Message: Database type is not supported.
- Cause: The database type is not supported.
- Recommendation: Check the database type and change it to the proper one.
- Message: Format is not supported.
- Cause: The format is not supported.
- Recommendation: Check the format and change it to the proper one.
- Message: The table/database name is not a valid name for tables/databases. Valid names only contain alphabet characters, numbers and _.
- Cause: The table/database name is not valid.
- Recommendation: Change a valid name for the table/database. Valid names only contain alphabet characters, numbers and
_
.
- Cause: The operation is not supported.
- Recommendation: Change Update method configuration as delete, update and upsert are not supported in Workspace DB.
- Cause: The database does not exist.
- Recommendation: Check if the database exists.
- Message: Use 'Stored procedure' as Source is not supported for serverless (on-demand) pool.
- Cause: The serverless pool has limitations.
- Recommendation: Retry using 'query' as the source or saving the stored procedure as a view, and then use 'table' as the source to read from view directly.
-
Message: Dataflow execution failed during broadcast exchange. Potential causes include misconfigured connections at sources or a broadcast join timeout error. To ensure the sources are configured correctly, please test the connection or run a source data preview in a Dataflow debug session. To avoid the broadcast join timeout, you can choose the 'Off' broadcast option in the Join/Exists/Lookup transformations. If you intend to use the broadcast option to improve performance then make sure broadcast streams can produce data within 60 secs for debug runs and within 300 secs for job runs. If problem persists, contact customer support.
-
Cause:
- The source connection/configuration error could lead to a broadcast failure in join/exists/lookup transformations.
- Broadcast has a default timeout of 60 seconds in debug runs and 300 seconds in job runs. On the broadcast join, the stream chosen for the broadcast seems too large to produce data within this limit. If a broadcast join is not used, the default broadcast done by a data flow can reach the same limit.
-
Recommendation:
- Do data preview at sources to confirm the sources are well configured.
- Turn off the broadcast option or avoid broadcasting large data streams where the processing can take more than 60 seconds. Instead, choose a smaller stream to broadcast.
- Large SQL/Data Warehouse tables and source files are typically bad candidates.
- In the absence of a broadcast join, use a larger cluster if the error occurs.
- If the problem persists, contact the customer support.
- Message: Short data type is not supported in Cosmos DB.
- Cause: The short data type is not supported in the Azure Cosmos DB.
- Recommendation: Add a derived column transformation to convert related columns from short to integer before using them in the Azure Cosmos DB sink transformation.
- Message: This endpoint does not support BlobStorageEvents, SoftDelete or AutomaticSnapshot. Please disable these account features if you would like to use this endpoint.
- Cause: Azure Blob Storage events, soft delete or automatic snapshot is not supported in data flows if the Azure Blob Storage linked service is created with service principal or managed identity authentication.
- Recommendation: Disable Azure Blob Storage events, soft delete or automatic snapshot feature on the Azure Blob account, or use key authentication to create the linked service.
- Message: The input authorization token can't serve the request. Please check that the expected payload is built as per the protocol, and check the key being used.
- Cause: There's no enough permission to read/write Azure Cosmos DB data.
- Recommendation: Please use the read-write key to access Azure Cosmos DB.
- Message: Resource not found.
- Cause: Invalid configuration is provided (for example, the partition key with invalid characters) or the resource doesn't exist.
- Recommendation: To solve this issue, refer to Diagnose and troubleshoot Azure Cosmos DB not found exceptions.
- Message: Expression type does not match column data type, expecting VARIANT but got VARCHAR.
- Cause: The column(s) type of input data which is string is different from the related column(s) type in the Snowflake sink transformation which is VARIANT.
- Recommendation: For the snowflake VARIANT, it can only accept data flow value which is struct, map or array type. If the value of your input data column(s) is JSON or XML or other string, use a parse transformation before the Snowflake sink transformation to covert value into struct, map or array type.
- Message: Malformed records are detected in schema inference. Parse Mode: FAILFAST.
- Cause: Wrong document form is selected to parse JSON file(s).
- Recommendation: Try different Document form (Single document/Document per line/Array of documents) in JSON settings. Most cases of parsing errors are caused by wrong configuration.
- Message: Failed to read footer for file
- Cause: Folder _spark_metadata is created by the structured streaming job.
- Recommendation: Delete _spark_metadata folder if it exists. For more information, refer to this article.
- Message: Failed to execute dataflow with internal server error, please retry later. If issue persists, please contact Microsoft support for further assistance
- Cause: The data flow execution is failed because of the system error.
- Recommendation: To solve this issue, refer to Internal server errors.
- Message: Storage with user assigned managed identity authentication in staging is not supported
- Cause: An exception is happened because of invalid staging configuration.
- Recommendation: The user-assigned managed identity authentication is not supported in staging. Use a different authentication to create an Azure Data Lake Storage Gen2 or Azure Blob Storage linked service, then use it as staging in mapping data flows.
- Message: Blob operation is not supported on older storage accounts. Creating a new storage account may fix the issue.
- Cause: The storage account is too old.
- Recommendation: Create a new storage account.
- Message: Blob operation is not supported on older storage accounts. Creating a new storage account may fix the issue.
- Cause: Operation is not supported.
- Recommendation: Change Update method configuration as delete, update and upsert are not supported in Azure Data Explorer.
- Message: Operation timeout while writing data.
- Cause: Operation times out while writing data.
- Recommendation: Increase the value in Timeout option in sink transformation settings.
- Message: Operation timeout while reading data.
- Cause: Operation times out while reading data.
- Recommendation: Increase the value in Timeout option in source transformation settings.
- Message: There are substantial concurrent MappingDataflow executions that are causing failures due to throttling under Integration Runtime.
- Cause: A large number of Data Flow activity runs are occurring concurrently on the integration runtime. For more information, see Azure Data Factory limits.
- Recommendation: If you want to run more Data Flow activities in parallel, distribute them across multiple integration runtimes.
- Message: There are substantial concurrent MappingDataflow executions which is causing failures due to throttling under subscription '%subscriptionId;', ActivityId: '%activityId;'.
- Cause: Throttling threshold was reached.
- Recommendation: Retry the request after a wait period.
- Message: Failed to provision cluster for '%activityId;' because the request computer exceeds the maximum concurrent count of 200. Integration Runtime '%IRName;'
- Cause: Transient error
- Recommendation: Retry the request after a wait period.
- Message: Unsupported compute type and/or core count value.
- Cause: Unsupported compute type and/or core count value was provided.
- Recommendation: Use one of the supported compute type and/or core count values given on this document.
- Message: Spark cluster not found.
- Recommendation: Restart the debug session.
- Message: Hit unexpected failure while allocating compute resources, please retry. If the problem persists, please contact Azure Support
- Cause: Transient error
- Recommendation: Retry the request after a wait period.
- Message: Unexpected failure during execution.
- Cause: Since debug clusters work differently from job clusters, excessive debug runs could wear the cluster over time, which could cause memory issues and abrupt restarts.
- Recommendation: Restart Debug cluster. If you are running multiple dataflows during debug session, use activity runs instead because activity level run creates separate session without taxing main debug cluster.
- Message: java.sql.SQLTransactionRollbackException. Deadlock found when trying to get lock; try restarting transaction. If the problem persists, please contact Azure Support
- Cause: Transient error
- Recommendation: Retry the request after a wait period.
For more help with troubleshooting, see these resources: