Azure Data Warehouse Ingestion Process: Solution Analysis

Implementing Azure Data Solution Exam: DP-200

Question

Note: This question is a part of series of questions that present the same scenario. Each question in the series contains a unique solution. Determine whether the solution meets the stated goals.

You develop a data ingestion process that will import data to an enterprise data warehouse in Azure Synapse Analytics. The data to be ingested resides in parquet files stored in an Azure Data Lake Gen 2 storage account.

You need to load the data from the Azure Data Lake Gen 2 storage account into the Data Warehouse.

Solution:

1. Create an external data source pointing to the Azure storage account

2. Create a workload group using the Azure storage account name as the pool name

3. Load the data using the INSERT'SELECT statement

Does the solution meet the goal?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B.

B

You need to create an external file format and external table using the external data source.

You then load the data using the CREATE TABLE AS SELECT statement.

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-load-from-azure-data-lake-store

The proposed solution is a valid approach to loading data from parquet files stored in an Azure Data Lake Gen 2 storage account into Azure Synapse Analytics. Therefore, the answer is A. Yes.

Here is a more detailed explanation of each step of the solution:

  1. Create an external data source pointing to the Azure storage account: Creating an external data source is necessary to define the location of the data stored in Azure Data Lake Gen 2 storage account. By creating an external data source, you can specify the storage account, file format, and authentication credentials that are required to access the data.

  2. Create a workload group using the Azure storage account name as the pool name: Creating a workload group enables you to manage the resource allocation for queries executed against the external data source. By using the Azure storage account name as the pool name, you can associate the workload group with the storage account and ensure that the queries executed against the external data source are properly prioritized and resource-constrained.

  3. Load the data using the INSERTSELECT statement: Using the INSERTSELECT statement is a standard way to load data from one table to another table in Azure Synapse Analytics. In this case, you would use the external data source and the workload group created in the previous steps to access and load the parquet files stored in the Azure Data Lake Gen 2 storage account.

In summary, the proposed solution is a valid approach to loading data from parquet files stored in an Azure Data Lake Gen 2 storage account into Azure Synapse Analytics.