Azure Data Factory: Incremental Data Loading from On-Premises SQL Server to Azure SQL Database

Performing Incremental Data Loading from On-Premises SQL Server to Azure SQL Database using Azure Data Factory

Question

George is a Cloud Data Engineer for Adatum Corporation.

He's working on the database migration activities from on-premises SQL server tables to Azure SQL database using Azure Data factory.

He's required to perform the incremental data loading activity.

The steps for incremental data loading activities from on-premises SQL server tables to Azure SQL database through Azure Data Factory using Azure portal are as follows - A.

Select the watermark column.

B.

Prepare the SQL data store to store the watermark value.

C.

Create a For-each activity that iterates through a list of source table names passed as a parameter to the pipeline.

D.

Create two lookup activities, one for getting the old watermark value from last time and another for getting the new watermark value from the source table.

E.

Create the Stored procedure activity in ADF to update the watermark for the delta data loading next time.

F.

Create a Copy activity in ADF to delta loading of data between the watermarks from the source table to merge data to the destination.

Which of the following set of activities he needs to perform in sequence?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer: B

Here is the right sequence of steps for migrating on-prem SQL server tables to Azure SQL database using Azure Data factory activities and pipelines.

a

Select the watermark column.

The one column for each table in the source data store to be selected is a watermark column that can be used to identify the new or updated records for every run.

b.

Prepare the SQL data store to store the watermark value.

c.Create a For-each activity that iterates through a list of source table names passed as a parameter to the pipeline.

d.

Create two lookup activities.

The first lookup activity retrieves the last watermark value while the second lookup activity retrieves the new lookup value.

These watermark values are passed to the Copy activity of the next step.

f.

Create a Copy activity that copies rows from source data store with watermark column value > old watermark value and < the new watermark value.

e.

Create a Stored Procedure activity in ADF which updates the watermark value for the pipeline runs next time.

Sure, I'd be happy to explain the correct sequence of activities for incremental data loading from on-premises SQL Server tables to Azure SQL Database using Azure Data Factory.

Here are the correct steps in sequence:

a. Select the watermark column. b. Prepare the SQL data store to store the watermark value. d. Create two lookup activities, one for getting the old watermark value from last time and another for getting the new watermark value from the source table. c. Create a For-each activity that iterates through a list of source table names passed as a parameter to the pipeline. f. Create a Copy activity in ADF to delta load data between the watermarks from the source table to merge data to the destination. e. Create the Stored procedure activity in ADF to update the watermark for the delta data loading next time.

Let me explain each step in more detail:

a. Select the watermark column: In incremental data loading, the watermark column is used to track the last loaded record. It's a column that has a consistently increasing or decreasing value, and it's used to identify new or updated records. The first step is to select the watermark column.

b. Prepare the SQL data store to store the watermark value: In order to store the watermark value, a SQL data store needs to be prepared. This store will keep track of the last loaded watermark value.

d. Create two lookup activities: Next, two lookup activities are created - one to get the old watermark value from the last load and another to get the new watermark value from the source table. This will help identify the range of data that needs to be loaded incrementally.

c. Create a For-each activity: A For-each activity is created that iterates through a list of source table names passed as a parameter to the pipeline. This will enable loading of data from multiple tables.

f. Create a Copy activity: The Copy activity is used to perform delta loading of data between the watermarks from the source table to merge data to the destination.

e. Create the Stored procedure activity: Finally, a stored procedure activity is created in ADF to update the watermark for the delta data loading next time. This updates the watermark store with the last loaded watermark value.

In summary, the correct sequence of activities for incremental data loading from on-premises SQL Server tables to Azure SQL Database using Azure Data Factory is a -> b -> d -> c -> f -> e.