You want to take the advantages of using the DevOps pipelines provided by Azure.You need to use Data Factory to ingest data and run a notebook on a Databricks cluster, which checks if the data has been ingested correctly and validates the result data file.Steps of your pipeline looks like this:
# run pipeline - job: "test job" displayName: "Test job" dependsOn: [Deploy_to_Databricks, Deploy_to_ADF] pool: vmImage: 'ubuntu-latest' timeoutInMinutes: 0 steps: - task: @4 displayName: DF Pipeline' inputs: azureSubscription: $(AZURE_RM_CONNECTION) ScriptPath: '$(Build.SourcesDirectory)/adf/temp/My_DFPipeline.ps1' ScriptArguments: '-ResourceGroupName $(RESOURCE_GROUP) -DataFactoryName $(DATA_FACTORY_NAME) -PipelineName $(PIPELINE_NAME)' azurePowerShellVersion: LatestVersion - task: @0 inputs: versionSpec: '3.x' addToPath: true architecture: 'x64' displayName: 'Python3.x' - task: @0 inputs: url: '$(DATABRICKS_URL)' token: '$(DATABRICKS_TOKEN)' displayName: 'Databricks config' - task: @0 inputs: notebookPath: '/Shared/devops-ds/test-data-ingestion' existingClusterId: '$(DATABRICKS_CLUSTER_ID)' executionParams: '{"bin_file_name":"$(bin_FILE_NAME)"}' displayName: 'Ingest data' - task: waitexecution@0 displayName: 'Wait until the testing is done'
Match the name of the pipeline steps with the task names in the above script:

Question

You want to take the advantages of using the DevOps pipelines provided by Azure.You need to use Data Factory to ingest data and run a notebook on a Databricks cluster, which checks if the data has been ingested correctly and validates the result data file.Steps of your pipeline looks like this:

# run pipeline - job: "test job" displayName: "Test job" dependsOn: [Deploy_to_Databricks, Deploy_to_ADF] pool: vmImage: 'ubuntu-latest' timeoutInMinutes: 0 steps: - task: <.......task1........>@4 displayName: DF Pipeline' inputs: azureSubscription: $(AZURE_RM_CONNECTION) ScriptPath: '$(Build.SourcesDirectory)/adf/temp/My_DFPipeline.ps1' ScriptArguments: '-ResourceGroupName $(RESOURCE_GROUP) -DataFactoryName $(DATA_FACTORY_NAME) -PipelineName $(PIPELINE_NAME)' azurePowerShellVersion: LatestVersion - task: <.......task2........>@0 inputs: versionSpec: '3.x' addToPath: true architecture: 'x64' displayName: 'Python3.x' - task: <.......task3........>@0 inputs: url: '$(DATABRICKS_URL)' token: '$(DATABRICKS_TOKEN)' displayName: 'Databricks config' - task: <.......task4........>@0 inputs: notebookPath: '/Shared/devops-ds/test-data-ingestion' existingClusterId: '$(DATABRICKS_CLUSTER_ID)' executionParams: '{"bin_file_name":"$(bin_FILE_NAME)"}' displayName: 'Ingest data' - task: waitexecution@0 displayName: 'Wait until the testing is done'

Match the name of the pipeline steps with the task names in the above script:

Exam-Answer · Accepted Answer

AzurePowerShell; UsePythonVersion; configuredatabricks; executenotebook

Data Pipeline Steps and Task Names for Data Factory, Databricks, and Azure DevOps

Question

Answers

Explanations