Solving ML Pipeline Output Problem

Solving ML Pipeline Output Problem

Question

While running your ML experiments, you make some changes in the underlying data in one of the datasets used during the preparation step in your.

After running your pipeline, you notice thatoutput doesn't change, regardless of changes in the data.

What should you do in order to solve the problem? Select two:

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E.

Answers: A and D.

Option A is CORRECT because reusing the output of previous steps in the pipeline is enabled by default.

You have to disable it if you want to prevent steps, use previous outputs.

Option B is incorrect becausethis is the default setting for reusing outputs from the previous run.

Leaving it “True” will not solve the problem.

Option C is incorrect because there is no “regenerate_output” in the PythonScriptStep.

Option D is CORRECT because setting regenerate_outputs=True at the experiment level forces all the steps in the pipeline NOT to use results from previous runs.

Option E is incorrect because setting regenerate_outputs=False at the lets all the steps in the pipeline use results from previous runs, i.e.

it won't solve the problem.

Reference:

The issue is that the changes in the underlying data in one of the datasets used during the preparation step are not being reflected in the output of the pipeline. In order to solve this issue, the following steps can be taken:

  1. In your PythonScriptStep, set allow_reuse = False: This option tells Azure Machine Learning not to reuse the output of the step if any of the inputs have changed. Setting allow_reuse to False can help ensure that the step is re-run and any changes in the data are taken into account.

  2. In your PythonScriptStep, set allow_reuse = True: This option tells Azure Machine Learning to reuse the output of the step if the inputs and parameters have not changed. Setting allow_reuse to True can help speed up the pipeline execution, but it may not take into account any changes in the data.

  3. In your PythonScriptStep, set regenerate_outputs=True: This option tells Azure Machine Learning to regenerate the output of the step, regardless of whether the inputs and parameters have changed. Setting regenerate_outputs to True ensures that any changes in the data are taken into account.

  4. In the experiment.submit, set regenerate_outputs=True: This option tells Azure Machine Learning to regenerate the outputs of all the steps in the pipeline, regardless of whether the inputs and parameters have changed. Setting regenerate_outputs to True ensures that any changes in the data are taken into account.

  5. In the experiment.submit, set regenerate_outputs=False: This option tells Azure Machine Learning not to regenerate the outputs of the steps in the pipeline if the inputs and parameters have not changed. Setting regenerate_outputs to False can help speed up the pipeline execution, but it may not take into account any changes in the data.

Therefore, options C and D are the correct answers to solve the issue, as they ensure that any changes in the data are taken into account by regenerating the outputs of the step or the entire pipeline.