Customizing Featurization in Azure AutoML Experiments

Customizing Featurization in Azure AutoML

Question

You are running autoML experiments.

Although during automated ML experiments several featurization techniques are applied automatically, in this particular case you want to customize featurization process and you want to manually select columns to be dropped.

Which of the following code segments will you need, in what order?

1.automl_config = AutoMLConfig(name='Automated ML Experiment', … featurization='off' ) 2.featurization_config.drop_columns = ['aspiration', 'stroke']
3.automl_config = AutoMLConfig(name='Automated ML Experiment', … featurization='FeaturizationConfig' ) 4.featurization_config.enabled _transformers = ['DropColumns']
5.featurization_config = FeaturizationConfig()

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: B.

Option A is incorrect because setting featurization ‘off' simply disables automatic featurization and doesn't allow customization.

You have to set ‘FeaturizationConfig'.

Option B is CORRECT because to customize featurization, AutoMLConfig object must be created, with the featurization parameter set to ‘FeaturizationObject';' then a FeaturizationConfig object must be created with setting the drop columns operation.

Option C is incorrect because the AutoMLConfig object must be created with featurization='FeaturizationConfig'; you cannot “enable transformers” because they are executed by default and you can only disable them (setting enabled_transformers); there is no transformer ‘DropColumns'.

Option D is incorrect because you cannot “enable transformers” because they are executed by default and you can only disable them (setting enabled_transformers); there is no transformer ‘DropColumns'

Therefore, the ‘enabled_transformers' step is not needed.

Reference:

The correct order of the code segments to customize the featurization process and manually select columns to be dropped during automated ML experiments is option D: 3, 5, 4, 2.

Here's a detailed explanation of each code segment and its role in the featurization process:

  1. automl_config = AutoMLConfig(name='Automated ML Experiment', ..., featurization='off'): This code segment is used to define the configuration for the automated ML experiment, including the name of the experiment, the type of task (classification or regression), the target column, the primary metric to optimize, and so on. The featurization='off' parameter is used to turn off automatic featurization, which means that no automatic feature engineering will be applied during the experiment.

  2. featurization_config.drop_columns = ['aspiration', 'stroke']: This code segment is used to specify which columns should be dropped from the dataset before training the model. The drop_columns attribute of the featurization_config object is set to a list of column names that should be excluded from the featurization process.

  3. automl_config = AutoMLConfig(name='Automated ML Experiment', ..., featurization=FeaturizationConfig()): This code segment creates a new instance of the FeaturizationConfig class and assigns it to the featurization parameter of the AutoMLConfig object. This tells the automated ML experiment to use the specified featurization configuration instead of the default settings.

  4. featurization_config.enabled_transformers = ['DropColumns']: This code segment is used to enable the DropColumns transformer in the featurization pipeline. The enabled_transformers attribute of the featurization_config object is set to a list of transformers that should be applied during the featurization process.

  5. featurization_config = FeaturizationConfig(): This code segment creates a new instance of the FeaturizationConfig class, which is used to customize the featurization process during automated ML experiments. The FeaturizationConfig object can be modified to specify which featurization transformers should be enabled, which columns should be dropped, and so on.

Therefore, the correct order of the code segments is: 3, 5, 4, 2. First, we create a new AutoMLConfig object that uses a custom FeaturizationConfig object instead of the default settings (segment 3). Then, we create a new instance of the FeaturizationConfig class (segment 5), enable the DropColumns transformer (segment 4), and specify which columns should be dropped from the dataset (segment 2).