Hyperparameter Tuning for Machine Learning Model in AWS

Scaling Type and Search Type for Hyperparameter Tuning Job

Question

You work as a machine learning specialist for a social media software company that produces games for mobile devices.

Your company has a new game that they believe will generate a large following very quickly.

You need to build a model to predict whether users will purchase additional game features via in-app purchases.

You have a large dataset to use for your training, and you need to find the best hyperparameters by using a hyperparameter tuning job.

You have configured the training jobs the hyperparameter tuning job will run by defining an estimator and objective.

You want to run your training jobs in a highly parallel manner because you want to complete your hyperparameter tuning quickly.

Also, you know that the order of magnitude is more important than the absolute value for your hyperparameter values.

For example, a change from 1 to 2 is expected to have a much bigger impact than a change from 100 to 101

Which scaling type and search type combination should you use for your hyperparameter tuning job?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer: B.

Option A is incorrect.

When an order of magnitude is more important than the absolute value for your hyperparameters in your tuning, you should use logarithmic scaling.

However, when you wish to run many training jobs in parallel, you should use a Random search strategy, not a Bayesian search strategy.

Option B is correct.

When an order of magnitude is more important than the absolute value for your hyperparameters in your tuning, you should use logarithmic scaling.

When you wish to run many training jobs in parallel, you should use a Random search strategy.

Option C is incorrect.

When an order of magnitude is more important than the absolute value for your hyperparameters in your tuning, you should use logarithmic scaling, not linear scaling.

When you wish to run many training jobs in parallel, you should use a Random search strategy, not a Bayesian search strategy.

Option D is incorrect.

When an order of magnitude is more important than the absolute value for your hyperparameters in your tuning, you should use logarithmic scaling, not linear scaling.

References:

Please see the Amazon SageMaker developer guide titled How Hyperparameter Tuning Works (https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html),

The Amazon SageMaker Examples titled Random search and hyperparameter scaling with SageMaker XGBoost and Automatic Model Tuning (https://github.com/aws/amazon-sagemaker-examples/blob/master/hyperparameter_tuning/xgboost_random_log/hpo_xgboost_random_log.ipynb)

To find the best hyperparameters for a machine learning model, we often use hyperparameter tuning. Hyperparameter tuning is an optimization process that aims to find the best set of hyperparameters for a given machine learning model. There are different methods for hyperparameter tuning, including grid search, random search, and Bayesian search.

In this scenario, the goal is to build a model to predict whether users will purchase additional game features via in-app purchases. The training data is large, and the focus is on completing the hyperparameter tuning quickly. Additionally, the order of magnitude is more important than the absolute value for the hyperparameter values.

For this scenario, the best scaling type and search type combination to use for the hyperparameter tuning job is logarithmic scaling and random search (option B).

Logarithmic scaling is useful when the range of the hyperparameter values is large and spans several orders of magnitude. This scaling type allows the search algorithm to explore the hyperparameter space more efficiently by sampling values from a logarithmic distribution instead of a linear distribution. Logarithmic scaling is particularly useful in this scenario, where the order of magnitude is more important than the absolute value for the hyperparameter values.

Random search is a search type that randomly samples hyperparameter values from a predefined range. This search type is beneficial when the hyperparameter space is large and the goal is to complete the hyperparameter tuning quickly. Additionally, random search does not require an initial estimate of the hyperparameter space, which makes it easy to use and computationally efficient.

Bayesian search is another search type that iteratively updates a probabilistic model of the hyperparameter space based on the previous observations. This search type is useful when the hyperparameter space is small, and we want to optimize the objective function quickly. However, Bayesian search requires an initial estimate of the hyperparameter space, which may not be suitable in this scenario.

Linear scaling is another scaling type that can be used to scale hyperparameters linearly within a predefined range. This scaling type is useful when the range of the hyperparameter values is small and does not span several orders of magnitude. However, this scaling type is not suitable for this scenario because the range of the hyperparameter values is expected to span several orders of magnitude.

In summary, the best scaling type and search type combination to use for the hyperparameter tuning job is logarithmic scaling and random search (option B). This combination will allow the search algorithm to explore the hyperparameter space efficiently while completing the hyperparameter tuning quickly.