Machine Learning Model Tuning for Shirt Style Classification | AWS Certified MLS-C01 Exam

XGBoost Model Tuning for Shirt Style Classification

Question

You work for a startup shirt manufacturer that has come up with a new manufacturing process for shirts that is very stylish and has become very popular since your company ran an online Kickstarter fundraiser and shipped its first line of shirts.

You now want to use machine learning to classify your shirt styles as either conservative or not based on customer feedback on your website.

This classification information will help your designers target new designs based on the customer perception of your current offerings. You have gathered your data from your website comments and ratings.

You have also performed feature engineering of your data.

You are now ready to run several model tuning jobs, as many as needed, even if you have to run hundreds of them, to find the best version of your XGBoost model.

You plan to do this by running many hyperparameter tuning jobs that test the range of hyperparameters you have available to you.

Since you have decided on using a binary classifier algorithm and based on the business problem you are trying to solve, you have decided you need to measure the success of a hyperparameter tuning job based on precision and recall. Which XGBoost metric is the best objective on which to evaluate your model?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E. F.

Answer: C.

Option A is incorrect.

The accuracy metric only measures (right cases)/(all cases), which doesn't give you precision or recall, which are the two metrics you wish to use to evaluate your model.

Option B is incorrect.

The error metric only measures (wrong cases)/(all cases), which doesn't give you precision or recall, which are the two metrics you wish to use to evaluate your model.

Option C is correct.

The f1 metric combines precision and recall into one metric.

It represents the harmonic mean of precision and recall.

Its formula: 2*precision*recall/(precision+recall).

Option D is incorrect.

The MAE metric finds the absolute value of the error between the predicted and target values.

This doesn't give you precision or recall, which are the two metrics you wish to use to evaluate your model.

Option E is incorrect.

The map metric finds the mean average precision.

This doesn't give you recall, which is one of the two metrics you wish to use to evaluate your model.

Option F is incorrect.

The error metric is a multiclass classification error rate, which is represented as (wrong cases)/(all cases)

This doesn't give you recall, which is one of the two metrics you wish to use to evaluate your model.

Also, this metric is used for multiclass classification problems.

You are trying to solve a binary classification problem.

Reference:

Please see the Amazon SageMaker developer guide titled Tune an XGBoost Model, the XGBoost docs page titled XGBoost Parameters, and the article titled 20 Popular Machine Learning Metrics.

Part 1: Classification & Regression Evaluation Metrics.

Based on the business problem you are trying to solve, you have decided that you need to measure the success of a hyperparameter tuning job based on precision and recall. Therefore, the best XGBoost metric to evaluate your model should also be based on precision and recall.

The most commonly used metric for evaluating binary classification models based on precision and recall is the F1 score, which is the harmonic mean of precision and recall. The F1 score is a balance between precision and recall and is useful when you want to optimize both precision and recall. It is defined as follows:

F1 Score = 2 * (Precision * Recall) / (Precision + Recall)

Here, Precision is the ratio of true positives to the sum of true positives and false positives. It measures the accuracy of positive predictions, i.e., how many of the predicted positive instances are actually positive. Recall, on the other hand, is the ratio of true positives to the sum of true positives and false negatives. It measures the completeness of positive predictions, i.e., how many of the actual positive instances are predicted as positive.

Therefore, C. F1 is the best XGBoost metric to evaluate your model in this scenario. It optimizes both precision and recall, which are important for the business problem you are trying to solve. You can use F1 as the objective metric for hyperparameter tuning of your XGBoost model.