Forecasting Demand for a New Product Using Historical Time Series Data

Forecasting Demand for a New Product

Question

You are a machine learning specialist working for a retail clothing conglomerate.

Your company sells many lines of clothing, such as budget casual wear, office casual wear, office formal wear, etc.

For each of these existing products in these categories, you have been using autoregressive integrated moving average (ARIMA) models to forecast demand.

You now wish to forecast demand for a new product based on the collective historical time series data from your existing products.

Which approach should you take to forecast demand for your new product?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: B.

Option A is incorrect.

The XGBoost algorithm is used to solve regression, classification, and ranking problems.

It would not be a good choice for a forecasting problem.

Option B is correct.

The DeepAR Forecasting algorithm works great when you are trying to forecast using much similar time series across a set of cross-sectional units.

That is exactly what you are trying to do: you have several similar time series from across your other product lines that you would now like to use to train your single model using all of these time series.

The collective time series will help you predict sales for your new product.

Option C is incorrect.

The ARIMA forecasting method uses one time series to train the model.

You need to train using several time series.

That's where the DeepAR algorithm capabilities differentiate; DeepAR forecasts using many similar time series across a set of cross-sectional units.

Option D is incorrect.

K-means clustering is used to find discrete groupings in your data.

K-means does not fit your problem definition as well as the DeepAR algorithm.

Reference:

Please see the Amazon SageMaker developer guide titled K-Means Algorithm (https://docs.aws.amazon.com/sagemaker/latest/dg/k-means.html), the Amazon SageMaker developer guide titled XGBoost Algorithm (https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html), the Amazon SageMaker developer guide titled DeepAR Forecasting Algorithm (https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html), and the Medium article titled Understanding Auto Regressive Moving Average Model - ARIMA (https://medium.com/fintechexplained/understanding-auto-regressive-model-arima-4bd463b7a1bb)

The best approach to forecast demand for a new product based on historical time series data from existing products depends on the specific characteristics of the data, the nature of the new product, and the required forecasting accuracy.

In this scenario, the existing products belong to different categories and are forecasted using ARIMA models. ARIMA models are appropriate for time series data with autocorrelation and can capture seasonality, trends, and other time-dependent patterns. ARIMA models are also relatively easy to interpret and explain. Therefore, one reasonable approach to forecasting demand for the new product is to use an ARIMA model.

Option A, forecasting demand for the new product using the XGBoost algorithm, is not appropriate for this scenario since XGBoost is a machine learning algorithm designed for supervised learning problems such as classification and regression. It is not designed for time series forecasting problems like this one.

Option B, forecasting demand for the new product using the DeepAR algorithm, is a valid approach since DeepAR is a neural network-based algorithm specifically designed for time series forecasting. It can capture complex temporal patterns and dependencies between the input features and the target variable. However, it may require a large amount of data to train and can be computationally expensive.

Option D, forecasting demand for the new product using k-means clustering, is not appropriate for this scenario since k-means clustering is an unsupervised learning algorithm designed for clustering problems. It does not provide a direct way to forecast the target variable.

Therefore, the most appropriate answer for this scenario is option C, forecasting demand for the new product using an ARIMA model, since it is a proven and interpretable approach for time series forecasting and has been successfully applied to similar data in the past.