Competitive Product Data Ingestion and Comparison Solution for Retail Competitor Analysis | AWS Certified Machine Learning - Specialty Exam Preparation Guide

Data Ingestion and Product Comparison Solution for Retail Competitor Analysis

Question

You are a member of a machine learning team at a large online retailer.

Your team is responsible for retail competitor analysis.

You have a competitive product data streaming source that you need to ingest into your data lake.

You need to use that streaming competitor product data to match the corresponding product data in your catalog of products.

Using this matching, your data scientists can produce competitive analysis dashboards in a BI tool. Which of the following options gives you the best data ingestion and most efficient product comparison solution?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer: C.

Option A is incorrect.

Kinesis Data Streams cannot write directly to S3

It needs a Kinesis Consumer Library application to receive the data and then write it to S3.

Option B is incorrect.

Kinesis Data Streams cannot write directly to Elasticsearch.

Also, Elasticsearch has no built-in data matching capability.

Option C is CORRECT.

Kinesis Data Firehose can stream directly to S3

Lake Formation has a FindMatches transform that enables you to identify matching records in your dataset, even when the records do not have a common unique identifier.

No fields match exactly.

In this scenario, you will match products in your product catalog with your competitive product sources even though the product entries are structured differently.

QuickSight allows your data scientists to produce their analysis dashboards.

Option D is incorrect.

Kinesis Data Firehose can write directly to Elasticsearch.

But Elasticsearch has no built-in data matching capability.

Reference:

Please see the AWS Lake Formation developer guide titled What Is AWS Lake Formation?.

Please see the AWS Glue developer guide titled Matching Records with AWS Lake Formation FindMatches.

The best solution for ingesting the streaming competitor product data and matching it with corresponding product data in the catalog of products would be using Kinesis Data Firehose, S3, and Lake Formation to store and analyze the data. QuickSight can be used to create competitive analysis dashboards in a BI tool.

Option A uses Kinesis Data Streams, which is not the best option for this scenario since it is designed for real-time data streaming and requires additional processing to store data in S3 for further analysis. Elasticsearch and Kibana in option B are also not the best choice as they are primarily used for search and visualization, respectively, and do not offer a complete data ingestion and processing solution.

Option C, on the other hand, offers a better solution for the scenario by using Kinesis Data Firehose, which is designed to store and process large amounts of streaming data. The data can then be stored in S3 for further processing and analysis using Lake Formation. QuickSight can be used to create competitive analysis dashboards.

Option D, similar to option B, uses Elasticsearch and Kibana, which are not the best choice for this scenario.

In conclusion, option C is the best solution for ingesting the streaming competitor product data and matching it with corresponding product data in the catalog of products, followed by processing and analysis using S3, Lake Formation, and QuickSight.