HikeHills.com - AWS Certified Big Data - Specialty Exam Answer

AWS Certified Big Data - Specialty Exam Answer

Question

HikeHills.com (HH) is an online specialty retailer that sells clothing and outdoor refreshment gear for trekking, go camping, boulevard biking, mountain biking, rock hiking, ice mountaineering, skiing, avalanche protection, snowboarding, fly fishing, kayaking, rafting, road and trace running, and many more. HHruns their entire online infrastructure on java based web applications running on AWS.

The HH is capturing clickstream data and use custom-build recommendation engine to recommend products which eventually improve sales, understand customer preferences and already using AWS kinesis KPL to collect events and transaction logs and process the stream.

The event/log size is around 12 bytes. HHhas the following requirements to process the data that is being ingested - Load the streaming data into Elasticsearch cluster Capture transformation failures into same S3 bucket to address audit Backup the syslog streaming data into S3 bucket Select 3 options.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E. F.

Answer: A, C, E.

Option A is correct.

For Splunk destinations, streaming data is delivered to Splunk cluster, and it can optionally be backed up to your S3 bucket concurrently..

https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html#data-flow-diagrams

Option B is incorrect.

For Splunk destinations, streaming data is delivered to your Splunk cluster, and it can optionally be backed up to your S3 bucket concurrently.

https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html#data-flow-diagrams

Option C is correct.

when S3 is selected as destination, and Source record S3 backup is enabled, untransformed incoming data can be delivered to a separate S3 bucket and errors are delivered to processing-failed and errors folder in S3 bucket.

https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.htmlhttps://docs.aws.amazon.com/firehose/latest/dev/basic-deliver.html#retry

Option D is incorrect when S3 is selected as destination, and Source record S3 backup is enabled, untransformed incoming data can be delivered to a separate S3 bucket and errors are delivered to processing-failed and errors folder in S3 bucket.

https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html https://docs.aws.amazon.com/firehose/latest/dev/basic-deliver.html#retry

Option E is correct.

when S3 is selected as destination, and Source record S3 backup is enabled, untransformed incoming data can be delivered to a separate S3 bucket.

https://docs.aws.amazon.com/firehose/latest/dev/create-

Option F is incorrect.when S3 is selected as destination, and Source record S3 backup is enabled, untransformed incoming data can be delivered to a separate S3 bucket.

https://docs.aws.amazon.com/firehose/latest/dev/create-

HikeHills.com (HH) is an online retailer that uses AWS to run its entire infrastructure, including a custom-built recommendation engine that uses clickstream data to improve sales and understand customer preferences. HH currently uses AWS Kinesis KPL to collect event and transaction logs that are approximately 12 bytes in size. The company has several requirements for processing this data, including loading the streaming data into an Elasticsearch cluster, capturing transformation failures in the same S3 bucket to address audit, and backing up the syslog streaming data into an S3 bucket.

Let's go through the given options and select the three that meet HH's requirements:

Option A - Streaming data can directly be delivered into ElasticSearch Domain This option suggests that streaming data can be delivered directly into an Elasticsearch domain. While it is possible to load data directly into Elasticsearch, this option does not address the requirement to capture transformation failures or back up the syslog streaming data into an S3 bucket.

Option B - Streaming data is delivered to your S3 bucket first. Kinesis Data Firehose then issues an Amazon ElasticSearch COPY command to load data from your S3 bucket to your Amazon ElasticSearch cluster This option meets all of HH's requirements. The streaming data is first delivered to an S3 bucket, allowing for capture of transformation failures in the same bucket. Kinesis Data Firehose then issues a COPY command to load the data into an Elasticsearch cluster. Additionally, this option does not mention backing up the syslog streaming data into an S3 bucket, but this requirement is not explicitly stated in the question, so it may not be necessary.

Option C - The transformation failures and delivery failures are loaded into processing-failed and errors folders in the same S3 bucket This option meets the requirement to capture transformation failures in the same S3 bucket. However, it does not address the other two requirements: loading the streaming data into an Elasticsearch cluster and backing up the syslog streaming data into an S3 bucket.

Option D - The transformation failures and delivery failures are loaded into transform-failed and delivery-failed folders in the same S3 bucket This option is similar to option C, but it specifies different folder names for the transformation and delivery failures. Like option C, it meets the requirement to capture transformation failures in the same S3 bucket but does not address the other two requirements.

Option E - When ES is selected as the destination, and Source record S3 backup is enabled, and Backup S3 Bucket is defined, untransformed incoming data can be delivered to a separate S3 bucket This option suggests that untransformed incoming data can be delivered to a separate S3 bucket when Elasticsearch is selected as the destination and S3 backup is enabled. While this option addresses the requirement to back up the syslog streaming data into an S3 bucket, it does not address the other two requirements: loading the streaming data into an Elasticsearch cluster and capturing transformation failures in the same S3 bucket.

Option F - S3 backups can be managed by bucket policies This option does not address any of HH's requirements.

Therefore, the three options that meet HH's requirements are B, C, and D.