HikeHills.com - AWS Certified Big Data Specialty Exam - Solutions

AWS Certified Big Data - Specialty Exam Solutions

Question

HikeHills.com (HH) is an online specialty retailer that sells clothing and outdoor refreshment gear for trekking, go camping, boulevard biking, mountain biking, rock hiking, ice mountaineering, skiing, avalanche protection, snowboarding, fly fishing, kayaking, rafting, road and trace running, and many more. HHruns their entire online infrastructure on java based web applications running on AWS.

The HH is capturing click stream data and use custom-build recommendation engine to recommend products which eventually improve sales, understand customer preferences and already using AWS kinesis KPL to collect events and transaction logs and process the stream.

The event/log size is around 12 KB.HHobserved couple of issues in the implementation and want to quickly fix the coding Some of the events and transaction logs are missing and it's adversely impacting the recommendation Need for accountancy of failures How can we solve the issues? select 3 options.

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E. F.

Answer : B, E, F.

Option A is incorrect -Writing to more than 1 stream is needed if the requirement drives the demand for 2 or more different streams.

There is no guarantee that the records would be written in case even into the second stream.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-writing.html

Option B is correct -Validate the transaction by checking the successful insert into the stream by embedding automatic and configurable retry mechanism.

https://docs.aws.amazon.com/streams/latest/dev/developing-producers-with-kpl.html

Option C is incorrect -Kinesis Streams provide capabilities to use Future objects to validate UserRecords.

No need to complicate the code by storing in memory/transient storage.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-writing.html

Option D is incorrect -Kinesis Streams provide capabilities to use Future objects to validate UserRecords.

No need to complicate the code by storing in memory/transient storage.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-writing.html

Option E is correct -Kinesis Streams provide capabilities to use Future objects to validate UserRecords.

No need to complicate the code by storing in memory/transient storage.

https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-writing.html

Option F is correct -time-to-live records need to be increases if the UserRecords could not inserted into stream in time.

https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/default_config.properties

The given scenario describes that HikeHills.com is an online specialty retailer that runs its entire online infrastructure on Java-based web applications running on AWS. They are capturing clickstream data and using a custom-built recommendation engine to recommend products, which improves sales and understands customer preferences. They are using AWS Kinesis KPL to collect events and transaction logs and process the stream, but they have observed some issues in the implementation that are adversely impacting the recommendation. Some of the events and transaction logs are missing, and there is a need for accounting for failures.

To solve these issues, we need to select three options out of the given six options, which are:

A. Write to a minimum of 2 streams to ensure replicas of event/log data is always available even if a record could not be inserted into a stream.

B. Writes to the Kinesis data stream with an automatic and configurable retry mechanism.

C. Maintain event/log information in temporary memory/storage of the application/session.

D. Compare the stream records using GetRecords method with temporary memory/storage of the application/session.

E. Examine failures using the Future objects that are returned from addUserRecord method.

F. Increase the time-to-live on records if records could not be inserted after RecordMaxBufferedTime.

Now, let's discuss each option in detail:

Option A: Write to a minimum of 2 streams to ensure replicas of event/log data is always available even if a record could not be inserted into a stream.

This option suggests writing to at least two streams to ensure that replicas of event/log data are always available, even if a record could not be inserted into one stream. This approach ensures data durability and availability by storing the data redundantly across multiple streams. In case of any failure or outage, the data is still available, and the recommendation engine can continue to work without any disruption. This option is a good choice to ensure data availability and durability.

Option B: Writes to the Kinesis data stream with an automatic and configurable retry mechanism.

This option suggests writing to the Kinesis data stream with an automatic and configurable retry mechanism. This approach helps in handling transient failures and network issues, which can cause data loss or delay. The Kinesis data stream provides automatic retries for write failures, and it can be configured to retry multiple times with an exponential backoff strategy. This option ensures data consistency and minimizes data loss during write failures or network issues.

Option C: Maintain event/log information in temporary memory/storage of the application/session.

This option suggests maintaining event/log information in temporary memory/storage of the application/session. This approach can help in reducing the latency and cost of processing data as it avoids the need to read data from Kinesis data stream repeatedly. However, it can result in data loss in case of a server failure or outage. This option is suitable for scenarios where data loss is acceptable or can be mitigated by other means.

Option D: Compare the stream records using GetRecords method with temporary memory/storage of the application/session.

This option suggests comparing the stream records using the GetRecords method with temporary memory/storage of the application/session. This approach can help in detecting missing or duplicate records by comparing the records received from Kinesis data stream with the temporary memory/storage of the application/session. However, it can increase the processing time and cost of data processing, especially in scenarios with a high volume of data. This option is suitable for scenarios where data consistency is critical, and data loss is not acceptable.

Option E: Examine failures using the Future objects that are returned from addUserRecord method.

This option suggests examining failures using the Future objects that are returned from the addUserRecord method. This approach can help in detecting write failures and taking appropriate actions, such as retries or error handling. However, it requires additional code implementation to handle the Future objects, which