AWS EC2 Auto Scaling Group - Retrieving Logs for Troubleshooting

Reviewing Logs for Troubleshooting Bug in AWS EC2 Auto Scaling Group

Prev Question Next Question

Question

You currently run your infrastructure on Amazon EC2 instances behind an Auto Scaling group.

All logs for your application are currently written to ephemeral storage.

Recently your company experienced a major bug in the code that made it through testing and was ultimately deployed to your fleet.

This bug triggered your Auto Scaling group to scale up and back down before you could successfully retrieve the logs off your server to better assist you in troubleshooting the bug.

Which technique should you use to make sure that you can review your logs?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Answer - C.

You can use CloudWatch Logs to monitor applications and systems using log data.

For example, CloudWatch Logs can track the number of errors that occur in your application logs and send you a notification whenever the rate of errors exceeds a threshold you specify.

CloudWatch Logs uses your log data for monitoring; so, no code changes are required.

Option A and B are invalid because Autoscaling policies are not designed for these purposes.

Option D is invalid because you use Cloudwatch Logs Agent and not the monitoring agent.

For more information on Cloudwatch logs, please refer to the below link:

http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html

The scenario described in the question highlights a challenge that many DevOps engineers face - the need to quickly retrieve logs for troubleshooting purposes before the resources on which they are stored become unavailable. In this case, the logs are being written to ephemeral storage, which is volatile and can be lost if the instance is terminated or stopped.

To ensure that the logs are retained, one option is to use a technique that allows them to be stored in a more persistent location. Let's review each of the answers to see which technique is most appropriate in this scenario:

A. Configure the ephemeral policies on your Auto Scaling group to back up on terminate. This answer suggests configuring the Auto Scaling group to back up the logs when the instance is terminated. However, this does not address the issue of being able to retrieve the logs in a timely manner before the instance is terminated. Additionally, backing up the logs on termination may not be sufficient if the instance is terminated unexpectedly, as would be the case if there is a bug in the code.

B. Configure your Auto Scaling policies to create a snapshot of all ephemeral storage on terminate. This answer suggests creating a snapshot of the ephemeral storage on termination. While this would allow for the logs to be retained, it would still not address the need to retrieve the logs quickly before the instance is terminated. Additionally, snapshots are typically used for backup and recovery purposes, and may not be ideal for storing logs that are needed for troubleshooting.

C. Install the CloudWatch Logs Agent on your AMI, and configure CloudWatch Logs Agent to stream your logs. This answer suggests using the CloudWatch Logs Agent to stream the logs to CloudWatch Logs. This is a good option, as it allows the logs to be stored in a more persistent location and enables real-time monitoring and analysis of the logs. By configuring the agent to stream the logs, they can be retrieved quickly and easily, even if the instance is terminated.

D. Install the CloudWatch monitoring agent on your AMI, and set up a new SNS alert for CloudWatch metrics that triggers the CloudWatch monitoring agent to backup all logs on the ephemeral drive. This answer suggests using the CloudWatch monitoring agent to back up the logs when an SNS alert is triggered. While this would allow for the logs to be retained, it still does not address the need to retrieve the logs quickly before the instance is terminated. Additionally, the use of an SNS alert adds complexity to the solution and may not be necessary.

In conclusion, the most appropriate technique to address the challenge described in the question is to install the CloudWatch Logs Agent on the AMI and configure it to stream the logs to CloudWatch Logs. This will allow the logs to be stored in a more persistent location and retrieved quickly and easily, even if the instance is terminated unexpectedly.