Increasing Coverage in Case of Potential Load Balancer Misconfiguration, CDN Failure, or Global Networking Catastrophe | Where to Measure New SLI

Where to Measure New SLI

Question

You support a multi-region web service running on Google Kubernetes Engine (GKE) behind a Global HTTP/S Cloud Load Balancer (CLB)

For legacy reasons, user requests first go through a third-party Content Delivery Network (CDN), which then routes traffic to the CLB.

You have already implemented an availability Service Level Indicator (SLI) at the CLB level.

However, you want to increase coverage in case of a potential load balancer misconfiguration, CDN failure, or other global networking catastrophe.

Where should you measure this new SLI? (Choose two.)

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E.

CD.

To increase coverage in case of a potential load balancer misconfiguration, CDN failure, or other global networking catastrophe, it is recommended to measure the Service Level Indicator (SLI) at multiple levels in the application stack. The following two options are the most suitable to measure the SLI:

  1. Metrics exported from the application servers: By measuring the SLI at the application servers' level, you can detect issues such as slow response times, error rates, or service unavailability. You can export the application server metrics to a monitoring system such as Google Cloud Monitoring or Prometheus to monitor the SLI.

  2. A synthetic client that periodically sends simulated user requests: By measuring the SLI from a synthetic client, you can simulate user traffic and detect issues such as network latency, connection timeouts, or other connectivity issues. You can use tools such as Google Cloud's Load Testing or Apache JMeter to simulate traffic to your application.

The following options are not the best choices to measure the SLI for this scenario:

A. Your application servers' logs: While application logs can provide insights into the behavior of the application, they may not provide a complete picture of the user experience. They are not an ideal choice for measuring the SLI for this scenario.

B. Instrumentation coded directly in the client: While it is possible to measure the SLI by instrumenting the client, this approach is not recommended as it requires deploying new code to the client.

D. GKE health checks for your application servers: While GKE health checks can provide insight into the health of your application, they are not the best choice for measuring the SLI for this scenario as they only monitor the health of the individual application instances.

In summary, the best options to measure the SLI for this scenario are metrics exported from the application servers and a synthetic client that periodically sends simulated user requests.