Scaling NGINX-based Application on GKE with GCLB - Service Level Indicator (SLI) Approach

Scale Your Application's Frontend with the Right Service Level Indicator

Question

Your team has recently deployed an NGINX-based application into Google Kubernetes Engine (GKE) and has exposed it to the public via an HTTP Google Cloud Load Balancer (GCLB) ingress.

You want to scale the deployment of the application's frontend using an appropriate Service Level Indicator (SLI)

What should you do?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E.

B.

To scale the deployment of the NGINX-based application's frontend in Google Kubernetes Engine (GKE) using an appropriate Service Level Indicator (SLI), the following steps can be taken:

Option A: Configure the horizontal pod autoscaler to use the average response time from the Liveness and Readiness probes. This option involves configuring the horizontal pod autoscaler (HPA) to use the average response time from the Liveness and Readiness probes as the SLI. Liveness and Readiness probes are used to check the health of the application by sending requests to it. The HPA scales the number of pods based on the SLI. However, the average response time may not always be an accurate measure of the application's performance, especially when there are fluctuations in traffic.

Option B: Configure the vertical pod autoscaler in GKE and enable the cluster autoscaler to scale the cluster as pods expand. This option involves using the vertical pod autoscaler (VPA) in GKE to adjust the resource allocation of the application's pods based on their resource usage patterns. The VPA monitors the resource usage of the pods and adjusts their resource requests and limits accordingly. This helps to optimize the utilization of resources and prevent resource starvation. Enabling the cluster autoscaler also ensures that the cluster scales up or down to accommodate the increased or decreased demand.

Option C: Install the Stackdriver custom metrics adapter and configure a horizontal pod autoscaler to use the number of requests provided by the GCL. This option involves installing the Stackdriver custom metrics adapter to collect custom metrics from the GCLB ingress, such as the number of requests per second. This information can then be used by the HPA to scale the number of pods based on the SLI. However, this option requires additional configuration and may involve extra costs.

Option D: This option is not specified in the question.

Option E: Expose the NGINX stats endpoint and configure the horizontal pod autoscaler to use the request metrics exposed by the NGINX deployment. This option involves exposing the NGINX stats endpoint and using the request metrics exposed by the NGINX deployment as the SLI. The HPA can then scale the number of pods based on the SLI. This option allows for real-time monitoring of the application's performance and can help to identify any bottlenecks. However, exposing the stats endpoint may pose a security risk if not properly secured.

In conclusion, the best option for scaling the deployment of the NGINX-based application's frontend in GKE using an appropriate Service Level Indicator (SLI) depends on the specific requirements of the application and the desired level of automation. Option B and Option E are both viable options that can be used depending on the specific use case.