Maintaining Instance Group Sizes for Autoscaling in Google Cloud

Properly Maintaining Instance Group Sizes

Question

You are running an application on multiple virtual machines within a managed instance group and have autoscaling enabled.

The autoscaling policy is configured so that additional instances are added to the group if the CPU utilization of instances goes above 80%

VMs are added until the instance group reaches its maximum limit of five VMs or until CPU utilization of instances lowers to 80%

The initial delay for HTTP health checks against the instances is set to 30 seconds.

The virtual machine instances take around three minutes to become available for users.

You observe that when the instance group autoscales, it adds more instances then necessary to support the levels of end-user traffic.

You want to properly maintain instance group sizes when autoscaling.

What should you do?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

D.

The issue with the current setup is that the autoscaling policy is triggered solely based on CPU utilization, which may not necessarily reflect the actual demand for the application. This can lead to the addition of more instances than required, which can result in additional costs and resource wastage.

To properly maintain instance group sizes when autoscaling, we need to ensure that the autoscaling policy is based on a more accurate measure of the application's demand. One possible solution is to use a load balancer in front of the instance group and configure it to perform health checks based on the actual application traffic. This can be achieved by using a TCP health check instead of an HTTP health check, as suggested in option C.

A TCP health check verifies that the instance is reachable on a specific port, without sending any application-level data. This can help to ensure that the instance is ready to receive traffic from the load balancer. On the other hand, an HTTP health check sends an HTTP request to the instance, which can result in false positives if the instance is not fully ready to serve traffic.

In addition to using a TCP health check, it may also be necessary to adjust the autoscaling policy to better reflect the application demand. For example, we could use a combination of CPU utilization and network traffic to trigger autoscaling, or we could use a more sophisticated algorithm based on machine learning or predictive analytics.

Option A (setting the maximum number of instances to 1) and option B (decreasing the maximum number of instances to 3) are not ideal solutions, as they restrict the ability of the instance group to scale up to meet demand. Option D (increasing the initial delay of the HTTP health check to 200 seconds) may reduce the frequency of false positives, but it does not address the underlying issue of using an HTTP health check. Therefore, option C is the best solution.