Monitoring and Troubleshooting API Performance

Identifying the Longest Service in Your Microservices-based Application

Question

A small number of API requests to your microservices-based application take a very long time.

You know that each request to the API can traverse many services.

You want to know which service takes the longest in those cases.

What should you do?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

D.

https://cloud.google.com/trace/docs/quickstart#find_a_trace

In a microservices-based application, a single API request can traverse multiple services. Therefore, when a small number of API requests take a very long time, it becomes difficult to identify the bottleneck or the service that is causing the delay. In order to identify the problematic service, it is important to instrument the application with a performance monitoring and tracing tool.

Option A suggests setting timeouts on the application so that requests fail faster. While this can help reduce the overall response time for the user, it does not provide insights into the specific service that is causing the delay.

Option B suggests sending custom metrics for each request to Stackdriver Monitoring. This can help identify patterns in request latency and can provide insights into which requests are taking the longest. However, it does not specifically identify the service that is causing the delay.

Option C suggests using Stackdriver Monitoring to look for insights when API latencies are high. While this can help identify when the API is experiencing high latency, it does not provide insights into the specific service that is causing the delay.

Option D is the correct answer. It suggests instrumenting the application with Stackdriver Trace to break down request latencies at each microservice. Stackdriver Trace can help identify the performance of each service in the microservices-based application and provide insights into which service is causing the delay. This can help identify the root cause of the delay and help in optimizing the service. Therefore, option D is the best approach to identify the problematic service.