Reducing Mean Time to Recovery for Deployment Failures | Improve Release Process

Effective Strategies to Reduce Downtime and Improve Mean Time to Recovery

Question

You deploy a new release of an internal application during a weekend maintenance window when there is minimal user tragic.

After the window ends, you learn that one of the new features isn't working as expected in the production environment.

After an extended outage, you roll back the new release and deploy a fix.

You want to modify your release process to reduce the mean time to recovery so you can avoid extended outages in the future.

What should you do? (Choose two.)

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D. E.

AC.

When deploying a new release of an application, it is essential to ensure that the process is as smooth and efficient as possible. This includes implementing measures to reduce the mean time to recovery in case of issues that arise during or after deployment. In this scenario, one of the new features in the production environment is not working as expected, leading to an extended outage. To avoid such incidents in the future, two measures that can be taken are:

A. Before merging new code, require 2 different peers to review the code changes. Code reviews are an important part of the development process, as they help to identify potential issues early on, before the code is deployed to production. Requiring two different peers to review the code changes before merging ensures that the code is thoroughly reviewed and any potential issues are caught before they can cause problems in production. This measure can help to reduce the mean time to recovery by catching potential issues early on and avoiding extended outages caused by code-related issues.

B. Adopt the blue/green deployment strategy when releasing new code via a CD server. Blue/green deployment is a strategy in which two identical environments are set up, with one environment (blue) running the current production code, and the other environment (green) running the new code. The new code is tested in the green environment, and once it is deemed stable, traffic is switched from the blue environment to the green environment. This strategy allows for a quick rollback if any issues arise during the deployment process, reducing the mean time to recovery. It also allows for a more controlled and gradual rollout of new code, minimizing the risk of widespread issues in production.

C. Integrate a code linting tool to validate coding standards before any code is accepted into the repository. Code linting tools can help to ensure that coding standards are adhered to, which can prevent issues related to coding practices that may cause issues later on. By integrating a code linting tool into the development process, code issues can be identified and addressed early on, reducing the risk of issues arising in production. While this measure can help to improve code quality, it may not directly contribute to reducing the mean time to recovery in the case of issues that do arise in production.

D. Require developers to run automated integration tests on their local development environments before release. Automated integration tests can help to catch issues related to how different components of an application interact with each other, reducing the risk of issues arising during deployment. By requiring developers to run automated integration tests on their local development environments before release, issues can be identified and addressed early on, reducing the risk of issues arising in production. While this measure can help to improve the quality of the code being deployed, it may not directly contribute to reducing the mean time to recovery in the case of issues that do arise in production.

E. Configure a CI server. Add a suite of unit tests to your code and have your CI server run them on commit and verify any changes. Continuous integration (CI) involves integrating code changes into a shared repository on a regular basis, often several times a day. By configuring a CI server and adding a suite of unit tests to the code, any changes to the code can be automatically tested, and any issues can be identified and addressed early on. This measure can help to improve the quality of the code being deployed, but it may not directly contribute to reducing the mean time to recovery in the case of issues that do arise in production.

In summary, measures that can be taken to reduce the mean time to recovery when deploying a new release of an application include requiring code reviews before merging new code, adopting the blue/green deployment strategy, and implementing automated integration tests on developers' local development environments.