page icon

SRE Maturity Assessment

Background of the initiative

Since it is physically difficult to embed SRE in all products, we were looking for a way to promote SRE across the board. Also, because we didn't have the data or indicators to get a bird's-eye view of the whole picture, we were unable to allocate resources efficiently as an organization, and we were often behind the curve in risk management. To solve these issues, we developed the SRE maturity assessment.

What is the SRE Maturity Assessment?

It was created based on the capability maturity model integration in order to obtain an overview of the entire business division and turn it into data.
In addition, we have created a list of necessary items based on the fault lines of service reliability, and have kept it as simple as possible to make it easier to evaluate.
図1. 成熟度概要
Figure 1. Maturity Overview

What can an SRE maturity assessment do?

By utilizing the SRE maturity assessment, SRE promotion (including enablement) can be carried out across the board.
Also, knowing where you are now makes it easier to create improvement plans and get closer to the ideal state for the product.

SRE maturity assessment process

The SRE maturity assessment is carried out in four main steps:
  1. preparation
  1. Assessment and Planning
  1. Improvement implementation
  1. Looking back

1. Preparation

When conducting an SRE maturity assessment, we will explain the concept of the SRE maturity assessment, the application flow, and the Level 3 guidelines.
  • Level 3 Guidelines
    • Questions about the perspectives when considering best practices for each item
    • Ideal state of each product = Level 3
    • The ideal state for each product is different, so it is not necessary to satisfy all of them.

2. Evaluation and Planning

By referring to Level 3 of each item, we will align our understanding of the current maturity level of each item with the ideal state. (Note: Level 3 of each item will be shared around June 2023.)
Once you've aligned your understanding, the final step is to create an improvement plan. First, create a quarterly improvement plan, then organize the action items and owners from there. We also recommend that you prioritize creating an improvement plan if monitoring, incident response, and postmortem are at Level 1.
図3. SRE成熟度評価シート
Figure 3. SRE Maturity Assessment Sheet
図4. SRE成熟度改善計画書
Figure 4. SRE Maturity Improvement Plan

3. Implement improvements

We will improve the maturity level of each item while utilizing knowledge from other services. We also provide templates that can be used immediately for postmortems and incident response.
図5. ナレッジデータベース
Figure 5. Knowledge Database

4. Review

After implementing improvements, conduct a quarterly or semi-annual review and revise the improvement plan. First, conduct a quarterly review, and if the operational load is high, it is better to do it semi-annual.

What the SRE Maturity Assessment Gains

  • It has become possible to get a bird's-eye view of the entire business division as data.
  • It becomes easier to determine which products and improvements should be prioritized.
  • I was able to learn about internal practices that I had not been aware of