DBRE starts from self-proclaimed

This article isDBRE Advent Calendar 2024This is the article for the 24th day
 
This is Onkai Yuta (@fat47) from the Service Reliability Group (SRG) of the Media Headquarters.
#SRG(Service Reliability Group) is a group that mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, and contributing to OSS.
 
This article is my first attempt at DBRE.

What is DBRE?


DBRE (Database Reliability Engineering) refers to the engineering that improves the reliability of databases and increases the availability of the entire system. Compared to SRE (Site Reliability Engineering), DBRE is not yet as well known, but recognition is gradually spreading.

There is no clear role for DBRE in our company.


As far as I know, within the CyberAgent Group, there is no independent role or organization called DBRE. I don't think there is any dedicated DBA position either.
*I apologize if this actually exists and I just don't know about it.
 
I belong to the cross-functional SRE team for the media business, and I am mainly responsible for MySQL-related tasks. Although I am not particularly struggling with the current situation, I thought that there might be something to be gained by identifying myself as a DBRE, so starting this year I have been focusing on DBRE-like activities.

Trying out activities as a self-proclaimed DBRE


Aurora MySQL Upgrade Knowledge Sharing Activity

This year, I was in charge of upgrading Ameba's Aurora MySQL. Details are summarized in the following blog, so please take a look.
 
In order to share the knowledge gained from this upgrade with the organization, we created a Slack channel within the CyberAgent Group where people can ask questions about Aurora.
Each business division and subsidiary had an SRE or similar organization, and each was dealing with the Aurora upgrade, so this channel was used to share knowledge and provide advice.
 
I also posted articles about my newly gained knowledge on the SRG Portal Blog, and have written over 15 blog posts over the past year. Some of the articles have even been included in Hatena Bookmark's Hot Entries, drawing the attention of many people both inside and outside the company. I hope that I have been able to contribute, even if only a little, to improving the reliability of the database.
 

Improved development productivity with automated documentation of MySQL environments

Ameba uses a microservices architecture and has a large number of microservices.
As a result, there are many MySQL clusters in operation, and the issue is that information about these clusters is not comprehensively documented.
Furthermore, even if documentation was available, it was often not updated and sometimes out of date, which caused problems such as the time it took to onboard new developers and catch up on new feature development.
Therefore, we have created an environment that automatically updates table information and ER diagrams for running MySQL clusters.
Creating an ER diagramtblsWe made use of the following.
Markdown documents generated by tbls can bemkdocsIt is converted to HTML using
The tbls config (yaml) file is managed on github, and the Aurora MySQL password is obtained from SecretsManager.
The generated html file is uploaded to S3 and users can access it via CloudFront.
A rough diagram of the configuration looks like this. Although it's not shown in the diagram, OIDC authentication is performed using CloudFront, which limits the developers who can access the service.
By running this daily, you can now automatically access the latest data as documents.
tbls sampleより引用
tbls sampleQuoted from

Conclusion


It's been a year since I started working as a self-proclaimed DBRE, and I think I've started to see results, albeit slowly.
There are many things I want to work on next year, so I would like to continue working on improving database reliability and development productivity step by step!
 
SRG is looking for people to work with us. If you're interested, please contact us here.