Organizing the various replication methods in MySQL 8.0

This is Onikai (@fat47) from the Service Reliability Group (SRG) of the Technology Division.
#SRGThe Service Reliability Group primarily provides comprehensive support for the infrastructure surrounding our media services, focusing on improving existing services, launching new ones, and contributing to open-source software (OSS).
This article is located within SRG.DBWGThe DB Working Group is releasing database-related materials that it provides to the entire company.
I hope this is of some help.
 

What is replication?


Replication is a mechanism that allows you to have copies of your data on another server.
This feature has been implemented since the early versions of MySQL.
The master server accepts update queries and forwards the changes to the slaves.
The slave accepts changes from the master, but is only accessible to the client as read-only.
💡
The terms "master" and "slave" have been changed to "source" and "replica."
マスター/スレーブの構成
Master/slave configuration
 

Types of replication configurations and common uses


Replication can be configured in various ways depending on the purpose.

Multi-stage replication

This configuration is one where the slave unit in a standard master-slave setup also takes on the role of the master.
They are also called grandchild slaves.
孫スレーブの構成
Configuration of grandchild slaves
One example of using this configuration is when performing a rolling upgrade of the MySQL version.
孫スレーブ構成を利用したアップグレード
Upgrade using a grandchild slave configuration
By building such a multi-stage slave configuration and promoting the slaves to masters, it becomes possible to upgrade to MySQL 5.7.

1:N configuration

This configuration involves one master unit and multiple slave units.
This is the most standard configuration.
It is primarily used to distribute the load on the system.
1:Nの構成
1:N configuration
It's also common to group slaves by the load balancer's virtual VIP and create user-inaccessible slaves for backup purposes.
Additionally, slaves become candidates for promotion when a master unit fails.
1:N構成での利用例
Example of use in a 1:N configuration

N:1 configuration

This configuration allows one slave unit to receive data from multiple masters.
We don't have many use cases at our company.
For example, if you use a slave machine exclusively for backups, you can reduce the number of backup servers because one slave machine can hold multiple copies of data.
Alternatively, it is possible to reconsolidate databases that are horizontally partitioned using sharding into a single slave database with the same schema.
 

About asynchronous and semi-synchronous replication


There are two types of replication: asynchronous replication and semi-synchronous replication.
Each system differs in the extent to which it guarantees update synchronization, resulting in differences in update performance and data persistence in the event of a master crash.
By default, asynchronous replication is enabled, and for most web services, asynchronous replication should be fine.

Asynchronous replication

The synchronization of updates between the master data and the binary log is guaranteed.
This setting does not guarantee the subsequent propagation process to slaves.
In other words, if the master database crashes, there may be data that has not been propagated to the slave database.

Semi-synchronous replication

Synchronization is guaranteed for updates to data and binary logs on the master, and to the relay logs of at least one slave. This guarantee is not guaranteed for slaves other than the first one.
This behavior was implemented starting with MySQL 5.5.
Since the update is considered complete only after the relay log update on one slave device is finished, the update throughput will be reduced.

About group replication


This feature was implemented in MySQL 5.7.
A group of three or more machines will be formed, and all machines will have the same data.
There are "primary" accounts that accept updates and "secondary" accounts that do not.
In group replication, there is only one primary server.Single Primary Mode"and
All servers become primaryMulti-primary mode" exists.
When the primary server receives an update SQL, the group communication engines of each server perform a certification process and then write the data to the relay log.
Certification is a process for checking write conflicts.
In conflicting transactions, only the transaction that commits first is marked as successful, and all others are marked as failures to maintain consistency.
The certification process is performed synchronously, while the data is updated on each node asynchronously.
グループレプリケーション
Group Replication
 

About InnoDB Cluster


This feature was implemented in MySQL 5.7.
The configuration will consist of a MySQL Router, MySQL Shell, and group replication.
MySQL Router is a product that transparently routes connections to MySQL servers.
MySQL Shell is a client with a variety of functions.
InnoDB Cluster
InnoDB Cluster
This configuration allows for automatic database promotion through group replication and automatic changes to the application's database target through MySQL Router functionality.
This is recommended when high availability is a priority. However, it comes with overhead and limitations in update processing, so prior testing is essential.
 

About InnoDB ReplicaSet


This is a new feature implemented in MySQL 8.0.19.
The configuration will consist of MySQL Router, MySQL Shell, and asynchronous replication.
GTID is required for asynchronous replication in this case.
InnoDB ReplicaSet
InnoDB ReplicaSet
This configuration is suitable for load balancing based on reference traffic.
There is no automatic failover function.
 

Comparison of replication types and their functionalities


The table above compares the features of each replication method described above.
With previous asynchronous and semi-synchronous replication methods, automatic failover and application redirection required the use of separate solutions (such as HA Proxy, LB, or MHA), but InnoDB Cluster can now handle all of these using only official MySQL features.
However, it's important to note that there are significant overhead costs during updates and many other limitations.
 

summary


Many people probably only use asynchronous replication, but there are actually many different types of replication available to suit various purposes.
Let's at least be aware of its existence so we can choose the optimal replication method.
 

In conclusion


SRG is looking for new team members. If you are interested, please contact us here.