MySQL online DDL and pt-online-schema-change
This is Onikai (@fat47) from the Service Reliability Group (SRG) of the Technology Division.
#SRGThe Service Reliability Group primarily provides comprehensive support for the infrastructure surrounding our media services, focusing on improving existing services, launching new ones, and contributing to open-source software (OSS).
This article is located within SRG.DBWGThe DB Working Group is releasing database-related materials that it provides to the entire company.
I hope this is of some help.
overviewconclusionmethodUsing official MySQL features (MySQL 5.6 and later)Use pt-online-schema-changeIn conclusion
overview
Adding new service specifications or developing new features may require DDL (Data Definition Language) operations, such as adding columns or indexes to existing tables.
If you perform a normal DDL operation while the service is running at that time, a table lock may occur, potentially leading to a serious service failure.
By using the method described below, you can prevent table locking and perform DDL operations online.
conclusion
AWS AuroraIn this case:
- Use MySQL's official online DDL function.
- If this is not possible for the reasons described below, the next best option is pt-online-schema-change.
RDS(MySQL) ,orPlain MySQL (5.5~5.7)In this case:
- Use pt-online-schema-change
method
Using official MySQL features (MySQL 5.6 and later)
With the online DDL feature implemented in MySQL 5.6, some operations can now be performed without locking.
In short, the following operations can be performed online:
- OPTIMIZE TABLE
- ADD INDEX
- CREATE INDEX
- DROP INDEX
- Adding a column
- Deleting columns (beware of high costs)
- Adding/removing foreign key constraints
Furthermore, MySQL 8.0.12 introduced a feature called INSTANT DDL, which allows for faster DDL operations by updating only the metadata.
However, the following constraints apply when executing with ALGORITHM=INSTANT.
- Adding columns cannot be combined with other ALTER TABLE actions that do not support ALGORITHM=INSTANT in the same statement.
- Columns can only be added as the last column in a table. Adding columns to any other position between other columns is not supported.
- You cannot add columns to a table that uses ROW_FORMAT=COMPRESSED.
- You cannot add columns to a table that contains a FULLTEXT index.
- Columns cannot be added to temporary tables. Only ALGORITHM=COPY is supported for temporary tables.
- You cannot add columns to a table that exists in a data dictionary tablespace.
- Row size limits are not evaluated when adding columns. However, row size limits are checked during DML operations that insert and update rows in a table.
In version 8.0.29 and later, when executing an ALTER statement, if the operation supports INSTANT, INSTANT will be used by default even if ALGORITHM is not specified.
Additionally, the following features have been added:
- It's possible to add columns not only at the end but also at other locations.
- Compatible with DROP COLOMN
For more details, please refer to the online DDL section in the official documentation.
Points to note when using this online DDL function:
Executing ALGORITHM=INSTANT on a large table when it cannot be executed will result in replication delays.
That's the point.
If a DDL operation on the master DB takes 20 minutes, it will be executed on the slave DB after it has finished on the master.
That directly translates to a 20-minute replication delay.
This does not apply if your execution environment is Amazon RDS Aurora, as the architecture is different.
To prevent this, first execute the DDL on the slave database, then temporarily disable binlog output on the master database and execute the DDL there as well.
This is generally called a Rolling Schema Upgrade. However, this method also has its drawbacks.
Since the slave has already performed the column type change or column addition,
It's necessary to check beforehand whether the application's structure is still acceptable under those circumstances.
If these points are unacceptable, you can use the tools made by Percona, which will be discussed later.
Use pt-online-schema-change
pt-online-schema-change (hereinafter referred to as pt-osc) is a tool included in the Percona Toolkit that executes ALTER statements, which normally incur table locks, without locking.
The architecture and details of this tool are explained on the following page, so please refer to it.
Please refer to the official documentation for instructions on how to install the Percona Toolkit.
Using pt-osc allows for online DDL execution in MySQL 5.5, which lacks online DDL functionality, and
Even DDL statements with long execution times can be executed in a way that minimizes replication delays.
Execution example when adding a column
This tool also has limitations and cannot be used in all situations.
Please test it in a verification environment beforehand to confirm.
In conclusion
SRG is looking for new team members.
If you are interested, please contact us here.
