A recurrence of a behavior where the table becomes invisible to the reader during DDL execution, which was supposed to have been resolved in Aurora MySQL 3.04.2

This is Onkai Yuta (@fat47) from the Service Reliability Group (SRG) of the Media Headquarters.
#SRG(Service Reliability Group) mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, contributing to OSS, etc.
This article summarizes cases where the behavior of "tables becoming invisible to the reader while DDL is being executed" that was supposed to have been resolved in Aurora MySQL 3.04.2 has recurred.
I hope this helps in some way.
 

Table becomes invisible to readers during online DDL


I wrote about this in a blog post earlier,
ALGORITHM=INPLACE
 
This issue was resolved in Aurora MySQL 3.04.2, 3.05 and above.

Recurrent cases


The following article was posted on freee's blog on July 11, 2024.
 
I will quote some of the reproduction conditions below.
1a: The table has not been accessed via a reader since the problematic DDL was executed on the table.
1b: The table has not been accessed via a reader since it was created.
2: If either 1a or 1b is true, run the DDL in question on the table.
3: Access the table from the reader during DDL execution →
Table cannot be opened
 
It seems that if you execute online DDL again on a table for which online DDL has already been executed and there is no reader access to the table, the table will be invisible during the execution.

Reproduction verification in local environment


Creating a Reproduced Table

Insert approximately 1 million records.

Execute online DDL on the writer endpoint

 

Executing SELECT on reader endpoint while online DDL is running

At this point, no problems occur as previously verified.
 

Execute online DDL for the second time on the writer endpoint

Once complete, run it a third time immediately without SELECTing from the leader.
 

Execute the third online DDL on the writer endpoint

 

SELECT is executed on the reader endpoint during the third online DDL execution

I can't see the table anymore!
Of course, it will become visible again once the online DDL is complete.
 
In other words, to safely perform online DDL in an existing environment,
  • Do not execute consecutive online DDLs on the same table
    • Leave a gap or intentionally insert a read from the reader
It is important to be aware of this.

What about other versions of Aurora MySQL?


[Conclusion] Reproduces in all versions

3.04.3 (compatible with MySQL 8.0.28) → Reproduce

The Japanese release notes have not yet been released, but the latest minor version of the LTS, 3.04.3, was released on June 26, 2024.
It has been replicated in this version as well.
 

3.05.2 (compatible with MySQL 8.0.32) → Reproduce

3.06.1 (compatible with MySQL 8.0.34) → Reproduce

 

3.07.0 (compatible with MySQL 8.0.36) → Reproduce

Conclusion


It was a sad story about an issue that was thought to have been resolved but actually remained unresolved.
Thank you to freee for covering this incident on your blog!
 
We have reported this issue to AWS, and we believe they are currently investigating the issue, so we hope they will be able to resolve it in the future.
 
SRG is looking for people to work with us. If you are interested, please contact us here.