Considering rollback after upgrading Aurora cluster using RDS B/G Deployments function

This is Yuta Onkai (@fat47) from the Service Reliability Group (SRG) of the Media Headquarters.
#SRG(Service Reliability Group) mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, contributing to OSS, etc.
This article examines whether it is possible to roll back an Aurora upgrade using RDS Blue/Green Deployments.
I hope this helps in some way.
 
 

Summary first


  • Is there a way to roll back an upgrade using Blue/Green Deployments?
    • In this test, two out of three patterns were successfully cut back.
  • Regardless of the rollback method, service maintenance to stop writing to the database is necessary
    • [Updated September 2024] Some methods have been updated and maintenance is no longer required.
  • It would be more practical to fix the issue in a new version rather than rolling back the version.
 

[Updated September 2024]


Our article was featured on Timee's tech blog.
 
This article led to an AWS blog post on RDS Blue/Green Deployment in August 2024, in whichThe binlog position at the time of B/G switching is now output.It was discovered that...

Version upgrades via Blue/Green Deployments


This is a managed function that allows you to create a copy of a cluster, replicate that cluster, and switch clusters with the push of a button.
For more information, please refer to the blog below, which was released at the time of the Blue/Green Deployments release.
 
Using this function, you can create a Green cluster (Aurora MySQL 3 series) from an active cluster (Aurora MySQL 2 series) and switch to it to upgrade.
 
However, if you use this function to upgrade, it will be difficult to revert to the original Aurora Version 2 if you later discover a critical issue.
 
However, we still tried to see if we could somehow cut it back.

Three reversal patterns were tested


  • How to replicate in the reverse direction from Green (MySQL 8.0) to Blue (MySQL 5.7)
  • How to take a logical backup from Green (MySQL 8.0) and apply it to Blue (MySQL 5.7)
  • How to generate differentials from the binlog at the rest point of Green (MySQL 8.0) and apply them to Blue (MySQL 5.7)
 

Creating a verification environment


  • Creating a cluster for Auora Version 2
  • Creating a table for verification and adding records
    • Database and table creation
      • Add Record
      • Creating a Green (MySQL 8.0) cluster using the Blue/Green Deployments feature
      • Extend the binlog retention period to 7 days on both the Green (MySQL 8.0) and Blue (MySQL 5.7) clusters.

        How to replicate in the reverse direction from Green (MySQL 8.0) to Blue (MySQL 5.7)


        overview

        Even if you switch B/G, the old Blue (MySQL 5.7) cluster will not be deleted automatically.
        The old Blue (MySQL 5.7) was used as a rollback environment.
        This is a method to manually replicate using the switched-over Green (MySQL 8.0) as the source.

        Verification steps

        • Stop writing to the database by putting the service into maintenance mode, etc.
          • [Additional note] Due to a functional update, it is no longer necessary to stop writing to the database.
        • Execute B/G switch
        • [Additional note] Check the position when switching.
          • Select B/G
          • Sort the Recent Events column by time to see the binlog position
        • Create a replica user on Green (MySQL 8.0)
          • Started replication from old Blue (MySQL 5.7) to Green (MySQL 8.0)
            • Check the replication status with old blue (MySQL5.7)
              • Add a record to Green (MySQL 8.0) to check operation
                • Check the replication status with old blue (MySQL5.7)
                  • Check the records in old blue (MySQL 5.7) and confirm that replication is working properly

                    How to take a logical backup from Green (MySQL 8.0) and apply it to Blue (MySQL 5.7)


                    overview

                    The method is to take a logical backup on Green (MySQL 8.0) and then put it back into the old Blue (MySQL 5.7) cluster. This is the most time-consuming but reliable method.

                    Verification steps

                    • Execute B/G switch
                    • Delete B/G roll
                    • After switching, test adding records to Green (MySQL 8.0)
                    • Stop writing to the database by putting the service into maintenance mode, etc.
                    • Restore with old Blue (MySQL5.7)
                      • Check the record contents of the old Blue (MySQL 5.7). It was confirmed that the contents added to Green (MySQL 8.0) were included.
                        • Delete the Green (MySQL 8.0) instance or cluster.
                        • -old1
                        • Service maintenance mode is released and service resumes

                        How to generate differentials from the binlog at the rest point of Green (MySQL 8.0) and apply them to Blue (MySQL 5.7)


                        ⚠️
                        ↓↓ Please note that this method is unlikely to be feasible. ↓↓

                        overview

                        During maintenance when DB writing is stopped, switch B/G, and after the switch, check the binlog position on the Green (MySQL 8.0) side.
                        When switching back, you can generate an update differential SQL file from that position to the latest position and apply it to the Blue (MySQL 5.7) cluster to switch back.This is a method that seems difficult.

                        Verification steps

                        • Stop writing to the database in service maintenance mode, etc.
                        • Execute B/G switch
                        • Delete B/G roll
                        • Check the current position of Green (MySQL 8.0) and take note
                          • Assuming the service maintenance has been lifted, run some update queries.
                            • To perform the switchback, stop writing to the database again using service maintenance mode, etc.
                            • Check the current position of Green after writing has stopped
                              • Get the Green binlog (operate on an Ope server with MySQL Client installed)
                                • The above script outputs a binlog file to /tmp on the ope server.
                                • Generate a recovery SQL file using the mysqlbinlog command
                                  • Check the status of tables on old Blue (MySQL 5.7)
                                    • Of course, I confirmed that the three INSERTs were not reflected.
                                  • When I input recovery.sql into the old Blue (MySQL 5.7), an error occurred saying that SUPER privileges were required.
                                    • The 9th line of the error recovery.sql is the update part of ROW
                                      • --base64-output=DECODE-ROWS
                                        • If the number is small,You can just delete the comments for INSERT, UPDATE, etc. by editing or replacing the differential SQL.However, there are many differences in the number of items, so it seems difficult to roll back to a production environment.
                                       

                                      Digression

                                      • When trying to obtain the binlog of an Aurora MySQL 3 series (MySQL 8.0) cluster using a MySQL 5.7 client, an error occurs.
                                       
                                      • If you try to forcibly generate SQL using the binlog of an Aurora MySQL 3 series (MySQL 8.0) cluster obtained with a MySQL 8.0 client using mysqlbinlog of a MySQL 5.7 client, an error will occur.
                                       
                                      • There are three places in the update differential SQL file where the SUPER privilege is required when restoring to Aurora Version 2 (MySQL 5.7).
                                       
                                      • In Aurora Version 3 (MySQL 8.0), the SUPER privilege has been deprecated and the required privileges have changed.
                                        • commandRequired privilege
                                          BINLOG 'xxxxxxxxxx';SUPER, BINLOG_ADMIN or REPLICATION_APPLIER
                                          SET @@SESSION.GTID_NEXT='AUTOMATIC'SUPER, SYSTEM_VARIABLES_ADMIN, SESSION_VARIABLES_ADMIN or REPLICATION_APPLIER
                                          SET @@session.pseudo_thread_id=xxxxx/SUPER, SYSTEM_VARIABLES_ADMIN or SESSION_VARIABLES_ADMIN
                                        • SESSION_VARIABLES_ADMIN
                                       

                                      Reference URL


                                      Conclusion


                                      I realized that if I could put the service into maintenance and stop database writes, there might be a way to somehow switch it back.
                                      We would like to propose an appropriate upgrade plan based on service requirements, etc.
                                       
                                      SRG is looking for people to work with us. If you are interested, please contact us here.