How to recover accidentally deleted database data to the state it was in just before it was deleted.

This is Onikai (@fat47) from the Service Reliability Group (SRG) of the Technology Division.
#SRGThe Service Reliability Group primarily provides comprehensive support for the infrastructure surrounding our media services, focusing on improving existing services, launching new ones, and contributing to open-source software (OSS).
This article is located within SRG.DBWGThe DB Working Group is releasing database-related materials that it provides to the entire company.
I hope this is of some help.
 

overview


This is a summary of how to recover a MySQL database that has been subjected to operations such as a table drop due to an operational error, or a complete deletion of records due to a failed WHERE clause in a DELETE statement caused by an application bug, back to the state immediately before the operation occurred.
In addition to daily full backups using Percona XtraBackup, the MySQL database for the service we manage also takes backups of the binlog file every 5 minutes.
Here's how to recover using those backup files.
 
This content primarily concerns operating MySQL in an on-premises environment.
The situation is slightly different for Amazon Aurora, so please refer to the CyberAgent Developers Blog article below.
Is backing up Aurora MySQL really okay? | CyberAgent Developers Blog
This is Hasegawa @rarirureluis from the Service Reliability Group (SRG) in the Technology Division. The #SRG (Service Reliability Group) is a group that provides cross-functional support for the infrastructure of our media services, and is involved in improving existing services, launching new ones, and contributing to open source. This time, I'll be talking about Amazon Aurora MySQL (hereinafter: Aurora MySQL). I don't even know why I'm writing so many articles about Aurora MySQL. The previous best practices for upgrading Aurora MySQL can be found here: Considering how to switch to RDS Graviton2 with minimal risk [Upgrade Edition] | CyberAgent Developers Blog This time, I'll be talking about backups. For example, if you accidentally drop a table, can you restore it to the state it was in just before you dropped it? Or is there documentation on how to restore it to that point? This isn't limited to public clouds like AWS, but even if you have a full backup, you can't restore it to a specific query. That's where binary logs come in. With the latest full backup and the binary logs taken since that backup, you can restore it to a specific query. Aurora MySQL has two recovery features: Backtrack and Point-in-Time Recovery. These two features don't restore to a specific query point in time, but rather to a specified date and time. Therefore, you cannot revert to a point immediately before an incorrect query occurred. Backtrack restores the cluster to a specified time.
 

Overall flow


  1. Check the location of the full backup data and backup binlog.
  1. Check the queries that were executed incorrectly and the time they were executed.
  1. Check the binlog position during a full backup taken prior to the execution of that query.
  1. Check the binlog position immediately before the incorrectly executed query.
  1. Generate recovery queries from the binlog file.
  1. Deploy full backup data to all databases that are in service.
  1. Apply the recovery query to all databases after setting sql_log_bin = 0.
  1. Finally, rebuild replication.
 

procedure


Check the location of the full backup data and backup binlog.

Check the location where the backup data is stored.
You should always keep track of whether you're uploading to S3 or to a physical server somewhere.

Check the query that was executed by mistake and the time it was executed.

This determines which query's data should be returned to the point immediately prior to its execution.
Please check with the service providers or developers.

Check the binlog position during a full backup taken prior to the execution of that query.

xtrabackup_binlog_info

Check the binlog position immediately before the incorrectly executed query.

If you want to specify a query that occurred immediately before a particular query, you will need to identify that query from the backed-up binlog file.
The binlog cannot be read directly, so you need to use the mysqlbinlog command to convert it back to SQL format before searching for the relevant query.
end_log_pos 286332780

Generate recovery queries from the binlog file.

Now that we've confirmed the "backup position" and the "position at the point we want to revert to," we generate a query to apply it to the restored DB.
This generates the SQL statement that applies up to the point immediately before the DELETE statement.
 
If you want to revert to a state up to a certain point in time, rather than returning to a specific query as described above, you can also write the code as follows.
In this case, the example specifies a full backup data set up to 1:05 PM on March 4th.

Deploy full backup data to all databases that are in service.

We will deploy the full backup data from XtraBackup to the DB server.
If you have backups on an external network such as S3, the following method is recommended.
  1. Download full backup data from S3 to one of the slave servers.
  1. The xtrabackup data is deployed on that slave server.
    1. Once the deployment is complete, copy the DB data from /var/lib/mysql on that slave server to all databases.

    Execute the recovery query with SET sql_log_bin = 0 and then apply it to all databases.

    Once the full backup data has been deployed to all databases, the SQL extracted from the binary logs will be applied to all databases.
    If we simply run the recovery query only to the DB master, it will take even longer for the replication delay to converge, so we set sql_log_bin = 0 and then run the query simultaneously to all databases.

    Finally, rebuild replication.

    Once all database data has been copied, configure replication between the master and slave systems.
    This restores the database to the state it was in immediately before the specified query was executed.
     

    In conclusion


    Restoring from backup data is an operation that will likely only occur once every few years, if at all.
    To ensure you can operate calmly in the event of an emergency, we recommend documenting recovery procedures and conducting training sessions in advance.
     
    SRG is looking for new team members. If you are interested, please contact us here.