AWS Database Migration Service (DMS)'s new Data Masking feature creates a secure load testing environment

Yuta Kikai of the Service Reliability Group (SRG) of the Media Headquarters@fat47)is.
#SRG(Service Reliability Group) mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, contributing to OSS, etc.
This article summarizes how to build a MySQL load testing environment in which personal information is masked using the new data masking feature of AWS DMS.
I hope this helps in some way.
 

AWS Database Migration Service Adds New Data Masking Feature


The data masking feature is a new feature of Database Migration Service (DMS) released on November 25, 2024.
 
The "Conversion Rules" in the DMS "Database Migration Task" have traditionally allowed you to convert data so that it can be used for migration to other databases.
 
With this release, the following actions have been added:
  • Number Masking
  • Randomize numbers
  • Hash Masking
 

Try masking in practice


DB advance preparation

First, from the RDS screen, create Aurora cluster A as the source and cluster B as the target.
 
Connect to cluster A and create a verification table like the one below.
Create the person table
Add a record to the person table
 

Creating a DMS

Next, create a replication instance to relay data from the DMS screen.
"Replication Instances" → "Create a Replication Instance"
 
3.5.4
Please note that as of January 2025, the default selection is 3.5.3.
 
Next, select "Database Migration Tasks" → "Create Data Migration Task".
 
Give the task an appropriate name,
For the replication instance, select the relay instance you created earlier.
Select the writer on cluster A for the source database and the writer on cluster B for the target database.
 
Scroll down and click "Add new selection rule".
Click on "Add conversion rule" that appears.
Enter the rule as follows:
Rule Target: Column
Source name: Enter schema (test)
Source table name: person
Column name: tel
Action: Number Masking
 
ハッシュマスキング
Now when you finish creating the task, it will automatically start full loading and conversion from Cluster A to Cluster B.
 

Check the data

First, connect to Cluster A and look at the record list, and you can see the original data.
 
Next, connect to Cluster B and check the record list.
As per the conversion rules, we were able to confirm that the name and address have been hashed and the phone number digits have been replaced with x.
 
By having the load testing application connect to the endpoint of Cluster B, it is possible to protect personal information while handling data volumes equivalent to those used in production.

Conclusion


Until now, data masking processes had to be implemented using Lambda or similar, but this new feature of DMS makes it much easier to perform the conversion process.
 
It's very helpful as it enables us to quickly create a safe load testing environment!
 
SRG is looking for people to work with us. If you are interested, please contact us here.