Be aware of network IP ranges before creating a MongoDB Atlas cluster

My name is Ohara and I work in the Service Reliability Group (SRG) of the Media Headquarters.
#SRG(Service Reliability Group) is a group that mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, and contributing to OSS.
This article explains the IP address conflict issue in MongoDB Atlas VPC peering and how to migrate to it with minimal downtime.

 

Introduction


MongoDB Atlas is a fully managed cloud database service that automates database management, allowing developers to focus on building applications. By using the VPC peering function from cloud environments such as GCP and AWS, you can securely connect using private IP addresses.
This time, we will explain the network IP address range conflict issue that occurred when setting up VPC peering between MongoDB Atlas and GCP, and how to resolve it.
 

IP address conflict error in VPC peering configuration


stg
common vpc
 

Why did the IP address get duplicated?


After investigation, it was determined that the cause of the problem was related to the specifications used when creating the MongoDB Atlas cluster.

Atlas default CIDR

192.168.0.0/16
shared
My first thought was to change the CIDR defined in this network container, but I found that due to the specifications of Atlas, changing the VPC CIDR of a cluster once it has been created would require recreating the cluster.
M10
 

VPC peering constraints

Of course, when peering VPCs, the IP ranges of the subnets must not overlap.
192.168.0.0/16
 

solution


The following three proposals were considered:
Reconfiguring the Atlas network will require downtime, so you should consider less disruptive methods if possible.
 

1. Delete and recreate the cluster

mongorestore
Furthermore, due to the Atlas specifications, existing snapshots must also be deleted when they are recreated, limiting the means for quick recovery.
 

2. CompetingRemove VPC peering with and compromise on public access

Access to stg is less frequent than access to shared, and there is little access that is important to the product, so it was a candidate.
shared
 

3. Set the appropriate CIDR in the new project,Migrate data with (recruit)

We used Atlas' Live Migration feature for data migration. This feature allows us to synchronize two databases and run them in parallel while the application is running, enabling us to switch over with almost no downtime. We can also verify the validity of the infrastructure by checking whether CIDR changes and peering settings can be made as expected without affecting the existing system.
This plan was adopted as it was deemed the safest and least impactful.
 

Live Migration for data migration and VPC peering reconfiguration


The specific steps of the solution adopted are as follows:
  1. shared_v2
  1. 172.16.0.0/18
    1. Check the CIDR of the new network
      1. common vpc
      1. shared_v2
      shared_v2
       

      The importance of design


      While managed services are very easy to build and manage, there are many things to consider when designing them, such as provider-specific specifications and limitations.
      By carefully designing not only the network but also the entire system at the early stages of construction, you can prevent large-scale rework and complex data migration work that may occur in the future.
       
      SRG is looking for people to work with us.
      If you are interested, please contact us here.