Things to be aware of to increase MongoDB availability: Configuration

Mr. Kobayashi (@berlinbytes)is.
#SRG(Service Reliability Group) mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, contributing to OSS, etc.
This article isCyberAgent Group SRE Advent Calander 2024This is the article for the 7th day.
 

Introduction


MongoDBAs you may know, it is a document-oriented distributed database that falls into the category of NoSQL databases.
At our companyNode.jsDue to its high compatibility with other games, it has been used for a long time, mainly in the gaming industry.
Around 2011, MongoDB began to be used partially around the 1.X series, and by the time the 2nd series was released, it began to be used on a large scale, mainly in the social gaming field.
Currently, due to its schema-less and scalable features, it is not only used in the gaming industry, but is also widely used as a database for handling master data for media-related services.
When I search the web for information about our company's use of MongoDB, I find some relatively old information, but not much more recent information.
  • Example from Tapple
  • Example from Pig Party
 
So in this article I would like to briefly introduce the current situation.

Our MongoDB clusterConfiguration Pattern


Standalone

It consists of a single machine, without replica sets or sharding.
It is mainly used in testing environments with small data sets, for research purposes such as temporary data and log deployment destinations, and in personal development environments.
Is it entirely manual?MongoDB Cloud ManagerIt operates under the control of.

Replica Set

Although horizontal distribution by sharding is not required, the system would be disruptive to business if it was not available.Replica Sethas been adopted.
This is used for the daily testing environment and some of the system's functions.
Managed by MongoDB Cloud Manager orMongoDB AtlasIt is built with

Sharding

Shardingcan be deployed to support throughput-hungry applications with large data sets, horizontally distributing both reads and writes across a cluster, with the added benefit of being easily scalable as needed.
You can also improve scalability by configuring each shard as an independent replica set.
On the other hand, as the number of components to be managed increases, an increase in costs is unavoidable.
This is also managed by MongoDB Cloud Manager or operated by MongoDB Atlas.
 

For high availability


In most of our production environments, we use replica sets or sharding configurations. In the past, we had a time when we placed a large number of physical servers in our data centers and operated on a large scale with over 30 shards, but now that equipment performance has improved and instances have very high performance, we no longer operate physical servers.
The platform is built on our own private cloud called "Cycloud" as well as public clouds such as AWS and Google Cloud.
As can be seen from the current configuration patterns, since MongoDB is the core database of the service, it is necessary to use replica sets in order to achieve a high level of availability. Sharding is also used for services that require high read and write speeds. As the cluster becomes larger and the amount of management required increases, procedures for scaling and version upgrades become more complicated, so we aim to reduce the operational load by using management tools.

Management Tools


MongoDB Cloud Manager

MongoDB Cloud Manager can automate most of the management of your infrastructure, allowing you to perform upgrades, monitoring, backups, and more from a web-based UI.

MongoDB Atlas

MongoDB Atlas is a DBaaS (Database-as-a-Service) managed by MongoDB and hosted on each public cloud.
Compared to MongoDB Cloud Manager, it has the ability to scale up and down in response to workload fluctuations, allowing you to optimize performance and costs. In addition, it provides tools such as index suggestions and performance advisors, allowing you to optimize operational management costs.
 
As such, the use of official management tools is becoming more common, as using automated deployment management services reduces management costs.
 
Also, this article alone would be like an introduction to MongoDB, so I would like to add that:
There are also cases where services that used MongoDB have migrated to other databases.
In recent years, databases such as those below that have APIs compatible with MongoDB have also appeared.
 
  • Amazon Document DB
 
  • Azure Cosmos DB for MongoDB
 
  • Oracle Database API for MongoDB
 
I would like to write an article introducing these and providing examples of migration at some point.

summary


In summary, the MongoDB configuration is as follows:
  • Using replica sets
  • Introducing Sharding
is important for ensuring availability and can improve fault tolerance and load balancing.
The introduction of management tools also contributes to reducing operational burdens and operational errors. And while it is a matter of using the right tool for the right job, the introduction of DBaaS has also been very effective.
Migrating to a database with a compatible API may also be worth considering, depending on your requirements.

Conclusion


This time, I introduced the MongoDB configuration used within our company from the perspective of availability.
In addition, the following points were not mentioned:
  • Memory Size Considerations
  • Deciding the instance specifications
  • Query optimization
  • Cost Optimization
  • About indexes
  • Introducing a cache layer
  • Surveillance and monitoring
I plan to continue posting more in the future.
SRG is looking for people to work with us. If you are interested, please contact us here.