Recently I’ve been asked to create a Roadmap for MongoDB in my organization for new joiners and other colleagues. I thought I’d create it within a few minutes by copying it from the internet but it remained thought only, Lol. I looked everywhere in search of this so-called “MongoDB Developer Roadmap” but all I found was Roadmaps for other technologies, sadly couldn’t find any perfect results. So I took a deep breath and decided to create it from scratch by referring to existing roadmaps and after a lot of research I came up with a solution.
So basically a Roadmap is a pathway to reach the destination, I know everyone knows the meaning of this but I thought let’s start with an important keyword, Lol. Okay without wasting any time let’s see what steps one should follow to become MongoDB Developer in 2021.
Before getting started with the steps let’s see a representation of the MongoDB Developer Roadmap
Let’s start with Basic Database Skills
This step includes the must-have and basic database skills as given below.
- Data models, Data Schemas and Data Independence
Data Models describe the logical structure of a database and are fundamental entities. Data Schemas are the skeleton structure that shows the structure of the entire database. Data independence is the ability to modify a schema definition in one level without affecting a schema definition in the next higher level.
- Relational model and Entity-Relationship Model
The Relational Model represents how data is stored in Relational Databases and the Entity-Relationship Model defines the conceptual view of a database.
- Normalization, Joins, SQL & NoSQL
Normalization is the process of organizing data in a database to reduce data redundancy and eliminate undesirable characteristics like Insertion, Update, and Deletion Anomalies. Joins combine data from multiple tables in relational databases. SQL is a domain-specific language that lets you access and manipulate databases. A NoSQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases.
- Indexing, Hashing, Transaction & Concurrency
Indexing is a way of optimizing the performance of a database by minimizing the number of disk accesses required when a query is processed. Hashing in a Database Management System is a procedural approach applied to rationally find the position of the required data. A database transaction symbolizes a unit of work performed within a database management system against a database. Concurrency in a Database is a procedure of managing simultaneous operations without conflicting with each other.
Database Design and its Tools
Designing a good database with appropriate schemas is the most crucial step in building an application, thus we must know which are the best tools available. See below.
Out of these tools, most of them are desktop applications and open-sourced. Moon Modeler, DBeaver & Adminer are my personal recommendation.
Database Drivers allows applications to connect to MongoDB and work with data. The driver features an asynchronous API that allows you to access method return values through Promises or specify callbacks to access them when communicating with MongoDB. In my case, I had to connect Node.js to MongoDB so let’s see the best MongoDB Drivers for Node.js below.
Out of these drivers, all are open-sourced and popular. Mongoose and MongoDB Native are my personal recommendations.
GUI Client Tools
UI plays an important role in development. MongoDB has a shell and it works well for managing administrative tasks but while working with larger data it becomes important to use UI tools. By understanding the need I have shortlisted some best GUI Client tools as given below.
- NoSQL Booster
- Studio 3T
- Robo 3T
- MongoDB Compass
- Mongo Management Studio
- MongoDB Monitoring Tool
- NoSQL Manager
Out of these tools, most of them are paid and comes as desktop apps. NoSQL Booster, Studio 3T & Robo 3T are my personal recommendations. Even if they are paid they still provide good features in the free tier. However, MongoDB Compass is maintained by MongoDB, and most importantly, it’s free.
This step includes basic and advanced concepts of MongoDB. By learning those concepts developer will have a good command over MongoDB. Let’s see these concepts below.
In MongoDB, databases hold one or more collections of documents. There are various mongo shell commands available to manage databases.
MongoDB stores documents in collections. Collections are analogous to tables in relational databases.
MongoDB CRUD Operations
CRUD operations create, read, update, and delete documents.
Delete operations remove documents from a collection.
Aggregation operations process data records and return computed results. Aggregation operations group values from multiple documents together and can perform a variety of operations on the grouped data to return a single result. MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single-purpose aggregation methods.
The aggregation pipeline is a framework for data aggregation modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into aggregated results.
Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. To perform map-reduce operations, MongoDB provides the
mapReduce database command.
3.Single Purpose Aggregation Operations
Single-purpose aggregation operations aggregate documents from a single collection. While these operations provide simple access to common aggregation processes, they lack the flexibility and capabilities of the aggregation pipeline and map-reduce.
The key challenge in data modeling is balancing the needs of the application, the performance characteristics of the database engine, and the data retrieval patterns.
1.Data Model Design
When designing data models, always consider the application usage of the data (i.e. queries, updates, and processing of the data) as well as the inherent structure of the data itself.
MongoDB provides the capability to perform schema validation during updates and insertions.
3.Model Relationship between Documents
The data model uses embedded documents to describe a relationship between connected data. There are two types of model relationships as given below.
- One to One
- One to Many
4.Data Model References
For many use cases in MongoDB, the denormalized data model where related data is stored within a single document will be optimal. However, in some cases, it makes sense to store related information in separate documents, typically in different collections or databases.
Transaction and Atomicity
Transactions refer to multi-document transactions on sharded clusters and replica sets. Multi-document transactions (whether on sharded clusters or replica sets) are also known as distributed transactions.
Indexes are special data structures that store a small portion of the collection’s data set in an easy to traverse form. The index stores the value of a specific field or set of fields, ordered by the value of the field. The ordering of the index entries supports efficient equality matches and range-based query operations.
Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.
The storage engine is the primary component of MongoDB responsible for managing data. MongoDB provides a variety of storage engines, allowing you to choose one most suited to your application.
The journal is a log that helps the database recover in the event of a hard shutdown.
Operators are used to performing various operations on data. There are various types of operators is given below.
- Query and Projection Operators
- Update Operators
- Aggregation Pipeline Stages
- Aggregation Pipeline Operators
- Query Modifiers
MongoDB provides various features, such as authentication, access control, encryption, to secure your MongoDB deployments.
MongoDB also provides the Security Checklist for a list of recommended actions to protect a MongoDB deployment. See checklist below.
1.Enable Access Control & Enforce Authentication
Enable access control and specify the authentication mechanism. You can use MongoDB’s SCRAM or x.509 authentication mechanism or integrate with your existing Kerberos/LDAP infrastructure. Authentication requires that all clients and servers provide valid credentials before they can connect to the system.
2.Configure Role-Based Access Control
Create a user administrator first, then create additional users. Create a unique MongoDB user for each person/application that accesses the system.
3.Limit Network Exposure
Ensure that MongoDB runs in a trusted network environment and configure firewalls or security groups to control inbound and outbound traffic for your MongoDB instances.
Allow only trusted clients to access the network interfaces and ports on which MongoDB instances are available. For instance, use IP whitelisting to allow access from trusted IP addresses.
4.Run MongoDB with a Dedicated User
Run MongoDB processes with a dedicated operating system user account. Ensure that the account has permissions to access data but no unnecessary permissions.
5.Encrypt Communication (TLS/SSL)
Configure MongoDB to use TLS/SSL for all incoming and outgoing connections. Use TLS/SSL to encrypt communication between
mongos components of a MongoDB deployment as well as between all applications and MongoDB.
6.Audit System Activity
Track access and changes to database configurations and data. MongoDB Enterprise includes a system auditing facility that can record system events (e.g. user operations, connection events) on a MongoDB instance. These audit records permit forensic analysis and allow administrators to verify proper controls. You can set up filters to record specific events, such as authentication events.
7.Encrypt and Protect Data
Starting with MongoDB Enterprise 3.2, you can encrypt data in the storage layer with the WiredTiger storage engine’s native Encryption at Rest.
Once you are done with DB designing, modeling, and security all you need is to host your database on a reliable platform. There could be multiple factors to help to choose a suitable hosting platform, performance & support, backup are one of them.
There are two ways to host your database as given below.
You can get a Cloud Virtual Machine and do the installation by yourself. This can be a good solution if you have that much technical knowledge and have time to manage it in any problematic situation. However, self-hosting can save some money and increase your knowledge. After a lot of research, I have noted down a few most popular Cloud VMs as given below.
In managed hosting, you delegate your responsibilities of managing DB to the hosting company. In short hosting company takes care of everything right from installation, administration, security to backup, and so on. Hosting companies charge for what you use by various tiers. I did some research and have listed down some of the most popular hosting companies which provide Managed Hosting, see below.
All of these hosting companies are paid and some of them have free tiers as well. MongoDB Atlas is my personal recommendation as it is most popular among others and also I have used it for hosting my database in my organization. MongoDB Atlas offers a free tier in which you can get started with basic specifications.
Get PDF Version of MongoDB Developer Roadmap on GitHub: https://github.com/navanathjadhav/mongodb-developer-roadmap
We have seen a MongoDB Developer Roadmap for 2021, it’s always important to choose the right path to the destination thus this roadmap will help us to learn MongoDB in a better & efficient way and will also help us to be a good MongoDB Developer in 2021.
Read my recent blog on 3 Simple steps to add pagination in React