MongoDB
Li Wei
Title: MongoDB
Introduction to Mongo
Basic Overview
MongoDB is an open‑source, high‑performance, schema‑less document database. It was originally designed to simplify development and make scaling easy, and it is one of the NoSQL database products. It is the non‑relational database that most closely resembles a relational database (MySQL).
Its supported data structures are very loose; it uses a JSON‑like format called BSON. Therefore it can store fairly complex data types and is very flexible.
In MongoDB a record is a document, which is a data structure composed of field‑value pairs **field:value**. A MongoDB document is similar to a JSON object, i.e., a document is essentially an object. Field data types are strings, and their values can be basic types as well as other documents, plain arrays, or arrays of documents.
MongoDB’s main features include:
High performance: MongoDB provides high‑performance data persistence. Support for embedded data models reduces I/O activity on the database system. Indexes enable faster queries and can include keys from embedded documents and arrays (text indexes address search needs, TTL indexes handle automatic expiration of historical data, geospatial indexes are useful for building various O2O applications).
Multiple storage engines such as mmapv1, WiredTiger, MongoRocks (RocksDB), and in‑memory support a wide range of scenarios. GridFS solves file‑storage requirements.
High availability: MongoDB’s replication mechanism is called a replica set, which provides automatic failover and data redundancy.
High scalability: Horizontal scalability is a core feature of MongoDB. Sharding distributes data across a cluster of machines (massive data storage with horizontally scalable service capacity).
Since version 3.4, MongoDB supports creating data zones based on shard keys. In a balanced cluster, MongoDB directs reads and writes for a zone only to the shards that cover that zone.
Rich query support: MongoDB offers a rich query language for CRUD operations, including aggregation, text search, and geospatial queries.
Other advantages: schema‑less (dynamic schema), flexible document model, etc.
Comparison with MySQL
MySQL vs. MongoDB
| SQL term / concept | MongoDB term / concept | Explanation |
|---|---|---|
| database | database | database |
| table | collection | collection (equivalent to a table) |
| row | document | data record / document |
| column | field | data field / attribute |
| index | index | index |
| table joins | — | MongoDB does not support joins |
| embedded documents | MongoDB uses embedded documents instead of multi‑table joins | |
| primary key | primary key | MongoDB automatically sets the _id field as the primary key |
Data Model
The smallest storage unit in MongoDB is the document object. A document object corresponds to a row in a relational database. Data in MongoDB is stored on disk as BSON (Binary JSON) documents.
BSON (Binary Serialized Document Format) is a binary representation of JSON, often called Binary JSON. Like JSON, BSON supports embedded document objects and arrays, but it also includes data types that JSON lacks, such as Date and BinData.
BSON uses a C‑style struct naming and representation, supports embedded documents and arrays, and is lightweight, traversable, and efficient. It can describe both unstructured and structured data. The format’s flexibility is a strength, but its space utilization is not optimal.
In addition to the basic JSON types (string, integer, boolean, double, null, array, object), MongoDB adds special types: date, object id, binary data, regular expression, and code. Each driver implements these types in its language; consult your driver’s documentation for details.
BSON data type reference list
| Data Type | Description | Example |
|---|---|---|
| String | UTF‑8 string | {"x":"foobar"} |
| ObjectId | 12‑byte unique identifier for a document | {"X":ObjectId()} |
| Boolean | true or false | {"x":true} |
| Array | Ordered list of values | {"x":["a", "b", "c"]} |
| 32‑bit integer | Not directly supported; JavaScript only has 64‑bit floating point, so 32‑bit ints are auto‑converted. The shell does not support this type and will treat it as a 64‑bit float. | |
| 64‑bit integer | Not directly supported; the shell displays it as a special embedded document. | |
| 64‑bit floating point | The default numeric type in the shell | {"x":3.14159,"y":3} |
| Null | Represents a null or undefined value | {"x":null} |
| Undefined | Can also be used in documents | {"x":undefined} |
| Symbol | Not supported in the shell; the shell converts symbols to strings automatically | |
| Regular expression | JavaScript regex syntax | {"x":/foobar/i} |
| Code | JavaScript code stored in a document | {"x":function() { /* …… */ }} |
| Binary data | Arbitrary byte strings (cannot be used directly in the shell) | |
| Max/Min value | Special BSON type representing the maximum possible value (not available in the shell) |
Use Cases
Traditional relational databases (e.g., MySQL) struggle to meet the “three‑high” demands of modern web‑2.0 applications:
- High performance – need for high‑concurrency reads/writes.
- Huge storage – efficient handling of massive data volumes.
- High scalability & high availability – ability to scale out and stay online.
MongoDB is designed to meet these three‑high requirements.
Typical scenarios:
- Social apps: Store user profiles and posts; use geospatial indexes for “people nearby” or “places near me” features.
- Gaming: Store player info, equipment, scores as embedded documents for fast queries and efficient storage.
- Logistics: Store order data; order status updates are kept in embedded arrays so a single query can retrieve the full change history.
- IoT: Store device metadata and logs, then perform multidimensional analytics.
- Live video: Store user info, likes, interaction data, etc.
Common characteristics of these workloads:
- Large data volume
- Frequent writes (and reads)
- Low‑value data with modest transactional requirements
Such data is well‑suited to MongoDB.
When to Choose MongoDB
Beyond the three‑high traits, consider these questions:
- The application does not need transactions or complex joins.
- It’s a new project with evolving requirements and an uncertain data model, requiring rapid iteration.
- Expected read/write QPS is 2,000–3,000 or higher.
- Data size will reach terabytes or even petabytes.
- Rapid horizontal scaling is required.
- Data loss is unacceptable.
- 99.999% availability is required.
- Heavy geospatial or text search workloads.
If any of the above applies, MongoDB is worth considering; if two or more apply, you won’t regret the choice.
What if you used MySQL instead?
Answer: Compared with MySQL, MongoDB can solve the problem at lower overall cost (including learning, development, and operations).
Mongo Commands
Common Commands
| Command | Meaning |
|---|---|
use articledb |
Select database |
db.comment.insert({bson数据}) |
Insert data |
db.comment.find(); |
Query all data |
db.comment.find({条件}) |
Conditional query |
db.comment.findOne({条件}) |
Find first matching record |
db.comment.find({条件}).limit(条数) |
Find first N matching records |
db.comment.find({条件}).skip(条数) |
Skip N matching records |
db.comment.update({条件},{修改后的数据}) |
OR |
db.comment.update({条件},{$set:{要修改部分的字段:数据}}) |
Update data |
db.comment.update({条件},{$inc:{自增的字段:步进值}}) |
Update with increment on a field |
db.comment.remove({条件}) |
Delete data |
db.comment.count({条件}) |
Aggregation query |
db.comment.find({字段名:/正则表达式/}) |
Fuzzy search |
db.comment.find({字段名:{$gt:值}}) |
Conditional comparison operators |
db.comment.find({字段名:{in:[值1,值2]}}) |
OR |
db.comment.find({字段名:{nin:[值1,值2]}}) |
Inclusion query |
db.comment.find({$and:[{条件1},{条件2}]}) |
OR |
db.comment.find({$or:[{条件1},{条件2}]}) |
Conditional join query |
Database Operations
Select or create a database:
Syntax:
If the database does not exist, it is created automatically. For example, the following statement creates the
articledbdatabase:(example omitted)
List all databases you have permission to view:
- Note: In MongoDB, a collection is only created after you insert a document into it. That is, after creating a collection (table), you must insert at least one document for the collection to actually exist.
Show the currently selected database:
(example omitted)
MongoDB’s default database is test. If you do not explicitly select a database, collections will be stored in test.
Database names can be any UTF‑8 string that meets the following rules:
- Cannot be an empty string (
""). - Must not contain spaces,
.,$,/,\, or the null character (\0). - Must be all lowercase.
- Maximum length is 64 bytes.
Some database names are reserved and have special purposes:
- admin: From a permissions standpoint, this is the “root” database. Adding a user to
admingrants that user privileges on all databases. Certain server‑side commands (e.g., list databases, shutdown) can only be run fromadmin. - local: Data in this database is never replicated; it can be used to store collections that are local to a single server.
- config: Used internally when MongoDB is configured for sharding; it stores metadata about the shards.
Delete a database:
Syntax:
(example omitted)
Collection Operations
A collection is analogous to a table in a relational database.
Collections can be created explicitly or implicitly.
Explicit Creation
Syntax:
Parameters:
name: name of the collection to create.
Example: create a regular collection named
mycollection.(example omitted)
List collections in the current database:
(example omitted)
Naming rules for collections:
- The name cannot be an empty string
"". - The name cannot contain the null character
\0, which marks the end of a collection name. - The name cannot start with
system., a prefix reserved for system collections. - User‑created collection names must not contain reserved characters. Some drivers allow the reserved character in system‑generated collection names; unless you need to access such system collections, avoid using
$in your names.
Implicit Creation
When you insert a document into a non‑existent collection, MongoDB automatically creates the collection.
Tip: In most cases you can rely on implicit creation.
Deleting a Collection
Syntax:
(example omitted)
Return value: drop() returns true if the collection was successfully dropped, otherwise false.
Example: drop the
mycollectioncollection.(example omitted)
Document Operations
A document’s structure is essentially the same as JSON.
All data stored in a collection is in BSON format.
Inserting
Single‑document insert:
Use insert() or save() to insert a document into a collection.
Syntax:
Parameters:
Parameter Type Description documentdocument or array The document or array of documents to insert (JSON format). writeConcerndocument (optional) A document expressing the [write concern]https://www.mongodb.com/docs/manual/reference/write-concern/. Omit to use the default [write concern]https://www.mongodb.com/docs/manual/reference/method/db.collection.insert/. See Write Concern. Do not set write concern explicitly when running inside a transaction; see [Transactions and Write Concern]https://www.mongodb.com/docs/manual/core/transactions/. orderedboolean (optional) If true, documents are inserted in order; on the first error MongoDB stops processing the rest. Iffalse, MongoDB performs unordered inserts and continues after errors. Default in 2.6+ istrue.
Example: insert a test record into the
commentcollection:Tips:
- If
commentdoes not exist, it will be created implicitly. - Numbers in the Mongo shell default to double; to store an integer you must use
NumberInt(<int>), otherwise you may encounter issues when reading it back. - Use
new Date()to insert the current timestamp. - If
_idis omitted, MongoDB generates a primary key automatically. - Fields without a value can be set to
nullor simply omitted.
After execution you will see a success message indicating one document was inserted.
- If
Important notes:
- Key/value pairs in a document are ordered.
- Values can be strings, numbers, other data types, or even whole embedded documents.
- MongoDB is case‑sensitive and type‑sensitive.
- Duplicate keys are not allowed within a document.
- Keys are strings; except for a few special cases, any UTF‑8 character may be used.
Key naming rules:
- Keys cannot contain the null character
\0, which marks the end of a key. .and$have special meanings and may only be used in specific contexts.- Keys that start with an underscore (
_) are reserved (though not strictly enforced).
Bulk Insert
Syntax:
Parameters:
Parameter Type Description documentdocument Document or array of documents to insert (JSON format). writeConcerndocument (optional) A document expressing the [write concern]https://www.mongodb.com/docs/manual/reference/write-concern/. Omit to use the default. Do not set explicitly inside a transaction; see [Transactions and Write Concern]https://www.mongodb.com/docs/manual/core/transactions/. orderedboolean (optional) Determines whether inserts are ordered ( true) or unordered (false). Default istrue.
Example:
Tips:
- If you specify
_id, that value becomes the_id. - If a single document fails, the bulk operation stops, but already‑inserted documents are not rolled back.
- Because bulk inserts involve many records and failures are more likely, you may want to wrap the operation in a
try…catchfor error handling; during testing you can ignore it.
(example omitted)
- If you specify
Deleting Documents
Syntax:
(example omitted)
Example:
(example omitted)
Caution: The following command deletes all documents in a collection—use it carefully.
Updating Documents
Syntax:
- Parameters: focus on the first four arguments.
Example:
Full replacement:
To change the document with
_id= 1 and set itslikenumto 1001:db.collection.update( { _id: 1 }, { likenum: 1001 } )After execution you will see that all fields except
likenumhave disappeared (the document was replaced).Partial update:
To modify only certain fields, use an update operator
$set. For example, to update the document with_id= 2, settingviewnumto 889:db.collection.update( { _id: 2 }, { $set: { viewnum: 889 } } )Bulk update:
Update the
nicknameof all users whoseuseridis 1003 to “Caesar the Great”:db.users.update( { userid: 1003 }, { $set: { nickname: "Caesar the Great" } }, { multi: true } // without this, only the first matching document is updated )Incrementing a field:
(content truncated)
Originally written by Li Wei (李唯_) and published in Chinese on 后端技术栈全书 (Full-Stack Backend Engineering). Translated and adapted for DriftSeas with permission.