Why mongodb uses B-tree



Mongodb is a general, document oriented distributed database. This is the official introduction to mongodb. Different from the traditional relational databases mysql, Oracle and SQL server, mongodb has one of the most important characteristics“Document oriented”Due to different data storage methods, the interface provided externally is no longer the well-known SQL, so it is divided into NoSQL. NoSQL is relative to SQL. Many storage systems we are familiar with are divided into NoSQL, such as redis, dynamodb, elasticsearch, etc.

Why mongodb uses B-tree


NoSQL is often understood as non SQL or non relational, but it is also understood as not only SQL. Digging deep into the meaning and origin of the word may not have much meaning. This secondary interpretation is often used for marketing. We only need to know that the storage method of mongodb for data is completely different from the traditional relational database.

The architecture of mongodb is very similar to MySQL. Pluggable storage engines are used at the bottom to meet the different needs of users. Users can choose different storage engines according to data characteristics. The latest version of mongodb uses wiredtiger as the default storage engine.

Why mongodb uses B-tree


As the default storage engine of mongodb, wiredtiger uses B-tree as the underlying data structure of index, but in addition to B-tree, it also supports LSM tree as the optional underlying storage structure. The full name of LSM tree is log structured merge tree. You can use the following commands in mongodb to create a collection based on LSM tree:


      { storageEngine: { wiredTiger: {configString: "type=lsm"}}}

We areFront end trainingThis article will not only introduce why mongodb’s default storage engine wiredtiger chose to use B tree instead of b+ tree, but also compare the performance and application scenarios between B tree and LSM tree, so as to help readers understand today’s problems more comprehensively.


Since we want to compare the difference between two different data structures and B-tree, here we will introduce why b+ tree and LSM tree have not become the default data structure of wiredtiger in two sections:

As a non relational database, mongodb’s demand for traversal data is not as strong as that of relational databases. It pursues the performance of reading and writing a single record;

Most OLTP databases face the scenario of reading more and writing less, and B-tree and LSM tree have greater advantages in this scenario;

The above two scenarios need to be faced and solved by mongodb, so we will compare different data structures in these two common scenarios.

Non relational

In fact, we have mentioned many times above that mongodb is a non relational document database. After completely abandoning the system of relational database, it is very free in design and implementation. It no longer needs to follow the system of SQL and relational database, and it can optimize specific scenarios more freely. Traversing data in the scenarios assumed by mongodb is not a common requirement.

Why mongodb uses B-tree


The reason for using b+ tree in MySQL is that only leaf nodes of b+ tree will store data. Connecting each leaf node in the tree through pointers can realize sequential traversal. Traversal data is very common in relational databases, so there is no problem with this choice.

The ultimate goal of mongodb and MySQL in choosing between multiple different data structures is to reduce the number of random IO required for queries. MySQL believes that queries that traverse data are common, so it chooses b+ tree as the underlying data structure and discards the feature of storing data through non leaf nodes, but mongodb faces different problems:

Why mongodb uses B-tree


Although the query of traversal data is relatively common, mongodb believes that querying a single data record is much more common than traversal data. Because the non leaf nodes of B tree can also store data, the average random IO times required to query a piece of data will be less than that of b+ tree. Mongodb using B tree will be faster than MySQL in similar scenarios. This is not to say that mongodb cannot traverse the data. We can also use the range in mongodb to query a batch of records that meet the corresponding conditions, but it will take longer than mysql.

SELECT * FROM comments WHERE created_at > '2019-01-01'

When many people see the query of traversing data, they may think of the range query as shown above. However, in relational databases, the more common is SQL as shown below – query foreign keys or all records with a certain field equal to a certain value:

SELECT * FROM comments WHERE post_id = 1

The above query is not a range query. It does not use expressions such as >, < and so on, but it will query a series of records in the comments table. If there is an index post on the comments table_ ID, then this query may traverse the corresponding index in the index to find a comment that meets the conditions. This query will also benefit from the interconnected leaf nodes of the MySQL b+ tree, because it can reduce the random IO times of the disk.

As a non relational database, mongodb uses a completely different method from the design of sets. If we still use the traditional table design idea of relational database to think about the design of sets in mongodb, writing queries like the above will bring relatively poor performance:

db.comments.find( { post_id: 1 } )

Because all nodes of the B tree can store data, and there is no good way to connect successive nodes through pointers, the performance of the above query in the B tree will be much worse than that in the b+ tree. However, this is not a design method recommended in mongodb. The more appropriate method is to use embedded documents to store posts and all its comments together:

     "_id": "...",    
     "Title": "why mongodb uses B-tree",    
     "author": "draven",    
     "comments": [        
              "_id": "...",            
              "Content": "you can't write this well"        
             "_id": "...",            
             "Content": "the first floor is right"        

Db.comments For queries like find ({post_id: 1}), we only need to take out the post and get all the relevant comments. This design method, which is different from the traditional relational database, requires all developers using mongodb to rethink. This is also the biggest reason why many people use mongodb but find that the performance is not as good as MySQL – the posture is wrong.

Some readers may have questions here. Since mongodb believes that querying a single data record is far more common than traversing the data, why not use hash as the underlying data structure?

Why mongodb uses B-tree


If we use hash, the complexity of the query will be o (1) for all single records, but the complexity of traversing the data is O (n); If a b+ tree is used, the complexity of a single record query is O (log n), and the complexity of traversing data is O (log n) + X. one of these two different data structures provides the best performance of a single record query, and the other provides the best performance of traversing data, but neither of them can meet the scenario faced by mongodb – single record queries are very common, but traversing data also requires relatively good performance support, Hash, a data structure with extreme performance, can only be used in simple and extreme scenarios.

Read more and write less

LSM tree is a disk based data structure. Its main purpose is to provide a low-cost indexing mechanism for files that need high-frequency write operations for a long time. Whether it is a B-tree or a b+ tree, writing records to the index file composed of these data structures requires random disk writes. The optimization logic of LSM tree is to sacrifice part of the read performance and convert random writes into sequential writes to optimize the writing of data.

In this article, we will not introduce in detail why LSM tree has better write performance. We will just analyze why wiredtiger uses B tree as the default data structure. Wiredtiger benchmarked the read and write throughput of LSM tree and B tree, and obtained the results as shown in the figure below. From the results in the figure, we can find:

Why mongodb uses B-tree


Without limiting writing;

The write performance of LSM tree is 1.5 ~ 2 times that of B tree;

The reading performance of LSM tree is 1/6 ~ 1/3 that of B tree;

In the case of restricted writing;

The write performance of LSM tree is basically the same as that of B tree;

The reading performance of LSM tree is 1/4 ~ 1/2 that of B tree;

In the case of limited writing, 30000 pieces of data will be written per second. From the analysis results here, in either case, the reading performance of B tree is much better than that of LSM tree. For most OLTP systems, the query of the system will be many times that of writing, so the excellent performance of LSM tree in writing cannot make it the default data format of mongodb.


Application scenarios are always the first thing to consider in system design. As a NoSQL mongodb, its target scenario is quite different from earlier databases. Let’s briefly summarize the two reasons why mongodb finally chose to use B-tree:

MySQL uses b+ tree because data traversal is very common in relational databases. It often needs to deal with the relationship between tables and query some data through the range; However, as a document oriented database, mongodb pays more attention to the document centered organization than the relationship between data, so it selects the B-tree with better performance to query a single document, which can also ensure an acceptable delay for the query of traversal data.

LSM tree is a data structure specially used to optimize writing. It turns random writing into sequential writing, which significantly improves the writing performance, but sacrifices the efficiency of reading. This does not match the characteristics required in most scenarios, so mongodb finally chooses the B tree with better reading performance as the default data structure.