Various types of indexes in mongodb

Time:2021-11-29

In the previous article, we introduced the simple operations of indexes in mongodb, such as creating, viewing, deleting and other basic operations. However, we only introduced one type of index above. In this article, let’s take a look at other types of indexes.

This is the tenth article in mongodb series. Understanding the previous articles will help you better understand this article:


1.Installing mongodb on Linux
2.Mongodb basic operation
3.Mongodb data type
4.Mongodb document update operation
5.Mongodb document query operation (I)
6.Mongodb document query operation (II)
7.Mongodb document query operation (III)
8.Mongodb view execution plan
9.Getting to know the index in mongodb


_ ID index

As described above, when we add documents to the collection, mongodb will help us create a document named_idThis field is an index. By default, general collections will help us create this field as an index, but some collections will not_idIt is used as an index by default, such as a fixed set, which will be discussed in detail in our later articles.

Composite index

If we have multiple query criteria, we can index them. For example, we can index both X and Y fields in the document, as follows:

db.sang_collect.ensureIndex({x:1,y:-1})

This composite index will be used when executing the following query statements:

db.sang_collect.find({x:1,y:999})

Partners can also check the query plan to determine that the index created above is indeed used.

Expired index

As the name suggests, an expired index is an index that will expire. After the index expires, the data corresponding to the index will be deleted. The creation method is as follows:

db.sang_collect.ensureIndex({time:1},{expireAfterSeconds:30})

Expireafterseconds indicates the expiration time of the index, in seconds. Time represents the field of the index. The data type of time must be Isodate or Isodate array. Otherwise, when the index expires, the data of time will not be deleted.

Full text index

Although the full-text index is easy to use, it unfortunately does not support Chinese. Let’s make a simple understanding here.

For example, my dataset is as follows:

{
    "_id" : ObjectId("59f5a3da1f9e8e181ffc3189"),
    "x" : "Java C# Python PHP"
}
{
    "_id" : ObjectId("59f5a3da1f9e8e181ffc318a"),
    "x" : "Java C#"
}
{
    "_id" : ObjectId("59f5a3da1f9e8e181ffc318b"),
    "x" : "Java Python"
}
{
    "_id" : ObjectId("59f5a3da1f9e8e181ffc318c"),
    "x" : "PHP Python"
}
{
    "_id" : ObjectId("59f5a4541f9e8e181ffc318d"),
    "x" : "C C++"
}

We can create a full-text index for the X field as follows:

db.sang_collect.ensureIndex({x:"text"})

Mongodb will automatically segment the data in the X field, and then we can query through the following statement:

db.sang_collect.find({$text:{$search:"Java"}})

At this point, all documents containing Java in X will be queried. If you want to query documents that contain both Java and C #, do the following:

db.sang_collect.find({$text:{$search:"\"Java C#\""}})

Enclose the query criteria in a pair of double quotation marks. If you want to query a document containing PHP or python, do the following:

db.sang_collect.find({$text:{$search:"PHP Python"}})  

If you want to query documents that contain both PHP and python but do not include Java, see the following:

db.sang_collect.find({$text:{$search:"PHP Python -Java"}})

After the full-text index is established, we can also view the similarity of query results, using $meta, as follows:

db.sang_collect.find({$text:{$search:"PHP Python"}},{score:{$meta:"textScore"}})

At this time, there will be an additional score field in the query result. The larger the value of this field, the higher the similarity. We can sort it by sort according to the score, as follows:

db.sang_collect.find({$text:{$search:"PHP Python"}},{score:{$meta:"textScore"}}).sort({score:{$meta:"textScore"}})

At present, the full-text index still seems to be very powerful. Unfortunately, it does not support Chinese for the time being, but there are many solutions on the Internet. Small partners can search and view it by themselves.

Geospatial index

Geospatial index type

Geospatial indexes can be divided into two categories:

1.2D index, which can be used to store and find points on the plane.
2.2d sphere index, which can be used to store and find points on the sphere.

2D index

2D index can generally be used in game maps.
Insert the data of a record point into the collection:

db.sang_collect.insert({x:[90,0]})

The format of inserted data is [longitude, latitude], value range, longitude [- 180180], latitude [- 90,90]. After the data is successfully inserted, we first create the index through the following command:

db.sang_collect.ensureIndex({x:"2d"})

Then, through $near, we can query the points near a point, as follows:

db.sang_collect.find({x:{$near:[90,0]}})

By default, 100 points near this point are returned. We can set the maximum distance returned through $maxdistance:

db.sang_collect.find({x:{$near:[90,0],$maxDistance:99}})

We can also query the points in a shape through $geowithin, such as the points in the rectangle:

db.sang_collect.find({x:{$geoWithin:{$box:[[0,0],[91,1]]}}})

Two coordinate points are used to determine the position of the rectangle.

Query points in a circle:

db.sang_collect.find({x:{$geoWithin:{$center:[[0,0],90]}}})

Parameters represent the center and radius of the circle, respectively.

Query points in polygons:

db.sang_collect.find({x:{$geoWithin:{$polygon:[[0,0],[100,0],[100,1],[0,1]]}}})

Any number of points can be filled here to represent each point in the polygon.

2D sphere index

2dsphere is suitable for spherical maps. Its data type is in geojson format. We canhttp://geojson.org/ View on addressFor example, we describe a point. Geojson is as follows:

{
    "_id" : ObjectId("59f5e0571f9e8e181ffc3196"),
    "name" : "shenzhen",
    "location" : {
        "type" : "Point",
        "coordinates" : [ 
            90.0, 
            0.0
        ]
    }
}

Description line, geojson format is as follows:

{
    "_id" : ObjectId("59f5e0d01f9e8e181ffc3199"),
    "name" : "shenzhen",
    "location" : {
        "type" : "LineString",
        "coordinates" : [ 
            [ 
                90.0, 
                0.0
            ], 
            [ 
                90.0, 
                1.0
            ], 
            [ 
                90.0, 
                2.0
            ]
        ]
    }
}

Description polygon, geojson format is as follows:

{
    "_id" : ObjectId("59f5e3f91f9e8e181ffc31d0"),
    "name" : "beijing",
    "location" : {
        "type" : "Polygon",
        "coordinates" : [ 
            [ 
                [ 
                    0.0, 
                    1.0
                ], 
                [ 
                    0.0, 
                    2.0
                ], 
                [ 
                    1.0, 
                    2.0
                ], 
                [ 
                    0.0, 
                    1.0
                ]
            ]
        ]
    }
}

There are other types, specific partners can refer tohttp://geojson.org/。 With the data, we can create a geospatial index through the following operations:

db.sang_collect.ensureIndex({location:"2dsphere"})

For example, I want to query the documents that intersect with Shenzhen, as follows:

var shenzhen = db.sang_collect.findOne({name:"shenzhen"})
db.sang_collect.find({location:{$geoIntersects:{$geometry:shenzhen.location}}})

The query result here is that it will be found if it intersects with Shenzhen (such as highways and railways passing through Shenzhen). We can also query only the areas in Shenzhen (such as all schools in Shenzhen), as follows:

var shenzhen = db.sang_collect.findOne({name:"shenzhen"})
db.sang_collect.find({location:{$within:{$geometry:shenzhen.location}}})

You can also query other locations near Tencent, as follows:

var QQ = db.sang_collect.findOne({name:"QQ"})
db.sang_collect.find({location:{$near:{$geometry:QQ.location}}})

Composite geospatial index

Location is often only a condition for our query. For example, if I want to query all schools in Shenzhen, I have to add another query condition, as follows:

var shenzhen = db.sang_collect.findOne({name:"shenzhen"})
db.sang_collect.find({location:{$within:{$geometry:shenzhen.location}},name:"QQ"})

Other query criteria follow.

Well, that’s all for the index problem in mongodb. If you have any questions, please leave a message for discussion.

reference material:

1. Authoritative guide to mongodb, 2nd Edition

More information on the official account:

Various types of indexes in mongodb