Detailed explanation of mongodb’s compact operation

Time:2021-6-25

Abstract:There are many steps in compact operation, but it can effectively reduce the disk usage.

Detailed explanation of mongodb's compact operation

Mongodb and disk

WhenFundebugMore and more data are processed, which leads to more and more disk usage and faster growth of mongodb. Therefore, I began to delete expired data regularly to optimize the algorithm to reduce redundant data. But I found that,Simply deleting documents cannot reduce mongodb disk usage. Why? Here is an explanation of the official document:

For wiredtiger storage engine (used by default after mongodb 3.2)::How do I reclaim disk space in WiredTiger?

The WiredTiger storage engine maintains lists of empty records in data files as it deletes documents. This space can be reused by WiredTiger, but will not be returned to the operating system unless under very specific circumstances.

in other words,The disk space occupied by the deleted document is still reserved by mongodb, will not be released. The same is true for older versions of mongodb’s mmapv1 storage engine. There’s nothing wrong with this, because the database will keep storing new documents, which can take advantage of the disk space previously reserved.

However, if you delete many documents and need mongodb to free disk space, what should you do? As described in the documentation, for the wiredtiger storage engine, we can usecompactOperation.

To allow the WiredTiger storage engine to release this empty space to the operating system, you can de-fragment your data file. This can be achieved using the compact command.

About compact operation

compactThe operation will defragment the fragmented disk and free up extra space.

Rewrites and defragments all data and indexes in a collection. On WiredTiger databases, this command will release unneeded disk space to the operating system.

For the compact operation, I’ve listed a few simple Q & as.

  • Does compact block database reading and writing? meeting! Therefore, the compact operation cannot be carried out in the peak period; For a replica set, the compact operation should be performed on each node in turn.
  • Can compact free disk space? For wiredtiger, yes; However, for the wiredtiger storage engine, the extra disk space will still be reserved for mongodb.
  • Does the compact operation take up extra disk space? According to my observation, basically not.
  • How much should paddingfactor be set? The value I set is 1.1, which can leave some extra space for each document and improve the modification performance. This value can be set according to actual needs.
  • How long does the compact operation take? It took me less than an hour to build a 400g replica node. In this way, the time should be related to the amount of data.
  • What is the effect of compact operation? Reduced disk space by nearly 50%, which should be related to the number of documents deleted.

Compact operation steps

Since compact operation will block the read and write operation of mongodb, each node should be operated in turn. In addition, the standard maintenance process of mongodb replica set is to temporarily set the secodary node to a separate port to start an independent Mongo instance for operation, so that the replica set can be completely isolated.

WeFundebugMongodb cluster runs in docker, so the operation steps are a little simpler, which can provide reference for you.

Secondary node

  • Close the mongodb container
sudo docker stop mongo
  • Start a separate temporary mongodb container
sudo docker run -it -d -p 37017:27017 -v /data/db:/data/db --name mongo_tmp mongo:3.2
  • Execute the compact command
mongo 127.0.0.1:37017
db.runCommand( { compact : 'events',paddingFactor: 1.1 } )
  • Restart mongodb node
sudo docker rm -f mongo_tmp
sudo docker start mongo

Primary node

  • Change primary node to secondary node
rs.stepDown()
  • Operate according to the secondary node

reference resources

Recommended Today

Swift advanced (XV) extension

The extension in swift is somewhat similar to the category in OC Extension can beenumeration、structural morphology、class、agreementAdd new features□ you can add methods, calculation attributes, subscripts, (convenient) initializers, nested types, protocols, etc What extensions can’t do:□ original functions cannot be overwritten□ you cannot add storage attributes or add attribute observers to existing attributes□ cannot add parent […]