Mongodb oplog

Time:2021-10-8

At the beginning, I thought oplog should be similar to MySQL bin log, but in fact, it is similar. Oplog is also used to record between replication sets by the primary node and synchronize with the secondary node. To keep the data consistent.

Recently, I encountered the problem of deleting dB by mistake (deleting database cannot run away). Therefore, I experimented with oplog for N times   Recover data.

Make a special record for future reference.

# —————————— oplog ———————————
##1. Use oplog in the replication set. You can use the following commands to view oplog:
rpset1:PRIMARY> rs.printReplicationInfo()
configured oplog size: 10240MB
log length start to end: 149092secs (41.41hrs)
oplog first event time: Sun Apr 26 2020 20:25:46 GMT+0800 (CST)
oplog last event time: Tue Apr 28 2020 13:50:38 GMT+0800 (CST)
now: Tue Apr 28 2020 13:50:38 GMT+0800 (CST)

rpset1:SECONDARY> rs.printReplicationInfo()
configured oplog size: 10240MB
log length start to end: 149937secs (41.65hrs)
oplog first event time: Sun Apr 26 2020 20:10:59 GMT+0800 (CST)
oplog last event time: Tue Apr 28 2020 13:49:56 GMT+0800 (CST)
now: Tue Apr 28 2020 13:49:56 GMT+0800 (CST)

rpset1:SECONDARY> rs.printReplicationInfo()
configured oplog size: 10240MB
log length start to end: 148635secs (41.29hrs)
oplog first event time: Sun Apr 26 2020 20:32:00 GMT+0800 (CST)
oplog last event time: Tue Apr 28 2020 13:49:15 GMT+0800 (CST)
now: Tue Apr 28 2020 13:49:16 GMT+0800 (CST)

#Oplogsize in the configuration file conf / slave.conf
replication:
  oplogSizeMB: 10240
  replSetName: rpset1

As can be seen from the above command, the oplog of this replica set   It has 41 hours of capacity, and this mongodb   There are regular backups every day. Therefore, this capacity must be enough.

Using oplogreplay   To recover data, the official said that there must be a special permission.

##2. Create a special role and use oplogreplay. This role must have anyresource and anyaction
#This permission is not required during backup, but must be available during recovery, otherwise the recovery fails and there is no error message.
use admin
db.createRole(
   {
    "role" : "sysadmin",
    "privileges" : [{ "resource" : {"anyResource" : true}, "actions" : ["anyAction"] }],
    "roles" : []
   }
)

#Create a dedicated user to use this role
db.createUser({user:"admin", pwd:"admin", roles:[{role:"sysadmin", db:"admin"}]})
#Or authorize a user db.grantrolestouser ("root", [{role: "sysadmin", DB: "admin"}])

 

Check the command of scheduled backup dB and find the following:

##3. Daily full backup
./mongodump -h 10.170.6.116:27017 -u admin -p admin --authenticationDatabase admin --gzip -o /data/tmp/rs0

#If the -- oplog option is available during backup, there will be an oplog.bson file in the output directory
# ./mongodump -h 10.170.6.116:27000 -u rsroot -p abcd1234 --authenticationDatabase admin --oplog -o /data/tmp/rs0

 

Because — oplog is not brought during backup   Parameter, so when restoring, restore the backup first and then oplogreplay, that is, refer to point 9 below.

At 4 to 8 o’clock, it is used to bring oplogreplay while restoring the backup   The way.

##4. Assuming that a false deletion occurs at a certain time point after the last daily backup, you need to use oplogreplay to recover the new data during this period
#First check the time point of the last daily backup (if the -- oplog parameter is used during dump, there will be oplog.bson file. If not, please refer to Article 9):
./bsondump /data/tmp/rs0/oplog.bson > /data/tmp/0
cat /data/tmp/0  
#Find the first line {"ts": {"$timestamp": {"t": 1588138496, "I": 1}
#Meaning of field:
TS: the time when the operation occurred, t: Unix timestamp, I: it can be regarded as the first few in the same time
h: Unique ID of the record
v: Version information
OP: type of write operation
   n: no-op
   c: db cmd
   i: insert
   u: update
   d: delete

NS: the namespace of the operation, that is, the database. Collection
o: The document corresponding to the operation
O2: the corresponding where condition when updating, which is available only when updating
#The start timestamp can be freely specified without finding records in oplog. Just a little earlier than the required time point.
./mongodump -h 192.168.6.116:27017 -u admin -p admin --authenticationDatabase admin -d local -c oplog.rs -q '{"ts":{"$gt": {"$timestamp":{"t":1588138300,"i":1}}}}' -o /data/tmp/rs1
##5. Export the current local / oplog.rs. note the JSON format of the - Q option
#Because the backup capacity of the whole local / oplog.rs is too large and the recovery will take too long, the start time method is adopted:
./mongodump -h 192.168.6.116:27017 -u admin -p admin --authenticationDatabase admin -d local -c oplog.rs -q '{"ts":{"$gt": {"$timestamp":{"t":1588138393,"i":1}}}}' -o /data/tmp/rs1
#You can also specify the end time as follows:
./mongodump -h 192.168.6.116:27017 -u rsroot -p abcd1234 --authenticationDatabase admin -d local -c oplog.rs -q '{"ts":{"$lte": {"$timestamp":{"t":1588142111,"i":1}}, "$gte": {"$timestamp":{"t":1588138393,"i":1}}}}' -o /data/tmp/rs2
#You can also use -- queryfile =. / n.json to specify the query file (there may be an error prompt in versions below 4.0.7)
{"ts":{"$gte": {"$timestamp":{"t":1589042338,"i":1}}}, "ns":{"$not": {"$regex": "test.names"}}}

# -q   Parameter example:

 -q '{"ts":{"$gte": {"$timestamp":{"t":1589342458,"i":1}}}, "ns":{"$nin":["test.tlog","config.system.sessions"]}}'

 -q '{"ts":{"$gte": {"$timestamp":{"t":1589342458,"i":1}}}, "lsid":{"$exists": false }}'

 

##6. Check oplog.rs.bson to manually find the deleted timestamp:
./bsondump /data/tmp/rs1/local/oplog.rs.bson > /data/tmp/1
#Open / data / TMP / 1 for manual search. If a table or library is deleted, there is drop information. If data is deleted, there is "op": "d" information


##7. Replace oplog.bson in daily full backup
rm -rf /data/tmp/rs0/oplog.bson
mv /data/tmp/rs1/local/oplog.rs.bson /data/tmp/rs0/oplog.bson


##8. Execute the recovery command (pay attention to user permissions)
./mongorestore -h 192.168.6.116:27017 -u admin -p admin --authenticationDatabase admin --oplogReplay --oplogLimit "1588232764:1" --dir /data/tmp/rs0/ 
#Where 1588232764 is "t" in $timestamp and 1 is "I" in $timestamp. After this configuration, oplog will
#Before this time point, the first delete statement and its subsequent operations are avoided, and the database remains in the pre disaster state


##9. If the daily backup does not have -- oplog and uses -- gzip, you can restore the backup first.
#Then use oplogreplay to specify a separate oplog.rs.bson file for recovery
./mongorestore -h 192.168.6.116:27017 -u admin -p admin --authenticationDatabase admin /data/tmp/rs0/ --gzip
. / mongorestore - H 192.168.6.116:27017 - U admin - P admin -- authenticationdatabase admin -- oplogreplay -- oploglimit "1588232764:1" / data / TMP / RS1 / local / oplog.rs.bson# if the recovery fails, you will be prompted with "applyops field: no such field". At this time, you can only try using the method in step 8 above.

Don’t worry about data confusion. Because oplog   Idempotency of even multiple replays   No duplicate data will be generated.    The same already exists_ ID, even if other fields are different, it will not be restored and does not exist_ ID will be restored.

Of course, you can also restore the backup and oplog to a single machine, and then use the export and import method to move the data to the production environment.

 

When the test recovers to a single machine, the same command is executed many times. Sometimes it makes mistakes and sometimes it succeeds. I don’t know why. You can only try it a few more times.

 

Recommended Today

SQL exercise 20 – Modeling & Reporting

This blog is used to review and sort out the common topic modeling architecture, analysis oriented architecture and integration topic reports in data warehouse. I have uploaded these reports to GitHub. If you are interested, you can have a lookAddress:https://github.com/nino-laiqiu/TiTanI recorded a relatively complete development process in my hexo blog deployed on GitHub. You can […]