In a concurrent environment, do you operate the database first or the cache first?

Time:2020-5-26

preface

In a distributed system, when cache and database exist at the same time, if there are write operations, do you want to operate the database first or cache first? Think about the possible problems before you look down. I’ll explain it in several ways.

Cache maintenance scheme I

Suppose there is a write (thread a) read (thread b) operation,Operate the cache first, and operate the database。 , as shown in the following flow chart:
In a concurrent environment, do you operate the database first or the cache first?

1) Thread a initiates a write operation. Step 1: del cache

2) Thread a writes new data to dB in the second step

3) Thread B initiates a read operation, cache miss,

4) Thread B gets the latest data from DB

5) Request B to set cache at the same time

In this way, it’s ok。 Let’s look at the second flow chart, as follows:

In a concurrent environment, do you operate the database first or the cache first?

1) Thread a initiates a write operation. Step 1: del cache

2) At this time, thread B initiates a read operation, cache miss

3) Thread B continues to read dB, reading out an old data

4) Then the old data is put into the cache

5) Thread a writes the latest data

OK, there is a problem. The old data has been put into the cache,Every read is old data, cache and data are inconsistent with database data

Cache maintenance scheme II

Double write operation,Operate the cache first, and operate the database

In a concurrent environment, do you operate the database first or the cache first?

1) Thread a initiates a write operation. Step 1: set cache

2) Thread a writes new data to dB in the second step

3) Thread B initiates a write operation, set cache,

4) Thread B writes new data to dB in the second step

In this way, there is no problem., but sometimes things may go against our wishes. Let’s look at the second flow chart, as follows:

In a concurrent environment, do you operate the database first or the cache first?

1) Thread a initiates a write operation. Step 1: set cache

2) Thread B initiates a write operation. Step 1: setcache

3) Thread B writes database to DB

4) Thread a writes database to DB

After execution, the cache stores the data after operation B, and the database is the data after operation a,Cache and database data are inconsistent

Cache maintenance scheme 3

Write (thread a) read (thread b) operation,Operate database first, then cache

In a concurrent environment, do you operate the database first or the cache first?

1) Thread a initiates a write operation. Step 1: write DB

2) Step 2 of thread a: del cache

3) Thread B initiates a read operation, cache miss

4) Thread B gets the latest data from DB

5) Thread B sets cache at the same time

This programNo obvious concurrency problems, but it’s possibleStep 2 failed to delete cacheAlthough the probability is relatively small,Superior to scheme I and scheme II, which is also used in normal work.

To sum up, we generally adopt scheme 3, but is there a perfect solution to the disadvantages of scheme 3?

Cache maintenance scheme IV

This is the improvement scheme of scheme 3, which is to operate the database first and then the cache. Let’s take a look at the flow chart:
In a concurrent environment, do you operate the database first or the cache first?

Through databasebinlogcomeAsynchronous key elimination, take MySQL as an example
sureSend binlog log collection to MQ queue using Alibaba’s canalInside, and then * * through the ACK mechanism
Confirm to process * * this update message, delete cache, and ensure data cache consistency.

But there’s another oneQuestion, what about the master-slave database

Cache maintenance scheme V

Master-slave DB problem: because there is a simultaneous delay in master-slave DB synchronization. If there is a request coming before the data is synchronized to the standby database after the cache is deleted,Dirty data will be read from the standby database, how to solve it? The solution is as follows:

In a concurrent environment, do you operate the database first or the cache first?

Cache maintenance summary

To sum up, in a distributed system, when cache and database exist at the same time, if there are write operations,Operate database first, then cache。 As follows:

(1) Whether there is relevant data in the read cache

(2) If there is related data value in the cache, return

(3) If there is no relevant data in the cache, read the relevant data from the database and put it into the cache key – > value, then return

(4) If there is update data, update the data first, and then delete the cache

(5) In order to ensure the success of the fourth step, use binlog to delete the cache asynchronously

(6) If it is a master-slave database, binglog is taken from the slave database

(7) If it is one master and many slaves, each slave database must collect binlog, and then the consumer receives the last binlog data before deleting the cache

Official account number

In a concurrent environment, do you operate the database first or the cache first?

  • Welcome to pay attention to it. Let’s learn and discuss it together.
  • GitHub address: https://github.com/whx123/Jav…