Read the optimistic lock, pessimistic lock and mvcc in the database

Time:2021-10-23

preface

In the actual use of the database, we often encounter situations where we don’t want the data to be written or read at the same time. For example, in the second kill scenario, two requests read that there is still one inventory in the system at the same time, and then update the inventory to 0 successively. At this time, oversold will occur, and the actual inventory of goods will not correspond to our records.

In order to solve the data inconsistency caused by this resource competition, we need a mechanism to ensure the correct access and modification of data. In the database, this mechanism is the concurrency control of the database. Optimistic concurrency control, pessimistic concurrency control and multi version concurrency control are the main technical means of database concurrency control.

Pessimistic Concurrency

essence

Wikipedia: in relational database management system, pessimistic concurrency control (also known as “pessimistic lock”, abbreviated as “PCC”) is a method of concurrency control. It can prevent a transaction from modifying data in a way that affects other users. If the operation performed by a transaction reads a row of data and applies a lock, other transactions can perform operations that conflict with the lock only when the transaction releases the lock.

In fact, the pessimistic lock we often say is not an actual lock, but an idea of concurrency control. Pessimistic concurrency control holds a pessimistic attitude towards the modification of data and believes that there will be conflict when the data is accessed by the outside world. Therefore, in the process of data processing, locking is adopted to ensure the monopoly of resources.

In fact, the database lock mechanism is implemented based on the pessimistic concurrency control point of view, and according to the actual use, the database lock can be divided into many types. For details, see my later articles.

Implementation mode

The locking process of database pessimistic lock is roughly as follows:

  • After starting a transaction, add a certain type of lock to the data application that needs to be locked according to the operation type: for example, shared row lock, etc
  • If the lock is successfully added, the subsequent operation will continue. If the data has been added with other locks and conflicts with the lock to be added now, the lock will fail (for example, an exclusive lock has been added). At this time, you need to wait for other locks to be released (deadlock may occur)
  • Release the lock when the transaction is completed

Advantages and disadvantages

advantage:
Pessimistic concurrency control adopts a conservative strategy: “take the lock first and access the data only after success”, which ensures that data acquisition and modification are carried out in an orderly manner. Therefore, it is suitable for use in an environment with more write and less read. Of course, using pessimistic locks cannot maintain very high performance, but on the premise that optimistic locks cannot provide better performance, pessimistic locks can ensure the security of data.

Disadvantages:
Due to the need for locking, and may face the problem of lock conflict or even deadlock, pessimistic concurrency control increases the additional overhead of the system, reduces the efficiency of the system, and reduces the parallelism of the system.

Optimistic concurrency control

essence

Wikipedia: in relational database management system, optimistic concurrency control (also known as “optimistic lock”, abbreviated as “OCC”) is a method of concurrency control. It assumes that multi-user concurrent transactions will not affect each other during processing, and each transaction can process the affected part of data without locks.

Optimistic concurrency control is optimistic about data modification. It believes that even in a concurrent environment, external operations on data generally do not cause conflicts, so it does not lock. Before submitting data updates, each transaction will first check whether other transactions have modified the data after the transaction reads the data. If other transactions are updated, the conflict information will be returned to let the user decide how to do the next step, such as retry or rollback.

It can be seen that optimistic locks are not actual locks, and they are not even used to realize concurrency control. Instead, other methods are used to judge whether data can be modified. Optimistic lock is generally a lock mechanism implemented by the user. Although the actual lock is not used, it can produce the effect of locking.

Implementation mode

CAS (compare and swap) is a well-known lock free algorithm. Lock free programming is to realize variable synchronization between multiple threads without locks, that is, to realize variable synchronization without threads being blocked, so it is also called non blocking synchronization. The scheme to realize non blocking synchronization is called “non blocking algorithm”.

Optimistic locks are basically implemented based on CAS (compare and swap) algorithm. Let’s first look at the CAS process. A CAS operation process can be represented by the following C code:

int cas(long *addr, long old, long new)
{
    /* Executes atomically. */
    if(*addr != old)
        return 0;
    *addr = new;
    return 1;
}

CAS has three operands, memory value V, old expected value a, and new value B to be modified. If and only if the expected value a is the same as the memory value V, modify the memory value V to B, otherwise do nothing. The whole CAS operation is an atomic operation and is inseparable.

The implementation of optimistic lock is similar to the above process, mainly in the following ways:

  • Version number tag: add a new field in the table: version, which is used to save the version number. When obtaining data, obtain the version number at the same time, and then use the following command when updating data:update xxx set version=version+1,… where … version="old version" and ....。 At this time, judge whether the update is successful by judging whether the number of affected rows of the returned result is 0. If the update fails, it indicates that other requests have updated the data.
  • Timestamp mark: just like the version number, it is judged by timestamp. Generally speaking, many data tables have an update time field. Judging from this field, there is no need to add another field.
  • Fields to be updated: if there is no timestamp field and you do not want to add new fields, you can consider using the fields to be updated to judge, because the updated data will generally change. Before updating, you can compare the old value of the field to be updated with the present value of the database. If there is no change, you can update.
  • All field marks: all fields in the data table are used for judgment. This is equivalent to locking not only certain fields, but also the whole data row. As long as the data of this row changes, it will not be updated.

Advantages and disadvantages

advantage:
Optimistic concurrency control does not actually lock, so there is no additional overhead. It is also good to have deadlock problems. It is suitable for concurrent scenarios with more reads and less writes. Because there is no additional overhead, it can greatly improve the performance of the database.

Disadvantages:
Optimistic concurrency control is not suitable for concurrency scenarios with more writes and less reads, because there will be many write conflicts, resulting in multiple data writes waiting for retries. In this case, the overhead is actually higher than pessimistic locks. Moreover, the business logic of optimistic lock is more complex than pessimistic lock. In business logic, failure and waiting for retry should be considered, and direct modification of the database by other third-party systems cannot be avoided.

Multi version concurrency control

essence

Wikipedia: multi version concurrency control (MCC or mvcc) is a kind of concurrency control commonly used in database management system. It is also used to realize transaction memory in programming language.

Optimistic concurrency control and pessimistic concurrency control both solve the competition conditions between transactions by delaying or terminating the corresponding transactions to ensure the serialization of transactions; Although the previous two concurrency control mechanisms can fundamentally solve the problem of serialization of concurrent transactions, they are actually solving the problem of write conflict. The difference between them is that they are optimistic about write conflict (pessimistic lock can also solve the problem of read-write conflict, but the performance is average). In the actual use process, the database read request is many times that of write request. If we can solve the problem of read-write concurrency, we can greatly improve the read performance of the database, which is what multi version concurrency control can do.

Different from pessimistic concurrency control and optimistic concurrency control, mvcc is to solve the problem of multiple and long-time read operations starving write operations caused by read-write locks, that is, to solve the problem of read-write conflicts. Mvcc can be combined with either of the first two mechanisms to improve the read performance of the database.

Pessimistic locking of database is based on the consideration of improving concurrency performance. Generally, multi version concurrency control is realized at the same time. Not only mysql, but also other database systems such as Oracle and PostgreSQL have implemented mvcc, but their implementation mechanisms are different, because mvcc does not have a unified implementation standard.

In general, the emergence of mvcc is a solution proposed by the database dissatisfied with using pessimistic locks to solve the read-write conflict problem due to low performance.

Implementation mode

Mvcc is realized by saving a snapshot of data at a certain point in time. The data item read by each transaction is a historical snapshot, which is called snapshot reading. Unlike the current reading, the data read by the snapshot may not be the latest, but snapshot isolation can make the data seen in the whole transaction the data state when it was started. The write operation does not overwrite the existing data item, but creates a new version, which will not become visible until the transaction is committed.

Current read and snapshot read

What are current reads and snapshot reads under MySQL InnoDB?

Current read
Like select lock in share mode, select for update; Update, insert, delete (exclusive lock) these operations are all current reads. Why are they called current reads? It reads the latest version of the record. When reading, it also ensures that other concurrent transactions cannot modify the current record, and locks the read record.

Snapshot read
For example, an unlocked select operation is a snapshot read, that is, an unlocked non blocking read; The premise of snapshot read is that the isolation level is not uncommitted read and serialization level, because uncommitted read always reads the latest data row, not the data row conforming to the current transaction version. Serialization locks all read rows

Advantages and disadvantages

Mvcc enables most read operations without locking. This design makes the read data operation very simple and has good performance, and can also ensure that only rows that meet the standards will be read. The disadvantage is that each row of records requires additional storage space, more row checking and some additional maintenance.

Applicable scenario

  • Pessimistic lock

    • Lock concurrency control for solving read-write conflict and write write conflict
    • It is applicable to situations where there are many writes, few reads and serious write conflicts, because pessimistic locks are locked when reading data. Scenarios where there are many reads will require frequent locking and a lot of waiting time. In cases where there are serious write conflicts, using pessimistic locks can ensure data consistency
    • High data consistency requirements
    • It can solve the problems of dirty reading, phantom reading, non repeatable reading, loss of the first type of update and loss of the second type of update
  • Optimistic lock

    • Lock free concurrency control for solving write write conflict
    • It is applicable to read more and write less, because if a large number of write operations occur, the possibility of write conflict will increase, and the business layer needs to retry continuously, which will greatly reduce the system performance
    • Data consistency requirements are not high, but very high response speed is required
    • Dirty reading, unreal reading and non repeatable reading cannot be solved, but the problem of update loss can be solved
  • MVCC

    • Lock free concurrency control for solving read-write conflict
    • Combine with the above two to improve their read performance

reference material

Wikipedia
https://www.cnblogs.com/rinack/p/10032207.html
https://draveness.me/database-concurrency-control/

Copyright notice

Please indicate the author and the source of the article
Author: Mr. X
https://segmentfault.com/a/1190000023332101

Recommended Today

Swift advanced (XV) extension

The extension in swift is somewhat similar to the category in OC Extension can beenumeration、structural morphology、class、agreementAdd new features□ you can add methods, calculation attributes, subscripts, (convenient) initializers, nested types, protocols, etc What extensions can’t do:□ original functions cannot be overwritten□ you cannot add storage attributes or add attribute observers to existing attributes□ cannot add parent […]