Inventory deduction and lock

Time:2022-1-15

scene

There is 1 item w left in stock, and users P1 and P2 purchase it at the same time Only one person can buy successfully (provided oversold is not allowed)
Second kill is a similar situation. There is only one commodity. N users rush to buy it at the same time, and only one person can grab it
Not to mention the second kill design and the use of queues to serialize requests, but to talk about how to use locks to ensure correct data

Common implementation schemes are as follows:

  • Code synchronization, such as using synchronized, lock and other synchronization methods
  • . update table set surplus = (surplus – buyquantity) where id = XX and (surplus – buyquantity) > 0 without query
  • Using CAS, update table set surplus = AA where id = XX and version = y
  • Use database lock, select XX for update
  • Use distributed locks (zookeeper, redis, etc.)

Scheme 1: code synchronization, such as synchronized, lock and other synchronization methods

This question is often asked during the interview, and a large number of people will answer it with this scheme
The pseudo code is as follows:

public synchronized void buy(String productName, Integer buyQuantity) {
    //More checks
    //Check remaining quantity
    Product = query records from the database;
    if (product.getSurplus < buyQuantity) {
        Return "insufficient inventory";
    }
    
    //Set new remaining quantity
    product.setSurplus(product.getSurplus() - quantity);
    //Update database
    update(product);
    //Log
    //Other business
}

Add the synchronized keyword to the method declaration to realize synchronization. In this way, two users purchase at the same time, and execute synchronously when the buy method is executed. When the second user executes, the inventory will be insufficient

Um It seems reasonable. I used to do the same So now when others answer like this, I will think silently in my heart Boy, you haven’t stepped on this pit
First, let’s talk about the premise configuration of this scheme:

  • Using spring declarative transaction management

  • The transaction propagation mechanism uses the default (propagation_required)

  • The project is divided into three layers: Controller Service Dao, and the transaction management is in the service layer

This scheme is not feasible mainly because of the following points:

  • The scope of synchronized is a single JVM instance, which is useless if it is clustered or distributed

  • Synchronized is used on object instances. If it is not a single instance, multiple instances will not be synchronized (spring is generally used to manage beans, which is a single instance by default)

  • In a single JVM, synchronized cannot guarantee the isolation of multiple database transactions This is related to the transaction propagation level in the code, the transaction isolation level of the database, the locking time, etc

    • Let’s talk about the isolation level first. Read committed and repeatable read are commonly used, and the other two are not commonly used

      • RR (repeatable read) level MySQL defaults to RR. After the transaction is started, the data submitted by other transactions will not be read
        According to the previous premise, we know that the transaction will be started when the buy method is used
        Suppose there are threads T1 and T2 executing the buy method at the same time Suppose T1 executes first and T2 waits
        Spring’s transaction start and commit are implemented through AOP (proxy), so the transaction will be started before the buy method is executed
        At this time, T1 and T2 are two transactions. After T1 is executed, T2 executes and cannot read the data submitted by T1, so there will be a problem

      • RC (read committed) level After a transaction is started, data submitted by other transactions can be read
        It seems that this level can solve the above problems When T2 executes, the results submitted by T1 can be read
        But the question is, when T2 executes, is the transaction of T1 committed?
        The flow of transactions and locks is as follows

        1. Open transaction (AOP)
        2. Lock (enter synchronized method)
        3. Release lock (exit synchronized method)
        4. Commit transaction (AOP)
    • It can be seen that the lock is released first and then the transaction is committed Therefore, when T2 executes the query, it may still not read the data submitted by T1, and there will be problems
      According to the problems in the isolation level, it is found that the main contradiction is that the time of transaction opening and submission is inconsistent with the time of locking and unlocking Some friends may have come up with a solution

      • Lock before the transaction is opened and unlock after the transaction is committed
        Indeed, this is equivalent to transaction serialization Performance aside, let’s talk about how to implement it
        If the default transaction propagation mechanism is used, to ensure that the transaction is locked before it is opened and unlocked after it is committed, the locking and unlocking need to be placed in the controller layer This has a potential problem. All methods of operating inventory should be locked, and it’s very tiring to write with the same lock
        And it still can’t cross JVM
      • The two steps of querying inventory and deducting inventory are extracted separately, the transaction is used separately, and the transaction isolation level is set to RC
        In fact, this is similar to 3-2-1 above. In the end, add unlock is placed in the outer layer of transaction open commit
        Comparatively speaking, the advantage is that there are fewer entrances The controller doesn’t have to deal with it
        The disadvantage is that the above method can not cross the JVM, and the separate method needs to be placed in another service class
        Because using spring, the internal method calls of the same bean will not be proxied again, so the configured individual transactions need to be placed in another service bean

Scheme 2: update directly without query

After reading the first scheme, a small partner said What you said is so complicated and so many problems, isn’t it because the queried data is not the latest?
We don’t need to query, just update directly
The pseudo code is as follows:

public synchronized void buy(String productName, Integer buyQuantity) {
    //More checks
    Int affected rows = update table set surplus = (surplus - buyquantity) where id = 1;
    if (result < 0) {
        Return "insufficient inventory";
    }
    //Log
    //Other business
}

After the test, it is found that the inventory has become – 1. Continue to improve

public synchronized void buy(String productName, Integer buyQuantity) {
    //More checks
    Int affects the number of rows = update table set surplus = (surplus - buyquantity) where id = 1 and (surplus - buyquantity) > 0;
    if (result < 0) {
        Return "insufficient inventory";
    }
    //Log
    //Other business
}

After the test, the function is OK;
This can be achieved, but there are some other problems:

  • It does not have generality, such as add operation
  • The inventory operation generally needs to record the quantity before and after the operation, so it can’t be recorded
  • other…
However, according to this scheme, scheme 3 can be exported

Scheme 3 uses CAS, update table set surplus = AA where id = XX and YY = y

CAS means compare / check and swap / set, which have similar meanings. Don’t worry about which word it is

Let’s modify the above SQL:

Int affected rows = update table set surplus = newquantity where id = 1 and surplus = oldquantity;

In this way, after thread T1 executes, thread T2 updates. If the number of affected rows = 0, it indicates that the data is updated. Re query and judge the execution The pseudo code is as follows:

public void buy(String productName, Integer buyQuantity) {
    //More checks
    Product product = getByDB(productName);
    Int number of affected rows = update table set surplus = (surplus - buyquantity) where id = 1 and surplus = the remaining quantity of the query;
    while (result == 0) {
        product = getByDB(productName);
        If (remaining quantity of query > buyquantity){
            Number of affected rows = update table set surplus = (surplus - buyquantity) where id = 1 and surplus = the remaining quantity of the query;
        } else {
            Return "insufficient inventory";
        }
    }
    
    //Log
    //Other business
}

When you see a few words of re query, you should think of the problem of transaction isolation level again

Yes, so the getbydb method in the above code must have separate transactions (note that separate transactions in the same bean do not take effect), and the transaction isolation level of the database must be RC,

Otherwise, the above code will be an endless loop

In the above scheme, there may be a classic problem in CAS ABA problem

ABA means:
Thread T1 query, inventory remaining 100
Thread T2 query, inventory remaining 100
Thread T1 executes subupdate t set surplus = 90 where id = x and surplus = 100;
Thread T3 query, inventory remaining 90
Thread T3 executes add update t set surplus = 100 where id = x and surplus = 90;
Thread T2 executes subupdate t set surplus = 90 where id = x and surplus = 100;

When thread T2 executes, the inventory of 100 is not the queried 100, but it does not affect this business

In general, CAS will use version to control

update t set surplus = 90 ,version = version+1 where id = x and version = oldVersion ;

In this way, you can update the version by + 1 on the original basis

There are several points to pay attention to when using CAS,

1. The number of failed retries. Do you need to limit it
2. Failure retry is transparent to users

Scheme 4: select XX for update

CAS in scheme 3 is the implementation of optimistic lock, while select for udpate is pessimistic lock When you query data, you lock the data
The pseudo code is as follows:

public void buy(String productName, Integer buyQuantity) {
    //More checks
    Product product = select * from table where name = productName for update;
    If (remaining quantity of query > buyquantity){
        Number of affected rows = update table set surplus = (surplus - buyquantity) where name = product name;
    } else {
        Return "insufficient inventory";
    }
    
    //Log
    //Other business
}

Thread T1 performs sub to query the inventory remaining 100

Thread T2 performs sub. At this time, the transaction of thread T1 has not been committed, and thread T2 is blocked. The result can not be queried until the transaction of thread T1 is committed or rolled back

Therefore, thread T2 must query the latest data It is equivalent to serializing transactions, which solves the problem of data consistency

For select for update, there are two points to note

  1. Unified entry: all inventory operations need to use select for update uniformly, so that they can be blocked. If the other method is still ordinary select, it will not be blocked
  2. Locking sequence: if there are multiple locks, the locking sequence should be consistent, otherwise deadlock will occur

Scheme 5: use distributed locks (zookeeper, redis, etc.)

Using distributed locks, the principle is the same as that of synchronized in scheme 1 The synchronized flag is only visible within the JVM process, while the distributed lock flag is globally visible The flag of select for update in scheme 4 is also globally visible

There are many implementation schemes of distributed lock: redis based, zookeeper based, database based and so on The previous blog wrote a simple implementation based on redis

Simple distributed lock based on redis setnx

It should be noted that using distributed locks and synchronized locks has the same problem, that is, the order of locks and transactions, which has been mentioned in scheme 1 No more repetition

Make a simple summary:

Scheme 1: synchronized and other JVM internal locks are not suitable to ensure database data consistency and cannot cross JVM
Scheme 2: it is not universal and cannot record the log before and after operation
Scheme 3: recommended However, if the data competition is fierce, the number of automatic retries will rise sharply
Scheme 4: recommended The simplest solution, but if the transaction is too large, there will be performance problems Improper operation will cause deadlock
Scheme 5: similar to scheme 1, but it can span the JVM

————————————————
Copyright notice: This article is the original article of CSDN blogger “Beijing Xiaobei”, which follows the CC 4.0 by-sa copyright agreement. Please attach the original source link and this notice for reprint.