10. Part V efficient concurrency – Chapter 13 thread safety and lock optimization

Time:2022-1-4

summary

Thread safety

When multiple threads access an object at the same time, if the scheduling and alternate execution of these threads in the runtime environment are not considered, no additional synchronization is required, or any other coordination operation is performed by the caller, and the behavior of calling this object can obtain the correct results, then this object is called thread safe.

Thread safety in Java language

The data shared by various operations in Java language is divided into the following five categories: immutability, absolute thread safety, relative thread safety, thread compatibility and thread opposition.

  1. Immutable
    In the Java language, immutable
    The object of (immutable) must be thread safe. Neither the method implementation of the object nor the caller of the method need any thread safety measures.
    There are many ways to ensure that the behavior of an object does not affect its state. The simplest one is to declare all variables with state in the object as final, so that it is immutable after the constructor ends.
  2. Absolute thread safety
  3. Relative thread safety
    Relative thread safety is what we usually call thread safety. It needs to ensure that a single operation on this object is thread safe. We don’t need to take additional safeguard measures when calling, but for some continuous calls in a specific order, we may need to use additional synchronization means at the calling end to ensure the correctness of the call.
  4. Thread compatible
  5. Thread opposition

Implementation method of thread safety

  1. Mutually exclusive synchronization
    Mutual exclusion & synchronization is one of the most common and important concurrent correctness guarantee means. Synchronization means that when multiple threads access shared data concurrently, ensure that the shared data is used by only one (or some, when using semaphores) thread at the same time. Mutual exclusion is a means to realize synchronization. Critical section, mutex and semaphore are common ways to realize mutual exclusion. Therefore, in the four words “mutually exclusive synchronization”, mutual exclusion is the cause and synchronization is the result; Mutual exclusion is the method and synchronization is the purpose.

    In Java, the most basic mutually exclusive synchronization means is the synchronized keyword, which is a block structured synchronization syntax. After the synchronized keyword is compiled in javac, two bytecode instructions, monitorenter and monitorexit, will be formed before and after the synchronization block. Both bytecode instructions require a parameter of type reference to indicate the object to be locked and unlocked. If synchronized in the Java source code explicitly specifies the object parameters, the reference of the object is used as the reference; If it is not explicitly specified, the method type modified by synchronized (such as instance method or class method) will determine whether to take the object instance where the code is located or the class object corresponding to the type as the lock to be held by the thread.

    When executing the monitorenter instruction, first try to obtain the lock of the object. If the object is not locked or the current thread already holds the lock of that object, the value of the lock counter will be increased by one, and the value of the lock counter will be decreased by one when the monitorexit instruction is executed. Once the counter value is zero, the lock is released. If the acquisition of an object lock fails, the current thread should be blocked and wait until the object requesting the lock is released by the thread holding it.

    Special attention should be paid when using synchronized:

    • Synchronization blocks modified by synchronized are reentrant to the same thread. That is, the same thread will not be locked by its own locked resources
    • The synchronized block modified by synchronized will unconditionally block the entry of other threads after the thread holding the lock executes and releases the lock. This means that it is impossible to force the thread that has acquired the lock to release the lock as it does with locks in some databases; It is also not possible to force a thread waiting for a lock to interrupt the wait or time out.

    Comparison between reentrant lock and synchronized: reentrant lock is the same, and its basic usage is similar; However, reentry lock adds advanced functions such as waiting interruptible, fair lock and lock binding multiple conditions.

  • Wait interruptible: when the thread holding the lock does not release the lock for a long time, the waiting thread can choose to give up waiting and deal with other things instead. The interruptible feature is helpful for processing synchronous blocks with very long execution time.
  • Fair lock: when multiple threads wait for the same lock, they must obtain the lock in turn according to the time sequence of applying for the lock; Non fair locks do not guarantee this. When the lock is released, any thread waiting for the lock has the opportunity to obtain the lock. Locks in synchronized are unfair, and reentrantlock is also unfair by default, but fair locks can be required through constructors with Boolean values. However, once a fair lock is used, the performance of reentrantlock will decline sharply, which will significantly affect the throughput.
  • Lock binding multiple conditions: it means that a reentrantlock object can bind multiple condition objects at the same time. In synchronized, the wait () of the lock object can cooperate with its notify () or notifyAll () method to implement an implicit condition. If it wants to be associated with more than one condition, it has to add an additional lock; Reentrantlock does not need to do this. It can call the newcondition () method multiple times.

It is recommended to use synchronized when both synchronized and reentrantlock can meet the requirements

  • Synchronized is synchronization at the Java syntax level, clear enough and simple enough. Every java programmer is familiar with synchronized, but not the lock interface in j.u.c. Therefore, synchronized is recommended when only basic synchronization functions are required.
  • Lock should ensure that the lock is released in the finally block, otherwise once an exception is thrown in the synchronization protected code block, the held lock may never be released. Using synchronized, the Java virtual machine ensures that the lock can be automatically released even if an exception occurs.
  • In the long run, the Java virtual machine is easier to optimize for synchronized, because the Java virtual machine can record the lock related information in synchronized in the metadata of threads and objects, while using lock in j.u.c, it is difficult for the Java virtual machine to know which lock objects are held by specific thread locks.
  1. Asynchronous blocking
    The main problem of mutually exclusive synchronization is the performance overhead caused by thread blocking and wake-up, so this synchronization is also called blocking synchronization. Mutually exclusive synchronization is a pessimistic concurrency strategy. It always thinks that as long as the correct synchronization measures (such as locking) are not taken, there will be problems. No matter whether the shared data will compete or not, it will lock, which will lead to the overhead of user state to core state conversion, maintaining lock counters and checking whether there are blocked threads that need to be awakened.

The optimistic concurrency strategy based on conflict detection is to operate first regardless of the risk. If no other threads compete for shared data, the operation will succeed directly; If the shared data is indeed contested and conflicts occur, other compensation measures shall be taken. The most common compensation measure is to constantly retry until there is no competing shared data. The implementation of this optimistic concurrency strategy no longer needs to suspend thread blocking, so this synchronization operation is called non blocking synchronization, and the code using this measure is often called lock free programming.
The two atomic instructions of operation and conflict detection are:

  • Test and set
  • Fetch and increment
  • Swap
  • Compare and swap (CAS)
  • Load linked / store conditional (ll / SC)

The CAS instruction requires three operands: the memory location, the old expected value a, and the new value B to be set. When the CAS instruction is executed, if and only if V conforms to a, the processor will update the value of V with B, otherwise it will not perform the update. However, no matter whether the value of V is updated or not, the old value of V will be returned. The above processing process is an atomic operation and will not be interrupted by other threads during execution.

import java.util.concurrent.atomic.AtomicInteger;

/**
 *Automatic increment operation test of atomic variable
 */
public class AtomicTest {
    public static AtomicInteger race = new AtomicInteger(0);

    public static void increase() {
        //The incrementandget () method keeps trying to assign a new value one greater than the current value to itself in an infinite loop. If it fails,
        //That means that the old value has changed during the CAS operation, so the next operation is repeated until the setting is successful.
        race.incrementAndGet();
    }

    private static final int THREADS_COUNT = 20;

    public static void main(String[] args) {
        Thread[] threads = new Thread[THREADS_COUNT];
        for (int i = 0; i < THREADS_COUNT; i++) {
            threads[i] = new Thread(new Runnable() {
                @Override
                public void run() {
                    for (int i = 0; i < 10000; i++) {
                        increase();
                    }
                }
            });
            threads[i].start();
        }

        while (Thread.activeCount() > 2) Thread.yield();
        System.out.println(race);
    }

}

The old value of CAS above does not necessarily ensure that no thread has changed, because ABA problems may occur.

Lock optimization

Adaptive spinning, lock elimination, lock inflating, lightweight locking, biased locking

Spin lock and adaptive spin

If the physical machine has multiple processor cores and can allow two or more threads to execute in parallel at the same time, we can ask the thread requesting the lock to “wait a moment”, but do not give up the execution time of the processor to see if the thread holding the lock will release the lock soon. In order to make the thread wait, we only need to make the thread execute a busy loop (spin). This technology is called spin lock.

Spin waiting cannot replace blocking, and let alone the requirements for the number of processors. Although spin waiting itself avoids the overhead of thread switching, it takes up processor time. Therefore, if the lock is occupied for a short time, the effect of spin waiting will be very good. On the contrary, if the lock is occupied for a long time, the spinning thread will only consume processor resources in vain. Therefore, the spin times will be set. The default value of spin times is ten times. Users can also use the parameter – XX: preblockspin to change it by themselves.

JDK6 introduces adaptive spin. If on the same lock object, the spin wait has just successfully obtained the lock, and the thread holding the lock is running, the virtual machine will think that the spin is likely to succeed again, allowing the spin wait to last for a relatively longer time, such as 100 busy cycles. On the other hand, if spin rarely successfully obtains a lock for a lock, it is possible to directly omit the spin process when acquiring the lock in the future, so as to avoid wasting processor resources.

Lock elimination

Lock elimination means that the virtual machine real-time compiler requires synchronization of some codes when running, but eliminates locks that are detected to be impossible to compete for shared data. The main judgment basis for lock elimination comes from the data support of escape analysis. If it is judged that all data on the heap will not escape and be accessed by other threads in a piece of code, they can be treated as data on the stack. It is considered that they are thread private, and synchronous locking is naturally unnecessary.

Lock coarsening

In principle, when writing code, it is always recommended to limit the scope of the synchronization block as small as possible – synchronization is only carried out in the actual scope of shared data, so as to minimize the number of operations to be synchronized. Even if there is lock competition, the line waiting for the lock can get the lock as soon as possible.

Most of the above principles are correct, but if a series of continuous operations repeatedly lock and unlock the same object, or even the lock operation occurs in the loop body, frequent mutually exclusive synchronization operations will lead to unnecessary performance loss even if there is no thread competition.

Lightweight Locking

Lightweight lock is a new lock mechanism added in JDK 6. Its name is relative to the traditional lock mechanism called “heavyweight” lock. The original intention of the design is to reduce the performance consumption caused by the use of operating system Mutex by the traditional heavyweight lock without multi-threaded competition.

The object header (32bit or 64bit) of the hotspot virtual machine is divided into two parts. The first part (mark word) is used to store the runtime data of the object itself, such as hashcode, GC generation age, etc. This part is the key to realize lightweight lock and bias lock. The other part is used to store pointers to the object type data of the method area. If it is an array object, there will be an additional part to store the array length.

Mark word is designed as a non fixed dynamic data structure to store as much information as possible in a very small space. It will reuse its own storage space according to the state of the object.

10. Part V efficient concurrency - Chapter 13 thread safety and lock optimization

Working process of lightweight lock:

When the code is about to enter the synchronization block, if the synchronization object is not locked (the lock flag bit is in the “01” state), the virtual machine will first create a space called lock record in the stack frame of the current thread to store the current copy of mark word of the lock object (the official prefix is displaced, i.e. displaced mark word)

10. Part V efficient concurrency - Chapter 13 thread safety and lock optimization

The virtual machine will then attempt to update the mark word of the object to a pointer to the lock record using the CAS operation. If the update action is successful, it means that the thread has the lock of the object, and the lock flag bit (the last two bits of mark word) of the object mark word will change to “00”, indicating that the object is in a lightweight locking state.

10. Part V efficient concurrency - Chapter 13 thread safety and lock optimization

If the update operation fails, it means that at least one thread competes with the current thread to obtain the lock of the object. The virtual machine will first check whether the mark word of the object points to the stack frame of the current thread. If so, it indicates that the current thread already has the lock of the object. It can directly enter the synchronization block to continue execution. Otherwise, it indicates that the lock object has been preempted by other threads. If more than two threads compete for the same lock, the lightweight lock will no longer be valid. It must be expanded into a heavyweight lock, and the status value of the lock flag will change to “10”. At this time, the pointer to the heavyweight lock (mutex) is stored in mark word, and the subsequent threads to be locked must also enter the blocking state.

Unlocking process:
Similarly, CAS operation is used to replace the current mark word of the object and the displaced mark word copied in the thread if the mark word of the object still points to the lock record of the thread. If it can be replaced successfully, the whole synchronization process will be completed smoothly; If the replacement fails, it indicates that another thread has tried to obtain the lock. It is necessary to wake up the suspended thread while releasing the lock.

The premise of lightweight lock is that there is no competition in the synchronization cycle. If there is no competition, the lightweight lock can successfully avoid the overhead of using mutex through CAS operation; However, if lock contention does exist, in addition to the cost of mutex itself, there is an additional cost of CAS operation. Therefore, in the case of competition, lightweight locks will be slower than traditional heavyweight locks.

Bias lock

The purpose of bias lock is to eliminate the synchronization primitive of data without competition and further improve the running performance of the program. If the lightweight lock uses CAS operation to eliminate the mutex used in synchronization without competition, the biased lock is to eliminate the whole synchronization without competition, even CAS operation.

The lock will be biased to the first thread to obtain it. If the lock has not been obtained by other threads in the next execution process, the thread holding the biased lock will never need to synchronize again.

Assuming that the current virtual machine has enabled the bias lock (enabling parameter – XX: + usebiasedlocking, which is the default value of the hotspot virtual machine since JDK6), when the lock object is obtained by the thread for the first time, the virtual machine will set the flag bit in the object header to “01” and the bias mode to “1”, indicating that it enters the bias mode. At the same time, the CAS operation is used to record the ID of the thread that obtained the lock in the markword of the object. If the CAS operation is successful, every time the thread holding the biased lock enters the synchronization block related to the lock, the virtual machine can no longer perform any synchronization operations (such as locking, unlocking and updating mark word).

Once another thread attempts to acquire the lock, the bias mode ends immediately. Whether to cancel the bias (bias mode is set to “0”) is determined according to whether the lock object is currently locked. After cancellation, the flag bit returns to the state of unlocked (flag bit is “01”) or lightweight locked (flag bit is “00”), and subsequent synchronization operations are performed according to the lightweight lock described above.

10. Part V efficient concurrency - Chapter 13 thread safety and lock optimization

When an object has calculated the consistency hash code, it can no longer enter the bias lock state; When an object is currently in a biased lock state and receives a request to calculate its consistency hash code, its biased state will be revoked immediately, and the lock will expand into a heavyweight lock. In the implementation of the heavyweight lock, the object header points to the position of the heavyweight lock. There are fields in the objectmonitor class representing the heavyweight lock that can record markword in the unlocked state (the flag bit is “01”), in which the original hash code can be stored naturally.

From books: in depth understanding of Java virtual machine: JVM advanced features and best practices (3rd Edition) – Zhou Zhiming