Concurrent programming: synchronized

Time:2022-5-7

Hello, I’m Xiao Hei, a migrant worker who lives on the Internet.

In the previous article, I shared with you some concepts and basic usage methods of threads in Java, such as how to start a thread in Java, producer consumer mode, etc., and if you want to ensure the access security of multi-threaded shared data and the atomicity of operation in the case of concurrency, you use the synchronized keyword. Today, I’d like to talk about the usage of synchronized keyword and the underlying principle.

Why use synchronized

I believe everyone must have their own answers to this question. I’d like to elaborate here. Let’s look at the following code of station ticket sales:

/**
 *The station opens two windows to sell tickets at the same time
 */
public class TicketDemo {

    public static void main(String[] args) {
        TrainStation station = new TrainStation();
        //Start two threads to sell tickets at the same time
        new Thread(station, "A").start();
        new Thread(station, "B").start();
    }
}

class TrainStation implements Runnable {
    private volatile int ticket = 10;
    @Override
    public void run() {
        while (ticket > 0) {
            System. out. Println ("thread" + thread. Currentthread() Getname() + "sold" + ticket + "ticket number");
            ticket = ticket - 1;
        }
    }
}

The above code does not consider thread safety. Executing this code may produce the following results:

image

It can be seen that both threads have bought ticket No. 10, which is absolutely impossible in the actual business scenario. (when you take the train, a big brother said you took his seat and told you to get out. He also said you were a ticket dealer. Are you angry

Because of the existence of this problem, how should we solve it? Synchronized is to solve the security problem of multi-threaded shared data.

Mode of use

Synchronized is mainly used in the following three ways.

Synchronous code block

public static void main(String[] args) {
    String str = "hello world";
    synchronized (str) {
        System.out.println(str);
    }
}

Synchronous instance method

class TrainStation implements Runnable {
    private volatile int ticket = 100;

    //The keyword is written directly on the instance method signature
    public synchronized void sale() { 
        while (ticket > 0) {
            System. out. Println ("thread" + thread. Currentthread() Getname() + "sold" + ticket + "ticket number");
            ticket = ticket - 1;
        }
    }

    @Override
    public void run() {
        sale();
    }
}

Synchronous static method

class TrainStation implements Runnable {
    //Note that the ticket variable is declared static here, because static methods can only access static variables
    private volatile static int ticket = 100;

    //It can also be placed directly on the signature of static methods
    public static synchronized void sale() {
        while (ticket > 0) {
            System. out. Println ("thread" + thread. Currentthread() Getname() + "sold" + ticket + "ticket number");
            ticket = ticket - 1;
        }
    }
    @Override
    public void run() {
        sale();
    }
}

Bytecode semantics

By running the program, we find that the synchronized keyword can indeed ensure thread safety. How does the computer ensure it? What is behind this keyword? We can take a look at the class file compiled by java code. First, let’s look at the compiled class of synchronous code block. adoptjavap -vBytecode file name can be viewed:

public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=4, args_size=1
         0: ldc           #2                  // String hello world
         2: astore_1
         3: aload_1
         4: dup
         5: astore_2
         6: monitorenter 			//  Monitor entry
         7: getstatic     #3                  // Field java/lang/System.out:Ljava/io/PrintStream;
        10: aload_1
        11: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        14: aload_2
        15: monitorexit 				//  Monitor exit
        16: goto          24
        19: astore_3
        20: aload_2
        21: monitorexit
        22: aload_3
        23: athrow
        24: return

Look at lines 6 and 15. These two instructions only appear after the synchronized code block is added,monitorIs a monitor of an object,monitorenterIt means that the execution of this instruction can only be carried out after obtaining the monitor of the objectmonitorexitIt means to exit from the object monitor after executing the synchronized code block, that is, to release. Therefore, this object monitor is what we call a lock. Obtaining a lock is to obtain the ownership of this object monitor.

Next, let’s look at the bytecode file when synchronized modifies the instance method.

public synchronized void sale();
    descriptor: ()V
	//Method identification acc_ Public stands for public modification, acc_ Synchronized indicates that the method is synchronous
    flags: ACC_PUBLIC, ACC_SYNCHRONIZED 
    Code:
      stack=3, locals=1, args_size=1
         0: aload_0
         1: getfield      #2                  // Field ticket:I
	//Omit other irrelevant bytecodes

You can see that there will be no more synchronization after the synchronized modification on the instance methodmonitorenterandmonitorexitInstruction, but directly add one to this methodACC_SYNCHRONIZEDFlag. When the sale () method is called while the program is running, it will check whether the method hasACC_SYNCHRONIZEDAccess ID, if any, indicates that the method is a synchronous method. At this time, the OK thread will first try to obtain the monitor object corresponding to the method. If it is obtained successfully, it will continue to execute the methodsale()Method. During execution, any other thread can no longer obtain the right to use the method monitor. It will not be released until the method is executed or throws an exception. Other threads can regain the monitor.

So what is the bytecode file of synchronized modified static methods like?

public static synchronized void sale();
    descriptor: ()V
    flags: ACC_PUBLIC, ACC_STATIC, ACC_SYNCHRONIZED
    Code:
      stack=3, locals=0, args_size=0
         0: getstatic     #2                  // Field ticket:I
      //Omit other irrelevant bytecodes

It can be seen that there is no difference between the synchronized modified static method and the instance method. Both of them are addedACC_SYNCHRONIZEDThe static method is only one more than the instance methodACC_STATICThe identifier indicates that the method is static.

The concept of object monitor is mentioned in the above synchronization code blocks and synchronization methods. Which object monitor is used in the three synchronization methods?

We use the object monitor to synchronize code blockssynchronized(str)STR in, that is, the object specified in parentheses. The purpose of adding synchronous code block in development is that multiple threads can only have one thread to hold the monitor at the same time, so the designation of this object must be an object shared by multiple threads, and we can’t directly add an object in parentheses, so we can’t be mutually exclusive and ensure safety.

The object monitor of the synchronization instance method is the current instance, that is, this.

The object monitor of synchronous static method is the class object of the current static method. We all know that each class in Java will also be represented by an object, that is, the object of this class. Each class has and only one.

Object lock (monitor)

As mentioned above, the thread needs to obtain the object monitor, that is, the object lock, before entering the synchronous code block. Before we start, let’s understand what an object is composed of in Java.

Let me ask you a question first,Object obj = new Object()What is the memory distribution of this code in the JVM?

It must be known to all students who have knowledge of JVM,new Object()An object will be created in heap memory,Object objIs a reference in the stack memory, which points to the object in the heap. So how do you know what the objects in the heap memory are composed of? Here we introduce a tool called JOL (Java object layout) Java object layout. It can be directly introduced into the project through Maven.

org.openjdk.jol
    jol-core
    0.9

After the introduction, the memory distribution of the object can be printed in the code.

public static void main(String[] args) {
    Object obj = new Object();
    //Parseinstance parses the object to printable so that the parsed result can be output
    System.out.println(ClassLayout.parseInstance(obj).toPrintable());
}

The output results are as follows:

java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           01 00 00 00 (00000001 00000000 00000000 00000000) (1)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

It can be seen from the results that the obj object is mainly divided into four parts. The size = 4 of each part represents four bytes, and the first three lines are the object headerobject header, the 4 bytes in the last line is to ensure that the size of an object can be an integer multiple of 8.

image

Let’s take a look at the difference between printing an object with a lock?

public static void main(String[] args) {
    Object obj = new Object();
    synchronized (obj){
        System.out.println(ClassLayout.parseInstance(obj).toPrintable());
    }
}
java.lang.Object object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           58 f7 19 01 (01011000 11110111 00011001 00000001) (18478936)
      4     4        (object header)                           00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8     4        (object header)                           e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

It is obvious that the first eight bytes have changed, that is, mark word has changed. So locking an object is actually mark word that changes the object.

The eight bytes in mark word have different meanings. In order to make the 64 bits represent more information, the JVM sets the last two bits as tag bits. The meanings of mark word under different tag bits are as follows:

|------------------------------------------------------------------------------|--------------------|
|                                  Mark Word (64 bits)                         |       State        |
|------------------------------------------------------------------------------|--------------------|
| unused:25 | identity_ hashcode:31 | unused:1 | age:4 | biased_ Lock: 1 | lock: 2 | no lock state|
|------------------------------------------------------------------------------|--------------------|
| thread:54 |       epoch:2        | unused:1 | age:4 | biased_ Lock: 1 | lock: 2 | bias lock|
|------------------------------------------------------------------------------|--------------------|
|                       ptr_ to_ lock_ Record: 62 | lock: 2 | lightweight lock|
|------------------------------------------------------------------------------|--------------------|
|                     ptr_ to_ heavyweight_ Monitor: 62 | lock: 2 | heavyweight lock|
|------------------------------------------------------------------------------|--------------------|
|| lock: 2 | GC tag|
|------------------------------------------------------------------------------|--------------------|

The last two bits are the lock mark bits, and different values represent different meanings.

biased_lock lock state
0 00 Unlocked state (New)
0 01 Bias lock
1 01 Bias lock
0 00 Lightweight Locking
0 10 Heavyweight lock
0 11 GC tag

biased_ Lock indicates whether the object is enabled with bias lock. 1 indicates that bias lock is enabled, and 0 indicates that bias lock is not enabled.

Age: 4-bit Java object age. In GC, if the object is copied once in the survivor area, the age increases by 1. When the object reaches the set threshold, it will be promoted to the old age. By default, the age threshold of parallel GC is 15 and that of concurrent GC is 6. Since age has only 4 bits, the maximum value is 15, which is-XX:MaxTenuringThresholdThe reason why the maximum value of the option is 15.

identity_ Hashcode: 25 bit object identification hash code, using delayed loading technology. Call methodSystem.identityHashCode()Calculate and write the result to the object header. When the object is locked, the value will be moved to the process monitor.

Thread: the ID of the thread holding the partial lock.

Epoch: biased timestamp.

ptr_ to_ lock_ Record: pointer to the lock record in the stack.

ptr_ to_ heavyweight_ Monitor: pointer to tube side monitor.

Lock upgrade process

Since there will be no lock, bias lock, lightweight lock and heavyweight lock, what is the upgrade process of these locks? Let’s take a look.

newly build

From the structure of the object header and the object memory distribution printed above, we can see that the flag bit of a newly created object is 00 and the biased_lock is also 0, indicating that the object is unlocked.

Bias lock

Biased lock means that when a piece of synchronous code is accessed by the same thread and there is no competition from other threads, the thread will automatically obtain the lock when accessing later, so as to reduce the consumption caused by obtaining the lock and improve performance.

When a thread accesses the synchronization code block and obtains the lock, the thread ID will be stored in mark word. When the thread enters and exits the synchronization block, it will no longer lock and unlock through CAS operation, but detect whether there is a bias lock pointing to the current thread stored in mark word. The acquisition and release of lightweight locks depend on multiple CAS atomic instructions, while biased locks only need to rely on CAS atomic instructions once when replacing ThreadID.

Lightweight Locking

Lightweight lock means that when the lock is biased towards the lock, other threads compete, but the lock is being accessed by other threads, it will be upgraded to lightweight lock. Or there is another case where the bias lock switch of the JVM is turned off, and the lock object will be marked with a bit lightweight lock at the beginning.

Lightweight locks consider the situation that there are not many threads competing for lock objects and threads do not hold locks for a long time. Blocking a thread requires the CPU to switch from the user state to the kernel state, which is costly. If the lock is released shortly after blocking, the cost is a little outweighed. Therefore, don’t block the thread at all at this time and let it spin and wait for the lock to be released.

When the lock of the record object matches the lock condition of the record object in the virtual lock space, the thread will enter the lock condition of the record object in the virtual lock space, and if it wants to enter the lock condition of the record object in the virtual lock space.

Then, the virtual opportunity uses CAS operation to try to update the mark word of the object to the pointer to lock record, and point the owner pointer in lock record to the mark word of the object.

If the operation is successful, it means that the current thread obtains the lock. If it fails, it means that other threads hold the lock. The current thread will try to obtain it again by spinning.

When the lightweight lock is unlocked, the CAS operation will be used to replace the lock record back to the object header. If it is successful, it means that no competition occurs. If it fails, it indicates that there is competition in the current lock, and the lock will expand into a heavyweight lock.

Heavyweight lock

Heavyweight lock means that when one thread acquires a lock, all other threads waiting to acquire the lock will be in a blocking state. Mutex is a mutex implementation that depends on the underlying operating system. Mutex is also called mutex. The performance of switching threads from the scheduled state to the operating state is very low compared with that of the kernel.

The whole process of lock upgrade can be more fully shown in the following figure.

image

If you need the original picture, follow my official account [Xiaohei says Java] background reply“Lock upgrade”Get.


OK, that’s all for today. I’ll see you next time.