Don’t panic when you’re asked about ThreadLocal during the interview. All the answers you want are here

Time:2020-12-2

When friends encounter thread safety problems, they may use the synchronized keyword in most cases. Only one thread is allowed to enter the locked method or code block at a time. In this way, the atomicity of the operation can be ensured and the modification of public resources will not have any wonderful problems. This locking mechanism works well when the concurrency is small. If the concurrency is large, there will be a large number of threads waiting for the same object lock, which will cause the system throughput to plummet.

The developers of JDK may also consider the disadvantages of using synchronized, so other ideas such as volatile and ThreadLocal appear to solve the thread safety problem. Volatile, which modifies variables without retaining copies, directly accesses the main memory, and is mainly used in the scenario of write once read many. ThreadLocal is to create a copy of variables for each thread to ensure that each thread access is its own copy. If it is isolated from each other, there will be no thread safety problems. In this way, space is used for time. Other contents will be discussed later. Today we will focus on ThreadLocal.

Next, we will introduce ThreadLocal from the following aspects

  • How to use ThreadLocal?
  • How ThreadLocal works
  • Source code analysis of ThreadLocal
  • What are the holes in ThreadLocal

1. How to use ThreadLocal?

Before we use ThreadLocal, let’s take a look at an example

* *

Operation results:

i:8,count:8
i:7,count:8
i:11,count:10
i:4,count:11
i:13,count:12
i:2,count:8
i:0,count:8
i:9,count:8
i:3,count:8
i:1,count:8
i:5,count:8
i:6,count:8
i:12,count:11
i:10,count:9
i:14,count:15
i:18,count:17
i:15,count:18
i:17,count:16
i:16,count:15
i:19,count:18
realCount:18

We can see that the realcount error eventually occurs. The expected result should be 20, but the actual situation is 18. There is a thread safety problem.

Next, change the program to ThreadLocal. What’s the result?

* *

Operation results:

i:6,count:1
i:10,count:1
i:3,count:1
i:0,count:1
i:7,count:1
i:11,count:1
i:9,count:1
i:5,count:1
i:8,count:1
i:1,count:1
i:4,count:1
i:2,count:1
i:13,count:1
i:15,count:1
i:14,count:1
i:19,count:1
i:18,count:1
i:17,count:1
i:12,count:1
i:16,count:1
realCount:0

We can see that the running results are quite different from the previous examples. First of all, the counts are all 1. Before, count has many values such as 8, 10, 11, 12 and so on. Second, realcount used to be 18, but now realcount is 0. Why is this difference?

2. Working principle of ThreadLocal

Let’s take a look at example 1

Don't panic when you're asked about ThreadLocal during the interview. All the answers you want are here

We can see that multiple threads can access the common resource count at the same time. When a thread is executing count + +, other threads may also execute count + +. However, due to the invisibility of multiple thread variable count, other threads will get the old count value + 1. In this way, the data problem that realcount is expected to be 20, but actually 18 occurs.

Let’s look at the situation in example 2

Don't panic when you're asked about ThreadLocal during the interview. All the answers you want are here

As shown in the figure, in the larger direction, ThreadLocal will create a copy of the variable for each thread to ensure that each thread access is its own copy and isolated from each other.

In a small way, there is a threadlocalmap inside each thread. Each ThreadLocal map contains an entry array. Entry is composed of ThreadLocal and data (here, count). In this way, each thread has its own unique variable count. In example 2, when thread 1 calls the calc method, the getcount method of will be called first threadLocal.get () return is empty, so getcount return value is 0. such threadLocal.set (getcount() + 1); it becomes threadLocal.set (0 + 1); it sets the data value of ThreadLocal in thread 1 to 1. Thread 2 calls the calc method again, and it also calls the getcount method first, because the first call threadLocal.get () return is empty, so getcount return value is also 0. such threadLocal.set (getcount() + 1); the data value of ThreadLocal in thread 2 is also set to 1…… Finally, the data value in ThreadLocal for each thread is 1.

Also, why is the realcount printed out in example 2 0?

because testThreadLocal.getCount () is invoked in the main thread, other thread changes will only affect the copy of its own, will not affect the original variables, count initial value is 0, so the final 0.

**3. ThreadLocal source code analysis

**

Before introducing ThreadLocal, let’s take a look at the thread class

ThreadLocal.ThreadLocalMap threadLocals = null;

You can see that a member variable called threadlocals is defined in the thread class. Its type is ThreadLocal.ThreadLocalMap 。 Obviously, threadlocalmap is the internal class of ThreadLocal, which verifies what I drew in the figure. Each thread has a threadlocalmap object.

Let’s focus on the ThreadLocal map

static class ThreadLocalMap {

Because the method is too long, I’ve omitted part of it here. From the above code, we can see that threadlocalmap contains an array called table. Its type is entry. Entry is a subclass of WeakReference (weak reference). Entry also contains the ThreadLocal variable and the value of object. The ThreadLocal variable is the referent of WeakReference.

Next, let’s go back to the ThreadLocal class. In fact, the following four methods are commonly used: get(), initialvalue(), set (t value) and remove(). Next, we will introduce them one by one.

First, let’s seeget()method

public T get() {

The getmap method

ThreadLocalMap getMap(Thread t) {
    return t.threadLocals;
}

It can’t be simpler. It directly returns the member variable threadlocals of the current thread

Let’s look at the getentry method again

private Entry getEntry(ThreadLocal<?> key) {

As I said before, entry is a subclass of WeakReference, then the e.get() method will call:

public T get() {
      return this.referent;
  }

A reference is returned, which is the ThreadLocal object passed in by the constructor.

Is there any explanation logic in getentryaftermiss?

private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
    Entry[] tab = table;
    int len = tab.length;

    while (e != null) {
        ThreadLocal<?> k = e.get();
        if (k == key)
            return e;
        if (k == null)
            expungeStaleEntry(i);
        else
            i = nextIndex(i, len);
        e = tab[i];
    }
    return null;
}

This method will call the expungestaleentry method, which we will focus on later.

Let’s look at the setinitialvalue method again

protected T initialValue() {

return null;

}

private T setInitialValue() {

The initialvalue() method is the second method we will introduce

protected T initialValue() {
      return null;
  }

We can see that this method has only one empty implementation, which will be re implemented after the user’s subclass is rewritten.

Next, focus on the set method of threadlocalmap

private void set(ThreadLocal<?> key, Object value) {

The replacestallentry method also calls the expungestaleentry method.

Let’s look at the createmap method in the setinitialvalue method

void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}

The code is very simple, just a new threadlocalmap object.

OK, here we are. The get() and initialvalue() methods are introduced.

The set (t value) method is described below

public void set(T value) {

so easy

Finally, take a look at the remove () method

public void remove() {

The key to this method is the remove method of the threadlocalmap class

private void remove(ThreadLocal<?> key) {

The clear method is also very simple. It just sets the reference to null, that is to clear the reference

public void clear() {
    this.referent = null;
}

We can see that the get(), set (t value), and remove() methods all call the expungestaleentry method. Let’s focus on the expungestaleentry method

private int expungeStaleEntry(int staleSlot) {

This method first clears the dirty entry at the current location, and then traverses backward until table [i] = = null. During the traversal, if the dirty entry is encountered again, it will be cleaned up. If it is not encountered, the current entry will be changed. If the subscript h obtained by hashing is inconsistent with the current index I, it indicates that a hash conflict occurred when the entry was put into the entry array (its position was shifted backward through rehashing). Now the dirty entry in front of it has been cleared, so the current entry should be moved forward to fill the upper position. Otherwise, the next time the set() or get() method is called to find the null value before the entry.

Why do you do this?

We know that the entry object contains ThreadLocal and value, and ThreadLocal is the reference of weak reference. Each time the GC is triggered by the garbage collection period, the referent of WeakReference will be recycled and the referent will be set to null. Then there will be many entries with ThreadLocal = null but value not empty in the table array. The existence of such entries has no practical value. This kind of data cannot be obtained through getentry, because it contains the sentence if (E! = null & & e.get() = = key).

Why use weak reference?

If a strong reference is used, ThreadLocal will no longer be referenced in the user process. However, as long as the thread does not end, there will still be a reference in the threadlocalmap, which cannot be recycled by GC, which will lead to memory leakage. This is especially true if the user thread takes a long time.

In addition, when using the thread pool technology, because the thread will not be destroyed, after recycling, it will be reused next time, which will lead to ThreadLocal unable to be released and eventually lead to memory leakage.

4. What are the holes in ThreadLocal

Memory leak problem:

Even if the WeakReference is used in ThreadLocal, there may be a memory leak problem because only the key (i.e. ThreadLocal object) is set to weak reference in the entry object, but the value value value is not. The following strong dependencies will still exist:

Thread -> ThreaLocalMap -> Entry -> value

To solve this problem, you need to call the get (), set (t value) or remove () methods. However, the get() and set (t value) methods are based on the data cleaning triggered by the garbage collector after collecting the key. If the garbage collector does not collect in time, there is also a problem.

Therefore, the safest way is to call the remove method manually after using ThreadLocal. As can be seen from the source code, this method will empty the key (i.e. ThreadLocal object) and value in the entry.

Thread safety issues:

Maybe some friends think that if you use ThreadLocal, there will be no thread safety problems. In fact, it is wrong. If we define a static variable count, in the case of multithreading, the value in ThreadLocal needs to be modified and the value of count should be set. It is also problematic. Because static variables are shared by multiple threads, copies will not be saved separately.

5. Summary

1. Each thread has a threadlocalmap object. Each threadlocalmap contains an entry array, and entry is composed of key (ThreadLocal) and value (data).

2. The key of entry is a weak reference and can be recycled by the garbage collector.

3. The four most commonly used methods of ThreadLocal: get(), initialvalue(), set (t value) and remove(). Except for the initialvalue method, other methods will call the expungestaleentry method to clean up the data with key = = null.

4. ThreadLocal may have memory leakage and thread safety problems. After using it, you need to manually call the remove method.

If you like this article, please pay attention to my public account: Su San talks about technology, and there will be a lot of dry goods to share. Thank you.

Don't panic when you're asked about ThreadLocal during the interview. All the answers you want are here