What is thread safety? The article takes you to understand deeply

Time:2021-6-10

preface

Welcome to the operating system series, using graphic + vernacular form to explain, so that Xiaobai can also understand, help you get started quickly.

The last article has introduced the basic knowledge of process and thread. There are multiple threads under the process. Although the communication between multiple threads is very convenient (in the same process), it brings thread safety problems. This article mainly introduces what methods are used in the operating system to solve the problem of multithread safety. Let’s enter the text.

Bloggers hope that readers can form the habit of thinking and summarizing after reading articles. Only in this way can they digest knowledge into their own things, rather than simply memorize

content syllabus

What is thread safety? The article takes you to understand deeply

Little story

I believe that everyone loves to work in the pit with salary, and a Xing is no exception. However, the floor where our company is located has less pit space, so we are very worried.

Every time a Xing (thread a) goes to the toilet (sharing resources), the door is locked, which means that some colleagues are in the pit (thread B holds the lock), so he can only wait outside helplessly. Soon after the sound of rushing water rings, his colleagues come out (thread B releases the lock). A Xing enters the toilet and locks the door (thread a holds the lock) to enjoy his own space, Other colleagues who came late had to wait in line. Everything was in order.

If the door lock is broken and orderly, it will no longer exist. Going to the toilet is no longer a pleasure, but a high degree of tension, to prevent the door from being suddenly opened. What’s worse, when the door is opened, it’s a girl. This is not only a thread safety problem, but also an array out of bounds.

With the end of the story, I just want to explain that if the shared resources are operated in a multithreaded environment, if there is no reasonable cooperation (mutual exclusion and synchronization) between multithreads, the rollover scene will certainly occur.

Competitive conditions

Because multithreads share process resources, when the operating system schedules the multithreads in the process, the problem of multithreads competing for shared resources will inevitably appear. If effective measures are not taken, the shared resources will be confused!

What is thread safety? The article takes you to understand deeply

To write a small example, create two threads, they are shared variablesiSelf increasing1implement1000The code is as follows

What is thread safety? The article takes you to understand deeply

Normally,iThe last value of the variable is2000, but not so. Let’s execute the code and see the result

  • result:2000
  • result:1855

After two runs, the results are 1855 and 2000 respectively. We find that the results of each run are different, which is intolerable in the computer. Although it is a small probability error, it must occur with a small probability.

Assembly instruction

In order to understand what’s going on, we have to understand the execution of assembly instructions in order toiplus1For example, the execution process of assembly instruction is as follows

What is thread safety? The article takes you to understand deeply

Good guy, an addition action, in C p u run, the actual execution3There are two instructions.

Now simulate the running of thread a and thread B, assuming that the memory variable is changed at this timeiThe value of is0, thread a loads the memoryiValue to registeriValue plus1, at this timeiThe value is1, preparing for the next stepiWhen the time slice is used up, the thread context switch occurs, and the private information of the thread is saved to the thread control block TCP.

The operating system schedules thread B to execute, and the memory variable at this timeiStill0Thread B performs the same steps as thread A. fortunately, before the time slice is used up, thread B performs the add1Finally, write back the memory, memory variablesiThe value is1

After the time slice of thread B is used, thread context switch occurs, and thread a returns to its last state to continue execution. TheiThe value is written back to memory, and the memory variable is set to1

Arguably, the lastiThe value should be2However, due to uncontrollable scheduling, the finaliThe value is1Here is the flow chart of thread a and thread B

What is thread safety? The article takes you to understand deeply

  • Step 1: take out the memoryiValue, load into register
  • The second step: to register theiValue plus1
  • Step 3: RegisteriValue is taken out and loaded into memory

Summary

This situation is called race condition. When multithreads compete with each other to operate shared resources, due to bad luck, thread context switching occurs in the process of execution, and finally the wrong result is obtained. In fact, each run may get different results, so the output result is uncertain.

Mutual exclusion and synchronization

In order to solve the thread safety problem caused by competitive conditions, the operating system solves this kind of problem through mutual exclusion and synchronization.

The concept of mutual exclusion

Multithreading execution of shared variable code may lead to race state, so we call this code critical section. It is a code fragment to execute shared resources, and must not be executed by multithreading at the same time.

Therefore, we hope that this code is mutualex clusion, that is to say, only one thread can execute critical section code, and other threads block and wait to achieve queuing effect.

What is thread safety? The article takes you to understand deeply

Mutex is not only aimed at multithreading competition, but also can be used for multiple processes to avoid the confusion of shared resources.

Synchronization concept

Mutex solves the problem of “multi process / thread” using critical area, but it does not solve the problem of “multi process / thread” working together

We all know that in multithreading, each thread must execute in sequence. They are independent and advance at an unpredictable speed. But sometimes we hope that multiple threads can cooperate closely to achieve a common task.

The so-called synchronization means that “multiple processes / threads” may need to wait for and exchange messages with each other at some key points. This kind of mutually restricted waiting and exchange information is called “process / thread” synchronization.

For example, there are two roles: R & D, quality control, and quality control testing functions. They need to wait for R & D to complete development, and R & D to fix bugs also need to wait for quality control to “submit b u g after testing”. The normal process is R & D to complete development, inform quality control to test, and inform R & D personnel to repair bugs after quality control testing.

What is thread safety? The article takes you to understand deeply

The difference between mutual exclusion and synchronization

  • Mutual exclusion: a resource can only be accessed by one visitor at the same time, which is unique and exclusive. However, mutex can’t limit the access order of visitors to resources, that is, access is out of order (operation a and operation B can’t be executed at the same time)
  • Synchronization: on the basis of mutual exclusion, visitors can access resources orderly through other mechanisms. In most cases, synchronization is mutually exclusive (operation a should be performed before operation B, and operation C must be performed after both operation a and operation B are completed)

Obviously, synchronization is a more complex mutex, and mutex is a special synchronization. That is to say, mutex means that two threads cannot run at the same time. They will repel each other. They must wait for one thread to run, and the other can run. Synchronization cannot run at the same time, but they must run the corresponding threads in a certain order (also a kind of mutex)!

Implementation of mutual exclusion and synchronization

Mutex and synchronization can ensure “correct cooperation among multiple processes / threads”, but mutex and synchronization are only concepts. The operating system must provide corresponding implementation. There are two kinds of implementation for mutex and synchronization

  • Lock: lock and unlock (mutually exclusive)
  • Semaphore: P, V operation (synchronous)

These two methods can achieve “multi process / thread” mutual exclusion, semaphore is more powerful than lock, it can also easily achieve “multi process / thread” synchronization.

lock

As the name suggests, to lock the critical area, any thread that enters the critical area must first perform the locking operation. Only when the locking is successful, can it enter the critical area, and then release the lock when it leaves the critical area, so as to achieve the effect of mutual exclusion.

What is thread safety? The article takes you to understand deeply

The implementation of lock is divided into “busy waiting lock” and “no busy waiting lock”

Busy waiting lock

Test and set lock (TSL) is a kind of non interruptible atomic operation, which belongs to atomic operation instruction and can be used to realize busy wait lock (spin lock).

Pseudo code of test and set lock instruction

What is thread safety? The article takes you to understand deeply

Check and set up the following steps

  • Check that the old values are equal
  • Set the new value and return the old value (success)
  • Not equal, no operation, directly return the old value (failed)

The above steps are regarded as one step and have atomicity. Atomicity means that all or none of them will be executed, and there will be no intermediate state of half execution

Pseudo codetestAndSetLockImplementation of busy wait lock (spin lock)

What is thread safety? The article takes you to understand deeply

There are two scenarios to run

  • Single thread: suppose a thread accesses the critical area and executesgetLockMethod, check the old value0The old value is updated through0Is the new value1, return the original value0, acquire the lock successfully, when leaving the critical area, executeunLockMethod, check the old value1The old value is updated through1Is the new value0, the lock is released successfully.
  • Multithreading: suppose that there are two threads, thread a accesses the critical area and executesgetLockMethod, check the old value0The old value is updated through0Is the new value1, return the original value0, the lock is obtained successfully, and thread B executesgetLockMethod, the old value check fails, the lock acquisition fails, and the loop continues until the update is successful. When thread a leaves the critical area, it executesunLockMethod, check the old value1The old value is updated through1Is the new value0, the lock is released successfully, and thread B obtains the lock successfully.

When the lock cannot be acquired, the thread will be locked all the timewileLoop, do nothing, so it is called busy wait lock, also known as spin lock.

This is the simplest lock that spins all the time, using C P U cycles until the lock is available. On a single processor, a preemptive scheduler is needed (that is, one thread is interrupted continuously by clock and other threads are run). Otherwise, spin lock cannot be used on CPU, because a spinning thread will never give up CPU.

No busy waiting lock

As the name suggests, no busy wait lock doesn’t need active spin, just passively wait for wake-up. When the lock is not obtained, the thread is added to the waiting queue, and the CPU is given to other threads. When other threads release the lock, they wake up the thread from the waiting queue.

What is thread safety? The article takes you to understand deeply

The implementation of the two locks is based on test and set lock (TSL). The above is just a simple pseudo code. In fact, the implementation of the operating system will be more complex, but the basic idea and general process are the same as this example.

Semaphore

In an operating system, multi thread / process coordination is achieved by semaphores. Semaphores usually represent the number of resources and correspond to an integer(s e n)Variable, and two system call functions with atomic operations to control the number of resources.

  • P operation: Sets e nreduce1After subtraction, ifs e n < 0, then the process / thread will enter the block waiting, otherwise, if it continues, the P operation may block
  • V operation: Sets e nplus1After adding, ifs e n <= 0, wake up the waiting process / thread, V operation will not block

P V operations must occur in pairs, but there is no sequence requirement, that is to say, you can p V or V P.

For example, COVID-19 has come out of trouble recently. For the sake of self safety, everyone is going to vaccinate, because the doctor has only two bits (equivalent to the signal volume of 2 resources), so it can only vaccinate two people at the same time.

What is thread safety? The article takes you to understand deeply

  • The semaphore is equal to0Indicates that no resources are available
  • Semaphore less than0Represents that a thread is blocking
  • Semaphore greater than0Represents that the resource is available

Using pseudo code to realize p V semaphore

What is thread safety? The article takes you to understand deeply

The function of PV operation is managed and implemented by the operating system, so PV function is atomic.

practice

Semaphore is more interesting. Here are some practices to deepen our understanding of semaphore. The contents of practice are as follows

  • Semaphores realize mutual exclusion
  • Semaphore realizes event synchronization
  • Semaphore realizes producer and consumer

mutex

Using semaphores to achieve mutual exclusion is very simple, the number of semaphores is1The thread enters the critical area for P operation and leaves the critical area for V operation.

What is thread safety? The article takes you to understand deeply

Event synchronization

Take the R & D and quality control threads mentioned above as an example to realize the effect of event synchronization. The pseudo code is as follows

What is thread safety? The article takes you to understand deeply

First of all, two semaphores are abstracted, which are “can test” and “can fix bug”. By default, both semaphores are “no”, that is to say0The key point is to perform p V operation on two semaphores

  • The quality control thread asks if the development thread has completed the development and executes thePoperationp(this.rDSemaphore)

    • If the development is not completed,this.rDSemaphorereduce1The results are as follows-1The quality control thread is blocked and waiting for wake-up (waiting for the follow-up R & D thread)V(operation)
    • If the development is completed, the R & D thread will execute firstVoperationv(this.rDSemaphore)Complete the development,this.rDSemaphoreplus1result1In this case, the quality control threadPoperationthis.rDSemaphorereduce1result0To carry out the following test work
  • The R & D thread asks if the quality control thread can repair the b u g and execute thePoperationp(this.qualitySemaphore)

    • If you can’t fix b u g,this.qualitySemaphorereduce1The results are as follows-1, R & D thread blocking, waiting for wake-up (waiting for subsequent quality control thread execution)V(operation)
    • If b u g can be repaired, the quality control thread will execute firstVoperationv(this.qualitySemaphore)Submit bug,this.qualitySemaphoreplus1The results are as follows1In this case, the R & D threadPoperationthis.qualitySemaphorereduce1result0To repair the b u g operation later
  • technological process

    • Quality control thread executionPoperationp(this.rDSemaphore)Can we test it,this.rDSemaphorereduce1The result is-1, unable to test, quality control thread blocked, waiting to wake up
    • R & D thread running, executionVoperationv(this.rDSemaphore)Complete R & D function,this.rDSemaphoreplus1The result is0To inform the quality control thread to test
    • R & D thread continuesPoperationp(this.qualitySemaphore)Can we fix b u g,this.qualitySemaphorreduce1The result is-1, can’t fix b, u, G, R & D thread blocking, waiting to wake up
    • After the quality control thread wakes up, it performs the test, and the test is finishedVoperationv(this.qualitySemaphore)Complete the test and submit the relevant b u G,this.qualitySemaphoreplus1The result is0, inform R & D thread to repair

Producer and Consumer

Producer and consumer is a classic thread synchronization problem. Let’s first analyze the roles

  • Producer: put production events into buffer
  • Consumer: consuming events from buffer
  • Buffer: container for loading events

What is thread safety? The article takes you to understand deeply

The analysis of the problem can be concluded as follows

  • There can only be one thread operation buffer at any time, which indicates that the operation buffer is critical code and needs mutual exclusion
  • When the buffer is empty, the consumer must wait for the producer to generate data
  • When the buffer is full, the producer must wait for the consumer to fetch the data

Through problem analysis, we can abstract three semaphores

  • Mutex semaphore: mutex access buffer, initialization1
  • Consumer resource semaphore: whether there are events in the buffer, initialization0, no events
  • Producer semaphore: whether there are empty loading events in the buffer, initializationN(buffer size)

The pseudo code is as follows

What is thread safety? The article takes you to understand deeply

The key PV operations are as follows

  • The production thread executes thePoperationp(this.produceSemaphore)The number of empty slots in the buffer decreases1, results<0It indicates that there is no empty slot, blocking and waiting for the “consumption thread” to wake up, otherwise the subsequent logic will be executed
  • Both production and consumption threads are executed in the operation bufferP VCritical zone operationp(this.mutexSemaphore)Andv(this.mutexSemaphore)I don’t want to give too much overview here
  • Consuming thread, executing before consuming events from the cachePoperationp(this.consumeSemaphore), the number of buffer events decreases1, results<0This indicates that there is no event consumption in the buffer, blocking and waiting for the “production thread” to wake up, and no subsequent logic is executed
  • After the production thread and the consumption thread finish loading / consuming, they should wake up the corresponding production / consumption thread to executeVOperation “buffer empty slot add”1/Buffer events plus1

About me

Official account:Program ape starFocus on the principle of technology, source code, output technology through graphical way, here will share the operating system, computer network, Java, distributed, database and other excellent original articles, looking forward to your attention.