Original address:time.geekbang.org/column/article/1…

Thread implementation model

Before we know the difference between coroutine and thread, we might as well understand several ways to implement thread at the bottom, so as to lay a foundation for later learning.

There are three ways to implement threads: 1:1 thread model of one-to-one mapping between lightweight process and kernel thread, n:1 thread model of user thread and kernel thread, and N: m thread model of mixed implementation of user thread and lightweight process.

1: 1 thread model

The kernel level thread (KLT) I mentioned above is a thread supported by the operating system kernel. The kernel schedules threads through the scheduler and is responsible for thread switching.

We know that in Linux operating system programming, we usually create a subprocess through fork() function to represent a thread in the kernel. After a process calls the fork() function, the system will first allocate resources to the new process, such as space for storing data and code. Then all the values of the original process are copied to the new process, and only a few values are different from the original process (such as PID), which is equivalent to copying a main process.

Using fork () to create sub processes to run in parallel will produce a lot of redundant data, which takes up a lot of memory space and consumes a lot of CPU time to initialize memory space and copy data.

If it’s the same data, why not share the data of the main process? At this timeLightweight processLight Weight Process, i.eLWP)There it is.

Compared with the thread created by fork() system call, LWP uses clone() system call to create thread. This function is to copy the data structure of some parent process’s resources. The copy content is optional, and the resources that are not copied can be shared to the child process through pointers.As a result, lightweight processes have smaller running units and faster running speed. LWP is a one-to-one mapping with kernel threads, and each LWP is supported by a kernel thread.

N: 1 thread model

1: Because the thread model is one-to-one mapping with the kernel, there is a switch between user mode and kernel mode in thread creation and switch, which has a large performance overhead.In addition, it also has limitations, mainly refers to the limited resources of the system, can not support the creation of a large number of lwps.

N: 1 thread model can solve these two problems of 1:1 thread model.

The thread model isThe thread creation, synchronization, destruction and scheduling are completed in user spaceThe kernel is no longer needed, that is to sayIn the process of thread creation, synchronization and destruction, there is no space switch between user state and kernel stateThereforeThread operation is very fast and low consumption

N: M thread model

N: The disadvantage of the 1-thread model is that the operating system can’t perceive the threads in user mode, so it is easy to cause a thread to be blocked when calling kernel thread, which will cause the whole process to be blocked.

N: M thread model is a hybrid thread management model based on the above two thread models, which supports the connection between user thread and kernel thread through LWP. The number of threads in user mode and LWP in kernel mode is n: m mapping relationship.

Go coroutine and java thread

The implementation of thread # start method in JDK 1.8 thread.java is actually realized by calling start0 method by native; In Linux, the implementation of JVM thread is based on ptthread_ Create, pthread_ Create actually calls clone() to complete the system call to create a thread.

Therefore, the current Java operating system in Linux uses a user thread plus lightweight threads, a user thread mapped to a kernel thread, that is, 1:1 thread model.Because threads are scheduled through the kernel, switching from one thread to another involves context switching.

The go language uses the N: m thread model to implement its own scheduler. It multiplexes (or schedules) m coroutines on N kernel threads. The context switching of coroutines is completed by the coroutine scheduler in user mode, so it does not need to fall into the kernel. In contrast, the cost is very small.

The realization principle of coroutine

Coroutines are not only implemented in go language, but also in most languages, including C #, Erlang, python, Lua, JavaScript, ruby, etc.

For coroutines, you may be more familiar with processes and threads. A process generally represents an application service. Multiple threads can be created in an application service. However, the concept of a coroutine is different from that of a process or thread,We can think of a coprogram as a class function or code in a block of functions, we can easily create multiple coroutines in one main thread.

The difference between calling coprograms and calling functions is that coprograms can suspend the execution of coprograms by suspending or blocking, while other coprograms can continue to execute. The hang here is just the hang in the program (user mode) and transfers the code execution right to other coroutines. After the coroutine that obtains the execution right is executed, it will wake up the suspended coroutine from the hang point. The suspension and wake-up of the coroutine is accomplished by a scheduler.

Combined with the figure below, you can see more clearly how the coroutine based on the N: m thread model works.

Suppose that two threads are created by default in the program to be used by the coprocessor, and the coprocessor ABCD is created in the main thread and stored in the ready queue respectively. The scheduler will first assign a worker thread a to execute coprocessor a, and another worker thread B to execute coprocessor B. other created coprocessors will be put in the queue to wait.


When coprocessor a calls the pause method or is blocked, coprocessor a will enter the suspend queue, and the scheduler will call other coprocessors in the queue to preempt thread a for execution. When coroutine A is awakened, it needs to re-enter into the ready queue and preempt the thread through the scheduler. If the preemption is successful, coroutine a will continue to execute. If it fails, it will continue to wait for the preemption thread.


Compared with threads, coroutines reduce the CPU context switching caused by synchronous resource competition. I / O intensive applications are more suitable for use, especially in the network request, there is more time waiting for the back-end response. Coroutines can ensure that threads will not block in waiting for network response, making full use of the ability of multi-core and multi thread. For CPU intensive applications, because the CPU is busy in most cases, the advantage of coroutine is not particularly obvious.


A coroutine is closely related to a thread. A coroutine can be regarded as a code block running on a thread. The suspending operation provided by a coroutine will cause the coroutine to suspend execution without causing the thread to block.

The coroutine is also a lightweight resource. Even if thousands of coroutines are created, it is not a great burden for the system. However, if thousands of threads are created in the program, the system will be under great pressure. It can be said that the design of coroutine greatly improves the utilization of threads.

This work adoptsCC agreementReprint must indicate the author and the link of this article