Song Baohua: talk about Linux making real-time high-performance tasks monopolize CPU

Time:2021-4-30

This paper mainly discusses how to let a thread monopolize the CPU in the fields of high real-time requirement, high-efficiency computing and dpdk; The principle of thread and interrupt isolation involved in exclusive CPU; And how to make the timer tick of the system not interrupt the exclusive task in the exclusive exclusive case, so as to achieve the lowest delay jitter.

It takes about 20 minutes to read this article.

Contents of this article:

  1. Engineering requirements
  2. User mode isolation
  3. Kernel state isolation

    3.1 interruption

    3.2 kernel thread
  4. Best practice guide

1. Engineering requirements

In an SMP or NUMA system, the number of CPUs is greater than 1. In engineering, we sometimes have a requirement that a CPU can monopolize the CPU, and the CPU only does the specified tasks without doing anything, so as to obtain the benefits of low latency and high real-time.

For example, in dpdk, by setting

GRUB_CMDLINE_LINUX_DEFAULT=“isolcpus=0-3,5,7”

Isolate cpu0,3,5,7, so that when dpdk tasks are running, other tasks will not switch context with dpdk tasks, so as to ensure the best network performance [1]. In the realtime application scenario, the CPU 2 is isolated by isolcpus = 2, and then the real-time application is bound to the isolated core by taskset

taskset-c 2 pn_dev

So as to ensure low delay requirements [2].

2. User mode isolation

In this place, we can see that they all use isolcpus as a startup parameter.

Practice is the only standard to test truth. Let’s start an 8-core arm64 system, run Ubuntu, and specify isolcpus = 2

image

After the system starts, we run the following simple program (start 8 processes and run the while loop)

image

We have 8 cores, and now we are running 8 processes. So theoretically, after load balancing, 8 processes should run equally on 8 cores. But let’s take a look at the actual htop results

image

We found that CPU utilization rate of CPU2 is 0.0%. This proves that CPU2 has been isolated, and processes in user space cannot run on it.

Of course, at this time, we can forcibly bind one of the a.out to CPU2 through taskset

image

From the result of the above command, we can see that the original affinity list of 663 has only 0,1,3-7 and no 2, but we forced it to be set to 2, and then we looked at htop. The CPU 2 takes up 100%:

image

Through the above experiment, we can clearly see that isolcpus = 2 makes it impossible for CPU2 to run user space processes (unless affinity is set manually).

3. Kernel state isolation

interrupt

However, what can run on CPU2 is not only user mode tasks, but also kernel threads and interrupts. Can isolcpus = isolate kernel threads and interrupts?

For interrupts, we are particularly easy to see, that is, to actually verify the SMP of each IRQ_ Affinity is fine

image

As can be seen from the figure above, the Linux kernel uses SMP to interrupt peripheral devices such as No. 44 and No. 47_ The affinity is set to FB (11111011), which obviously avoids CPU2. Therefore, the actual peripheral interrupt will not occur in CPU2, unless we forcibly bind the interrupt core, for example, let interrupt 44 bind to CPU2

echo 2 >/proc/irq/44/smp_affinity_list

After that, we find that interrupt 44 can occur in CPU2

image

However, the timer interrupt and IPI of the system are the cornerstone of Linux system, so they should be run on CPU2. Among them, the most likely one to cause delay jitter is timer tick.

Next, we will focus on the problem of tick. Linux has been configured with no in idle state_ Hz tickless, so when nothing is running on CPU2, the actual timer interrupt hardly happens.

Next, let’s run the a.out of the previous eight processes with isolcpus = 2. By default, no tasks will occupy CPU2. By running cat / proc / interrupts | head 2 several times, we can see that timer interrupts of other cores occur frequently, while CPU2 is almost unchanged, which is obviously the no in idle_ Hz plays an important role in power saving

image

However, once we put a task into CPU2, even if we only put one, we will find that the timer interrupt on CPU2 begins to increase

image

This shows that even if there is only one thread running on the isolated CPU, the timer tick will start running. Of course, the timer tick will frequently interrupt this thread, resulting in a lot of context switching. You must think that Linux is so stupid. Since there is only one person, there is no need for time slicing. There is no need to schedule two or more tasks by time slicing. Why do you have to run tick? In fact, the reason is that our kernel only enables idle no by default_ HZ:

image

Let’s recompile a kernel to enable No_ HZ_ FULL:

image

When we enable no_ HZ_ After full, when Linux supports only one task on the CPU, it can be used_ Hz. But there are two stupid eyes, so this “full” is not really full [3]. Of course, this is understandable, because there are two problems related to time slice scheduling. When should no be enabled_ HZ_ Full, kernel documentation / timers / no_ Hz.rst has a clear “instruction”, which is only required in real-time and HPC scenes, otherwise the default No_ HZ_ Idle is your best choice:

image

We recompiled the kernel and checked No_ HZ_ Full, start Linux next, pay attention to the parameter to add nohz when starting_ Full = 2, let CPU2 support No_ HZ_ FULL:

image

When there is only one task to rerun CPU2, take a look at its timer interrupt

image

Find that the tick on CPU2 is stable on 188, so I believe you will be more happy, because you monopolize more thoroughly!

Next, let’s put another task into CPU2. If there are two tasks, the timer tick on CPU2 will start to increase

image

However, this may not be a problem, because we have agreed to “monopolize”. When a task is monopolized, it should be a very ideal situation that timer tick does not disturb!

Kernel state thread

In fact, threads in kernel mode are similar to those in user mode. When they are not bound to the isolated CPU, they will not run to the isolated CPU. The following uses the DMA added by the author in the kernel_ map_ Benchmark to do the experiment [4], and start 16 kernel threads to do DMA map and unmap (note that we only have 8 cores)

./dma_map_benchmark -s 120 -t 16

We can see that CPU usage on CPU2 is also 0:

image

DMA in the kernel_ map_ The benchmark thread is occupying CPU 0-1, 3-7, but not CPU 2

image

However, if kernel thread uses kthread_ bind_ Mask () is similar to API binding thread to isolated CPU, so the situation is different. This is similar to using taskset to bind user state task to CPU.

4. Best practice guide

For high real-time and high-performance computing scenarios, if you want a task to monopolize the CPU, the best choice is:

  1. Using isolcpus to isolate CPU

  2. Bind the specified task to the isolated CPU

  3. Be careful to accidentally bind the interrupt and kernel thread to the isolated CPU, and check these “unexpected” elements

  4. Enable no_ HZ_ Full, the effect is better, because even the timer tick interrupt does not disturb you.