Deep understanding of Linux CGroup series (3): memory

Time:2020-2-26

Original link: deep understanding of Linux CGroup series (3): memory

Through the learning of the last article, we learned how to view the current CGroup information and how to operate it/sys/fs/cgroupDirectory to dynamically set CGroup, and learn how to set CPU shares and CPU quota to controlsliceInternal and differentsliceCPU usage time between. This article will focus on memory, and demonstrate how to limit the use of memory through CGroup through specific examples.

1. Search for lost memory

Last article told us that CPU controller provides two methods to limit CPU usage time, among whichCPUSharesUsed to set relative weight,CPUQuotaUsed to limit the percentage of CPU usage time for a user, service, or VM. For example, if a user has both cpushares and cpuquota set, suppose cpuquota is set to50%Before the CPU usage of the user reaches 50%, the CPU can be used according to the cpushares setting.

For memory, in CentOS 7, SYSTEMd has helped us bind memory to / sys / FS / CGroup / memory.systemdOnly one parameter was providedMemoryLimitThis parameter represents the total amount of physical memory that a user or service can use. Take the previous user Tom for example. Its uid is 1000, which can be set by the following command:

$ systemctl set-property user-1000.slice MemoryLimit=200M

Use users nowtomLog in to the system throughstressThe command generates 8 subprocesses, each of which allocates 256M memory:

$ stress --vm 8 --vm-bytes 256M

As expected, the memory usage of the stress process has exceeded the limit, which should triggeroom-killer, but actually the process is still running. Why? Let’s take a look at the memory currently occupied:

$ cd /sys/fs/cgroup/memory/user.slice/user-1000.slice

$ cat memory.usage_in_bytes
209661952

It’s strange that the memory occupied is less than 200m. Where is the rest of the memory? Don’t panic. Do you remember that the memory usage in Linux system includes not only physical memory, but also swap partition, that is, swap. Let’s see if it’s caused by swap. Stop the stress process, wait for 30 seconds, and observe the usage of swap space:

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           3.7G        180M        3.2G        8.9M        318M        3.3G
Swap:          3.9G        512K        3.9G

Rerun the stress process:

$ stress --vm 8 --vm-bytes 256M

To view memory usage:

$ cat memory.usage_in_bytes
209637376

It is found that the memory usage is just within 200m. Then look at the occupation of swap space:

$ free
              total        used        free      shared  buff/cache   available
Mem:        3880876      407464     3145260        9164      328152     3220164
Swap:       4063228     2031360     2031868

More than just now2031360-512=2030848kNow it is basically certain that when the process usage reaches the limit, the kernel will try to move the data in the physical memory to the swap space to make the memory allocation successful. We can accurately calculate the total amount of physical memory + swap space used by Tom users. First, we need to check the physical memory and swap space used by Tom users respectively:

$ egrep "swap|rss" memory.stat
rss 209637376
rss_huge 0
swap 1938804736
total_rss 209637376
total_rss_huge 0
total_swap 1938804736

You can see that the physical memory usage is209637376Byte, swap space usage is1938804736Bytes, total(209637376+1938804736)/1024/1024=2048M. The total amount of memory required by the stress process is256*8=2048M. The two are equal.

At this time, if you check every few secondsmemory.failcntFile, you will find that the value in this file has been growing:

$ cat memory.failcnt
59390293

As can be seen from the above results, when the physical memory is insufficient, the number in memory.failcnt will be triggered plus 1, but at this time, the process will not necessarily be killed, and the kernel will try to move the data in the physical memory to the swap space.

2. Close swap

In order to better observe the memory control of CGroup, we can use Tom instead of swap space. The implementation methods are as follows:

  1. takememory.swappinessChange the value of the file to 0:

    $ echo 0 > /sys/fs/cgroup/memory/user.slice/user-1000.slice/memory.swappiness

    After this setting, the current CGroup will not use the swap space even if the system has enabled the swap space.

  2. Switch off the swap space of the system directly:

    $ swapoff -a

    If you want to make it permanent, you need to comment it out/etc/fstabSwap in the file.

If you don’t want to shut down the swap space of the system, and you want Tom not to use swap space, the first method given above is problematic:

  • You can only modify it when the Tom user logs inmemory.swappinessThe value of the file, because if the Tom user does not log in, the current CGroup will disappear.
  • Even if you change itmemory.swappinessThe value of the file will also expire after the login

If we solve this problem according to the conventional thinking, it may be very difficult. We can find another way to start with PAM.

Linux PAM (pluggable authentication modules) is a system level user authentication framework. PAM separates the program development from the authentication mode, and the program calls the additional “authentication” module to complete its own work at runtime. The local system administrator selects which authentication modules to use through configuration, where/etc/pam.d/The directory is dedicated to storing PAM configurations and setting up independent authentication methods for specific applications. For example, when a user logs in via SSH, the/etc/pam.d/sshdThe strategy inside.

from/etc/pam.d/sshdFirst, we can create a shell script:

$ cat /usr/local/bin/tom-noswap.sh
#!/bin/bash

if [ $PAM_USER == 'tom' ]
  then
    echo 0 > /sys/fs/cgroup/memory/user.slice/user-1000.slice/memory.swappiness
fi

Then in/etc/pam.d/sshdThe script is called through PAM ﹣ exec in the/etc/pam.d/sshdAdd a line at the end of to read:

$ session optional pam_exec.so seteuid /usr/local/bin/tom-noswap.sh

Now log in with the Tom user and you will findmemory.swappinessThe value of becomes 0.

Here, we need to pay attention to the premise that there is at least one login session of user Tom, and through thesystemctl set-property user-1000.slice MemoryLimit=200MThe command sets the limit,/sys/fs/cgroup/memory/user.slice/user-1000.sliceDirectory will exist. Therefore, all the above operations must ensure that at least one login session of user Tom is kept.

3. Control memory usage

After we turn off swap, we can strictly control the memory usage of the process. Or use the example mentioned at the beginning to log in to the system with the user Tom. First run the following command in the first shell window:

$ journalctl -f

Open the second shell window (or Tom user), and generate 8 subprocesses through the stress command. Each process allocates 256M memory:

$ stress --vm 8 --vm-bytes 256M
stress: info: [30150] dispatching hogs: 0 cpu, 0 io, 8 vm, 0 hdd
stress: FAIL: [30150] (415) <-- worker 30152 got signal 9
stress: WARN: [30150] (417) stress: FAIL: [30150] (415) <-- worker 30151 got signal 9
stress: WARN: [30150] (417) now reaping child worker processes
stress: FAIL: [30150] (415) <-- worker 30154 got signal 9
stress: WARN: [30150] (417) now reaping child worker processes
stress: FAIL: [30150] (415) <-- worker 30157 got signal 9
stress: WARN: [30150] (417) now reaping child worker processes
stress: FAIL: [30150] (415) <-- worker 30158 got signal 9
stress: WARN: [30150] (417) now reaping child worker processes
stress: FAIL: [30150] (451) failed run completed in 0s

Now you can see that the stress process is quickly killed. Back to the first shell window, the following information will be output:

Deep understanding of Linux CGroup series (3): memory

It can be seen that CGroup’s limitation of memory works, and the memory usage of stress process exceeds the limit, triggering the oom killer and killing the process.

4. More documents

Add an episode. If you want to get more documents about CGroup, you can install it from yumkernel-docBag. After installation, you can enter/usr/share/docsTo view the detailed documents of each CGroup controller.

$ cd /usr/share/doc/kernel-doc-3.10.0/Documentation/cgroups
$ ll
Total consumption 172
 4 - R -- R -- R -- 1 root root 918 June 14 02:29 00 index
16 - R -- R -- R -- 1 root 16355 June 14 02:29 blkio-controller.txt
28 - R -- R -- R -- 1 root 27027 June 14 02:29 cggroups.txt
 4 - R -- R -- R -- 1 root 1972 June 14 02:29 cpuacct.txt
40 - R -- R -- R -- 1 root 37225 June 14 02:29 cpuses.txt
 8 - R -- R -- R -- 1 root 4370 June 14 02:29 devices.txt
 8 - R -- R -- R -- 1 root 4908 June 14 02:29 freezer-subsystem.txt
 4 - R -- R -- R -- 1 root 1714 June 14 02:29 hugetlb.txt
16 - R -- R -- R -- 1 root 14124 June 14 02:29 memcg_test.txt
36 - R -- R -- R -- 1 root 36415 June 14 02:29 memory.txt
 4 - R -- R -- R -- 1 root
 4 - R -- R -- R -- 1 root 2513 June 14 02:29 net_prio.txt

The next article will discuss how to use CGroup to restrict I / O. please wait~

Deep understanding of Linux CGroup series (3): memory

Recommended Today

Incomplete delivery order log of SAP SD basic knowledge

Incomplete delivery order log of SAP SD basic knowledge   If we call the incomplete project log, the system checks whether the data in the outbound delivery is complete. From the generated list, we can directly jump to the screen of maintaining incomplete fields.   We can call log of incomplete items from delivery processing, […]