Interviewer: you talk about the understanding of physical memory and virtual memory

Time:2021-6-10

The article is constantly updated every week. It’s not easy to be original. It’s the biggest affirmation for me to let more people see the “San Lian”. WeChat can search the public for the first time to read the official account of “backend Technology School” (usually one to two updates earlier than blogs).

Let’s continue to learn about Linux memory management today. What? You want to learn more about time management. I don’t deserve it. Take a watermelon and go to Weibo.

Interviewer: you talk about the understanding of physical memory and virtual memory

Let’s get back to business. In the last article, don’t say that you don’t understand Linux memory management. The 10 diagrams are clear for you! This paper analyzes the Linux memory management mechanism. If you have forgotten, you can look back and strongly suggest that you read that article first and then this one. Due to space limitation, I didn’t learn about physical memory management and virtual memory allocation in the previous article. Let’s learn about it today.

Through the previous study, we know that programs are not so easy to cheat. Let memory management play out the virtual address space. In the end, you still have to give the program real physical memory, otherwise the program will go on strike. Therefore, it is better to manage and use such an important resource as physical memory (physical memory is your real memory module), So how does the kernel manage physical memory?

Physical memory management

stayLinux In the system, the physical memory is divided into 4K memory pages by segmentation and paging mechanismPage(also called a page boxPage Frame)The allocation and recycling of physical memory are based on memory pages, which greatly benefits the paging management of physical memory.

If the system requests a small memory, it can allocate a page to it in advance to avoid the frequent system overhead caused by repeated application and release of small memory.

If the system needs large blocks of memory, it can use multi page memory to piece together, instead of requiring large blocks of continuous memory. You see no matter the size of the memory can be retracted and released freely, paging mechanism how perfect solution!

Interviewer: you talk about the understanding of physical memory and virtual memory

But, ideal is very plump, reality is very bony. There are still some problems if the memory is directly paged without additional management. Let’s take a look at the problems that the system will encounter when allocating and releasing physical pages many times.

Problems in physical page management

There are external and internal fragments in the allocation of physical memory pages. The so-called “internal” and “external” are for “inside and outside of a page frame”. The memory fragments in a page frame are internal fragments, and the fragments between multiple page frames are external fragments.

External debris

When large memory blocks need to be allocated, several pages need to be combined. When the system allocates physical memory pages, it tries to allocate continuous memory pages. Frequent allocation and recycling of physical pages results in a large number of small memory blocks mixed in the middle of the allocated pages, forming external fragments. For example:

Interviewer: you talk about the understanding of physical memory and virtual memory

Internal debris

Physical memory is allocated by page, so that when only a small amount of memory is needed, at least 4K pages will be allocated. However, in the kernel, there are many scenarios where memory needs to be allocated in bytes. In this way, only a few bytes are needed, but one page of memory has to be allocated. After removing the used bytes, the rest will form internal fragments.

Interviewer: you talk about the understanding of physical memory and virtual memory

Page management algorithm

Because of the above problems, smart programmers have an idea and introduce page management algorithm to solve the above fragmentation problem.

Buddy allocation algorithm

LinuxWhat does the kernel mean by introducing the buddy system algorithm? That is, the same size of the page frame block is linked with a linked list. The page frame block is like a good partner hand in hand, which is also the origin of the name of this algorithm.

Interviewer: you talk about the understanding of physical memory and virtual memory

Specifically, all free page frames are grouped into 11 block linked lists. Each block linked list contains 1, 2, 4, 8, 16, 32, 64, 128, 25, 6512 and 1024 consecutive page frame blocks. Maximum 1024 consecutive page frames can be applied, corresponding to 4MB of continuous memory.

Interviewer: you talk about the understanding of physical memory and virtual memory

Because any positive integer can be written by2^nAnd composition, so can always find the appropriate size of memory block allocation, reduce the external fragmentation.

Allocation instance

For example: I need to apply for four page frames, but there is no free page frame block in the chain list of four consecutive page frame blocks. The partner system will get one from the chain list of eight consecutive page frame blocks and split it into two consecutive four page frame blocks, take one of them and put the other into the free chain list of four consecutive page frame blocks. When releasing, it will check whether the page boxes before and after the released page boxes are free, and whether they can form blocks of the next level.

Command view
[lemon]]# cat /proc/buddyinfo 
Node 0, zone      DMA      1      0      0      0      2      1      1      0      1      1      3 
Node 0, zone    DMA32   3198   4108   4940   4773   4030   2184    891    180     67     32    330 
Node 0, zone   Normal  42438  37404  16035   4386    610    121     22      3      0      0      1 

Slab distributor

Seeing this, you may think that with the partner system, you can always manage the physical memory well, right? No, it’s not enough, or there won’t be anything wrong with the slab allocator.

Interviewer: you talk about the understanding of physical memory and virtual memory

What is a slab allocator?

Generally speaking, the life cycle of kernel objects is as follows: allocate memory – initialize – release memory. There are a large number of small objects in the kernel, such as file description structure objects and task description structure objects. If memory is allocated and released page by page according to the partner system, the frequent execution of “allocate memory – initialize – release memory” on small objects will consume a lot of performance.

The memory allocated by the partner system is still based on the page frame. For many scenarios of the kernel, small pieces of memory are allocated, which is far less than the size of a page of memory. slab Distributor,By dividing the memory into different sizes according to different objectsTo cache kernel objects.

Partner system and slab are not two choices,slabMemory allocator is a complement to partner allocation algorithm.

The principle of vernacular

For objects of the same type in each kernel, such as:task_struct、file_structFor small kernel data objects that need to be reused, there will be a slab cache pool to cache a large number of commonly used “initialized” objects. Whenever you want to apply for this type of object, you need to start from theslabAssign one out of the list; When it is to be released, it is saved in the list instead of returned to the partner system directly, so as to avoid internal fragmentation and greatly improve the performance of memory allocation.

Main advantages
  • slabMemory management is based on small kernel objects. It doesn’t need to allocate one page of memory every time. It makes full use of memory space and avoids internal fragmentation.
  • slabCache the small objects created and released frequently in the kernel, reuse some of the same objects, and reduce the number of memory allocation.
data structure

Interviewer: you talk about the understanding of physical memory and virtual memory

kmem_cacheIt’s acache_chainThe linked list of each node represents the same type of “object cache” in the kernelkmem_cacheIt is usually a continuous block of memory containing three types of memoryslabsLinked list:

  • slabs_full(fully distributive)slab(linked list)
  • slabs_partial(partially allocated)slab(linked list)
  • slabs_empty(not assigned toslab(linked list)

kmem_cacheThere is an important structure inkmem_list3Contains the declaration of the above three data structures.

Interviewer: you talk about the understanding of physical memory and virtual memory

slabyes slabThe smallest unit of the allocator in the implementation of the previous oneslabConsists of one or more consecutive physical pages (usually only one page). A single slab can beslabMoving between linked lists, for example, if a “half full” slabs_partialWhen the linked list becomes full after it has been assigned an object, it must start fromslabs_partialTo delete and insert into “all full”slabs_fullIn the list. kernel slab The object allocation process is as follows:

  1. If slabs_partialThe linked list still has unallocated space to allocate objects. If it becomes full after allocation, it will moveslabreachslabs_fullLinked list
  2. If slabs_partialThe linked list has no unallocated space. Go to the next step
  3. Ifslabs_emptyThe linked list has unallocated space, allocates objects and moves them at the same time slab get into slabs_partialLinked list
  4. Ifslabs_emptyIf it is empty, request the partner system to page and create a new idleslab, assign objects according to step 3

Interviewer: you talk about the understanding of physical memory and virtual memory

Command view

The above is all theory, more abstract, move to Kangkang system in the slab! You can go throughcat /proc/slabinfoCommand, the actual view of the system slabInformation.

Interviewer: you talk about the understanding of physical memory and virtual memory

slabtopReal time display of kernel slab memory cache information.

Interviewer: you talk about the understanding of physical memory and virtual memory

Classification of slab cache

The slab cache is divided into two categories: General cache and special cache.

Universal cache

Using in slab distributorkmem_cacheTo describe the structure of cache, it also needs slab allocator to cache. cache_ Cache stores the cache of cache descriptors. It is a general cache, which is stored in thecache_chainThe first element in a linked list.

In addition, the allocation of small pieces of continuous memory provided by the slab allocator is also implemented by the universal cache. The objects provided by the universal cache have geometrically distributed sizes ranging from 32 to 131072 bytes. The kernel provideskmalloc()andkfree()The two interfaces apply and release memory respectively.

Dedicated cache

The kernel provides a complete set of interfaces for the application and release of dedicated cache, and allocates the slab cache for the specified objects according to the parameters passed in.

Application and release of dedicated cache

kmem_ cache_ Create () is used to create a cache for a specified object. It starts from the cache_ In the normal cache, a cache descriptor is assigned to the new private cache and inserted into the cache formed by the cache descriptor_ Chain. kmem_ cache_ Destory () is used to undo and delete from the cache_ Delete cache on chain list.

Application and release of slab

slabThe definition of data structure in kernel is as follows:

Interviewer: you talk about the understanding of physical memory and virtual memory

kmem_ cache_ Alloc() allocates a slab in the cache specified by its parameter, corresponding to kmem_ cache_ Free() releases a slab in the cache specified by its parameter.

Virtual memory allocation

The previous discussion is about the management of physical memory. Linux deceives user programs by virtual memory management, pretending that each program has 4G virtual memory addressing space (if you don’t understand what I’m saying here, I suggest you look back and don’t say you don’t understand Linux memory management anymore. The 10 pictures are clear for you!).

So let’s study the allocation of virtual memory, including user space virtual memory and kernel space virtual memory.

Note that the allocated virtual memory has not been mapped to the physical memory. Page missing exception will occur only when accessing the requested virtual memory. Then apply for the physical memory through the partner system and the slab allocator described above.

User space memory allocation

malloc

mallocVirtual memory used to apply for user space when the application is less than128KBWhen the memory is small, malloc useSbrk or BRKAllocate memory; When the application is greater than128KBUse themmapFunction application memory;

Existing problems

becausebrk/sbrk/mmapIt’s a system call. If you have to generate system call overhead every time you apply for memory,cpuFrequent switching between user mode and kernel mode greatly affects the performance.

Moreover, the heap is growing from the low address to the high address. If the memory of the low address is not released, the memory of the high address can not be recycled, which is easy to produce memory fragmentation.

solve

Therefore, malloc The implementation of memory pool is adopted. First, a large block of memory is applied, and then the memory is divided into memory blocks of different sizes. When users apply for memory, they directly select a similar memory block from the memory pool to allocate it.
Interviewer: you talk about the understanding of physical memory and virtual memory

Kernel space memory allocation

Before we talk about kernel space memory allocation, let’s review the kernel address space.kmallocandvmallocIt is used to allocate the virtual memory of different mapping areas.

Interviewer: you talk about the understanding of physical memory and virtual memory

kmalloc

kmalloc()The allocated virtual address range is in the direct memory mapped area of kernel space.

Virtual memory in bytes is generally used to allocate small pieces of memory. Releasing memory corresponds tokfree, which can allocate continuous physical memory. Function prototype in<linux/kmalloc.h>In general, it is called in the driverkmalloc()To allocate memory to data structures.

Remember the slab?kmallocIt is based on the slab allocator and can also be usedcat /proc/slabinfoCommand, viewkmallocrelevantslabObject information. Kmalloc-8, kmalloc-16 and so on are kmalloc cache based on slab allocation.

Interviewer: you talk about the understanding of physical memory and virtual memory

vmalloc

vmallocThe assigned virtual address range, located in thevmalloc_startAnd vmalloc_endDynamic memory mapping area between.

In general, it is used to allocate large blocks of memory, and releasing memory corresponds tovfreeThe allocated virtual memory address is continuous, but the physical address is not necessarily continuous. Function prototype in<linux/vmalloc.h>In the statement. It is generally used to allocate data structure for active switch area, and to allocate data structure for certain switch areaI/OThe driver allocates buffers or space for kernel modules.

The following figure summarizes the above two kernel space virtual memory allocation methods.
Interviewer: you talk about the understanding of physical memory and virtual memory

To sum up

This isLinux In the second part of the series of articles on memory management, it is strongly recommended that those students who are not clear in the process of reading should go to see what I wrote before. Don’t say that you don’t understand Linux memory management any more. The 10 pictures are clearly arranged for you!, The topic of Linux memory management has come to an end. The knowledge I shared is very basic, and I can hardly use it in daily development work, but I think every developer under Linux should understand it.

I know that some interviewers like to examine the candidates’ basic qualities when they are interviewing. The contents of these two articles are enough to cope with the interview. In other words, linxu memory management is too complex to be explained clearly in one or two articles. However, we should at least have a macro awareness, not to ask three questions. If you want to have a deep understanding of the principle, it is strongly recommended to learn from the book combined with the kernel source code, and make a little progress every day. Our goal is the sea of stars.

In the course of the creation, I also drew a lot of illustrations, which can be used as knowledge index. I can see pictures more clearly than personal characters. You can reply to memory management in the background of my official account, and get the HD original pictures of them.

Thank you for reading the old rules. The purpose of the article is to share the understanding of knowledge. I will verify the technical articles repeatedly to ensure the accuracy to the greatest extent. If there are obvious mistakes in the article, you are welcome to point them out. Let’s study together in the discussion. That’s all for today’s technology sharing. See you next time.

Reference

Linux kernel design and Implementation (3rd Edition)

Analysis of Linux kernel slab mechanism https://www.jianshu.com/p/95d…

Slab allocator in Linux memory management http://edsionte.com/techblog/…

Analysis of Linux slab allocator https://www.ibm.com/developer…

Buddy and slab algorithms for Linux kernel memory management https://zhuanlan.zhihu.com/p/…

Slab of Linux memory https://fivezh.github.io/2017…

The principle of malloc implementation memory pool MMAP sbrk linked list https://zhuanlan.zhihu.com/p/…

Malloc implementation principle http://luodw.cc/2016/02/17/ma…

Glibc memory management https://www.jianshu.com/p/2fe…

Originality is not easy. Move your fingers here. Your “forward and like” is the biggest support for my continuous creation. See you next article.

You can search the official account of WeChat, “backend Technology School”, reply to “information”, “1024”, and have various programming learning materials that I have prepared for you. The article is updated every week. See you next time!