Do you know about Linux memory management? 10 pictures for you to arrange clearly!

Time:2020-7-4

The article is updated every week. Your “three companies” are my greatest affirmation. WeChat can search the public for the first time to read the official account of “backend Technology School” (usually one to two updates earlier than blogs).

Let’s take a look at it today Linux Memory management. For MasteryCURDMemory management seems to be far away from us, but although this knowledge is not popular (it is estimated that many people will not have the opportunity to use it after learning), it is definitely the foundation of the foundation. It is just like the internal skill training in martial arts. You can’t see the immediate effect after learning, but it will be of great benefit to your future development work, because you stand higher.

All the sample pictures in this paper are drawn by myself. Drawing pictures takes more time than code words, but reading pictures is more intuitive than words. Students who need high-definition sample pictures can obtain them by themselves at the end of the paper.

To be more utilitarian, it may make the interviewer more interested in you if you inadvertently reveal that you know this knowledge during the interview, which is a step closer to the peak of your life.

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

Premise agreement: This article discusses the technical content premise, the operating system environment isx8632 bits of Architecture Linux System.

Virtual address

Even in the modern operating system, memory is still a very valuable resource in the computer. Take a look at your computer’s several T solid state drives and then look at the memory size. In order to make full use of and manage system memory resources, Linux adopts virtual memory management technology, which makes every process have4GBNoninterference virtual address space.

Process initialization, allocation and operation are based on this “virtual address”. Only when the process needs to actually access memory resources, the mapping of virtual address and physical address will be established and the physical memory page will be called.

It’s not a very appropriate analogy. This principle is actually the same as the current so and so network disk, if your network disk space is 1TB Do you really think it gives you so much space in one breath? It’s still too young. It’s only when you put things in that that you allocate space to you. As much as you put in, you can get as much actual space as you can, but you and your friends look like everyone has it 1TB The space is the same.

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

Benefits of virtual address

  • Avoid users directly accessing the physical memory address, prevent some destructive operations, and protect the operating system
  • Each process is allocated 4GB of virtual memory, and the user program can use a larger address space than the actual physical memory

4GBThe virtual address space of the process is divided into two parts: “user space” and “kernel space”

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

Physical address

As we have known in the previous section, the addresses used in both user space and kernel space are virtual addresses. When a process needs to actually access memory, the “page missing exception” generated by the kernel’s “request paging mechanism” will be called into the physical memory page.

The conversion of a virtual address to a physical address of memory involves the use ofMMUMemory management unit (MMU) is used to segment and paginate virtual address. The specific process of segmenting and pagination will not be repeated here. You can refer to any textbook on principles of computer organization.

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

LinuxThe kernel divides the physical memory into three management areas

ZONE_DMA

DMAMemory area. Contains a memory page frame between 0MB and 16MB, which can be based on ISA Through the DMA Use to map directly to the address space of the kernel.

ZONE_NORMAL

Normal memory area. It contains a memory page frame between 16MB and 896mb, and a regular page frame, which is directly mapped to the address space of the kernel.

ZONE_HIGHMEM

High end memory area. It contains more than 896mb of memory page frames. It is not directly mapped, but can be accessed through permanent mapping and temporary mapping.

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

User space

User processes can access “user space”. Each process has its own independent user space. The virtual address range is from0x00000000to0xBFFFFFFFThe total capacity is 3G.

The user process can only access the virtual address of the user space, and can access the kernel space only when performing the embedded operation or system call.

Process and memory

The user space occupied by the process (the executing program) is divided into two categories according to the principle of “address spaces with consistent access properties are stored together”5Different memory areas. Access properties refer to “readable, writable, executable, etc.

  • Code snippet

    Code segment is used to store the operation instructions of executable files, and the image of executable program in memory. Code segments need to be protected from being illegally modified at run time, so only read operations are allowed, and they are not writable.

  • Data segment

    The data segment is used to store the initialized global variables in the executable file, in other words, the variables and global variables allocated statically by the program.

  • BSS segment

    BSS Section contains the uninitialized global variables in the program, in memorybssAll segments are zeroed.

  • heap heap

    Heap is used to store the dynamically allocated memory segment during the process running. Its size is not fixed and can be dynamically expanded or reduced. When a process calls malloc and other functions to allocate memory, the newly allocated memory is dynamically added to the heap (heap is expanded); when free and other functions are used to release memory, the released memory is removed from the heap (heap is reduced)

  • Stackstack

    The stack is a local variable created temporarily by the user to store the program, that is, the variable defined in the function (but not includingstaticDeclared variable, static means that the variable is stored in the data segment). In addition, when the function is called, its parameters will also be pushed into the calling process stack, and the return value of the function will be stored back in the stack after the call is finished. Because of the first in first out (FIFO) characteristics of the stack, the stack is especially convenient for saving / restoring the call site. In this sense, we can think of the stack as a memory area to store and exchange temporary data.

Data segments in the above memory areasBSSSegments and heaps are usually continuously stored in memory and are contiguous in location, while code segments and stacks are often stored independently. The heap and stack areas are in thei386In architecture, stack expands downward and heap expands upward.
Do you know about Linux memory management? 10 pictures for you to arrange clearly!

You can also use it under LinuxsizeCommand to view the size of each memory area of the compiled program

[lemon ~]# size /usr/local/sbin/sshd
   text       data        bss        dec        hex    filename
1924532      12412     426896    2363840     2411c0    /usr/local/sbin/sshd

Kernel space

stayx86 32In bit system, Linux kernel address space refers to virtual address from0xC0000000Start to0xFFFFFFFFHigh end memory address space, total1GIt includes kernel image, physical page table and driver running in kernel space.

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

Direct mapping area

Direct mapping areaDirect Memory Region: from kernel space start address, max896MIs a direct memory mapped area.

The 896mb “linear address” of the direct mapping area is directly before the “physical address”896MBMapping, that is, linear addresses and assigned physical addresses are continuous. Linear address of kernel address space0xC0000001The corresponding physical address is0x00000001There is an offset between themPAGE_OFFSET = 0xC0000000

There is a linear conversion relationship between the linear address and the physical address of the region=PAGE_OFFSET+Physical address “can also be usedvirt_to_phys() Function converts a linear address in the kernel virtual space to a physical address.

High end memory linear address space

The linear address range of kernel space is from 896m to 1g, and the address range with capacity of 128MB is high-end memory linear address space. Why is it called high-end memory linear address space? Here’s an explanation:

As mentioned earlier, the total size of kernel space is 1GB. The linear address of 896mb starting from the starting address of kernel space can be directly mapped to the address range with physical address size of 896mb. Even if the 1GB linear address in the kernel space is mapped to the physical address, it can only address the physical memory address range of 1GB at most.

What is the memory size of your home now? Wake up to 0202, the general PC memory is greater than 1GB!
Do you know about Linux memory management? 10 pictures for you to arrange clearly!

Therefore, the kernel space takes out the final 128M address range and divides it into the following three high-end memory mapping areas to address the entire physical address range. This problem does not exist on 64 bit systems because the available linear address space is much larger than the installable memory.

Dynamic memory mapping area

vmalloc RegionThis area is defined by the kernel functionvmallocThe feature is: linear space is continuous, but the corresponding physical address space is not necessarily continuous.vmallocThe physical page corresponding to the allocated linear address may be in low memory or high memory.

Permanent memory mapped area

Persistent Kernel Mapping RegionThis area has access to high-end memory. The access method is to use thealloc_page (_GFP_HIGHMEM)Allocate high-end memory pages or use kmap Function maps the high-end memory allocated to this area.

Fixed mapping area

Fixing kernel Mapping RegionThere are only 4K isolation strips at the top of 4G, and each address entry serves a specific purpose, such asACPI_BASEEtc.

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

Look back

The above is a little bit too much. Don’t rush into the next section. Before that, let’s review the content above. If you read the above chapter carefully, I will draw another picture here. Now you should have such a global picture of memory management in your mind.

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

Memory data structure

In order to manage the virtual memory in the kernel system, it is necessary to abstract the memory management data structure from it. Memory management operations such as “allocation, release, etc.” are based on these data structure operations. Here are two data structures for managing the virtual memory area.

User space memory data structure

In the previous section “processes and memory”, we mentioned that Linux processes can be divided into five different memory areas: code segments, data segments, and BSS The way the kernel manages these areas is to abstract these memory areas into vm_area_struct Memory management object for.

vm_area_struct It is the basic management unit that describes the address space of a process. A process often needs more than one vm_area_struct To describe its user space virtual address, you need to use “linked list” and “red black tree” to organize each vm_area_struct

Linked list is used when all nodes need to be traversed, while red black tree is suitable for locating specific memory area in address space. In order to achieve high performance for various operations on the memory area, the kernel uses both data structures.

Address management model of user space process

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

Kernel space dynamic allocation memory data structure

In the kernel space section, we mentioned the dynamic memory mapping area, which is defined by kernel functionsvmallocThe feature is: linear space is continuous, but the corresponding physical address space is not necessarily continuous.vmallocThe physical page corresponding to the allocated linear address may be in low memory or high memory.

vmallocThe assigned address is limited to vmalloc_start And vmalloc_end between. Every piece vmalloc Each allocated kernel virtual memory corresponds to one vm_struct Structure, between different kernel space virtual addresses 4k The size of the anti cross boundary free area interval area. Like the virtual address feature of user space, these virtual addresses have no simple mapping relationship with physical memory. They can only be converted to physical address or physical page through the kernel page table. They may not have been mapped, and the physical pages can only be allocated when page missing occurs.

Do you know about Linux memory management? 10 pictures for you to arrange clearly!

To sum up

Linux Memory management is a very complex system. This article is just the tip of the iceberg. It shows you the whole picture of memory management from a macro perspective. But generally speaking, this knowledge is sufficient when you chat with the interviewer. Of course, I hope you can understand the deeper principles through reading.

I hope this article can be used as an index like learning guide. When you want to further study a certain point, you can find the entry point in these chapters, and the location of this knowledge point in the macro memory management.

In the course of the creation, I also drew a lot of illustrations, which can be used as knowledge index. I can see pictures more clearly than personal characters. You can reply to memory management in the background of my official account, and get the HD original pictures of them.

Old rules, thank you for your reading. The purpose of this article is to share your understanding of knowledge. For technical articles, I will verify them repeatedly in order to ensure the accuracy to the greatest extent. If there are obvious mistakes in the text, please point out that we can learn together through discussion. That’s all for today’s technology sharing. See you next time.

It’s not easy to be original. If you have a little bit of harvest here, you can use your fingers to “like” and “pay attention” to me.

My more wonderful articles:

Very detailed Linux C / C + + learning route summary! Offered by Tencent
The interviewer calls you to build an airplane again. How to design the micro service interface?
Interview are asked in the micro service, service governance, RPC, next generation of micro service framework… This article will take you to understand thoroughly!
Linux “process” problems do not panic, senior programmers teach you 6 moves to deal with!
Interviewer: are you familiar with MySQL affairs? Let me ask you 10 questions
I used big data to analyze the recruitment needs of more than 1000 jobs in first tier cities, and told you how to find a job scientifically
Tencent interview notes
Can you still play like this? I use vscode to draw class diagram, flow chart, sequence diagram, state diagram, not too cool!
Interviewer: how many redis distributed locks do you know? I can do three!
The most detailed personal blog tutorial building tutorial githubpages + Jekyll simple style blog

It’s not easy to create. Just like it, pay attention to it and support it

You can search the official account of WeChat for “back end technology school” to reply to “information”. Articles are updated every week, see you next time!