On virtual memory of Linux

Time:2022-1-9

origin

virtual memory

There is no doubt that virtual memory is definitely one of the most important concepts in the operating system. I think it is mainly due to the important “strategic position” of memory. The CPU is too fast, but the capacity is small and the function is single. Other I / O hardware supports various fancy functions, but they are too slow compared with the CPU. Therefore, a lubricant is needed between them as a buffer, which is where the memory plays its role.

The figure above is the simplest and most intuitive explanation of virtual memory.

The operating system has a piece of physical memory (the middle part), and there are two processes (actually more) P1 and P2. The operating system secretly tells P1 and P2 that my whole memory is yours. Use it casually and manage enough. But in fact, the operating system just draws a big cake for them. These memories are said to be given P1 and P2, but in fact, they are only given a serial number. Only when P1 and P2 really began to use these memories did the system start to move around and piece together various blocks for the process. P2 thought he was using a memory. In fact, the system has quietly redirected to real B. even when P1 and P2 share C memory, they don’t know.

This means of spoofing the process of the operating system is virtual memory. For processes such as P1 and P2, they think they occupy the whole memory, and they don’t know or care about the address of the physical memory they use.

Paging and page tables

Virtual memory is a concept in the operating system. For the operating system, virtual memory is a comparison table. When P1 obtains the data in memory a, it should go to the address a of physical memory, and the data in memory B should go to the address C of physical memory.

We know that the basic unit in the system is byte. If the byte of each virtual memory is corresponding to the address of the physical memory, each entry needs at least 8 bytes (32-bit virtual address – > 32-bit physical address). In the case of 4G memory, 32GB space is required to store the comparison table, so the table is too large for the real physical address, So the operating system introduces the concept of page.

When the system starts, the operating system divides the entire physical memory into pages in 4K units. Later, when allocating memory, the mapping table of virtual memory pages corresponding to physical memory pages is greatly reduced. For 4G memory, only 8m mapping table is required. For virtual memory not used by some processes, there is no need to save the mapping relationship. Moreover, Linux also designs a multi-level page table for large memory, which can enter one page to reduce memory consumption. The mapping table from operating system virtual memory to physical memory is called page table.

Memory addressing and allocation

We know that through the virtual memory mechanism, each process thinks that it occupies all memory. When a process accesses memory, the operating system will convert the virtual memory address provided by the process into a physical address, and then obtain data from the corresponding physical address. There is a kind of hardware in the CPU. The memory management unit MMU (memory management unit) is specially used to translate virtual memory addresses. The CPU also sets a cache strategy for page table addressing. Due to the locality of the program, its cache hit rate can reach 98%.

In the above case, there is a mapping from virtual address to physical address in the page table. If the physical address accessed by the process has not been allocated, the system will generate a page missing interrupt. During interrupt processing, the system switches to the kernel state to allocate physical address to the process virtual address.

function

Virtual memory not only solves the problem of memory access conflict of multiple processes through memory address translation, but also brings more benefits.

Process memory management

It is helpful for process memory management, which is mainly reflected in:

  • Memory integrity: due to the “deception” of virtual memory to the process, each process thinks that the memory it obtains is a continuous address. When we write applications, we don’t have to consider the allocation of large addresses. We always think that the system has enough large memory.
  • Security: when a process accesses memory, it must be addressed through the page table. The operating system can realize the permission control of memory by adding various access permission identification bits to each item of the page table.

data sharing

It is easier to share memory and data through virtual memory.

When the process loads the system library, it always allocates a piece of memory first and loads the library file in the disk into this memory. When using the physical memory directly, because the physical memory address is unique, even if the system finds that the same library has been loaded twice in the system, the loading memory specified by each process is different, and the system is powerless.

When using virtual memory, the system only needs to point the virtual memory address of the process to the physical memory address where the library file is located. As shown in the figure above, the B addresses of processes P1 and P2 point to physical address C.

Using virtual memory to use shared memory is also very simple. The system only needs to point the virtual memory address of each process to the shared memory address allocated by the system.

SWAP

Virtual memory allows processes to “expand” memory.

We mentioned earlier that virtual memory allocates physical memory for processes through page missing interrupts. Memory is always limited. What if all physical memory is occupied?

Linux puts forward the concept of swap. In Linux, swap partitions can be used. When physical memory is allocated, but the available memory is insufficient, the temporarily unused memory data will be put on the disk first, and the processes in need will use it first. When the processes need to use these data again, these data will be loaded into memory through this “exchange” technology, Linux allows processes to use more memory.

common problem

I also had a lot of problems in understanding virtual memory.

32-bit and 64 bit

The most common problems are 32-bit and 64 bit.

If the CPU accesses memory through the physical bus, the range of access addresses is limited by the number of machine buses. On a 32-bit machine, there are 32 buses, and each bus has high and low potentials, representing bit 1 and 0 respectively. Then the maximum accessible address is 2 ^ 32bit = 4GB, so it is invalid to insert memory greater than 4G on a 32-bit machine, The CPU cannot access more than 4G of memory.

However, 64 bit machines do not have a 64 bit bus, and their maximum memory is limited by the operating system. Linux currently supports a maximum of 256g memory.

According to the concept of virtual memory, it is not impossible to run 64 bit software on 32-bit system. However, due to the structural design of virtual memory address, 64 bit virtual address cannot be used in 32-bit system.

Direct operation of physical memory

The operating system uses virtual memory. What should we do if we want to directly operate memory?

Linux will map all devices to files in the / dev / directory. We can directly operate hardware through these device files, and memory is no exception. In Linux, the memory setting is mapped to / dev / MEM. The root user can directly operate the memory by reading and writing this file.

The JVM process is consuming too much virtual memory

When using top to check the system performance, we will find that in the virt column, Java processes will occupy a lot of virtual memory.

The reason for this problem is that Java allocates a large amount of virtual memory using glibc’s arena memory pool and is not used. In addition, the files read by Java will also be mapped into virtual memory. Under the default configuration of the virtual machine, each java thread stack will occupy 1m of virtual memory. Specifically, you can see why multithreaded programs under Linux consume virtual memory so much.

The actual physical memory occupied depends on the res (resident) column, and the value of this column is the size mapped to the physical memory.

Common management commands

We can also manage the virtual memory of Linux by ourselves.

View system memory status

There are many ways to view the system memory. Commands such as free and vmstat can output the memory status of the current system. It should be noted that the available memory is not just the free column. Due to the lazy feature of the operating system, a large number of Buffers / caches will not be cleaned up immediately after the processes are no longer used. If the processes that used them run again, they can continue to be used, They can also be used when necessary.

In addition, you can view the details of system memory usage, including dirty page status, through cat / proc / meminfo. For details, see: / proc / meminfo puzzle.

pmap

If you want to view the virtual memory distribution of a process separately, you can use the PMAP PID command, which will list the occupation of each segment of virtual memory from low address to high address.

You can add the – XX parameter to output more detailed information.

Modify memory configuration

We can also modify the system configuration of Linux by using sysctl VM [- options] config or directly reading and writing files in / proc / sys / VM / directory to view and modify the configuration.

Swap operation

The swap feature of virtual memory is not always beneficial. Allowing processes to constantly exchange a large amount of data between memory and disk will greatly occupy CPU and reduce system operation efficiency, so sometimes we don’t want to use swap.

We can modify VM Use swap as little as possible to set the memory with swap = 0, or simply use the swap off command to disable swap.

Summary

The concept of virtual memory is very easy to understand, but it will derive a series of very complex knowledge. This article only talks about some basic principles and omits many details, such as the use of virtual memory addressing middle register, the use of virtual memory by the operating system to enhance cache, buffer applications, etc. I have the opportunity to say it alone.

The above is about the details of Linux virtual memory. For more information about Linux virtual memory, please pay attention to other related articles of developeppaer!