The significance of MMU
[guide] this paper introduces the background and working mechanism of MMU from the perspective of memory management. However, the implementation details of the specific processor are ignored, and the working principle of MMU is clearly combed from the concept.
Before the birth of MMU:
In traditional batch processing systems such as DOS, the layout of applications and operating systems in memory is roughly as follows:
- The application program directly accesses the physical memory, and the operating system occupies a part of the memory area.
- The responsibility of the operating system is to “load” the application, “run” or “unload” the application.
If we have been single task processing, there will be no problem, or the memory required by the application is always very small, then this architecture will not have any problems. However, with the development of computer science and technology, the problems to be solved are more and more complex, and single task batch processing can not meet the needs. And applications need more and more memory. Moreover, with the requirement of multitasking processing, this kind of technical architecture can not meet the demand any more. How did the earlier multitask processing system work?
The programmer loads and executes the application in sections, but segmentation is a hard work. And it’s boring. At this time, smart computer scientists came up with a good way to come up with the idea of virtual memory. The memory required by the program can be much larger than the physical memory, leaving the current need to be executed in the memory, but not the part that needs to be executed in the disk. At the same time, it can meet the requirement that multiple applications can simultaneously reside in the memory and execute concurrently.
In general, what major strategies need to be implemented?
- All applications can reside in memory at the same time, and are scheduled by the operating system for concurrent execution. We need to provide a mechanism to manage I / O overlap and CPU resources competing for access.
- Virtual real memory mapping and exchange management can establish exchange mapping relationship between real physical memory, variable or fixed partition, paging or segmentation with virtual memory, and effectively manage this mapping to achieve exchange management.
In this way, some more specific requirements derived from the implementation are as follows:
- Competitive access protection management requirements: strict access protection is required to dynamically manage which memory pages / segments or areas are used by which applications. This is a competitive access management requirement for resources.
- Efficient translation transformation management requirements: it is necessary to implement fast and efficient mapping translation transformation, otherwise the operation efficiency of the system will be low.
- Efficient virtual real memory exchange requirements: it needs to be fast and efficient in the process of page / segment exchange between virtual memory and physical memory.
In short, under such a background, MMU came into being. It can be seen that the development and growth of any technology must be demand driven. This is the objective law of the development of technology itself.
Benefits of memory management
- It provides convenient and unified memory space abstraction for programming. In terms of application development, it seems that they all have their own independent user memory space access rights, which hides the underlying implementation details and provides a unified and portable user abstraction.
- In order to maximize the performance with the minimum cost, the memory management by MMU is not as efficient as accessing the memory directly. Why do we need to use this mechanism for memory management is that each process of concurrent processes has complete and independent memory space. In fact, memory is expensive. Even if the memory cost is much cheaper than before, the application process still can’t search for the memory. In the actual hardware, the design of large enough memory can realize direct access. Even if it can meet the requirements, the CPU’s direct addressing space using address bus is limited.
Overall strategy of memory management
From the perspective of operating system, the basic abstraction of virtual memory is realized by operating system
- The processor memory space does not have to be consistent with the actual physical memory space connected.
- When an application requests access to memory, the operating system translates the virtual memory address into the physical memory address, and then completes the access.
From the perspective of application, the address used by an application (often a process) is a virtual memory address. Conceptually, it is shown in the following schematic diagram that the MMU is in theoperating systemIs responsible for translating virtual memory into physical memory.
In this way, the virtual memory makes the application do not have to reside all its contents in the memory for execution at one time
- Save memory: many applications don’t have to have their entire contents loaded and resident in memory at one time, so the advantage is obvious. Even if the hardware system is configured with a large amount of memory, memory is still the most precious resource in the system. So the benefits of memory savings are obvious.
- Make applications and operating systems more flexible。
- The operating system flexibly allocates memory to applications according to their dynamic runtime behavior.
- Enables applications to use more or less memory than the actual physical memory.
MMU and TLB
MMU (memory management unit)：
- A hardware circuit unit that converts virtual memory addresses to physical memory addresses
- All memory access will be converted through MMU unless MMU is not enabled.
TLB (translation lookaside Bu Er translation backup buffer)*
The ultimate purpose of such an architecture is to meet the following operational requirements:
Multiple processes run simultaneously in the real physical memory space, and MMU acts as a vital bridge from virtual memory to physical memory.
So, how does this kind of framework do from the concept of high level? In fact, the physical memory is implemented by using the partition management strategy. From the perspective of implementation, there are two optional strategies
Fixed size partition mechanism
Variable size partition mechanism
Fixed size region mechanism
Through such a conceptual strategy, the physical memory is divided into fixed equal size slices
- Each slice provides a base address
- Actual address, physical address = a chip base address + virtual address
- The chip base address is dynamically loaded by the operating system when the process is running dynamically
The advantage of this strategy is that it is easy to switch. But the strategy also brings obvious disadvantages
- Internal fragmentation: memory in a partition that is not used by one process is unusable to other processes
- One partition size is not enough for all application processes.
Variable size partition mechanism
Memory is divided into variable size blocks for map exchange management
- A base address and a variable size boundary are required. The variable size boundary is used for cross-border protection.
- Actual address, physical address = a chip base address + virtual address
The advantage of this strategy is that there is no internal memory fragmentation and the allocation is just enough for the process. However, the disadvantage is that fragments will be generated in the dynamic process of loading and unloading.
Paging mechanism uses fixed size partition for mapping management in both virtual memory space and physical memory space.
- From the application (process) point of view, memory is the virtual address space of consecutive 0-n pages.
- From the perspective of physical memory, memory pages are scattered throughout the physical storage
- This mapping relationship is invisible to the application and hides the implementation details.
How is paging addressed? The design concept introduced here, the specific processor implementation has subtle differences
- The virtual address consists of two parts:Virtual paging numberas well asOffset
- Virtual page serial number VPNyesPage tableIndex of
- Page tableMaintainedPage frame number (PFN)
- Physical address byPFN::OffsetAnalysis.
Take a chestnut, as shown in the following figure:
We haven’t found the specific physical address yet. Let’s take a look at the complete parsing example
How to manage page tables
For the 32-bit address space, assuming 4K is the page size, the page table size is 100MB, which is a big overhead for page table query. So how to reduce the cost? In fact, only a small part of the actual address space needs to be mapped. Then, based on the first level page mechanism, the multi-level page table mechanism is extended.
Take the secondary paging mechanism as an example
Single level page table already has a lot of cost, such as querying page table and fetching data, while secondary paging mechanism, because it needs to query page table twice, will double the cost. So how to improve efficiency? In fact, the concept mentioned above has not yet described TLB in depth. The translation work is cached by hardware cache, which is the meaning of TLB.
- TLB translates virtual pages into PTE, which can be done in a single cycle instruction.
- TLB is implemented by hardware
- Full Association cache (find all entries in parallel)
- The cache index is a virtual page number
- The cache content is PTE
- From PTE + offset, the physical address can be calculated directly
Who is responsible for loading TLB? There are two strategies to choose from here:
- It is loaded by the operating system. The operating system finds the corresponding PTE and loads it to TLB. The format is flexible.
- MMU hardware is responsible for maintaining the page table by the operating system. MMU directly accesses the page table. The page table format strictly depends on the hardware design format.
To sum up
From the general development of the computer to understand the general development strategy of memory management, how to derive the MMU, as well as the differences between fixed fragmentation management, variable fragmentation management and other different mechanisms, and finally derived the single-level paging management mechanism, multi-level paging management mechanism, the role of TLB. This paper describes the birth and mechanism of MMU from a relatively easy to understand concept, but ignores the implementation details of the processor. As a more in-depth understanding of the working mechanism of MMU in concept, it is still a simple and easy to understand article.
The article is from WeChat official account: embedded Inn, more contents, please pay attention to my official account, strictly prohibit commercial use, and illegal.