Paging principle and algorithm of storage management
Storage manager: records which memory is in use and which is idle; allocates it when needed by the process and releases it after use.
1. No memory abstraction
this is a way to expose the address to the process so as to operate the memory directly. It is an attempt to run multi-channel programs without memory abstraction: IBM 360, the memory is divided into 2K blocks, each block is allocated with a 4-bit protection key, and the memory is protected by PSW (program status word, program status word), a 4-bit code in PSW, the program can only access the same protection key as PSW 4-bit code Otherwise, the operating system will catch the event of illegal access. Only the operating system can modify the protection key. The problem is the redirection problem. All programs refer to the absolute physical address.
The problem of multi-channel programs running is that they all refer to the absolute address. The solution of IBM 360 is to use the “static redirection” technology. For example, when the program is loaded to the address 16384, the constant 16384 is added to each program address, but this method has two disadvantages: 1. Loading will affect the loading speed, 2. The loader needs certain methods to distinguish the address and constant.
2. Memory abstraction address space
2.1 address space concept
The address space solves the security problem of avoiding one process from accessing the address of another process.
2.2 base register and limit register
dynamic relocation: map the address space of each process to different parts of the physical memory. When using the base register and boundary register, the program does not need to be relocated when it is loaded into the physical memory; when a process is running, the starting physical address of the program is loaded into the base register, and the program length is loaded into the boundary register.
2.3 exchange technology
Switching technology is to transfer a process into memory completely, make the process run for a period of time, and then save it back to disk. Idle processes mainly exist on disk, so they do not occupy memory when they are not running,
2.3.1 free space management
188.8.131.52 bitmap management
using the bitmap method, memory is divided into allocation units as small as a few words or as large as several thousand bytes. Each storage unit corresponds to a bit in the bitmap, 0 means free, 1 means occupied (or vice versa).
. It is a time-consuming operation to find 0 consecutive strings of specified length in bitmap, which is the disadvantage of bitmap.
184.108.40.206 use chain list management
The allocation algorithms of linked list management include: first fit, next fit, best fit and worst fit;
2.4 virtual memory
The technology of switching faces this problem: “software expansion”, and there is not enough memory to install programs;
The basic idea of virtual memory is that each program has its own address space, which is divided into several blocks, each block is called a page. Each page has its own address range. These pages are mapped to physical memory.
Virtual memory is suitable for multi-channel programming. Many program fragments are saved in memory at the same time. When a program waits for a part of it to read into memory, it can give the CPU to another process.
The management tool for virtual memory is memory management unit (MMU). When the virtual address is sent to MMU, MMU maps the virtual address to the physical memory address.
MMU is usually a part of CPU, but logically, it can be a single chip, but logically, it can be a single chip, and that’s it for a long time.
the virtual address space is divided into several units of ‘page’ according to the fixed size; the corresponding unit in the physical memory is called ‘page frame’, and the size of page and page frame are usually the same.
the exchange between ram and disk always takes the whole page as the unit.
principle demonstration: virtual address 8192 (binary is 0010000000000100), the input 16 bit virtual address is divided into 4-bit page number and 12-bit offset (calculation method: according to page size 4K, 12 bits are needed to determine the page offset, and the remaining 4 bits are page number).
2.4.2 page table
Virtual address is divided into virtual page number (high part) and offset (low part). The function of page table is to map virtual page to page box.
“in / out” bit: when this bit is 1, it means that the table item is valid and can be used; if it is 0, it means that the virtual page corresponding to the table item is not in memory now, and accessing the page will cause a page missing interrupt.
“Protection bit”: indicates what type of access is allowed for the page; the simplest is to use one bit, 0 for read / write and 1 for read-only; the other is to use three bits, each bit for read, write and execute the page respectively.
“Modified bit and referenced bit”: when writing a page, the hardware automatically sets the modification bit. When the operating system reallocates the page box, if the modification bit is set, the page is dirty and must be written back to disk. If the bit is not set, the page box is clean and only needs to be simply discarded.
The cache inhibit bit is used to ensure that the hardware reads data from the device instead of accessing an old cached copy; however, this bit is not required for machines with independent IO space without memory mapped io.
Virtual memory is essentially used to create a new abstract concept address space. The implementation of virtual memory is to decompose the address space into pages and map each page to a page frame or (temporarily) unmap of physical memory.
2.4.3 accelerate paging process
Two problems need to be considered in the design of paging system: 1) the mapping from virtual address to physical address must be very fast; 2) if the virtual address space is very large, the page table is also very large.
220.127.116.11 conversion detection buffer
The page table is usually located in the memory. It can set a small hardware device for the computer to map the virtual address directly to the Wulin address without accessing the page table. This hardware device is called translation lookaside buffer (TLB), which uses the principle of locality.
18.104.22.168 software TLB management
. The system must first locate the page, then remove a table entry from the TLB, then load a new entry, and finally execute the instruction with previous errors.
when a page is accessed in memory but not in TLB, soft miss will be generated. At this time, only TLB needs to be updated, and disk IO is not required. When the page itself is not in memory, hard failure will be generated. At this time, disk access is required to load the page.
2.4.4 page table for large memory
TLB is introduced to speed up the conversion from virtual address to physical address. Another problem is how to deal with huge virtual address space.
22.214.171.124 multi level page table
The 32-bit address is divided into 10 bit pT1 domain, 10 bit pT2 domain and 12 bit offset domain.
the purpose of introducing multi-level page tables is to avoid keeping all page tables in memory all the time, especially those page tables that are never needed should not be kept.
this virtual address space is more than 1 million pages, in fact, only four page tables are needed: the top-level page table and the second level page tables of 0-4m (body segment), 4m-8m (data segment) and top 4m (stack segment). The “in / out” bits of 1021 table entries in the top-level page table are set to 0, which will cause page missing interrupt when accessing them.
126.96.36.199 inverted page table
In real memory, each page box has a table entry, not every virtual page. The disadvantage of inverted page table is that the conversion from virtual address to physical space is very difficult. When a process n wants to access virtual memory P, it must search the entire inverted list to find a table item (n, P), and perform every memory operation.
Solution: use TLB acceleration. In case of TLB failure, create a hash table, hash with virtual address, and handle conflicts through linked list.
2.5 page replacement algorithm
The problem: when a page needs to be swapped out of memory, can it only be the page of the page missing process itself? Can the page to be swapped out belong to another process?
2.5.1 optimal page replacement algorithm
disadvantage: it can’t be realized, because it’s impossible to predict when the next page will be accessed; although one page access can be tracked through the simulation program, it’s only for a specific program and input data.
2.5.2 page replacement algorithm not used recently
the system sets two status bits for each page, R-bit: set when the page is accessed (read or write), m-bit: set when the page (i.e. modified page) is written.
. According to the combination of R-bit and m-bit, it can be divided into the following four categories:
Class 0: not accessed, not modified.
class 1: not accessed, modified.
class 2: accessed, not modified.
class 3: accessed: modified.
The NRU (not recently used) algorithm randomly selects a page from the least numbered non empty classes.
advantages: easy to understand, can be effectively implemented, although the performance is not the best, but enough.
2.5.3 first in first out page replacement algorithm
FIFO (first in first out) algorithm: the operating system maintains a linked list in memory. The latest pages are placed at the end of the table, and the oldest pages are placed in the header. In case of page missing interruption, the pages in the header are eliminated and added to the footer.
2.5.4 second chance page replacement algorithm
The second chance algorithm improves the FIFO algorithm, maintains a linked list, checks the R-bit of the oldest page, if the page is not visited, if the bit is 0, that is, it is not used, it is immediately replaced; if it is 1, it is cleared to 0, and the page is placed at the end of the linked list, changes its load time, and then continues to search.
2.5.5 clock page replacement algorithm
The second change algorithm costs too much to maintain a linked list, so it replaces the linked list with a clock ring. In case of page missing interruption, if its R-bit is 0, it will be eliminated, and the new page will be inserted into this position, and then the pointer will be moved forward one position. If it is 1, it will clear the R-bit and move the pointer forward until a page with R-bit 0 is found.
2.5.6 least recently used page replacement algorithm
LRU (least recently used) algorithm is an approximation of the optimal algorithm. There are two ways to implement LRU with special hardware:
1) the hardware is required to have a 64 bit counter C, which runs one instruction at a time. When accessing a page box, the counter is saved in the corresponding page table item. When the page missing interrupt occurs, the page with the minimum count is found for elimination.
2) in a machine with n page frames, the system maintains a matrix with an initial value of 0. When accessing the page frame K, set all k rows to 1, and then set K columns to 0. In case of page missing interruption, find the row with the smallest binary value to be eliminated.
2.5.7 simulate LRU with software
principle: each page has a software counter. The initial value is 0. Every time the clock is interrupted, all pages will be scanned, and its R bit (remember: every time the clock is interrupted, this bit will be reset) will be added to the counter of this page. When the page is interrupted, the page with the smallest counter will be eliminated.
Defect: never forget anything, deeply affected by the running of previous programs.
Two differences between LRU algorithm and aging algorithm:
1) because the LRU algorithm records only one bit in each clock tick, it is impossible to distinguish which page is accessed at an earlier time and which page is accessed at a later time, while the aging algorithm can record the sequence.
2) the counter of aging algorithm has only a limited number of digits, which limits the record of previous pages; this is reasonable, for example, a clock tick cycle is 20ms, and the counter is 8 bits. If the page has not been accessed for 160ms, it may not be important.
2.5.8 page replacement algorithm of worksets
Working set: a collection of pages currently in use by a process.
Thrashing: a page break occurs every few instructions executed.
Working set model: the paging system tries to track the working set of a process to ensure that its working set is in memory before the process is allowed to run.
There are two ways to calculate worksets:
1) at any time t, there is a set, which contains all the pages visited by the last K memory accesses. This set w (k, t) is the working set.
2) another low overhead method is not to look back for the last K memory accesses, but to consider their execution time. The working set of a process is called the set of pages it has visited in the past T seconds.
Definition of “current actual working time”: if a process runs at time t and uses 40 ms CPU time at time (T + 100) ms, its time for working set is 40 ms. the total CPU time of a process from its execution to current actual use is called current actual working time.
2.5.9 clock page replacement algorithm of working set
At the beginning, the table is empty. As more pages are added, a ring is formed. In case of page missing interruption, check the page pointed by the pointer first. If R is 1, reset the R bit and point the pointer to the next page. If R bit is 0, if the time to live is greater than T and the page is clean, it is no longer in the working set and put the new page here. If the page has been modified, in order to avoid No process switching caused by disk operation. The pointer moves forward to operate the next page.
2.5.10 summary of page replacement algorithm
The best two algorithms are aging algorithm and working set clock algorithm. They are given LRU and working set respectively, which have good scheduling performance and can be effectively implemented.