In the front, we mainly studied the machine level representation of programs, the representation and processing of information, and talked about the organization of assembly instructions and data, partial to the software part. After that, it mainly studies processor architecture, memory hierarchy, link, exception, virtual memory, program optimization and other parts, partial hardware part.
Processor architecture is mainly aboutBasic design idea and method of processorTo study how the processor works from an abstract point of view
Instruction set architecture (ISA)
- Instruction set architecture: the instructions that can be supported by the processor, andByte level encoding (that is, encoding corresponding to binary level).It is understood as a man-made specification that specifies what instructions can be run by the processor and what functions can be realized without specifying how these functions are realized.
- Instruction set architecture is only a specification, not a hardware design method. Microarchitecture is a hardware design method, which needs to consider how to design an efficient processor; while instruction collective architecture is not related to how to design, only what functions these instructions provide
- The instruction set architecture mainly describes four aspects:
- Describes instruction syntax.
- It describes the semantics of grammar.
- This paper describes the behavior in the process of instruction execution and the result after execution. Results include affected registers and flags
- It defines the visible state of the programmer, that is, what can control the computer from the perspective of the programmer. For example, assembly language can operate memory, register, program counter, condition code, so these are visible.
The design concept of two kinds of instruction collective architecture
- In practical applications, manufacturers have produced many models of processors (such as x86 processor, which is a common processor). They have corresponding architectures, which can be divided into two concepts, multiple families
Complex instruction set architecture (CISC)
- In the general computer system, x86 family is the representative processor of CISC; in the embedded system, there are many, typical is 8051 single chip microcomputer
- Embedded system is a special computer system, which is used to perform special tasks, such as smart refrigerator and smart TV. The embedded system has low performance and single processing task, which is different from the general computer system used at home and can do many tasks.
- It is worth noting that CISC is a relatively early design concept of instruction collective architecture. In the general-purpose computer system, the dominant one is the x86 processor, and the rest has basically disappeared
Reduced instruction set architecture (RISC)
- Power processor. IBM mainframe uses power processor. Mainframe is a highly reliable and highly available computer. High reliability requires long-term continuous machine. High reliability requires large-scale computing, which can be applied in banking and other fields. Xbox game console developed by Microsoft also started to use power processor in the later stage
- Arm instruction set architecture. Both Qualcomm and apple processors use this instruction set architecture
- MIPS instruction set architecture. It is mainly used in low-end routers; the loongisa architecture used by the domestic processors of Godson in China is also developed from MIPs. Godson processor is a special processor, which is applied in the field of industrial control and national defense security, representing the highest level of processor design in China
- Risc-v (five) instruction set architecture. It is an open source instruction collective architecture, and the internal design of the processor is open source. The company can get the open source processor for free and then modify it according to its own needs. Risc-v series is mainly used in the field of Internet of things, which is a hot topic in the future. In the second half of last year, Alibaba’s pingtouge company released an artificial intelligence processor for Internet terminals, which is based on this implementation.
CISC features and design concept
- The supported instructions are stack oriented instructions. For example, the return address of a procedure in a procedure call is first pushed into the stack; the first six registers of the parameters in the x86-64 bit processor are pushed into the stack; all the parameters in the x86-32 bit processor are passed through the stack
- There are explicit stack in and stack out instructions. Stack in can be implemented by subtraction instruction + MOV, and stack out can be implemented by addition + mov. However, there are special stack in and stack out instructions in CISC
- Memory addressing with quads is very flexible. Quads not only read and write in memory at the same time, but also deal with arithmetic operations. Strong abstract ability, equivalent to multiple simple instructions
- Conditional code. Zero flag, symbol flag, overflow flag, etc
- CISC instruction set processor, for common and typical operation tasks, is still implemented in the form of one instruction (such as Quad memory addressing), although it is relatively complex. Its essence is that one instruction is equivalent to many simple instructions. It turns software problems into hardware solutions. Because early software is too poor and hardware implementation is fast, hardware is used to accelerate software.
- Since the CISC instruction set represents the typical operation with one instruction, there are many instructions, such as more than 2000 pages in x86 manual. And the instruction length is variable
RISC features and design concept
- It is a register oriented programming, and the instruction is related to the register, not the stack. RISC’s parameter passing and return address are passed by register instead of stack. Correspondingly, RISC processor has many registers, at least 32. Operating registers instead of stacks also speeds up processor processing.
- Address access is simple. Only load and store can access address instructions. Address calculation is also very simple, only the base address and offset are two parts
- There is no condition code.
- RISC reduced instruction set architecture is the opposite of CISC. It is hoped that the fewer instructions supported in the processor, the better. Therefore, it does not support the instruction with high abstraction, only the simplest instruction. It hopes to complete the complex instruction task through the combination of simple instructions. Its manual has only 56 instructions, less than 3 pages
- The advantage of RISC is that it can speed up the processor. The working speed of the processor depends on the clock cycle. The processor executes instructions one by one according to the clock cycle. The shorter the clock cycle, the faster the working speed. The clock cycle is similar to the short board effect of the barrel principle, which is determined by the slowest instruction. Because of the complex instructions in CISC, the clock cycle is long and the processor is slow.
Comparison of RISC and CISC instruction sets
- CISC instruction set is difficult to design and optimize (complicated instruction and long clock cycle); RISC instruction set is simple to design and optimize, but the compiler design is complex. Because RISC is composed of simple instructions, the compiler has a higher optimization space for the program
- Although RISC processor is faster, the desktop system and server system are still mainly x86 system. This is because Intel has made many modifications to its CISC, with high performance; and in the early stage, it was mainly CISC instruction set processor, so a large number of software has been running on this platform. For the sake of software ecology, it is still mainly x86 system. It is worth mentioning that the internal design of X86 refers to the design of many RISCs, so that the core of 64 bit system has become RISC, just a CISC shell. For example, 64 bit system adds many registers, and the first six parameters are passed by registers
- In the field of embedded market, RISC has occupied a leading position. This is because RISC processor has the advantages of simple instruction, miniaturization and simple design, which reduces its cost and energy consumption. However, embedded devices have low requirements for device performance, but due to the use of batteries and high requirements for energy consumption, RISC just meets the requirements
What’s inside the x86 processor now
- On the macro level, it is CISC instruction set processor, but on the micro level, it is only CISC shell, and the core is RISC instruction set. Although the complex instructions of CISC are received externally, they will be translated into multiple microcodes (microinstructions, microcodes) internally. In fact, it is to convert the complex instructions of CISC into microprograms with multiple microinstructions.
Implementation of single period processor
Single cycle processor model
- On the left is the circuit used for calculation in the CPU. The input part includes instructions and data
- On the right is memory. The calculated data is stored here, and the instructions and data are obtained from the memory and transferred to the left computing circuit.
- Clock cycle refers to the shortest time required for an instruction from the beginning to the end. In this process, the clock cycle cannot be less than 320ps
- Single cycle processor: the execution of an instruction takes one clock cycle
The execution of instructions is divided into six steps
- Address (instruction fetching): according to the address pointed to by the program counter in the processor, the instruction is read into the CPU first from the storage
- Decoding: prepare the operands and generate the control signals corresponding to the operands
- Execution: perform basic arithmetic and logical operations to calculate the result
- Access to memory: operate on the memory. Read and write operations are completed at this stage
- Write back: if the result affects the register, update the register at this stage
- Update PC register: update the program counter to the next one; if there is jump instruction, update it to other instructions
- These six steps are not necessary, for example, 3, 4 and 5 are optional, while 1, 2 and 6 are generally necessary
Understand the six steps of instruction execution with two examples
- Add add instruction
- Access: this instruction is three bytes, read three bytes
- Decoding: prepare the data in RA and Rb, prepare the add instruction signal, and send it to the processor
- Perform: add to get results
- Write back: update RB register
- PC: program counter plus 3, (this instruction takes up three bytes)
- MOV instruction to move data to RA register
- Access: four byte instruction, read four bytes
- Decoding: prepare the data in Rb register
- Execute: add address
- Memory access: read memory
- Write back: write to RA register
- PC: update program counter, PC + 4
- The above consideration is the RISC instruction execution process. CISC instruction execution is more troublesome in x86 system. It needs to be decomposed into microinstruction first, and then microinstruction executes the above process
- From the perspective of resource utilization: from the entry to the completion of instructions, in the decoding stage, the access stage does nothing. In the same way, the front part of the circuit is no longer working. At the same time, only one of these components works effectively, although these six components constitute the processor. Therefore, the utilization rate of single cycle processor is very low from the perspective of resource utilization
- The six parts of the processor are regarded as production line processes and instructions as products. The efficiency of pipeline is much higher than that of single cycle. So the basic idea to solve the single cycle processor is pipeline processing