Principles of computer composition (review)

Time:2022-1-10

#Zhuo Yijing

Chapter I Introduction to computer system

\1. What are computer systems, computer hardware and computer software? Which is more important, hardware or software?

Solution: P3

Computer system: a complex composed of computer hardware system and software system.

Computer hardware: refers to the electronic circuits and physical devices in a computer.

Computer software: programs and related data required for computer operation.

Hardware and software are interdependent and indispensable in computer system, so they are equally important.

\5. What are the characteristics of von Neumann computer?

Solution: the characteristic of von Neumann computer is P8

L the computer is composed of five parts: arithmetic unit, controller, memory, input device and output device;

L instructions and data are stored in the memory in the same status and can be accessed by address;

L instructions and data are represented in binary;

L the instruction consists of two parts: operation code and address code. The operation code is used to represent the nature of the operation, and the address code is used to represent the position of the operand in the memory;

L the instructions are stored in the memory in sequence, usually automatically taken out and executed in sequence;

L the machine is centered on the arithmetic unit (original von Neumann machine).

\7. Explain the following concepts:

Host, CPU, main memory, storage unit, storage element, storage primitive, storage element, storage word, storage word length, storage capacity, machine word length and instruction word length.

Solution: p9-10

Host: it is the main part of computer hardware, which is composed of CPU and main memory mm.

CPU: central processing unit, which is the core component of computer hardware and is composed of arithmetic unit and controller; (the early arithmetic unit and controller were not on the same chip, and the current CPU not only contains arithmetic unit and controller, but also integrates cache).

Main memory: the memory in the computer that stores running programs and data. It is the main working memory of the computer and can be accessed randomly; It is composed of memory, various logic components and control circuits.

Storage unit: a storage unit that can store a machine word and has a specific storage address.

Storage element: the physical element that stores one bit of binary information. It is the smallest storage unit in the memory, also known as storage primitive or storage element. It cannot be accessed separately.

Storage word: the logical unit of binary code stored in a storage unit.

Storage word length: the number of bits of binary code stored in a storage unit.

Storage capacity: the total amount of binary codes that can be stored in the memory; (usually, the primary and secondary storage capacities are described separately).

Machine word length: refers to the number of bits of binary data that can be processed by the CPU at one time. It is usually related to the number of registers of the CPU.

Instruction word length: the number of bits in the binary code of an instruction.

\8. Explain the Chinese meaning of the following abbreviations:

CPU、PC、IR、CU、ALU、ACC、MQ、X、MAR、MDR、I/O、MIPS、CPI、FLOPS

Solution: a comprehensive answer should be divided into three parts: English full name, Chinese name and function.

CPU: central processing unit, which is the core component of computer hardware, is mainly composed of arithmetic unit and controller.

PC: program counter, whose function is to store the address of the current instruction to be executed, and can automatically count to form the address of the next instruction.

IR: instruction register, whose function is to store the currently executing instructions.

Cu: control unit, control unit (component), is the core component of the controller, and its function is to generate micro operation command sequence.

Alu: arithmetic logic unit, as the core component of the arithmetic unit, its function is to perform arithmetic and logic operations.

ACC: accumulator is a register in the arithmetic unit that can store both operands before operation and operation results.

MQ: multiplier quotient register, which stores the multiplier during multiplication and the quotient during division.

10: This letter has no specific abbreviation meaning and can be used as any part name. Here, it represents the operand register, that is, one of the working registers in the arithmetic unit, which is used to store the operand;

Mar: memory address register, which is used to store the address of the storage unit to be accessed in main memory.

MDR: memory data register, which is used to store data read from or to be written to a storage unit in main memory.

I / O: input / output equipment is the general name of input equipment and output equipment, which is used for the conversion and transmission of information inside and outside the computer.

MIPS: million instructions per second, which is a unit of measurement of computer operation speed.

\9. Draw the block diagram of the host, take the memory instruction “sta m” and the addition instruction “add m” (M is the main memory address) as examples, and mark the information flow (such as → ①) to complete the instruction (including the instruction fetching stage) in order in the figure. Assuming that the main memory capacity is 256M * 32 bits, point out the number of bits of each register in the figure under the condition that the instruction word length, storage word length and machine word length are equal.

Solution: the host block diagram is shown in Figure 1.11 of P13.

(1) Sta m instruction: PC → Mar, mar → mm, mm → MDR, MDR → IR,

OP(IR) →CU,Ad(IR) →MAR,ACC→MDR,MAR→MM,WR

(2) Add m instruction: PC → Mar, mar → mm, mm → MDR, MDR → IR,

​ OP(IR) →CU,Ad(IR) →MAR,RD,MM→MDR,MDR→X,ADD,ALU→ACC,ACC→MDR,WR

Assuming that the main memory capacity is 256M * 32 bits, under the condition that the instruction word length, storage word length and machine word length are equal, ACC, x, IR and MDR registers are 32 bits, and PC and mar registers are 28 bits.

\11. Instructions and data are stored in memory. How can the computer distinguish them?

Solution: the computer can distinguish between instructions and data in the following two ways:

L instructions and data are distinguished by different time periods, that is, what is taken out in the instruction fetching stage (or finger fetching microprogram) is instructions, and what is taken out in the instruction execution stage (or corresponding microprogram) is data.

L distinguished by the address source, the fetching of the storage unit address provided by the PC is the instruction, and the fetching of the storage unit address provided by the instruction address code part is the operand.

Chapter 3 system bus

\1. What is a bus? What are the characteristics of bus transmission? In order to reduce the bus load, what characteristics should the components on the bus have?

Answer: p41 A bus is a transmission component shared by multiple components.

The characteristic of bus transmission is that only one channel of information can be transmitted on the bus at a certain time, that is, time-sharing.

In order to reduce the bus load, the components on the bus shall be connected with the bus through the three state drive buffer circuit.

\4. Why set bus optimization control? How many common centralized bus controls are there? What are the characteristics of each? Which method has the fastest response time? Which method is most sensitive to circuit failure?

Answer: bus optimization control solves the problem of allocation of use right when multiple components apply for bus at the same time;

There are three kinds of common centralized bus control: chain query, counter timing query and independent request;

Features: chain query mode, simple connection, easy expansion, and most sensitive to circuit faults; The priority setting of counter timing query mode is flexible, insensitive to faults, and the connection and control process are complex; The independent request mode is the fastest, but it has large amount of hardware devices, many connections and high cost.

\5. Explain the following concepts: bus width, bus bandwidth, bus multiplexing, bus master (or master module), bus slave (or slave module), bus transmission cycle and bus communication control.

Answer: P46.

Bus width: usually refers to the number of data buses;

Bus bandwidth: the data transmission rate of the bus, which refers to the number of bits of data transmitted on the bus in unit time;

Bus multiplexing: it means that the same signal line can transmit different signals in time division.

Main equipment (main module) of bus: refers to the equipment (module) with bus control right during a bus transmission;

Bus slave device (slave module): refers to the device (module) that cooperates with the master device to complete data transmission during a bus transmission. It can only passively accept the commands sent by the master device;

Bus transmission cycle: refers to the time required for the bus to complete a complete and reliable transmission;

Bus communication control: refers to the time cooperation mode of both parties in the process of bus transmission.

\6. Compare synchronous communication with asynchronous communication.

A: synchronous communication: it refers to the communication controlled by a unified clock. The control mode is simple and the flexibility is poor. When the working speed of various components in the system varies greatly, the bus working efficiency decreases significantly. Suitable for occasions with little speed difference.

Asynchronous communication: it refers to the communication without unified clock control. The components are connected by response. The control mode is more synchronous, complex and flexible. When the working speed of each component in the system varies greatly, it is conducive to improve the bus working efficiency.

\7. Draw a diagram to illustrate several interlocking relationships between request and answer in asynchronous communication

image-20211226210531110

1) Non interlocking mode:

After the master module sends the request signal, it does not have to wait for the response signal of the slave module, but after a period of time, after confirming that the slave module has received the request signal, it will cancel its request signal; After receiving the request signal from the module, an answer signal is sent when conditions permit, and after a period of time (the setting of this period of time is different for different devices) to confirm that the main module has received the answer signal, the answer signal is automatically cancelled.

2) Semi interlocking mode:

When the master module sends a request signal, it must cancel the request signal after receiving the answer signal from the slave module, which has an interlocking relationship; The slave module sends an answer signal after receiving the request signal, but does not have to wait to know that the request signal of the master module has been cancelled, but automatically cancels its answer signal after a period of time, without interlocking relationship. Because one party has an interlocking relationship and the other party has no interlocking relationship, it is called semi interlocking mode.

3) Full interlock mode:

After the master module sends the request signal, the request signal must be cancelled after the slave module answers; After the answer signal is sent from the module, the answer signal must be withdrawn after knowing that the request signal of the main module has been withdrawn. There is an interlocking relationship between the two sides, so it is called full interlocking mode.

\8. Why does semi synchronous communication retain the characteristics of synchronous communication and asynchronous communication at the same time?

A: semi synchronous communication can not only be controlled by a unified clock like synchronous communication, but also allow inconsistent transmission time like asynchronous communication. Therefore, the work efficiency is between the two.

3.14Let the clock frequency of the bus be 8MHz, and one bus cycle is equal to one clock cycle. If 16 bit data is transmitted in parallel in a bus cycle, what is the bandwidth of the bus?
Solution; Bus width = 16 bits / 8 = 2B bus bandwidth = 8MHz × 2B =16MB/s

3.15In a 32-bit bus system, the clock frequency of the bus is 66MHz. Assuming that the shortest transmission cycle of the bus is 4 clock cycles, try to calculate the maximum data transmission rate of the bus. What measures can be taken to improve the data transmission rate?
Solution 1: bus width = 32 bits / 8 = 4B clock cycle = 1 / 66MHz = 0.015 µ s
Minimum bus transmission cycle = 0.015 µ s × 4 =0.06µs
Maximum data transfer rate of bus = 4B / 0.06 µ s = 66.67mb/s

Solution 2: bus working frequency = 66MHz / 4 = 16.5mhz, bus maximum data transmission rate = 16.5mhz × 4B =66MB/s
If you want to improve the data transmission rate of the bus, you can increase the clock frequency of the bus, reduce the number of clocks in the bus cycle, or increase the bus width.

3.16In asynchronous serial transmission system, the character format is: 1 start bit, 8 data bits, 1 check bit and 2 end bits. If 120 characters are required to be transmitted per second, try to find the baud rate and bit rate.
Solution: one frame = 1 + 8 + 1 + 2 = 12 bit baud rate = 120 frames / S × 12 bits = 1440 baud
Bit rate = 1440 baud × (8 / 12) = 960bps or: bit rate = 120 frames / second × 8 =960bps

Chapter 4 memory

\1. Explain concepts: main memory, auxiliary memory, cache, ram, SRAM, DRAM, ROM, prom, EPROM, EEPROM, CDROM, flash memory.

A: main memory: main memory, used to store programs and data being executed. The CPU can directly read and write randomly, and the access speed is high.

Auxiliary memory: auxiliary memory, which is used to store programs and data that are not currently executed, as well as some information that needs to be permanently saved.

Cache: cache memory, between CPU and main memory, used to solve the speed mismatch between CPU and main memory.

Ram: semiconductor random access memory, mainly used as the main memory in computers.

SRAM: Static semiconductor random access memory.

DRAM: dynamic semiconductor random access memory.

ROM: Mask semiconductor read only memory. The contents are written by the chip manufacturer at the time of manufacturing, and can only be read out but not written in the future.

Prom: programmable read-only memory, which is determined by the user according to the needs. It can only be written once.

EPROM: ultraviolet erasable programmable read only memory. When the content needs to be modified, erase all its contents now, and then program. Erasure relies on ultraviolet light to leak the charge on the floating gate.

EEPROM: electrically erasable programmable read only memory.

CDROM: read only disc.

Flash memory: flash memory. Or flash memory.

\3. Where is the memory hierarchy mainly reflected? Why divide these levels? How do computers manage these levels?

A: the memory hierarchy is mainly reflected in cache main memory and main memory auxiliary memory.

Cache main memory hierarchy mainly accelerates CPU memory access in the storage system, that is, from the analysis of the overall operation effect, the CPU memory access speed is accelerated, which is close to the speed of cache, while the addressing space and bit price are close to main memory.

The main memory auxiliary memory level mainly plays the role of capacity expansion in the storage system, that is, from the perspective of the programmer, the capacity and bit price of the memory he uses are close to the auxiliary memory, and the speed is close to the main memory.

Combining the functions of the above two storage levels, from the perspective of the whole storage system, the optimization effect of high speed, large capacity and low bit price is achieved.

The information scheduling function between main memory and cache is automatically completed by hardware. At present, the scheduling of main memory and auxiliary memory is widely realized by virtual storage technology, that is, a part of main memory and auxiliary memory form virtual memory through the combination of software and hardware. Programmers can use this virtual address space (logical address space) which is much larger than the actual space (physical address space) of main memory for programming. When the program runs, it is composed of software The hardware automatically completes the conversion between the virtual address space and the actual physical space of the main memory. Therefore, scheduling or transformation operations at these two levels are transparent to programmers.

\4. Explain the difference between access cycle and access time.

Solution: the main difference between access cycle and access time is that the access time is only the time to complete one operation, and the access cycle includes not only the operation time, but also the recovery time of the line after operation. Namely:

Access cycle = access time + recovery time

\5. What is memory bandwidth? If the data bus width of the memory is 32 bits and the access cycle is 200ns, what is the memory bandwidth?

Solution: the bandwidth of memory refers to the maximum amount of information in and out of memory in unit time.

Memory bandwidth = 1 / 200ns × 32 bits = 160m bits / second = 20MB / second = 5m words / second

Note: the word length is 32 bits, not 16 bits. (Note: 1ns = 10-9s)

\6. The word length of a machine is 32 bits and its storage capacity is 64KB. What is the addressing range of word addressing? If main memory is addressed in bytes, try to draw the allocation of word address and byte address of main memory.

Solution: when the storage capacity is 64KB, the addressing range by byte is 64K,

If addressing by word, the addressing range is 64K / (32 / 8) = 16K

Allocation of main memory word address and byte address: as shown in the figureimg

\7. One with a capacity of 16K × What is the sum of address lines and data lines of 32-bit memory? How many memory chips are required when the following different specifications of memory chips are selected?

1K × 4-bit, 2K × 8-bit, 4K × 4-bit, 16K × 1 bit, 4K × 8-bit, 8K × 8 bits

Solution: sum of address lines and data lines = 14 + 32 = 46;

When selecting different chips, the number of chips required is:

1K × 4:(16K × 32) / (1K × 4) = 16 × 8 = 128 pieces

2K × 8:(16K × 32) / (2K × 8) = 8 × 4 = 32 pieces

4K × 4:(16K × 32) / (4K × 4) = 4 × 8 = 32 pieces

16K × 1:(16K × 32)/ (16K × 1) = 1 × 32 = 32 pieces

4K × 8:(16K × 32)/ (4K × 8) = 4 × 4 = 16 pieces

8K × 8:(16K × 32) / (8K × 8) = 2 × 4 = 8 pieces

\9. What is refresh? Why refresh? Description there are several ways to refresh.

Solution: refresh: all rewriting processes of DRAM regularly;

Refresh reason: the attenuation of information stored in DRAM caused by capacitor leakage needs to be supplemented in time, so regular refresh operation is arranged;

There are three common refresh methods: centralized, decentralized and asynchronous.

Centralized: within the maximum refresh interval, a period of time is centrally scheduled for refresh, and there is a dead time for CPU access.

Decentralized: insert a refresh cycle after each read / write cycle, without CPU access dead time.

Asynchronous: a compromise between centralized and decentralized.

\10. How many decoding and driving modes are there for semiconductor memory chips?

Solution: there are two decoding driving modes of semiconductor memory chip: line selection method and repetition method.

Line selection method: the address decoding signal only selects all bits of the same word, which has simple structure and costs equipment;

Rerouting method: address branch and column are decoded, and the intersection of row and column decoding lines is the selected unit. This method uses the coincidence of row and column decoded signals to locate the location, also known as matrix decoding. It can greatly save equipment consumption and is the most commonly used decoding driving mode.

\11. One 8K × The internal structure of the 8-bit dynamic RAM chip is arranged in 256 bits × 256 format with an access cycle of 0.1 μ s。 What is the refresh interval of centralized refresh, decentralized refresh and asynchronous refresh?

Solution: the decentralized refresh mode is adopted. The refresh interval is 2ms, in which the refresh dead time is 256 × zero point one μ s=25.6 μ s

Decentralized refresh mode is adopted, and the refresh interval is 256 × (0.1 μ s+ × zero point one μ s)=51.2 μ s

Asynchronous refresh mode is adopted, and the refresh interval is 2ms

\15. The CPU has 16 address lines and 8 data linesimg(active at low level) as memory access control signal,imgMake read-write command signal (high level is read and low level is write). The following memory chips are available: ROM (2k) × 8-bit, 4K × 4-bit, 8K × 8-bit), RAM (1K) × 4-bit, 2K × 8-bit, 4K × 8 bits), and 74138 decoder and other gate circuits (gate circuit self-determination). Try to select the appropriate chip from the above specifications and draw the connection diagram between CPU and memory chip. requirement:

(1) The minimum 4K address is the system program area, and the address range of 4096 ~ 16383 is the user program area.

(2) Indicate the type and quantity of memory chips selected.

(3) Draw the film selection logic in detail.

Solution: (1) address space allocation diagram:

System program area (ROM 4KB in total): 0000h-0fffh

User program area (RAM 12KB in total): 1000h-3fffh

(2) Chip selection: ROM: 4K × 4-bit chip, 2-bit parallel

Ram: select 4K × 3 8-bit chips in series (ram1 address range: 1000h-1fffh, ram2 address range: 2000h-2fffh, ram3 address range: 3000h-3fffh)

(3) The binary addresses of each chip are allocated as follows:

A15 A14 A13 A12 A11 A10 A9 A8 A7 A6 A5 A4 A3 A2 A1 A0
ROM1,2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
RAM1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
RAM2 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1
RAM3 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1

CPU and memory connection logic diagram and chip selection logic are shown in the following figure (3):

imgFigure (3)

\17. Write the Hamming codes corresponding to 1100, 1101, 1110 and 1111.

Solution: the effective information is n = 4 bits, assuming that the effective information is represented by b4b3b2b1

Check digit k = 3 digits, (2k > = n + K + 1)

If the check bits are C1, C2 and C3 respectively, the Hamming code has 4 + 3 = 7 bits, i.e. c1c2b4c3b3b2b1

The check bit is in the 1st, 2nd and 4th bits respectively in Hamming code

c1=b4⊕b3⊕b1

c2=b4⊕b2⊕b1

c3=b3⊕b2⊕b1

When the valid information is 1100, c3c2c1 = 110 and Hamming code is 0111100.

When the valid information is 1101, c3c2c1 = 001, and the Hamming code is 1010101.

When the valid information is 1110, c3c2c1 = 000 and the Hamming code is 0010110.

When the valid information is 1111, c3c2c1 = 111 and Hamming code is 1111111.

\18. It is known that the received Hamming codes (configured according to the principle of spouse) are 1100100, 1100111, 1100000 and 1100001. Check whether the above codes are wrong? Who is wrong?

Solution: suppose the received Hamming code is: c1’c2’b4’c3’b3’b2’b1 ‘

The error correction process is as follows:

P1=c1’⊕b4’⊕b3’⊕b1’

P2=c2’⊕b4’⊕b2’⊕b1’

P3=c3’⊕b3’⊕b2’⊕b1’

If the Hamming code received is 1100100, p3p2p1 = 011, indicating that the code is wrong, the third bit (B4 ‘) is wrong, and the valid information is 1100

If the received Hamming code is 1100111, p3p2p1 = 111, indicating that the code is wrong, the 7th bit (B1 ‘) is wrong, and the valid information is 0110

If the Hamming code received is 1100000, then p3p2p1 = 110, indicating that the code is wrong, the 6th bit (B2 ‘) is wrong, and the valid information is 0010

If the received Hamming code is 1100001, p3p2p1 = 001, indicating that the code is wrong, the first bit (C1 ‘) is wrong, and the valid information is 0001

\24. For a 4-bit low interleaved memory, assuming that the storage cycle is t, the CPU starts a memory every 1 / 4 access cycle. How many access cycles are required to access 64 words in turn?

Solution: the bus transmission cycle of 4-body low-order crossed memory is τ,τ= T / 4, the time required to access 64 words in turn is:

t=T+(64-1) τ=T+63T/4=16.75T

\25. What is “locality of program access”? Which level in the storage system uses the locality principle of program access?

A: the locality principle of program operation means that the recently accessed programs and data are likely to be accessed again in a short period of time; Spatially, these accessed programs and data are often concentrated in a small storage area; In terms of access order, instruction sequential execution is more likely than transfer execution (about 5:1). The cache main memory level and main memory auxiliary memory level in the storage system adopt the locality principle of program access.

\28. Set the main memory capacity as 256K words, the cache capacity as 2K words and the block length as 4.

(1) Design the cache address format. How many blocks of data can be loaded in the cache?

(2) In the direct mapping mode, the main memory address format is designed.

(3) The main memory address format is designed under the four-way group associated mapping mode.

(4) In the fully associative mapping mode, the main memory address format is designed.

(5) If the storage word length is 32 bits, the memory is addressed by bytes, and the address format of main memory under the above three mapping modes is written.

Solution: (1) the cache capacity is 2K words and the block length is 4. There are 2K / 4 = 211 / 22 = 29 = 512 blocks in total,

The cache word address is 9 bits, and the address in the word block is 2 bits

Therefore, the cache address format is designed as follows:

Cache block address (9 bits) Address in block (2 bits)

(2) The main memory capacity is 256K words = 218 words, and the main memory address has 18 bits in total, which is divided into 256K / 4 = 216 blocks,

The main memory word block is marked as 18-9-2 = 7 bits.

The format of main memory address in direct mapping mode is as follows:

Main memory block mark (7 bits) Cache block address (9 bits) Address in block (2 bits)

(3) According to the condition that four groups are connected, there are four blocks in a group, and the cache is divided into 512 / 4 = 128 = 27 groups,

The main memory word block is marked as 18-7-2 = 9 bits, and the main memory address format is designed as follows:

Main memory block mark (9 bits) Group address (7 bits) Address in block (2 bits)

(4) In the fully associative mapping mode, the main memory word block is marked as 18-2 = 16 bits, and its address format is as follows:

Main memory block mark (16 bits) Address in block (2 bits)

(5) If the storage word length is 32 bits and the memory is addressed by bytes, the main memory capacity is 256K * 32 / 4 = 221B,

The cache capacity is 2K32 / 4 = 214b, block length is 432 / 4 = 32B = 25B, the address in the word block is 5 bits,

In the direct mapping mode, the main memory word block is marked as 21-9-5 = 7 bits, and the main memory address format is:

Main memory block mark (7 bits) Cache block address (9 bits) Address in block (5 bits)

In the four-way group associative mapping mode, the main memory word block is marked as 21-7-5 = 9 bits, and the main memory address format is:

Main memory block mark (9 bits) Group address (7 bits) Address in block (5 bits)

In the fully associative mapping mode, the main memory word block is marked as 21-5 = 16 bits, and the main memory address format is:

Main memory block mark (16 bits) Address in block (5 bits)

\29. Suppose that the CPU accesses the cache 4800 times and accesses the main memory 200 times when executing a certain program. It is known that the cache access cycle is 30ns and the main memory access cycle is 150ns. How many times has the performance of the system been improved by calculating the cache hit rate and the average access time and efficiency of the cache main memory system?

Solution: cache access hit rate: 4800 / (4800 + 200) = 24 / 25 = 96%

Then the average access time of the cache main memory system is: TA = 0.9630ns+(1-0.96)150ns=34.8ns

The access efficiency of cache main memory system is e = TC / TA100%=30/34.8100%=86.2%

The performance is 4.31 times higher than the original 150ns / 34.8ns, that is, 3.31 times higher.

\30. The cache of a group connected mapping consists of 64 blocks, and each group contains 4 blocks. The main memory contains 4096 blocks, each block is composed of 128 words, and the access address is the word address. How many bits are the addresses of main memory and high-speed memory? Draw the main memory address format.

Solution: the number of cache groups: 64 / 4 = 16, the cache capacity: 64 * 128 = 213 words, and the cache address is 13 bits

The main memory is divided into 4096 / 16 = 256 areas, with 16 blocks in each area

The main memory capacity is 4096 * 128 = 219 words, the main memory address is 19 bits, and the address format is as follows:

Main memory block mark (8 bits) Group address (4 bits) Address in block (7 bits)

\32. Suppose that the main memory capacity of a machine is 4MB and the cache capacity is 16kb. There are 8 words per word block and 32 bits per word. Design a cache organization of four-way group associated mapping (that is, there are 4 word blocks in each cache group).

(1) Draw the number of bits of each segment in the main memory address field.

(2) Let the initial state of the cache be empty, and the CPU reads 90 words from units 0, 1, 2,…, 89 of the main memory in turn (one word at a time in the main memory), and repeats it eight times in this order. What is the hit rate?

(3) If the speed of cache is 6 times that of main memory, how many times does the speed increase with cache compared with without cache?

Solution: (1) according to the fact that each word block has 8 words and 32 bits (4 bytes) per word, the address in the word block in the main memory address field is 3 + 2 = 5 bits.

According to the cache capacity of 16kb = 214b and the word block size of 8 * 32 / 8 = 32 = 25B, the cache address has 14 bits in total and 214-5 = 29 blocks in total.

According to the four way group association mapping, the cache is divided into 29 / 22 = 27 groups.

According to the main memory capacity of 4MB = 222b, the main memory address has 22 bits in total, and the main memory word block is marked as 22-7-5 = 10 bits, so the main memory address format is:

Main memory block mark (10 bits) Group address (7 bits) Address in block (5 bits)

(2) Since there are 8 words in each word block and the initial state is empty, when the CPU reads unit 0, it misses and must access the main memory. At the same time, it calls the main memory block of the word into any block in group 0 of the cache, and then the CPU hits when reading units 1 ~ 7. Similarly, the CPU misses when reading numbers 8, 16,…, 88. It can be seen that the CPU misses 12 times in reading 90 words continuously, and then hits all 90 words in 8 cycles. The hit rate is:

img

(3) If the cache cycle is t, the main memory cycle is 6T, and the access time without cache is 6T908. The access time with cache is t (90)8-12)+6t12. Compared with no cache, the multiple of speed improvement is:

img

Chapter 5 input and output system

\1. What are the addressing methods of I / O? What are the characteristics of each?

Solution: there are two common I / O addressing methods: I / O and memory unified addressing and I / O independent addressing.

Features: the I / O address of the unified addressing mode of I / O and memory adopts the same format as the address of the main memory unit. The I / O device and main memory occupy the same address space. The CPU can access the I / O device like accessing main memory without arranging special I / O instructions.

In the I / O independent addressing mode, the machine specially arranges a set of address codes completely different from the main memory address format for the I / O equipment. At this time, the I / O address and the main memory address are two independent spaces, and the CPU needs to access the I / O address space through special I / O instructions.

3. What are the control modes when I / O equipment and host exchange information? Describe their characteristics

Answer: (1) program query method. Its characteristic is that the host works in serial with I / O. After the CPU starts the I / O, it will always query whether the I / O is ready. If the equipment is ready, the CPU will turn to the program for processing the information transmitted between the I / O and the host; If the device is not ready, the CPU queries repeatedly until I / O is ready. It can be seen that the CPU efficiency in this way is very low

(2) Program interrupt mode. Its characteristic is that the host works in parallel with I / O. After the CPU starts I / O, it does not need to query whether the I / O is ready at any time, but continues to execute the program when the I / O is ready. Send an interrupt request signal to the CPU. The CPU responds to the interrupt request of I / O at an appropriate time and suspends the current program to serve I / O. This method eliminates the “stepping” phenomenon and improves the CPU efficiency (the characteristics are: the CPU works in parallel with the equipment, and the transmission works in serial with the main program)

(3) DMA mode. Its characteristic is that the host works in parallel with I / O, and there is a direct data path between main memory and I / O. After the CPU starts the I / O, it is not necessary to query whether the I / O is ready. When the I / O is ready, it sends a DMA request. At this time, the CPU does not participate in the information exchange between the I / O and the main memory, but temporarily gives the use right of the external bus (address line, data line and relevant control line) to the DMA, and can still complete its own internal operations (such as addition, displacement, etc.), so it is not necessary to interrupt the current program, Only one access cycle needs to be suspended to access memory (i.e. cycle misappropriation), and the CPU has high efficiency (characterized by: the CPU works in parallel with the device, and the transmission works in parallel with the main program)

(4) Channel mode. Channel is a processor with special functions. The CPU delegates some power to the channel, which realizes the unified management of peripheral devices and the data exchange between peripheral devices and main memory, which greatly improves the efficiency of CPU, but it costs more hardware.

(5) I / O processor mode. It is the further development of the channel mode. The CPU gives all the management rights of I / O operation and peripheral equipment to the I / O processor. Its essence is a multi machine system, so the efficiency is greatly improved

\10. What is an I / O interface and how is it different from a port? Why should I / O interfaces be set? How are I / O interfaces classified?

Solution: I / O interface generally refers to the connecting part between CPU and I / O equipment, and port refers to the register that can be accessed by CPU in I / O interface. Port plus corresponding control logic constitutes I / O interface.

There are many I / O interface classification methods, mainly including:

(1) According to the data transmission mode, there are parallel interface and serial interface;

(2) According to the control mode of data transmission, there are program control interface, program interrupt interface and DMA interface.

\12. Describe its working process in combination with the interface circuit of program query mode.

Solution: the working process of program query interface is as follows (take input as an example):

1) The CPU sends the I / O address and the device starts working; Address bus ® Interface ® Device selector decoding ® Select ® Send sel signal; 2) CPU sends start command DBR ® Open the command receiving door; ® D is set to 0, B is set to 1 ® The interface sends a start command to the equipment; 3) The CPU waits and the input device reads out the data; 4) When the peripheral work is completed, set B to 0 and D to 1; 5) Ready signal ® Interface ® Completion signal ® Control bus ® CPU; 6) Input: the CPU takes the data in the DBR through the input instruction (in).

In the case of output, other operations are similar to input except that the data transmission direction is opposite. The working process is as follows:

Open the command receiving door; ® Select and send sel signal ® Device selector decoding ® Interface ® Address bus ® 1) CPU sends I / O address (2) output: CPU puts data into interface DBR through output instruction (out); The equipment starts working; ® The interface sends a start command to the device ® D is set to 0, B is set to 1 ® 3) The CPU sends the start command 4) the CPU waits, and the output device takes the data from the DBR; B is set to 0, D is set to 1; ® Interface ® 5) When the peripheral work is completed, the signal CPU is completed, and the CPU can output data to the interface DBR again through instructions for the second transmission. ® Control bus ® 6) Ready signal.

\13. Explain the difference and relationship between interrupt vector address and entry address.

Solution: difference between interrupt vector address and entry address:

The vector address is the memory address number of the interrupt source generated by the hardware circuit (vector encoder), and the interrupt entry address is the first address of the interrupt service program.

Interrupt the connection between vector address and entry address:

The interrupt vector address can be understood as the interrupt service program entry address indicator (the address of the entry address), through which the interrupt service program entry address can be obtained. (two methods: put a JMP instruction in the unit indicated by the vector address; set the vector address table in the main memory. Refer to 8.4.3)

\16. Under what conditions and at what time can the CPU respond to I / O interrupt requests?

Solution: the condition and time for the CPU to respond to the I / O interrupt request are: when the interrupt allowed status is 1 (EINT = 1) and at least one interrupt request is found, the CPU will respond to the interrupt when an instruction is executed.

\28. Is the response time of CPU to DMA request and interrupt request the same? Why?

Solution: the CPU’s response time to DMA requests and interrupt requests is different. Because the switching speeds of the two methods are very different, the CPU must query and respond to DMA requests at a shorter time interval. The response to the interrupt request is at the end of each instruction execution cycle, and the response to the DMA request is at the end of the access cycle.

The interrupt mode is program switching, and the program is composed of instructions, so the interrupt request can only be responded to after the execution of an instruction. Moreover, the CPU sends a query signal only at the end of each instruction execution cycle to obtain the interrupt request signal. If the conditions are met at this time, it can respond to the interrupt request.

DMA request is that the DMA interface applies to the CPU to occupy the bus according to the working state of the device. At this time, as long as the bus is not occupied by the CPU, it can immediately respond to the DMA request; If the bus is being occupied by the CPU, the CPU must wait until the end of the access cycle before handing over the right to use the bus.

\31. Suppose that the highest frequency of a device transmitting information to the CPU is 40000 times / s, and the execution time of the corresponding interrupt processing program is 40ms, ask whether the peripheral can exchange information with the host in the way of program interrupt, and why?

Solution: the time interval for the device to transmit information to the CPU = 1 / 40000 = 0.025 × 10-3=25 m s < 40ms

Then: the peripheral cannot exchange information with the host in the form of program interrupt, because the execution speed of the interrupt handler is slower than that of the peripheral.

\32. Set the speed of disk memory as 3000 rpm, divided into 8 sectors, each sector stores 1K bytes, and the width of data transmission between main memory and disk memory is 16 bits (i.e. 16 bits per transmission). Assuming that the maximum execution time of an instruction is 25ms, can the scheme of responding to DMA requests at the end of an instruction execution be adopted? Why? If not, what should be done?

Solution: first calculate the disk transfer speed, and then compare it with the instruction execution speed to draw a conclusion.

Channel capacity = 1K × eight × 8 bits = 8KB = 4K words

Data rate = 4K words × 3000 rpm = 4K words × 50 R / S = 200K words / S

Transmission time of one word = 1 / 200K s » 5ms (Note: 1K = 1024 here, from the abbreviation of data block unit.)

Because 5 ms < < 25ms, one instruction cannot be used to end the response to DMA requests. The solution of querying and responding to DMA requests at the end of each CPU machine cycle should be adopted (usually CPU machine cycle = mm access cycle).

Chapter 6 computer operation methods

\5. If [x] complement is known, find the original sum of [x] and X.

[X1] supplement = 1.1100; [x2] supplement = 1.1001; [X3] supplement = 0.1110; [X4] supplement = 1.0000;

[X5] supplement = 10101; [X6] supplement = 11100; [X7] supplement = 00111; [X8] supplement = 10000;

The corresponding relationship between [x] complement and [x] original and X is as follows:

[x] Supplement 1.1100 1.1001 0.1110 1.0000 1,0101 1,1100 0,0111 1,0000
[x] Original 1.0100 1.0111 0.1110 nothing 1,1011 1,0100 0,0111 nothing
x -0.0100 -0.0111 0.1110 -1 -1011 -100 0,0111 -10000

\9. When hexadecimal numbers 9b and FF are expressed as original code, complement code, inverse code, code shift and unsigned number respectively, what are the corresponding decimal numbers (assuming that the machine number adopts one sign bit)?

Solution: the correspondence between the true value and the number of machines is as follows:

9BH Original code Complement Inverse code Code shift Unsigned number
Corresponding decimal number -27 -101 -100 +27 155
FFH Original code Complement Inverse code Code shift Unsigned number
Corresponding decimal number -128 -1 -0 +128 256

\14. Set the length of floating-point number as 32 bits. To represent a decimal number between ± 60000, under the condition of ensuring the maximum accuracy of the number, except that the order symbol and the number symbol take 1 bit respectively, how many bits do the order code and mantissa take respectively? According to this allocation, what are the conditions for the floating point overflow?

Solution: to ensure the maximum accuracy of the number, take the base value of the order code = 2.

To represent a decimal number between ± 60000, since 32768 (215) < 60000 < 65536 (216), the order code should take 5 digits in addition to the order sign (take the power of 2 upward).

Therefore: mantissa digits = 32-1-1-5 = 25 digits

25 (32) the floating point number format is as follows:

Order sign (1 bit) Order code (5 bits) Number sign (1 bit) Mantissa (25 digits)

In this format, the floating point overflow condition is: order code ³ twenty-five

\19. Set the length of the machine number as 8 digits (including 1 symbol bit), and use the complement operation rules to calculate the following questions.

(1) A = 9 / 64, B = – 13 / 32, find a + B.

(2) A = 19 / 32, B = – 17 / 128, find a-b.

(3) A = – 3 / 16, B = 9 / 32, find a + B.

(4) A = – 87, B = 53, find a-b.

(5) A = 115, B = – 24, find a + B.

Solution: (1) a = 9 / 64 = 0.001 0010b, B = -13 / 32 = -0.011 0100b

[a] supplement = 0.001 0010, [b] supplement = 1.100 1100

[a + b] makeup = 0.0010010 + 1.1001100 = 1.1011110 – no overflow

A+B= -0.010 0010B = -17/64

(2)A=19/32= 0.100 1100B, B= -17/128= -0.001 0001B

[a] complement = 0.100 1100, [b] complement = 1.110 1111, [- b] complement = 0.001 0001

[A-B] makeup = 0.1001100 + 0.0010001 = 0.1011101 – no overflow

A-B= 0.101 1101B = 93/128B

(3)A= -3/16= -0.001 1000B, B=9/32= 0.010 0100B

[A] Makeup = 1.110 1000, [b] makeup = 0.010 0100

[a + b] makeup = 1.1101000 + 0.0100100 = 0.0001100 – no overflow

A+B= 0.000 1100B = 3/32

(4) A= -87= -101 0111B, B=53=110 101B

[A] Make up = 1 010 1001, [b] make up = 0 011 0101, [- b] make up = 1 100 1011

[A-B] makeup = 1 0101001 + 1 1001011 = 0 1110100 – overflow

(5)A=115= 111 0011B, B= -24= -11 000B

[A] Supplement = 0 1110011, [b] supplement = 1110 1000

[a + b] makeup = 0 1110011 + 1 1101000 = 0 1011011 – no overflow

A+B= 101 1011B = 91

26. Calculate [x ± y] complement according to the floating-point operation steps of machine complement

(1)x=2-011× 0.101 100,y=2-010×(-0.011 100);

(2)x=2-011×(-0.100 010),y=2-010×(-0.011 111);

(3)x=2101×(-0.100 101),y=2100×(-0.001 111)。

Solution: first convert X and Y into machine number form:

(1)x=2-011× 0.101 100,y=2-010×(-0.011 100)

[x] Make up = 1101; 0.101 100, [y] supplement = 1110; 1.100 100

[ex] complement = 1101, [y] complement = 1110, [MX] complement = 0.101 100, [my] complement = 1.100

1) Opposite order:

[de] supplement = [ex] supplement + [- ey] supplement = 11101 + 00010 = 11111 < 0,

If ex should be aligned with ey, then: [ex] complement + 1 = 11101 + 00001 = 11110 = [ey] complement

[x] Make up = 1110; 0.010 110

2) Mantissa operation:

[MX] supplement + [my] supplement = 0.010 110 + 11.100 100 = 11.111010

[MX] supplement + [- my] supplement = 0.010 110 + 00.011100 = 00.110 010

3) Result normalization:

[x + y] complement = 11110; 11.111 010 = 11,011; 11.010 000 (mantissa left gauge 3 times, order code minus 3)

[X-Y] supplement = 11110; 00.110 010, is the normalized number.

4) Rounding: None

5) Overflow: None

Then: x + y = 2-101 × (-0.110 000)

x-y =2-010×0.110 010

(2)x=2-011×(-0.100010),y=2-010×(-0.011111)

[x] Make up = 1101; 1.011 110, [y] supplement = 1110; 1.100 001

1) Opposite order: the process is the same as (1) of (1), then

[x] Make up = 1110; 1.101 111

2) Mantissa operation:

[MX] complement + [my] complement = 11.101111 + 11 100001 = 11.010000

[MX] supplement + [- my] supplement = 11.101111 + 00.011111 = 00.001110

3) Result normalization:

[x + y] complement = 11110; 11.010 000, which is a normalized number

[X-Y] supplement = 11110; 00.001 110 =11,100; 00.111000 (mantissa left gauge twice, order code minus 2)

4) Rounding: None

5) Overflow: None

Then: x + y = 2-010 × (-0.110 000)

x-y =2-100×0.111 000

(3)x=2101×(-0.100 101),y=2100×(-0.001 111)

[x] Make up = 0101; 1.011, [y] supplement = 0100; 1.110 001

1) Opposite order:

[de] complement = 00101 + 11100 = 00001 > 0, ey should be aligned to ex, then:

[ey] supplement + 1 = 00100 + 00001 = 00101 = [ex] supplement

[y] Make up = 0101; 1.111 000(1)

2) Mantissa operation:

[MX] supplement + [my] supplement = 11.011011 + 11.111000 (1) = 11.010011 (1)

[MX] supplement + [- my] supplement = 11.011011 + 00.000111 (1) = 11.100010 (1)

3) Result normalization:

[x + y] supplement = 00101; 11.010 011 (1), is a normalized number

[X-Y] supplement = 00101; 11.100 010(1)=00,100; 11.000 101 (mantissa left gauge 1 time, order code minus 1)

4) Rounding:

[x + y] supplement = 00101; 11.010 011 (house)

[X-Y] complement invariant

5) Overflow: None

Then: x + y = 2101 × (-0.101 101)

x-y =2100×(-0.111 011)

Chapter 7 instruction system

1. What is machine instruction? What is an instruction system? Why is there a close relationship between the instruction system and the main functions of the machine and the hardware structure?

Answer: refer to P300.

3. What are instruction word length, machine word length and storage word length?

Answer: Chapter 1 question 7

6. For two address instructions, where can the physical address of the operand be arranged? Give an example.

A: for two address instructions, the physical address of the operand can be arranged in registers, instructions or memory units.

\8. The instruction word of a machine is 16 bits long, and the address code of each operand is 6 bits. Assuming that the length of the opcode is fixed, the instruction is divided into three formats: zero address, one address and two address. If there are m zero address instructions and n one address instructions, how many two address instructions are there at most? If the number of opcode bits is variable, how many two address instructions are allowed at most?

Solution: 1) if fixed length opcode is adopted, the format of two address instruction is as follows:

Op (4 bits) A1 (6 bits) A2 (6 bits)

If there are k kinds of two address instructions, then k = 24-m-n

When m = 1 (minimum value) and N = 1 (minimum value), there are at most two address instructions: kmax = 16-1-1 = 14

2) If the variable length opcode is used, the format of the two address instruction is still as shown in 1), but the length of the opcode can vary with the number of address codes. At this time, k = 24 – (n / 26 + m / 212);

When (n / 26 + m / 212) £ 1 (n / 26 + m / 212 rounded up), K is the largest, then the two address instructions have at most:

Kmax = 16-1 = 15 kinds (only one code is left for extension mark.)

\16. The main memory capacity of a machine is 4m ´ 16 bits, and the storage word length is equal to the instruction word length. If the machine’s instruction system can complete 108 operations, the number of operation codes is fixed, and has six addressing modes: direct, indirect, index, base address, relative and immediate, try to answer: (1) draw an address instruction format and point out the role of each field;

(2) The maximum range of direct addressing of the instruction;

(3) Addressing range of primary address and multiple address;

(4) Range of immediate numbers (decimal representation);

(5) Displacement of relative addressing (decimal representation);

(6) Which of the above six addressing modes has the shortest execution time? Which is the longest? Why? Which facilitates program floating? Which is the best way to deal with array problems?

(7) How to modify the instruction format so that the addressing range of the instruction can be expanded to 4m?

(8) What can be done to enable a branch instruction to be transferred to any location in main memory? Brief description.

Solution: (1) single word long one address instruction format:

Op (7 bits) M (3 bits) A (6 bits)

OP is the operation code field, 7 bits in total, which can reflect 108 operations;

M is the addressing mode field, with 3 bits in total, which can reflect 6 addressing operations;

A is the address code field, 16-7-3 = 6 bits in total.

(2) The maximum range of direct addressing is 26 = 64.

(3) Since the storage word length is 16 bits, the addressing range of primary inter address is 216; If there are multiple inter addresses, the highest bit of the storage word is required to distinguish whether to continue indirect addressing, so the addressing range is 215.

(4) The range of immediate number is – 32-31 (signed number), or 0-63 (unsigned number).

(5) The displacement of relative addressing is – 32-31.

(6) Among the above six addressing modes, since the immediate number is given directly by the instruction, the execution time of the immediate addressing instruction is the shortest. Indirect addressing requires multiple memory accesses in the execution stage of the instruction (one indirect addressing requires two memory accesses, and multiple indirect addressing requires multiple memory accesses), so the execution time is the longest. Index addressing because the content of the index register is given by the user, and it is allowed to be modified by the user during the execution of the program, and its formal address is always unchanged, the instruction of index addressing is convenient for the user to program to deal with array problems. The effective address of the relative addressing operand is only a certain displacement from the current instruction address, which is more conducive to program floating than direct addressing.

(7) Scheme 1: in order to expand the addressing range of the instruction to 4m, 22 bits of effective address are required. At this time, the format of single word length one address instruction can be changed to double word length, as shown in the figure below:

Op (7 bits) Mod (3 bits) A (high 6 bits)
A (lower 16 bits)

Scheme 2: if the single word long instruction (16 bit) format is still adopted, in order to expand the instruction addressing range to 4m, it can be realized through the segment addressing scheme. The arrangement is as follows:

The hardware sets the segment register DS (16 bits) to store the segment address. After completing the addressing operation specified by the instruction addressing mode, the effective address EA (6 bits) is obtained, and then the hardware automatically completes the segment addressing, and finally obtains the 22 bit physical address. Namely: physical address = (DS) ’26 + EA

Note: segment addressing mode is implicitly implemented by hardware. After the addressing process specified by programming is completed and EA is generated, it is automatically completed by hardware and transparent to users.

Scheme 3: when the single word long instruction (16 bit) format is adopted, the instruction addressing range can also be expanded to 4m through the page addressing scheme. The arrangement is as follows:

The hardware sets the page register pr (16 bits) to store the page address. Page addressing is added in the instruction addressing mode. When it is necessary to expand the instruction addressing range to 4m, select the page addressing mode by programming, then: EA = (PR) ‖ a (effective address = page address “splicing” 6-bit form address), so as to obtain 22 bit effective address.

(8) In order to transfer a transfer instruction to any location in main memory, the addressing range must reach 4m. In addition to adopting the format of double word length one address instruction in scheme (7), 22 bit base register or 22 bit index register can also be configured so that EA = (BR) + a (BR is the 22 bit base register) or EA = (IX) + a (IX is the 22 bit index register), You can access 4m of storage space. The same effect can also be achieved by shifting the 16 bit base address register to the left by 6 bits and adding it to the formal address a.

In short, no matter what method is adopted, the final actual address should be 22 digits.

Chapter 8 structure and function of CPU

\1. What are the functions of CPU? Draw its structure block diagram and briefly explain the function of each component.

A: refer to p328 and figure 8.2.

\2. What is an instruction cycle? Does the instruction cycle have a fixed value? Why?

Solution: instruction cycle refers to the time required to fetch and execute an instruction.

Because the time required for the execution of various instructions in the computer varies greatly, in order to improve the CPU operation efficiency, even in the synchronous control machine, the instruction cycle length of different instructions is inconsistent, that is, the instruction cycle is not a fixed value for different instructions.

\3. Draw the flow chart of instruction cycle, analyze and explain the function of each sub cycle in the diagram.

Answer: refer to p343 and figure 8.8.

\5. What stage is before the interruption cycle? What is the stage after the interrupt cycle? What should the CPU do during the interrupt cycle?

Answer: before the interrupt cycle is the execution cycle, and after the interrupt cycle is the reference cycle. In the interrupt cycle, the CPU shall complete the work of saving the breakpoint, sending the interrupt vector to the PC and closing the interrupt.

\17. What are the functions of intr, int and EINT triggers in the interrupt system?

Solution: intr – interrupt request trigger, which is used to register the random interrupt request signal sent by the interrupt source, so as to provide a stable interrupt request signal for CPU to query interrupts and interrupt queue optimization lines.

EINT — interrupt permission trigger, the main interrupt switch in the CPU. When EINT = 1, it indicates that interrupt is allowed (on interrupt), and when EINT = 0, it indicates that interrupt is prohibited (off interrupt). Its status can be set by on, off and other commands.

Int — interrupt flag trigger, a part of the periodic state distribution circuit in the controller timing system, representing the interrupt cycle flag. When int = 1, enter the interrupt cycle and execute the operation of interrupt implicit instruction.

\24. There are four interrupt sources a, B, C and D, and their priority is arranged in the order of a, B, C and D from high to low. If the execution time of interrupt service program is 20 µ s, please draw the track of CPU execution program according to the time when the interrupt source requests interrupt given by the time axis shown in the figure below.

Solution: the response priority of a, B, C and D is the processing priority. The track diagram of CPU execution program is as follows:

img

\25. A machine has five interrupt sources l0, L1, L2, L3 and L4, which are sorted as l0 from high to low according to the priority of interrupt response ® L1 ® L2 ® L3 ® L4. According to the format shown below, the interrupt processing order is now required to be changed to L1 ® L4 ® L2 ® L0 ® L3, write the mask word of each interrupt source according to the following format.

Solution: the shielding status of each interrupt source is shown in the following table:

Interrupt source Shielded word
0 1 2 3 4
I0 1 0 0 1 0
I1 1 1 1 1 1
I2 1 0 1 1 0
I3 0 0 0 1 0
I4 1 0 1 1 1

In the table: set shielding bit = 1 to indicate shielding; Mask bit = 0, indicating that the interrupt is open.

Chapter 9 functions of control unit

\3. What are instruction cycles, machine cycles, and clock cycles? What is the relationship between the three?

A: the total time required for the CPU to fetch and execute an instruction is called the instruction cycle;

Machine cycle refers to the time required to execute a relatively complete operation (instruction step) in the instruction cycle in a synchronous controlled machine. Generally, the length of the machine cycle is arranged to be equal to the main memory cycle;

Clock cycle refers to the cycle time of the computer’s main clock. It is the most basic timing unit when the computer is running. It corresponds to the time required to complete a micro operation. Usually, the clock cycle is equal to the reciprocal of the computer’s main frequency.

\4. Can you say that the faster the dominant frequency of the machine, the faster the speed of the machine? Why?

Solution: it cannot be said that the faster the dominant frequency of the machine, the faster the speed of the machine. Because the speed of the machine is not only related to the main frequency, but also related to many factors, such as data path structure, timing allocation scheme, ALU computing power, instruction function strength and so on. It depends on the comprehensive effect.

\5. Suppose that the dominant frequency of machine a is 8MHz, the machine cycle includes 4 clock cycles, and the average instruction execution speed of the machine is 0.4mips, try to find the average instruction cycle and machine cycle of the machine, and how many machine cycles are included in each instruction cycle? If the dominant frequency of machine B is 12Mhz and the machine cycle also includes 4 clock cycles, what is the average instruction execution speed MIPS of machine B?

Solution: first calculate the average instruction cycle through the average instruction execution speed of machine a, then calculate the clock cycle through the main frequency, and then further calculate the machine cycle. The algorithm of machine B parameters is similar to that of machine a. The calculation is as follows:

Average instruction cycle of machine a = 1 / 0.4mips = 2.5 µ s

Clock cycle of unit a = 1 / 8MHz = 125ns

Machine cycle of machine a = 125ns × 4=500ns=0.5µs

Number of machine cycles included in each instruction cycle of machine a = 2.5 µ s ÷ 0.5 µ s = 5

Clock cycle of machine B = 1 / 12Mhz » 83ns

Machine cycle of machine B = 83ns × 4=332ns

Suppose that each instruction cycle of machine B also includes 5 machine cycles, then:

Average instruction cycle of machine B = 332ns × 5=1.66µs

Average command execution speed of machine B = 1 / 1.66 µ s = 0.6mips

Conclusion: the improvement of main frequency is conducive to the improvement of machine execution speed.

\6. Suppose the main frequency of a machine is 8MHz, each machine cycle contains an average of 2 clock cycles, and each instruction has an average of 4 machine cycles. How many MIPS is the average instruction execution speed of the machine? If the dominant frequency of the machine remains unchanged, but each machine cycle contains an average of 4 clock cycles, and each instruction has an average of 4 machine cycles, what is the average instruction execution speed MIPS of the machine? What conclusions can be drawn from this?

Solution: first calculate the clock cycle through the main frequency, then calculate the machine cycle and average instruction cycle, and finally calculate the average instruction execution speed through the reciprocal of the average instruction cycle. The calculation is as follows:

Clock cycle = 1 / 8MHz = 0.125 × 10-6s

Machine cycle = 0.125 × 10-6s × 2=0.25 × 10-6s

Average instruction cycle = 0.25 × 10-6s × 4=10-6s

Average instruction execution speed = 1 / 10-6s = 1mips

After parameter change: machine cycle = 0.125 × 10-6s × 4=0.5 × 10-6s

Average instruction cycle = 0.5 × 10-6s × 4=2 × 10-6s

Average instruction execution speed = 1 / (2) × 10-6s) =0.5MIPS

Conclusion: the execution speed of two machines with the same dominant frequency is not necessarily the same.

\11. Set the internal structure of the CPU as shown in Figure 9.4. In addition, there are six registers B, C, D, e, h and L. their respective input and output terminals are connected with the internal bus and controlled by the control signal respectively (for example, Bi is the input control of register B; Bo is the output control of register b). It is required to write all micro operation and control signals required to complete the following instructions from taking instructions.

(1)ADD B,C ((B)+(C) ®B)

(2)SUB A,H ((AC)-(H) ®AC)

Solution: first draw the flow chart of corresponding instructions, then decompose each step of data path operation in the diagram into corresponding micro operations, and then write micro commands with the same name.

(1) Add B, C instruction flow and micro command sequence are as follows:

img! [img]( file:///C:/Users/ Shallow / appdata / local / temp / msohtmlclip1 / 01 / clip_ image002. gif)

(2) Sub A and H instruction flow and micro command sequence are as follows:

imgimg

Chapter 10 design of control unit

\2. Write the micro operation and beat arrangement (including finger taking operation) to complete the following instructions.

(1) The instruction add R1, X completes the operation of adding the contents of the R1 register and the contents of the main memory x unit, and the result is stored in R1.

(2) The instruction isz x completes increasing the contents of the main memory x unit by 1, and if the result is 0, the execution of the next instruction will be skipped.

Solution: the CPU data path with single bus structure is set as shown in the figure below, and two registers C and D are set at Alu input (see Figure 17). Synchronous control is adopted, with 3 beats per week:

img

(1) The micromanipulation and beat of instructions add R1 and X are arranged as follows:

Finger picking cycle: t0 PC ® MAR,1 ® R

T1 M(MAR) ®MDR,PC+1®PC

T2 MDR®IR,OP(IR) ®ID

Execution cycle 1: t0 ad (IR) ® MAR,1 ® R

​ T1 M(MAR) ®MDR

T2 MDR®D

Execution cycle 2: t0 R1 ® C

T1 +

T2 ALU®R1

(2) Micro operation and beat arrangement of instruction isz X:

The finger taking period is the same as (1): omitted

Execution cycle 1: t0 ad (IR) – Mar, 1-r

​ T1 M(MAR)-MDR

T2 MDR-C,+1-ALU

Execution cycle 2: t0 alu-mdr, 1-W

T1 (PC+1)·Z+ PC-PC

\15. Set the capacity of control memory as 512 × 48 bits, the microprogram can be transferred in the whole control memory space, and there are 4 conditions to control the microprogram transfer (using direct control). The microinstruction format is as follows:

Solution: because there are 512 control memories48=2948

Therefore, the lower address field should have 9 bits and the microinstruction word length should be 48 bits

Because there are 4 conditions to control microprogram transfer, 4 + 1 < = 23

Therefore, the judgment test field accounts for 3 bits

Therefore, the number of control field digits is: 48-9-3 = 36

The microinstruction format is:

48 13 12 10 9 1

Address field under control field test field

Recommended Today

SQL server cannot generate fruncm thread

SQL Server was unable to generate the fruncm thread. The database error log is as follows: Copy codeThe code is as follows: 2013-09-26 21:21:50.31 Server      Microsoft SQL Server 2005 – 9.00.1399.06 (Intel X86)Oct 14 2005 00:33:37 Copyright (c) 1988-2005 Microsoft CorporationEnterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 2) 2013-09-26 21:21:50.31 Server      (c) […]