In UNIX, everything is a file and a binary stream. In the process of information exchange, we operate on these flows, which is what we usually call
I / O (input output) operationThe computer records one for each streamFile descriptor — FD (file descriptor)To distinguish the operation of each file!
The figure above shows when the process executes
recvfromFunction, through the system call, the user mode is transferred to the kernel state. Because the data has not arrived, the kernel will wait for the data until the data is ready, and then all the data will be copied to the user process!In this case, the user process is stuck in the whole process
Blocking I / O – when you go downstairs to get the express, you find that the express brother doesn’t come. OK, I can’t help it. I’ll wait until the express arrives. I’ll go back to the dormitory after I get the express!
Non blocking I / O means that when the user process calls the read operation, if the kernel data is ready, it will directly copy and return; if not, it will return an error prompt process, and constantly polling whether it is ready!
The figure above shows when the process executes
recvfromFunction, through the system call, from the user state to the kernel state, because the data is delayed, so the kernel will not wait for data, directly return an error to the user process! Then the user process continuously polling to check whether the data is readyIn this case, the user process continuously checks whether the kernel data is ready, which wastes CPU time slices
Non blocking I / O – when you go downstairs to get the express, you find that the express brother doesn’t come. OK, I’ll go back to the dormitory first. Just returned to the dormitory, I thought: what if this happens to come? So I immediately went downstairs to check, repeated the process until the express brother came!
I / O multiplexing is to monitor multiple file descriptors by executing select or poll functions. Once the kernel finds that a certain data is ready, it will notify the corresponding user process to read it
The above figure shows that the process has been blocked on calls to select or poll functions, waiting until a socket is readable, and then the process reads the data.This model has an obvious drawback: it requires two system calls. Moreover, when a connection comes, it is necessary to traverse all the registered file descriptors to find the file descriptor that needs to process information. If tens of thousands of file descriptors have been registered, the CPU will explode due to traversing these registered file descriptors. But its advantage is that it can monitor multiple sockets
The advantage of I / O multiplexing is not that it can handle a single connection faster, but that it can handle more connections in a single thread / process. Compared with multi process and multi thread technology, the biggest advantage of I / O multiplexing technology is that the system overhead is small, the system does not need to create processes / threads, and does not need to maintain these processes / threads, thus greatly reducing the system overhead.
1. Support the maximum number of connections a process can open
Select: the maximum number of connections a single process can open is FD_ The size of the SetSize macro is defined as the size of 32 integers (on a 32-bit machine, the size is 3232, which is the same as FD on a 64 bit machine_ Of course, we can modify the and recompile the kernel, but the performance may be affected, which needs further testing.
Poll: poll is essentially the same as select, but it has no limit on the maximum number of connections because it is stored based on linked lists.
Epoll: Although there is an upper limit on the number of connections, it is very large. A machine with 1G memory can open about 100000 connections, and a machine with 2G memory can open about 200000 connections.
2. IO efficiency problems caused by the rapid increase of FD
Select: because the connection is traversed linearly every time it is called, the “linear degradation performance problem” with slow traversal speed will be caused with the increase of FD.
Same as above
Epoll: because the implementation in the epoll kernel is based on the callback function on each FD, only the active socket can call callback actively. Therefore, when there are few active sockets, there is no linear degradation performance problem in epoll using epoll. However, if all sockets are active, there may be performance problems.
3. Message delivery
Select: the kernel needs to pass messages to user space, and both require kernel copy actions
Same as above
Epoll: epoll is implemented by sharing a piece of memory between kernel and user space.
However, when the express delivery company called me, I got a call from the express company.
When the signal of socket / O is ready to be accessed by the signal of socket I
Let me know
The above figure shows that the user process first creates aSignal processing function, return immediately.After that, the process does not block, but instead does something else.When the kernel is ready for data, it generates a sigio signal（Level trigger）And it is sent to the signal processing function. You can call recvfrom function data in this function to copy from kernel space to user space. This process is blocked.No matter how we deal with the signal, the advantage of this model is that it won’t block while waiting for the data to arrive. The main loop can continue to execute and wait for the signal handler to notify that the data is ready for processing or that the datagram is ready to be read.
Signal driven I / O — also to get the express, and there are Zhongtong, Shentong, Yunda and other express are coming, I wait for the call（This phone is different from the above phone, which needs a special express phone）I’ll clean up the dormitory first. Once the phone rings, I’ll get some express!
The asynchronous I / O model is that the process initiates an asynchronous IO request and returns immediately. WhenThe kernel completes ioThe kernel sends a signal to the process.
The figure above shows that the user process will call first
aio_readFunction performs the system call, andPass the descriptor, the buffer pointer, the buffer size (the same three parameters as read), the file offset (similar to lseek), and how to notify us when the entire operation is complete.This system call returns immediately, our process does not block, but continues to perform other things until all data is written into the buffer!!!
Asynchronous I / O — also take express, I was too busy, I entrusted my classmate to get it, and told him the order number and type of the express, and then I went to eat or do other thingsLet me know when he gets the express. Comfortable
To sum up, the first four I / O models are synchronous I / O, that is, they will be blocked at the stage of data writing from the kernel to the buffer, that is, the synchronous I / O operation will cause the request process to be blocked until the I / O operation is completed. Asynchronous I / O operations do not cause the request process to be blocked
4. About bio, NiO and AIO in Java, it is recommended to read the bloggitee.com/SnailClimb/JavaGuide/blo…
This work adoptsCC agreementThe author and the link to this article must be indicated in the reprint