Explore the zero copy technology in netty and Kafka!

Time:2020-9-10

preface

Literally, it means that the data does not need to be copied back and forth, which greatly improves the performance of the system. We often use this word in Java
NiO, netty, Kafka, rocketmq and other frameworks are often used as a highlight of their performance improvement. Next, we will start with several concepts of I / O, and then analyze zero copy.

I / O concept

1. Buffer

Buffer is the basis of all I / O, I / O is nothing more than to move data into or out of the buffer. To perform I / O operation, a process sends a request to the operating system to either drain (write) the data in the buffer or fill the buffer (read). Here is a flow chart of a java process initiating a read request to load data

Explore the zero copy technology in netty and Kafka!

After a process initiates a read request, after receiving the read request, the kernel first checks whether the data required by the process already exists in the kernel space. If it does, it copies the data directly to the process buffer; if there is no kernel, it immediately sends a command to the disk controller to read data from the disk, and the disk controller writes the data directly to the kernel read buffer, This step is completed through DMA, and then the kernel copies the data to the process buffer;
If a process initiates a write request, it also needs to copy the data in the user buffer to the socket buffer of the kernel, and then copy the data to the network card through DMA and send it out;
You may think that this is a waste of space. Every time you need to copy the data in the kernel space to the user space, so the emergence of zero copy is to solve this problem;
There are two ways of zero copy: MMAP + write and sendfile;

2. Virtual memory

All modern operating systems use virtual memory, using virtual addresses instead of physical addresses. The benefits of this are:
1. More than one virtual address can point to the same physical memory address,
2. The virtual memory space can be larger than the actual available physical address;
Using the first feature, you can map the kernel space address and user space virtual address to the same physical address, so that DMA can fill in the buffer that is visible to both kernel and user space processes, as shown in the following figure

Explore the zero copy technology in netty and Kafka!

Instead of copying between the kernel and user space, Java also uses this feature of the operating system to improve performance. Let’s focus on what Java supports for zero copy.

3. MMAP + write mode

Instead of the original read + write mode, MMAP is a memory mapping file method, that is, a file or other objects are mapped to the address space of the process to realize the one-to-one mapping relationship between the file disk address and a virtual address in the process virtual address space. In this way, the copy data of the original kernel read buffer to the user buffer can be saved Or do you need the kernel read buffer to copy the data to the kernel socket buffer, as shown in the following figure:

Explore the zero copy technology in netty and Kafka!

4. Sendfile mode

Sendfile system call was introduced in kernel version 2.1 to simplify the process of data transmission between two channels over the network. The introduction of sendfile system calls not only reduces data replication, but also reduces the number of context switches, as shown in the following figure:

Explore the zero copy technology in netty and Kafka!

Data transfer only occurs in the kernel space, so a context switch is reduced. However, there is still a copy. Can you omit this copy? An improvement has been made in the Linux 2.4 kernel
The corresponding data description information (memory address, offset) in the buffer is recorded into the corresponding socket buffer, so that a CPU copy in the kernel space is also omitted;

Java zero copy

1.MappedByteBuffer

java
The filechannel provided by NiO provides a map() method, which can establish a virtual memory mapping between an open file and mappedbytebuffer. Mappedbytebuffer inherits from ByteBuffer, similar to a memory based buffer, except that the data elements of the object are stored in a file on the disk. Calling the get() method will get the data from the disk To reflect the current content of the file, calling the put() method will update the file on the disk, and the changes made to the file will be visible to other readers. Let’s take a simple read example and analyze mappedbytebuffer:

public class MappedByteBufferTest {
    
        public static void main(String[] args) throws Exception {
            File file = new File("D://db.txt");
            long len = file.length();
            byte[] ds = new byte[(int) len];
            MappedByteBuffer mappedByteBuffer = new FileInputStream(file).getChannel().map(FileChannel.MapMode.READ_ONLY, 0,
                    len);
            for (int offset = 0; offset < len; offset++) {
                byte b = mappedByteBuffer.get();
                ds[offset] = b;
            }
            Scanner scan = new Scanner(new ByteArrayInputStream(ds)).useDelimiter(" ");
            while (scan.hasNext()) {
                System.out.print(scan.next() + " ");
            }
        }
    }
    Copy code

The mapping is mainly realized through the map() provided by filechannel. The map() method is as follows:

public abstract MappedByteBuffer map(MapMode mode,
                                             long position, long size)
            throws IOException;
            
    Copy code

Three parameters are provided, mapmode, position and size
Mapmode: the mode of mapping, including read_ ONLY,READ_ WRITE,PRIVATE;
Position: where to start mapping and the position of the number of bytes;
Size: how many bytes backward from position;

Focus on mapmode, which means read-only and read-write respectively. Of course, the requested mapping mode is restricted by the access rights of the filechannel object. If read is enabled on a file that does not have read permission_ Only, non readablechannelexception will be thrown; private mode represents the mapping of copy on write, which means that any modification made through the put() method will result in a private copy of data, and the data in this copy can only be seen by mappedbytebuffer instance. This process will not make any changes to the underlying file, and once the buffer is garbage collected, it will not change the underlying file (garbage
Collected), those changes will be lost. For a general look at the source code of the map() method:

public MappedByteBuffer map(MapMode mode, long position, long size)
            throws IOException
        {
                ... omit
                int pagePosition = (int)(position % allocationGranularity);
                long mapPosition = position - pagePosition;
                long mapSize = size + pagePosition;
                try {
                    // If no exception was thrown from map0, the address is valid
                    addr = map0(imode, mapPosition, mapSize);
                } catch (OutOfMemoryError x) {
                    // An OutOfMemoryError may indicate that we've exhausted memory
                    // so force gc and re-attempt map
                    System.gc();
                    try {
                        Thread.sleep(100);
                    } catch (InterruptedException y) {
                        Thread.currentThread().interrupt();
                    }
                    try {
                        addr = map0(imode, mapPosition, mapSize);
                    } catch (OutOfMemoryError y) {
                        // After a second OOME, fail
                        throw new IOException("Map failed", y);
                    }
                }
    
                // On Windows, and potentially other platforms, we need an open
                // file descriptor for some mapping operations.
                FileDescriptor mfd;
                try {
                    mfd = nd.duplicateForMapping(fd);
                } catch (IOException ioe) {
                    unmap0(addr, mapSize);
                    throw ioe;
                }
    
                assert (IOStatus.checkAll(addr));
                assert (addr % allocationGranularity == 0);
                int isize = (int)size;
                Unmapper um = new Unmapper(addr, mapSize, isize, mfd);
                if ((!writable) || (imode == MAP_RO)) {
                    return Util.newMappedByteBufferR(isize,
                                                     addr + pagePosition,
                                                     mfd,
                                                     um);
                } else {
                    return Util.newMappedByteBuffer(isize,
                                                    addr + pagePosition,
                                                    mfd,
                                                    um);
                }
         }
    Copy code

It roughly means that the address of the memory mapping is obtained through the native method. If it fails, the GC will map again. Finally, the mappedbytebuffer is instantiated through the address of the memory mapping. Mappedbytebuffer itself is an abstract class. In fact, the real instance here is directbytebuffer;

2.DirectByteBuffer

Directbytebuffer inherits from mappedbytebuffer. You can guess from the name that it has opened up a section of direct memory and will not occupy the memory space of the JVM. In the previous section, mappedbytebuffer mapped through filechannel is also directbytebuffer. Of course, in addition to this method, you can also manually open up a section of space

ByteBuffer directByteBuffer = ByteBuffer.allocateDirect(100);
    Copy code

As above, 100 bytes of direct memory space has been opened up;

3. Channel to channel transmission

It is often necessary to transfer files from one location to another. Filechannel provides the transferto() method to improve the efficiency of transfer. First, let’s take a simple example:

public class ChannelTransfer {
        public static void main(String[] argv) throws Exception {
            String files[]=new String[1];
            files[0]="D://db.txt";
            catFiles(Channels.newChannel(System.out), files);
        }
    
        private static void catFiles(WritableByteChannel target, String[] files)
                throws Exception {
            for (int i = 0; i < files.length; i++) {
                FileInputStream fis = new FileInputStream(files[i]);
                FileChannel channel = fis.getChannel();
                channel.transferTo(0, channel.size(), target);
                channel.close();
                fis.close();
            }
        }
    }
    Copy code

Transfer the file data to the System.out Channel and interface are defined as follows:


        public abstract long transferTo(long position, long count,
                                        WritableByteChannel target)
            throws IOException;
    

Several parameters are easy to understand, such as the starting position of transmission, the number of bytes to be transferred, and the target channel; transferto() allows one channel to be cross connected to another without an intermediate buffer to transfer data;
Note: there are two meanings: the first layer does not need the user space buffer to copy the kernel buffer; the other layer has its own kernel buffer for the two channels; the two kernel buffers can also be used without copying data;

Netty zero copy

Netty provides a zero copy buffer. When transmitting data, the final processed data needs to combine and split a single transmitted message. NiO’s native ByteBuffer can’t do this. Netty realizes zero copy by providing composite and slice buffers. It’s clear from the following figure:

Explore the zero copy technology in netty and Kafka!

The TCP layer HTTP message is divided into two channelbuffers, which are meaningless to our upper logic (HTTP processing).
However, two channelbuffers are combined to form a meaningful HTTP message. The corresponding channelbuffer of this message is what can be called “message”. Here, the word “virtual” is used
Buffer”。
Take a look at the compositechannelbuffer source code provided by netty

public class CompositeChannelBuffer extends AbstractChannelBuffer {
    
        private final ByteOrder order;
        private ChannelBuffer[] components;
        private int[] indices;
        private int lastAccessedComponentId;
        private final boolean gathering;
        
        public byte getByte(int index) {
            int componentId = componentId(index);
            return components[componentId].getByte(index - indices[componentId]);
        }
        ... omit

Components are used to save all received buffers. Indexes records the starting position of each buffer, and lastaccessed componentid records the componentid of the last visit. Compositechannelbuffer does not open up new memory and directly copies all channelbuffer contents. Instead, it directly saves all references to channelbuffers and reads and writes them in sub channelbuffers There are zero copies.

Other zero copies

The messages of rocketmq are written to the commitlog file in sequence, and then the consumption is used
The queue file is used as the index; rocketmq uses zero copy MMAP + write to respond to the consumer’s request;
Similarly, Kafka has a large number of network data persistent to disk and disk files sent through the network. Kafka uses sendfile zero copy mode;

summary

Zero copy if we simply use the probability of objects in Java to understand, it is actually the use of object references. Every place that references an object changes it can change the object. There is only one object forever.

Source: https://juejin.im/post/5cad6f…
Author: ksfzhouhui

Recommended reading

  • Nginx current limiting configuration
  • Sub database and sub table vs newsql database
  • Ten minutes to get started rocketmq
  • Prometheus + granafa build a big MySQL monitoring platform
  • Core technical guide for spring boot to build multi tenant SaaS platform
  • SaaS system architecture experience summary

Learning material sharing

12 setsMicroservices, spring boot, and spring cloud core technical data. This is part of the information directory:

  • Spring security authentication and authorization
  • Spring boot project practice (background service architecture and operation and maintenance architecture of small and medium sized Internet companies)
  • Spring boot project (enterprise rights management project)
  • Spring cloud microservice architecture project (distributed transaction solution)
  • Official account back office replyarch028Access to information:

    Explore the zero copy technology in netty and Kafka!