Deep understanding of zero copy


Title: in depth understanding of zero copy
date: 2021/5/21 16:11

1、 I / O concept

1.1 buffer zone

Buffer is the basis of all I / O. I / O is nothing more than moving data into or out of the buffer; When a process performs I / O operations, it sends a request to the operating system to either drain (write) the data in the buffer or fill (read) the buffer.

Deep understanding of zero copy

General flow chart of java process initiating read request to load data

After the process initiates a read request, the kernel will check whether the data required by the process already exists in the kernel space after receiving the read request. If it already exists, it will directly copy the data to the buffer of the process.

If there is no kernel, the disk controller immediately sends a command to read data from the disk, and the disk controller directly writes the data into the kernel read buffer. This step is completed through DMA (direct memory access, which can be understood as a hardware unit to liberate the CPU and complete file IO).

Next, the kernel copies the data to the process buffer; If the process initiates a write request, it also needs to copy the data in the user buffer to the socket buffer of the kernel, and then copy the data to the network card through DMA and send it.

You may think this is a waste of space. You need to copy the data in kernel space to user space every timeTherefore, the emergence of zero copy is to solve this problem.

Here is a brief mention. There are two methods for zero copy:

  • mmap+write
  • Sendfile

1.2 Virtual memory & virtual address space

CPU accesses memory through addressing. The addressing width of 32-bit CPU is 0 ~ 0xFFFFFFFF, and the calculated size is 4G, that is, the maximum supported physical memory is 4G.

However, in practice, such a problem is encountered. The program needs to use 4G memory, and the available physical memory is less than 4G, resulting in the program having to reduce the memory occupation.

In order to solve this problem, modern CPU introducesMMU(memory management unit).

The core idea of MMU is to use virtual address instead of physical address, that is, virtual address is used in CPU addressing, and MMU is responsible for mapping virtual address to physical address.

The introduction of MMU solves the limitation of physical memory,For programs, it’s like they’re using 4G memory

virtual memory: virtual memory is a technology that logically expands physical memory.The basic idea is to use software and hardware technology to use the two-level memory of memory and external memory as the first-level memory. The implementation of virtual memory technology makes use of automatic coverage and switching technology. Simply put, use a part of the hard disk as memory.(for example, swap area in Linux)

Virtual address space: virtual address space refers toThe range of virtual memory that the CPU can address, the Linux system will provide each process with a virtual address space (physical memory is allocated only when it is actually used). On a 32-bit CPU machine, its address range is 0x00000000 ~ 0xFFFFFFFF (about 4G), in which the high 1g space is the kernel space, which is called by the operating system, and the low 3G space is the user space, which is used by the user.

Deep understanding of zero copy

Division of virtual space

When addressing, the CPU addresses according to the virtual address, and thenMMU (memory management unit)Converts a virtual address to a physical address.

Deep understanding of zero copy

CPU virtual address addressing

Page missing interrupt: because only a part of the program is added to the memory, the address you are looking for is not in the memory (the CPU generates a page missing exception). If the memory is insufficient, it will passPage replacement algorithmTo replace the pages in memory, and then add the pages in external memory to memory to keep the program running normally.

commonPage replacement algorithm

  • Opt page replacement algorithm (best page replacement algorithm): ideally, it is impossible to realize. It is generally used as a method to measure other replacement algorithms.
  • FIFO page replacement algorithm (FIFO page replacement algorithm): always eliminate the page that enters the memory first, that is, select the page that stays in the memory for the longest time to eliminate.
  • LRU page replacement algorithm (page replacement algorithm has not been used recently): LRU (least currently used) algorithm gives each page an access field to record the time t that a page has experienced since it was last accessed. When it is necessary to eliminate a page, select the page with the largest t value in the existing page, that is, the most recently unused page, to eliminate it.
  • LFU page replacement algorithm (minimum page sorting algorithm): the LFU (least frequently used) algorithm allows the system to maintain a linked list of pages sorted by the last visit time. The first node of the linked list is the recently used page, and the last node of the linked list is the longest unused page. When accessing memory, find the corresponding page and move it to the top of the linked list. When a page is missing, the page at the end of the linked list node is replaced. In other words, the more frequently used pages in memory, the longer they will be retained.
Deep understanding of zero copy

Benefits of using virtual addresses instead of physical addresses

  1. More than one virtual address can point to the same physical memory address(many to one).

  2. The virtual address space can be larger than the actual available physical address(expansion through virtual memory).

Take advantage of the first featureThe kernel space address and the user space virtual address can be mapped to the same physical addressIn this way, DMA can fill the buffer visible to both kernel and user space processes

Deep understanding of zero copy

Save the copy between kernel and user space, Java nio. channels. Filechannel #map also uses this feature of the operating system to improve performance.

Relationship between virtual address space and virtual memory

Through the virtual memory technology, the memory we use can be much larger than the physical memory, such as physical memory 256M, but we can allocate 4G virtual address space for each process. When the CPU executes the code in the virtual address space, if it can’t find the required page in memory, it needs to find it in external memory,This part of external memory can be used as memory, which is virtual memory。 (use virtual memory to explain what virtual memory is / laugh and cry)

Virtual address space is not equal to virtual memory.Virtual address space is a space that does not really exist, but a virtual range through CPU addressing。 Virtual memory is the space of real hard disk.

To put it bluntly, virtual memory is to solve the problem of insufficient memory; The virtual address space encapsulates the use of virtual memory, so that the process thinks it has a large continuous memory space.

In my opinion, modern computers have enough memory (benefit 2 mentioned above). We can completely abandon the concepts of virtual memory and virtual address space, but virtual memory and virtual address space have another advantage. He canMore than one virtual address can point to the same physical memory (virtual memory [disk]) address(benefit 1 mentioned above), so as to realize zero copy of Java.

Is there a size limit for virtual memory?

In terms of hardware, the memory space of Linux system consists of two parts: physical memory andSwap (on disk)。 Swap is the virtual space, which is called swap space in Linux. If you have installed Linux, one step is to let you allocate the size of swap space, soIn theory, the size of the virtual space depends on the size of the disk

What is swap?

Swap space is an area on a disk. It can be a partition, a file, or a combination of them. Simply put,When the physical memory of the system is tight, Linux will save the infrequently accessed data in memory to swap (page change)In this way, the system will have more physical memory to serve each process. When the system needs to access the contents stored on the swap, it will load the data on the swap into memory. This is what we often call swap out and swap in.

Is there a size limit for the virtual address space?

Virtual address spaceIt meansThe range of virtual memory that the CPU can address, because the system will provide each process with a virtual memory space (physical memory is allocated only when the virtual memory is actually used), on a 32-bit CPU machine, its addressing range is 0x00000000 ~ 0xFFFFFFFF (about 4G), that is, 4G address space can be found.

Of course, there are basically no restrictions on 64 bit machines, but the memory of 64 bit machines is generally more than 16g. Basically, the memory is enough. The possibility of “page change” operation is also relatively small, so it is OK to set swap very small.

1.2.1 page & page frame & page table

MMU: when addressing the CPU, use the virtual address instead of the physical address, and then convert it to the physical address by the MMU (memory management unit).

Memory pagingIs a memory management mechanism based on the use of MMU.It divides the virtual address and physical address into page and page frame according to a fixed size of 4K (this size seems to be modifiable)And ensure that the page and page frame are the same size.

A 4K virtual address range is calledpage

A 4K physical address range is calledPage frame

Page table: the page table is like a function. The input is the page number and the output is the page frame.The operating system maintains a page table for each process。 Therefore, the virtual addresses of different processes may be the same. The page table gives the position of the page frame corresponding to each page in the process.

Deep understanding of zero copy

Relationship among the three

Why paging?

It is assumed that memory is allocated continuously (that is, the program is continuous in physical memory)
1. Process a comes in and applies to the OS for 200 memory space, so the OS allocates 0 ~ 199 to a
2. Process B comes in and applies to the OS for 5 memory space. The OS allocates 200 ~ 204 to it
3. Process C comes in and applies to the OS for 100 memory space. The OS allocates 205 ~ 304 to it
4. At this time, process B runs and returns 200 ~ 204 to OS
However, after a long time, as long as the size of the processes in the system is > 5, the space between 200 and 204 will not be allocated (as long as a and C do not exit).
After a longer period of time, there will be many 200 ~ 204 unusable fragments in the memory
The paging mechanism allows programs to be logically continuous and physically discrete. That is, on a continuous piece of physical memory, it may be 04 (this value depends on the size of the page) belongs to a, and 59 belongs to B and 10 ~ 14 belong to C, so as to ensure that any “memory fragment” can be allocated.

Life cycle of page frame

Although each process has 4G of virtual address space, it is obvious thatIn every short period of time it runs, it needs to access a small amount of space, and these spaces are generally address continuousIf you really allocate all the physical memory to this process, most of the physical memory will be wasted。 So now it’s normalUsing the paging technology (dividing the virtual address space and physical memory into fixed size blocks), the page table is established, and the pages of the process’s virtual address space are mapped to the page frame of physical memory. Here, the page table stores the mapping relationship。 thenAs the process runs, pages will be allocated on demand, and those page frames that have not been used for a long time will be recycled by the operating system

1.2.2 Page Cache & Swap Cache & Buffer Cache

Locality principle

When the program is executed, it shows a local law, that is, in a period of time, the execution of the whole program is limited to a certain part of the program. Accordingly, the memory space accessed by execution is also limited to a certain memory area. Specifically, locality usually has two forms: time locality and space locality.

  • Temporal locality: memory locations that have been referenced once will be referenced many times in the future.
  • Spatial locality: if the location of a memory is referenced, the location near it will also be referenced in the future.

Page Cache

In memory (including physical memory and virtual memory)

Page cache caches the file contents in pages (4K). The file data cached in the page cache can be read by users faster. meanwhileFor the write operation with buffer, the data can be returned immediately after being written to the page cache without waiting for the data to be actually persisted to the disk, which improves the overall performance of the upper application reading and writing files.

In fact, there are only two kinds of contents in memory

  1. Loaded from a file, for example Class file, filechannel #map, filechannel #transforto, etc. (pagecache)
  2. Memory used by code runtime, map Put() (user space memory)

Swap Cache

in disks

Some processes in the system often require a lot of memory during initialization (mainly anonymous pages obtained through malloc),After initialization, this part of memory will not be used frequently by the process and will not be released. This creates a waste of memory. Linux figured out a way to replace the data in these memories into the disk, and then mark this memory as recyclableThen, the page box recycling mechanism in Linux (not to be introduced) will recycle these pages and give them to processes in need.

Buffer Cache

In memory

The minimum data unit of the disk is sector. Each time you read or write to the disk, you operate on the disk in sector. The sector size is related to the specific disk type. Some are 512bytes and some are 4K bytes.Whether the user wants to read 1 byte or 10 bytes, it must be read in sector when accessing the disk finally, if the disk is read raw, it means that the efficiency of data reading will be very low. Similarly, if the user wants to write (update) 1 byte of data to a certain location on the disk, he must also refresh a sector. The implication isBefore writing this byte, we need to read out all the sector data on the disk where the 1 byte is located, modify the corresponding 1 byte data in memory, and then write the whole modified sector data to the disk at one go. In order to reduce such inefficient access and improve disk access performance as much as possible, the kernel will build a layer of cache on the disk sector. It caches part of the sector data in the memory in the unit of integer multiple strength (block) of the sector. When there is a data reading request, it can directly read the corresponding data from the memory. When data is written, it can directly update the data of the specified part in the memory, and then write the updated data back to the sector of the corresponding disk in an asynchronous manner. This layer of cache is block cache buffer cache.

Logical relationship between two types of caches

Page cache and buffer cache are two manifestations of the same thing: for a page, for the top, it is a page cache of a file, while for the bottom, it is also a group of buffer caches on a device

Before the emergence of virtual memory mechanism, the operating system used block cache series, but after the emergence of virtual memory, the operating system managed IO with greater granularity, so it adopted page cache mechanism, which is a page based and file oriented cache mechanism.

In the current Linux kernel code, page cache and buffer cache are actually unified,Both the page cache of the file and the buffer cache of the block are finally unified on the page

Java and page cache
Deep understanding of zero copy

It can be seen from the above figure that the file mapped by the filechannel #map method does not use the virtual memory space (swapcache), so it can’t put all 2G files in memory, so does it regard the file itself as a part of the virtual space?? So I thinkThe real meaning of virtual memory should be a technology that can use the disk as memory to make the application think that it is in memory (through the virtual address space) and will be fetched only when it is actually used

IIMemory relationship between JVM and Linux

2.1 Linux process memory model

The JVM runs on the Linux system as a process. Understanding the memory relationship between Linux and processes is the basis for understanding the memory relationship between the JVM and Linux. The following figure shows the general relationship between memory at three levels: hardware, system and process.

Deep understanding of zero copy

In terms of hardware, the memory space of Linux system consists of two parts:Physical memory and swap (on disk)

Physical memory is the main memory area used when Linux is active;When the physical memory is not enough, Linux will put some temporarily unused memory data into swap on the disk to free up more available memory space;When you need to use the data in swap, you must first change it back to memory.

From the perspective of Linux system, in addition to the bin area of the boot system,The whole memory space is mainly divided into two parts: kernel space and user space.

Kernel memory is the memory space used by Linux itself, which is mainly used by program logic such as program scheduling, memory allocation and connecting hardware resources.

User memory is the main space provided to each process,Linux provides the same virtual memory space for each process; This makes the processes independent of each other. It givesEach process must have a virtual address space, and physical memory is allocated only when it is actually used

As shown in the figure below,For 32 Linux systems, 0 ~ 3G virtual memory space is generally allocated as user spaceAllocate 3 ~ 4G virtual memory space as kernel space; The partitioning of 64 bit systems is similar.

Deep understanding of zero copy

From the perspective of the process, the user memory (virtual memory space) that the process can directly access is divided into five parts: code area, data area, heap area, stack area and unused area.

  • The machine code of the application program is stored in the code area. The code cannot be modified during operation. It has the characteristics of read-only and fixed size.

  • The data area stores the global data, static data and some constant strings in the application, and its size is also fixed.

  • Heap is the space dynamically applied by the program at run time. It belongs to the memory resources directly applied and released by the program at run time.

  • The stack area is used to store the incoming parameters, temporary variables, return addresses and other data of the function.

  • Unused area is a preparatory area for allocating new memory space.

2.2 process and JVM memory space

JVM is essentially a process, so its memory space (also known as runtime data area, pay attention to the difference from JMM) also has the general characteristics of a process.

However, JVM is not an ordinary process. It has many new features in memory space for two main reasons:

  • The JVM transplants many things that originally belong to the scope of operating system management into the JVM in order to reduce the number of system calls;
  • Java NiO forreducebe used forRead / write IOofSystem call overhead
Deep understanding of zero copy

Comparison between JVM process and ordinary process memory model

2.2.1 user memory

Permanent generation

The permanent generation is essentially the code area and data area of a java program. Classes in Java programs will be loaded into different data structures in the whole region, including constant pool, domain, method data, method body, constructor, special methods in the class, instance initialization, interface initialization, etc. This area is a part of the heap for the operating system; For Java programs, this is the space to accommodate the program itself and static resources, so that the JVM can interpret and execute Java programs.

New generation & old age

The new generation and the old generation areThe heap space that Java programs really use, mainly used for memory object storage; But itsThe management mode is essentially different from the ordinary process

When an ordinary process allocates space to a memory object at runtime, such as when C + + performs a new operation, it will trigger a system call to allocate memory space, which will be returned after the thread of the operating system allocates space according to the size of the object; At the same time, when a program releases an object, such as when C + + performs a delete operation, it will also trigger a system call to inform the operating system that the space occupied by the object can be recycled.

JVM uses memory differently from normal processes. JVM to operating systemRequest an entire memory area(the specific size can be adjusted in the JVM parameters) as the heap of Java programs (divided into new generation and old generation); When a java program applies for memory space, such as executing a new operation, the JVM will allocate the required size to the Java program in this space, and the Java program is not responsible for notifying the JVM when the space of this object can be released. The garbage object memory space is recycled by the JVM.

The advantages of JVM memory management are obvious, including:

  1. Reduce the number of system calls, the JVM does not need the intervention of the operating system when allocating memory space to Java programs. It only needs to apply to the operating system for memory or notify recycling when the Java heap size changes, while the system call is required for each allocation and recycling of memory space of ordinary programs;
  2. Reduce memory leaks, ordinary programs do not (or do not timely) notify the operating system of the release of memory space is one of the important reasons for memory leakage. Unified management by the JVM can avoid the memory leakage problem caused by programmers.
Unused area

Unused area is a preparatory area for allocating new memory space. For ordinary processes, this area can be used to apply for and release heap and stack space. This area will be used for each heap memory allocation, so the size changes frequently; For JVM processes, this area will be used when adjusting the heap size and thread stack. Generally, the heap size is less adjusted, so the size is relatively stable. The operating system will dynamically adjust the size of this area, and this area is usually not allocated with actual physical memory, but allows processes to apply for heap or stack space in this area.

2.2.2 kernel memory

Applications usually do not deal directly with kernel memory, which is managed and used by the operating system; However, with Linux’s focus on performance and improvement,Some new features allow applications to use kernel memory or map to kernel space

Java NiO was born under this background. It makes full use of the new features of Linux system and improves the IO performance of Java programs.

Deep understanding of zero copy

The figure above shows the distribution of the kernel used by Java NiO in the Linux system. NiO buffer mainly includes ByteBuffer used by NiO when using various channels and ByteBuffer actively used by Java programs Allocatedirector requests the allocated buffer.

In pagecache, NiO mainly uses filechannel Opening a file in map mode takes up mapped and filechannel Transferto and filechannel The cache required for transferfrom (NiO file in the figure).

The usage of NiO buffer and mapped can be monitored through JMX, as shown in the figure below. However, the implementation of filechannel uses the native pagecache through system calls. The process is transparent to Java and cannot monitor the use of this part of memory.

Deep understanding of zero copy

Linux and Java NiO open up space in the kernel memory for programs to use, mainly to reduce unnecessary replication, so as to reduce the overhead of IO operating system calls. For example, when sending the data of disk files to the network card using the common method and NiO, the data flow comparison is shown in the following figure:

Deep understanding of zero copy

Copying data between kernel memory and user memory consumes resources and time. As we can see from the above figure,Through NiO, the data copy between kernel memory and user memory is reduced twice。 This is one of the important mechanisms for the high performance of Java NiO (the other is asynchronous non blocking).

As can be seen from the above, kernel memory is also very important for Java program performance. Therefore, when dividing the use of system memory, we must leave some available space for the kernel.

3、 Zero copy

Traditional IO writes out a file through socket

File f = new File("helloword/data.txt");
RandomAccessFile file = new RandomAccessFile(file, "r");

byte[] buf = new byte[(int)f.length()];;

Socket socket = ...;

The internal workflow is as follows:

Deep understanding of zero copy

The switching between user mode and kernel mode occurs three times (this operation is relatively heavyweight), and the data is copied four times.

  1. Java itself does not have the ability to read and write IO, so after the read method is called, you should start from theUser stateSwitch toKernel state, call the reading ability of the operating system (kernel) to read data into theKernel bufferDuring this period, the user thread is blocked, and the operating system uses DMA (direct memory access) to read files without using CPU

    DMA can also be understood as a hardware unit, which is used to free the CPU to complete file IO

  2. fromKernel stateSwitch backUser state, data fromKernel bufferRead inUser buffer(i.e. byte [] buf). During this period, the CPU will participate in the copy and cannot use DMA

  3. The write method is called, and the data is removed from theUser buffer(byte [] buf) writeSocket buffer, the CPU will participate in the copy

  4. Next, you need to write data to the network card. Java does not have this ability, so you have to learn from itUser stateSwitch toKernel stateCall the write capability of the operating system and use DMASocket bufferThe data is written to the network card and the CPU will not be used

It can be seen that there are many intermediate links. Java IO is not actually a physical device level read-write,It’s a copy of the cache, the real reading and writing at the bottom is done by the operating system.

1.0 MMAP + write mode

Use MMAP + Write instead of read + write. MMAP is a memory mapping file method, that is, a file or other object is mapped to the address space of the process,Realize the one-to-one correspondence between the file disk address and a virtual address in the process virtual address space

In this way, you can save the original kernel read buffer from copying data to the user buffer, but you still need the kernel read buffer to copy data to the kernel socket buffer.

Deep understanding of zero copy

This wayOne data copy is reduced (copy cache from kernel space to user space), and the switching times between user state and kernel state are not reduced (still three times)

Code implementation:

File f = new File("helloword/data.txt");
RandomAccessFile file = new RandomAccessFile(file, "r");
//In fact, instead of loading the file into memory, the file is regarded as a part of virtual memory and loaded into memory when it is used
MappedByteBuffer buf = file.getChannel().map(FileChannel.MapMode.READ_ONLY, 0, f.length());

Socket socket = ...;


Deep understanding of zero copy


The method has three parameters, mapmode, position and size, respectively:

  • MapMode:The mapped mode. The options include: read_ ONLY,READ_ WRITE,PRIVATE。
  • Position:Where to start mapping, the number of bytes.
  • Size:How many bytes back from position.

Let’s focus on mapmode. The first two represent read-only and read-write respectively, of courseThe requested mapping mode is restricted by the access rights of the filechannel object(that is, if your filechannel is read-only, an error will be reported if the model is read_write),If read is enabled on a file that does not have read permission_ Only, a nonreadablechannelexception will be thrown

The private mode represents the mapping of copy on write, which meansAny modification made through the put () method will result in a private copy of the data that can only be seen by the mappedbytebuffer instance。 This processNo changes will be made to the underlying fileMoreover, once the buffer is garbage collected, those modifications will be lost.

Browse the source code of the map () method:

public MappedByteBuffer map(MapMode mode, long position, long size)
        throws IOException
            ... Omit
            int pagePosition = (int)(position % allocationGranularity);
            long mapPosition = position - pagePosition;
            long mapSize = size + pagePosition;
            try {
                // If no exception was thrown from map0, the address is valid
                addr = map0(imode, mapPosition, mapSize);
            } catch (OutOfMemoryError x) {
                // An OutOfMemoryError may indicate that we've exhausted memory
                // so force gc and re-attempt map
                try {
                } catch (InterruptedException y) {
                try {
                    addr = map0(imode, mapPosition, mapSize);
                } catch (OutOfMemoryError y) {
                    // After a second OOME, fail
                    throw new IOException("Map failed", y);

            // On Windows, and potentially other platforms, we need an open
            // file descriptor for some mapping operations.
            FileDescriptor mfd;
            try {
                mfd = nd.duplicateForMapping(fd);
            } catch (IOException ioe) {
                unmap0(addr, mapSize);
                throw ioe;

            assert (IOStatus.checkAll(addr));
            assert (addr % allocationGranularity == 0);
            int isize = (int)size;
            Unmapper um = new Unmapper(addr, mapSize, isize, mfd);
            if ((!writable) || (imode == MAP_RO)) {
                return Util.newMappedByteBufferR(isize,
                                                 addr + pagePosition,
            } else {
                return Util.newMappedByteBuffer(isize,
                                                addr + pagePosition,

It roughly meansObtain the address of the memory mapping through the native method. If it fails (out of memory error due to insufficient memory), manually perform the GC mapping again

Finally, mappedbytebuffer is instantiated through the address of memory mapping. Mappedbytebuffer itself is an abstract class. In fact, what is really instantiated here is directbytebuffer.

Note 1: this memory (refers to off heap memory, not mappedbytebuffer)It is not affected by JVM garbage collection, so the memory address is fixed, which is helpful for IO reading and writing

The directbytebuf object in Java only maintains a virtual reference to this memory(phantom reference), memory recycling is divided into two steps

  1. Directbytebuf object is garbage collected and virtual references are added to the reference queue
  2. Access the reference queue (sun. Misc. Cleaner) through a dedicated thread to release the out of heap memory according to the virtual reference

Note 2: because mappedbytebuffer is implemented through virtual memory technology, yourWhen to refresh to disk is determined by the operating system。 But one advantage isEven if your Java program hangs up after writing to memory, as long as the operating system works normally, the data will be written to disk.

java. nio. Mappedbytebuffer #force canForces the operating system to write the contents of memory to the hard disk, but it’s better to use less.

Mappedbytebuffer #get procedure

public byte get() {
    return ((unsafe.getByte(ix(nextGetIndex()))));
public byte get(int i) {
    return ((unsafe.getByte(ix(checkIndex(i)))));
private long ix(int i) {
    return address + (i << 0);

The Map0 () function returns an address, so that the file can be operated through address without calling the read or write methods. The bottom layer is unsafe Getbyte method to obtain the data of the specified memory through (address + offset).

  1. Accessing the memory area pointed to by address for the first time leads to page missing interrupt. The interrupt response function will find the corresponding page in the exchange area. If it is not found (that is, the file has never been read into memory), read the specified page of the file from the hard disk to the physical memory (non JVM heap memory).
  2. If the physical memory is not enough when copying data, the temporarily unused physical pages will be exchanged to the virtual memory of the hard disk through the virtual memory mechanism (SWAP).


  1. Mappedbytebuffer uses virtual memory, so the memory size allocated (map) is not limited by the – Xmx parameter of the JVM, but it is also limited by size (the maximum is integer.max_value, 2G according to the source code).

    See sun. Com for source code nio. ch.FileChannelImpl#map

    Essentially becausejava.nio.MappedByteBufferDirectly inherited fromjava.nio.ByteBufferThe index of ByteBuffer is of type int, so mappedbytebuffer can only be indexed to the maximumInteger.MAX_VALUESo the map method of filechannel will check the validity of parameters.

  2. The size of a map should be limited to about 1.5g

  3. When the file exceeds the limit, you can remap the contents behind the file through the position parameter.

  4. Use mappedbytebuffer operationLarge fileFaster than IO flow (for small files, memory mapped files will lead to a waste of fragmented space, because memory mapping is always necessaryAlign page boundaries, the minimum unit is 4 KIB. A 5 KIB file mapping will occupy 8 KIB memory and waste 3 KIB memory.)

  5. The memory of the loaded file is outside the heap memory of Java, allowing two different processes to access the file.

  6. Do not call mappedbytebuffer often Force () method, which forces the operating system to write the contents of memory to the hard disk. Therefore, if you call the force () method every time you write the memory mapping file, you will not really benefit from the memory mapping file, but just like disk IO.

2.0 sendfile mode

The sendfile system call was introduced in kernel version 2.1 to simplify the data transfer between two channels over the network.

The introduction of sendfile system call not only reduces data replication, but also reduces the number of context switches.

Deep understanding of zero copy

Java calls the transferto / transferfrom method to copy data corresponding to two channels; Than the above wayIt reduces the switching from Java code (user mode) to operating system (kernel mode) twice, even the ByteBuffer object is not created.

Deep understanding of zero copy
  1. After the transferto method is called by Java, theUser stateSwitch toKernel state, data is read in using DMAKernel buffer, no CPU
  2. Data fromKernel bufferTransfer toSocket buffer, the CPU will participate in the copy
  3. Finally, DMA will be usedSocket bufferThe data is written to the network card and the CPU will not be used

Code implementation:

File f = new File("helloword/data.txt");
RandomAccessFile file = new RandomAccessFile(file, "r");

Socket socket = ...;
file.getChannel().transferTo(0, f.length(), socket.getChannel());

3.0 further optimization of Linux 2.4

We can see that there is another copy using the CPU in the kernel space of the previous method, so can we save this copy?

Linux2. 4. The kernel is improved by recording the corresponding data description information (memory address and offset) in the kernel buffer into the corresponding socket buffer, so that even one CPU copy in the kernel space is omitted.

Deep understanding of zero copy
  1. After the transferto method is called by Java, theUser stateSwitch toKernel state, data is read in using DMAKernel buffer, no CPU
  2. Only some offset and length information will be copied inSocket buffer, almost no consumption
  3. Using DMA willKernel bufferThe data is written to the network card and the CPU will not be used

In the whole process, there is only one switch between user state and kernel state, and the data is copied twice.


The so-called [zero copy] does not mean that there is no copy, but that duplicate data will not be copied to the JVM memory. The advantages of zero copy are:

  • Less switching between user mode and kernel mode
  • The CPU is not used for calculation (the CPU is used as long as the copy between memory is involved), and the pseudo sharing of CPU cache is reduced (because zero copy will use DMA for data copy, and it is not put into memory at all, so the CPU cannot participate in the calculation)
  • Zero copy is suitable for small file transfer (the file is large and the kernel buffer is full,

4、 Other zero copies

4.1 Netty

In nettyZero-copyIt is similar to the OS level mentioned aboveZero-copyIt’s different, netty’sZero-coypIt is completely in user mode (Java level), and itsZero-copyAre more inclined toOptimize data operationsSuch a concept.

Netty provides a zero copy buffer. When transmitting data, the final processed data will need to combine and split a single transmitted message, which NiO’s native ByteBuffer cannot do,Netty implements zero copy by providing two buffers: composite and slice(reduce the copy of data combination).

Deep understanding of zero copy

The TCP layer HTTP message is divided into two channelbuffers, which are meaningless to our upper layer logic (HTTP processing).

However, when the two channelbuffers are combined, they become a meaningful HTTP message. The channelbuffer corresponding to this message is what can be called “message”. Here, a word “virtual buffer” is used.

Take a look at the compositechannelbuffer source code provided by netty:

public class CompositeChannelBuffer extends AbstractChannelBuffer {

    private final ByteOrder order;
    private ChannelBuffer[] components;
    private int[] indices;
    private int lastAccessedComponentId;
    private final boolean gathering;

    public byte getByte(int index) {
        int componentId = componentId(index);
        return components[componentId].getByte(index - indices[componentId]);
    ... Omit

Components are used to save all received buffers. Indexes records the starting position of each buffer, and lastaccessedcomponentid records the last accessed componentid.

Compositechannelbuffer does not open up new memory and directly copy all channelbuffer contents. Instead, it directly saves the references of all channelbuffers and reads and writes in the sub channelbuffer to achieve zero copy.


4.2 other zero copies

Rocketmq messages are written to the commitlog file in sequence, and then the consumption queue file is used as the index.

Rocketmq responds to the consumer’s request by zero copy MMAP + write.

Similarly, in Kafka, a large number of network data are persisted to disk and disk files are sent through the network. Kafka uses the sendfile zero copy method.

5、 Test the API speed of several copy files

contestant 3.8M 394M 800M
FileInputStream 16s 870ms
BufferedInputStream 170 ~ 180ms 15s 989ms 32s 325ms
BufferedInputStream with byte[1024] 50 ~ 65ms 1s 243ms 3s 418ms
RandomAccessFile with byte[1024] 50 ~ 65ms 2s 663ms 5s 782ms
FileChannel#write() 80 ~ 90ms 3s 5s 494ms
FileChannel#transferTo() 30ms 593ms 2s 404ms
MappedByteBuffer 30~50ms 1s 286ms 4s 968ms

First place: filechannel #transferto()

Second place: bufferedinputstream with byte [1024], unexpectedly

Third place: mappedbytebuffer

Fourth place: RandomAccessFile with byte [1024]

Fifth place: filechannel #write (), I didn’t expect it

Reference articles

Virtual address and virtual memory

Detailed explanation of memory paging->Partial error

Linux disk caching mechanism

Between page caching, memory, and files->Close to the bottom, I can’t understand

Relationship and evolution history of page cache and buffer cache in Linux kernel

Linux memory management — vernacular page box recycling

Detailed explanation of memory relationship between JVM and Linux

JAVA memory mapping, easy processing of G large files

Why use memory mapped files in Java

Mappedbytebuffer I can hold files of any size->Partial error

How good is the zero copy technology in netty and Kafka?