V8 memory management and garbage collection mechanism

Time:2020-9-29

Due to the V8 engine, node has some limitations when operating large memory objects. On 64 bit machines, the default maximum operation object size is about 1.4g, and on 32-bit machines, the default maximum operation object size is about 0.7G.
If our node program often operates on objects with large memory, you can modify this default value:

node --max-old-space-size=1700 index.js
node --max-new-space-size=1024 index.js

Among them,max-old-space-sizeIndicates the maximum capacity of the old generation memory space,max-new-space-sizeRepresents the maximum capacity of this new generation memory space. However, these two values have upper limits and cannot be set infinitely. The largest memory space of the old generation is about 1.7g, and that of the new generation is about 1.0g.
As for the new generation and the old generation, this is a classification of memory in V8, which will be introduced later.
Back to the problem of operating large memory objects, what if the 1.7g memory is not large enough? To understand that 1.7g is a limitation made at the V8 level, to avoid this limitation, we can use theBufferObject,BufferThe memory allocation of the object is carried out at the C + + level and is not limited by the V8 engine.
Note: approvedprocess.memoryUsage()Method can be used to view the memory usage of V8 engine throughos.totalmem()Methods andos.freemem()Method to view the total memory and free memory of the operating system respectively.

//View V8 memory usage
process.memoryUsage()
{ 
  rss: 31469568,
  heapTotal: 7708672,
  heapUsed: 5152856,
  external: 8609 
}

//View total operating system memory
os.totalmem()
8279511040
//View the free memory of the operating system
os.freemem()
1610977280

The memory obtained by the above methods is in bytes.

New generation and old generation

V8 divides the memory into two categories: the new generation memory space and the old generation memory space. The new generation memory space is mainly used to store the objects with short survival time, and the old generation memory space is mainly used to store the objects with longer survival time. For garbage collection, the new generation and the old generation have different strategies, which are introduced in turn.

New generation garbage collection

The garbage collection in the new generation memory is mainly implemented by scavenge algorithm, and Cheney algorithm is mainly used in the specific implementation. Cheney divides the memory space into two parts. Each part is called a semispace. One of these two semispaces is in use and the other is idle. The semispace in use is also called from, and the semispace in idle is also called to.
When garbage collection is running, objects in from will be checked. When an object needs to be recycled, it will be left in the from space, and the remaining objects will be moved to the to space, and then reversed to exchange the from space and to space. During garbage collection, the memory of to space will be released, as shown in the following figure:

New generation garbage collection.png

In short, objects that do not need to be recycled are finally stored in the from space, and the objects that need to be recycled are finally stored in the to space. When garbage collection is running, all objects in the to space are recycled.

Promotion of new generation objects

As mentioned earlier, the new generation memory space is used to store objects with shorter lifetime, while the old generation memory space is used to store objects with longer lifetime. There are two ways to promote the new generation to the old generation

1. Declare promotion several times

During garbage collection, if an object is found to have been cleaned before, it will be promoted to the old generation memory space

2. Occupy large memory for promotion

In the process of reversing the from space and to space, if the usage of the to space (reserved data) has exceeded 25%, then the objects in the from space will be promoted to the old generation memory space directly

Recycling of old generation garbage

After finishing the garbage collection of the new generation, let’s take a look at the garbage collection in the old generation. First of all, the structure of the old generation memory space is different from that of the new generation memory space, which is actually a continuous structure, instead of being divided into two parts: from and to

Old generation memory space.png

There are two ways of garbage collection in old generation memory space: Mark sweep and mark compact.

Mark Sweep

Mark sweep marks the objects that need to be recycled, and releases the corresponding address space directly when garbage collection is running, as shown in the following figure (the red memory area indicates the area to be recycled)

Mark clear.png

As shown in the above figure, using mark sweep for garbage collection will cause a problem, that is, memory will be discontinuous after garbage collection. In order to solve this problem, mark compact scheme is proposed.

Mark Compact

The idea of mark compact is a little like Cheney algorithm adopted in the new generation of garbage collection: move the surviving objects to one side, move the objects to be recycled to the other side, and then recycle the whole area of objects to be recycled.

Tag merge.png

The figure above shows the process of garbage collection using mark compact in the old generation memory space.

summary

This paper introduces the memory management and garbage collection of node

  • The problem and solution of big object operation limitation in V8
  • Several methods of getting memory usage
  • The concept of new generation and old generation
  • Garbage collection schemes for the new generation and the old generation