5、 JVM family (garbage collection mechanism)

Time:2022-1-14

preface

In Java, if an object can no longer be referenced, the object is garbage and should be recycled to prevent memory occupation.

How to judge garbage?

1. Reference counting method

We can easily think of the method of reference counting. When an object is referenced, the reference is + 1, and when the reference is removed, – 1, that is, when the number of references is 0, the object is garbage and should be recycled.
For example, the following example:

Example 1
String a = new string ("hello");
        a = null;

We created an object of type string. The object value is “hello”, which is pointed to by reference A. at this time, we can understand that the object is referenced by a, and its reference + 1
。 Then we let a = null. That is, the reference of the object is removed. Reference – 1 Finally, the number of references is 0, so the object is considered garbage and should be recycled.

Example 2
public class ReferenceCountingGC {

  public Object instance;

  public ReferenceCountingGC(String name) {
  }

  public static void testGC(){

    ReferenceCountingGC a = new ReferenceCountingGC("objA");
    ReferenceCountingGC b = new ReferenceCountingGC("objB");

    a.instance = b;
    b.instance = a;

    a = null;
    b = null;
  }
}

In example 2, we created two referencecountinggc objects respectively, which are pointed by a and B, and there is a member variable instance in the two objects.
Then we let A. instance = B; That is, the instance variable in object a points to the reference pointed to by B, b.instance = a; Instance in object B points to the reference a points to. Then a = null and B = null respectively. Let’s look at the following figure:

5、 JVM family (garbage collection mechanism)

image.png

Although the original A and B objects were not referenced by a and B. However, instances in a and B objects refer to each other. That is, the instance in the a object points to the B object. The instance in the B object points to the a object. That is, both a and B objects are referenced, even if we set a = null and B = null. Their number of references is also not 0. Therefore, it is impossible to judge whether they are garbage by reference counting, so it is impossible to recycle and waste memory.

2. GC root tracing algorithm (reachability analysis algorithm).

The general process of GC root tracing algorithm is as follows:
Starting from GC root tracing, all reachable objects are living objects, and unreachable objects are regarded as garbage. And what is GC root tracing?
GC root tracing is a set of active reference collections, including:

  • All currently loaded classes
  • Reference type static variables in Java classes
  • Reference type constants in the runtime constant pool of Java classes
  • Some static data structures of VM point to object references in GC heap
  • etc.
    The general process can be seen as follows:
5、 JVM family (garbage collection mechanism)

image.png

How to recycle?

When the JVM recognizes the garbage, how to recycle it? In short, there are three algorithms for garbage collection: mark removal algorithm, copy algorithm and mark compression algorithm

  • Mark removal algorithm
    It is divided into marking and clearing stages. In the marking stage, all objects reachable from GC root are marked. At this time, unreachable objects are garbage objects. The unmarked objects are then cleared in the purge phase. The disadvantage is that it will cause a lot of memory fragments, that is, the memory is discontinuous. Although objects can be allocated in discontinuous memory space, this is less efficient than continuous memory space.
  • Replication algorithm
    Divide the original memory into two blocks and use only one block at a time. During garbage collection, the active objects in the memory in use are copied to unused memory blocks through the GC root tracing algorithm. Then clear the objects in the memory block in use. Then swap the roles of the two memory blocks to complete garbage collection. The disadvantage of this algorithm is to halve the memory, which is a great waste of memory
  • Label compression algorithm
    The algorithm is an optimized version of the mark removal algorithm, which has gone through the stages of marking and compression respectively. We all know in the marking phase. In the compression phase, all living objects are compressed on one side of memory, and then the external objects are recycled.
  • Summary
    Although the mark removal algorithm will produce a lot of memory fragments, it does not need to move too many objects. It is more suitable for objects to survive.
    Although the replication algorithm will halve the memory, it can make the memory continuous and suitable for the situation with few objects.
    The tag compression algorithm is an optimized version of the tag removal algorithm. It also needs to be marked, but it is only compressed to the other side. Make the memory continuous.

Generational thought

We have introduced three garbage collection algorithms above, but when the JVM actually performs garbage collection, it does not only use one algorithm alone. Instead, different algorithms are adopted according to the actual situation.
The so-called generational algorithm is to adopt different garbage collection algorithms according to different memory regions of the JVM. For example, for the Cenozoic area with few surviving objects, the replication algorithm is adopted. Because there are few objects, the objects that need to be recycled and do not need to be recycled can be moved to the unused area and the used area. For the old age area with more surviving objects, you can use the mark removal or mark compression algorithm, because in this way, you don’t have to move too many objects.

Zoning thought

The above generational thought is divided according to the number of objects, the new generation and the old generation. The idea of zoning is to divide the old age and the new generation according to the length of the object’s life cycle. Then different garbage collection algorithms are adopted according to different generations. However, there is also a partitioning idea in the JVM, that is, the whole heap space is divided into continuous small areas, which are used independently in each period. Independent recycling. The following figure summarizes the JVM garbage collection mechanism analyzed above

5、 JVM family (garbage collection mechanism)

image.png