JVM learning (4): garbage collection


Areas where garbage is collected: heap, method area

The heap and method area of the runtime data area are shared among all threads for recycling

All threads are private


When to recycle:

This is often used in development

List list = new ArrayList<>();
//Business logic code
return ;

This is unreasonable. List is a local variable and should be assigned null after use


This code then uses the parameter – XX: + printgcdetails – XX: + useserialgc

public class ReferenceCountingGC {
    private static final int MB = 1024 * 1024;
    Object instance = null;
    private byte[] size = new byte[2 * MB];

    public static void main(String[] args) {
        ReferenceCountingGC o1 = new ReferenceCountingGC();
        ReferenceCountingGC o2 = new ReferenceCountingGC();
        o1.instance = o2;
        o2.instance = o1;
        o1 = null;
        o2 = null;

There is a section in the print: “Name: memory consumption before GC – > memory consumption after GC (total memory size of this area)]

[Tenured: 0K->643K(349568K), 0.0016527 secs]


Reference counter: when an object makes a reference, the counter is increased by one; when the reference fails, the counter is subtracted by one; when a reference counter equals 0, it means that the object will not be referenced again

O1 and o2new are strong references after completion, and then cross reference each other. Finally, although the assignment is null, the reference counter is not 0. In principle, it is not recycled, but actually recycled


Four References:

(1) Strong reference: there are obvious references such as new object() in the code. As long as the reference still exists, it will not be recycled. If there is not enough memory, an oom exception will be thrown

(2) Soft reference: when the memory is enough, it will not be recycled; when the memory is insufficient, it will be recycled. The use of weak reference is as follows( java.lang.ref .SoftReference)

        Object object = new Object();
        SoftReference test = new SoftReference(object);

(3) Weak reference: before the next garbage collection, objects associated with weak references will be recycled regardless of whether there is enough memory( java.lang.ref .WeakReference)

(4) Virtual reference: has no actual use, only for receiving a system notification before the object is garbage collected( java.lang.ref .PhantomReference)


Accessibility analysis:

In Java, there are four types of objects that can be used as GC roots:

(1) The object referenced in the virtual machine stack (the local variable table in the stack frame)

(2) The object referenced by the static property of the class in the method area

(3) The object referenced by a constant in the method area

(4) Objects referenced by JNI (native method) in the local method stack, such as person after calling test (person) in the following sentence

private native void test(Person person);


According to the graph algorithm, GC root can be found through object level reference, which is not recyclable. As long as it is disconnected from GC root, it is recyclable



Before formal GC, reachability analysis is performed to mark objects that may be declared dead in the future

If you have to traverse all the references every time you GC, the workload is very heavy

Because in the reachability analysis to ensure that there is no change in the reference relationship, all execution threads have to pause and wait, and the threads in the program need to stop to cooperate with the reachability analysis

Therefore, it must be unrealistic to traverse the whole reference chain directly each time. In order to deal with this embarrassing problem, the first conservative GC and later accurate GC

Here, the exact GC will refer to an oopmap, which is used to store the mapping table of types

Generally speakingOopmap tells us what data can be recycled in GC


Safe Point:

With oopmap, hotspot can quickly and accurately complete GC roots enumeration

But there’s another question: where are we going to create oopmap

During the running of the program, the reference changes constantly. If every instruction claims oopmap, it will take up too much space, so there is a safe point

GC pauses are performed only at safe points, as long as the record of reference changes is completed before the GC pause

If there are too few security points selected, the waiting time of GC will be too long. If too many security points are selected, the GC will be too frequent.

The selection principle is “have the characteristic of making the program execute for a long time”, that is, the existing instructions can be reused at this time.

It is generally selected in the location of method call, loop jump and exception throw.

Generally speakingSafepoint tells us where to do GC


STW:Stop the world is an abbreviation for stop the world, which means that all normal user threads are stopped


Now the question is how to interrupt threads at safe point? There are two schemes: preemptive interrupt and active interrupt.

Preemptive interrupt:

When GC occurs, all threads are interrupted. If it is found that a thread is no longer on the safe point, the thread will be resumed and run to the safe point. It’s almost no use now.

Disadvantages: Thread.sleep (); wait() can’t run to this safe point

Active interrupt:

Set a flag that coincides with the safe point, plus the place where the object is created to allocate memory. Each thread actively polls this flag and suspends itself if the interrupt flag is true

Hotspot uses active interrupts


Mark clear algorithm

The algorithm marks all the objects that need to be recycled, and recycles all marked objects after the marking is completed


(1) Low efficiency

(2) After the mark is cleared, a large number of discontinuous memory fragments will be generated


Replication algorithm:

The memory is divided into two equal size blocks. Only one block is used at a time. After the memory is used up, the living objects are copied to another block, and the used memory space is cleaned up at one time

Disadvantages: memory reduced to half of the original, low utilization


Mark sorting algorithm:

The marking process is the same as the mark cleaning algorithm, but it does not clean up the recyclable objects directly. All the surviving objects are moved to one end, and then the memory outside the end boundary is cleaned directly


Generation algorithm:


Generally, Java heap is divided into the new generation and the old generation. In this way, the most suitable collection algorithm can be adopted according to the characteristics of each era. In the new generation, a large number of objects will die every time they are collected, so the replication algorithm is selected

Only a small amount of replication cost is needed to complete the recycling; the old generation has a high survival rate, and there is no extra space to guarantee its allocation, so we must use the mark cleaning algorithm

Java Memory diagram

More advanced algorithm, because in the actual project, most of the objects are born dead, each time a large number of garbage collection objects die, after many times of recycling, finally survive into a higher generation

Vivid metaphor: similar to soldiers fighting, every time the rank of surviving soldiers will be upgraded, and eventually the survivors are generals


Allocation strategy:

(1) Large objects enter the old generation directly, and the typical large objects are long strings and large arrays

Using the parameter – XX: preemptsizethreshold, objects larger than a certain value can be directly saved in the old generation allocation, avoiding a large number of memory replication between the Eden area and the two surviving areas

(2) If a long-term survival object enters the old generation, if the object survives through a minorgc, the age will be increased by one year. By default, 15 years old will be promoted to the old age

(3) If the total size of all objects of the same age in the surviving generation is greater than half of the space of the surviving generation, the objects whose age is greater than or equal to that age can enter the old generation directly without the need for 15 years old

(4) Check whether the maximum available continuous space of the elderly is greater than the average size of objects promoted to the old age. If it is greater than, try minorgc once. If it is less than, it is not allowed to take risks. At this time, a fullgc is required


Garbage collector:

(1) Serial collector (- XX: + useserialgc)

A single thread collector not only uses one thread to complete garbage collection, but also must suspend all other worker threads (STW) until it finishes collecting

The new generation uses the replication algorithm, while the old generation uses the marking algorithm

Applicable scenario: desktop application (eclipse, burpsuite)


(2) Serialold collector

Old time version of serial


(3) Parnew collector (- XX: + userparnewgc)

The multithreaded version of the serial collector is no different from serial except that it uses multithreading for garbage collection. The new generation uses multithreading, while the old one still uses multithreadingSingle thread

The new generation uses the replication algorithm, while the old generation uses the marking algorithm

Applicable scenario: the preferred new generation collector on the server side


(4) Parallel scavenge collector (- XX: + useparallelgc)

New generation collector, controllable throughput

Throughput: running user code actual / (running user code actual + garbage collection time)

The shorter the pause time is, the more suitable the program needs to interact with the user. Good response speed can improve the user experience. High throughput can efficiently use CPU time to complete the computing task of the program as soon as possible

The difference with parnew collector is that the user can control the pause time of user thread during GC

Applicable scenario: background computing, not too much interaction


(5) Parallel old collector (- XX: + useparalleloldgc)

Is an older version of the parallel scavenge collector, suitable for multithreading and tag collation algorithms

Both the new generation and the old generation are multithreaded


(6) CMS (concurrent mark sweep) collector (- XX: + useconcmarksweepgc)

CMS collector is very complex, which aims to obtain the shortest recovery pause time. CMS is based on mark clearing algorithm. The tag includes the following parts:

Initial tag: STW is required to mark the root object

Concurrent tagging: marking step by step from the root object

Re tagging: STW is required, and objects not found in the first two steps should be re marked, such as (string s = null; s = HelloWorld! “)

Common parameters:

-20: Cmsinitiatingoccupancyfraction is used to set CMS spatial parameters

-20: + usecmpact atfullcollection GC completes a collation operation

-20: Cmsinitiatingoccupancyfraction = 70 – XX: + usecmsinitiatingoccupancyonly these two operations are generally combined: reduce the frequency of CMS GC or increase the frequency and reduce the duration of GC

Default number of threads recycled by CMS: (number of CPUs + 3) / 4

Note: CMS collector can only be used in the elderly

Applicable scenario: Web project


(7) G1 collector

G1 collector is completely different from the previous one. In terms of algorithm, it belongs to both tag collation and copy algorithm


(1) Scan root GC roots

(2) Rememberset records the data structure of the recycled object

(3) To detect the data in the rememberset, we need to go from the young generation to the old age

(4) Copy the object to the surviving or old generation

(5) Cleaning up


Similar to CMS, it is no longer detailed here


G1 application scenario: server side program


Different collectors work together:

Above the horizontal line is the Cenozoic era, below the horizontal line is the old age, in which G1 cannot cooperate with other collectors




JDK1.8 default garbage collector: parallel scavenge + parallel old

Jdk1.9 default garbage collector: G1

Recommended Today

Blog based on beego, go blog

Go Blog A beego based development, can quickly create personal blog, CMS system Include functions see Official website of go bloggo-blog.cn Demo siteleechan.online Update log time function January 23, 2020 New top post function February 2, 2020 New custom navigation function February 4, 2020 New site announcement function February 6, 2020 New link module February […]