You have to know about JVM garbage collection

Time:2022-5-7

catalogue

1、 Four citation methods
1.1 strong reference
1.2 soft reference
1.3 weak reference
1.4 phantom reference

2、 How to judge whether the object is garbage
2.1 reference counting method
2.2 root accessibility analysis

3、 Garbage collection algorithm
3.1 mark sweep
3.2 mark compact
3.3 mark copy

4、 Garbage collector
4.1 classification and characteristics
4.1.1 serial
4.1.2 throughput priority
4.1.3 priority of response time
4.2 serial garbage collector details
4.2.1 Serial
4.2.2 Serial-Old
4.2.3 flow chart
4.3 details of throughput priority garbage collector
4.3.1 JVM related parameters
4.3.2 flow chart
4.4. Details of response time priority garbage collector
4.4.1 JVM related parameters
4.4.2 flow chart
4.3.3 characteristics of CMS

5、 G1 garbage collector
5.1 relevant JVM parameters
5.2 features
5.3 G1 Cenozoic garbage recycling
5.4 G1 old age garbage recycling

1、 Four citation methods

1.1 strong reference

Only when all GC roots objects do not reference this object through strong reference can this object be recycled.

1.2 soft reference

  • When only the soft reference refers to the object, after garbage collection, if the garbage collection is issued again when the memory is still insufficient, the soft reference object will be recycled
  • You can release the soft reference itself in conjunction with the reference queue

1.3 weak reference

  • When only weak references refer to this object, this part of the object will be recycled every time garbage collection occurs, regardless of whether the memory is sufficient or not
  • You can release the soft reference itself in conjunction with the reference queue

1.4 phantom reference

  • It must be used in conjunction with the reference queue, mainly with ByteBuffer. When the referenced object is recycled, the virtual reference will be stored in the queue, and the reference handler thread will call the virtual reference related methods to release the direct memory

2、 How to judge whether the object is garbage

2.1 reference counting method

As long as an object has a reference relationship, the number of references of the object will be increased by 1. If the number of references of an object is 0, it means that the object is garbage.

Advantages: simple implementation and high efficiency

Disadvantages: if a pair of objects have formed a mutual reference, but these two objects have not been referenced by other objects, under normal circumstances, this pair of objects should be recycled as garbage, but they cannot be recycled because of the mutual reference.

2.2 root accessibility analysis

Start looking down through the GC root object. The objects that cannot be found indicate that they are not referenced, and these objects that are not referenced are considered garbage.

Currently, the following objects can be used as GC root objects:

1. The object referenced by the local variable table in the stack frame
2. Objects in local method stack
3. Object referenced by class static attribute
4. Object referenced by constant in method area
5. Surviving Thread objects
6. Bootstrap CloassLoader
7. Classes loaded through bootstrap classloader and extension classloader
8. Objects being locked by synchronized

3、 Garbage collection algorithm

According to the previous description, we know which objects are garbage and need to be recycled. What algorithm is used for recycling?

3.1 mark sweep

It’s easy to understand, that is, when GC is released, first analyze the root accessibility of all objects, so as to mark all garbage objects; After all objects are marked, the cleaning operation will be carried out.

Therefore, generally speaking, it is to mark first and then clear.

Disadvantages; After the mark is cleared, a large number of discontinuous memory fragments will be generated. Too many fragments may lead to the failure to meet the allocation requirements when large objects need to be allocated during program operation, resulting in GC operation.

3.2 mark compact

The operation process of the recycling algorithm is basically the same asMark – clearThe algorithm is just that the second step is a little different. This method will be carried out in the process of clearingarrangementOperation, which is the biggest difference.

Advantage: ultimately, there will be no space waste caused by several space debris.

Disadvantages: the calculation brought by the sorting process can not be underestimated.

3.3 mark copy

This method is quite different from the first two:

In this way, the storage area will be divided into two parts: from and to. There may be objects in the from area, but the to area is always empty to prepare for the next data acceptance.

There are two pointers pointing to these two areas: from pointer and to pointer,

Advantages: this algorithm is very suitable for objects that live early and die late

Disadvantages: there is always an unused memory area, resulting in a waste of space.

4、 Garbage collector

Garbage collector is the concrete implementation of the above three garbage collection algorithms

In Java, the characteristics of objects saved by different “generations” are different. Therefore, Java does not prefer one kind of garbage collection, but only adopts one algorithm. Instead, it adopts different garbage collectors (garbage collection algorithms) according to the characteristics of different “generations” objects.

4.1 classification and characteristics

4.1 serial

characteristic:

  • Single thread
  • Small heap memory, suitable for personal computers
4.2 throughput priority

characteristic:

  • Multithreading
  • Large heap memory, suitable for multi-core CPU
  • Make STW the shortest time per unit time, that is, the proportion of garbage collection time to total running time in a period of time. The smaller the proportion, the better. It means that most of the time is in code logic
4.3 priority of response time

characteristic:

  • Multithreading
  • Large heap memory, suitable for multi-core CPU
  • Try to keep the single STW time as short as possible, just pursue the short time of each garbage collection, and don’t care how many garbage collection takes place in a period of time.

4.2 serial garbage collector details

JVM switch: – XX: + useserialgc = serial + serialold

4.2.1 Serial
  • Working in the new generation
  • usecopyalgorithm
  • Single thread
4.2.2 Serial-Old
  • Old age
  • useMarking – finishingalgorithm
  • Single thread
4.2.3 flow chart
You have to know about JVM garbage collection

Serial garbage collector png

4.3 details of throughput priority garbage collector

JDK1. 8 is enabled by default. The algorithm used is consistent with serial, but the processing thread is different.

4.3.1 JVM related parameters
  • -XX+UseParallelGC

    • Working in the new generation
    • Replication algorithm
  • -XX+UseParallelOldGC

    • Working in the elderly generation
    • Marking sorting algorithm
  • -XX:+UseAdaptiveSizePolicy

    Dynamically adjust the ratio of Eden to survivor

  • -20: Gctimeratio = ratio (default = 99)

    Proportion = 1 / (1 + Radio)

    Indicates that you want the ratio of GC time to total time in the current total running time to be less than or equal to the value of the above formula.

  • -20: Maxgcpausemils = MS (default 200ms)

    Time of single garbage collection

  • -XX:ParallelGCThread=n

    Specifies the number of parallel garbage processing threads. The default is the number of CPU cores

    Therefore, in the process of garbage collection, the CPU may run full at once

4.3.2 flow chart
You have to know about JVM garbage collection

Throughput first png

4.4. Details of response time priority garbage collector

4.4.1 JVM related parameters
  • -XX:+UseParNewGC

    Working in the new generation

  • -XX:+UseConcMarkSweepGC

    Working in the elderly generation, if garbage collection fails, it will be returned toSerialOldgarbage collection.

    Mark removal algorithm is adopted;

    It can be executed concurrently, that is, in some garbage collection stages, the garbage collection thread can be executed together with the user thread.

  • -XX:ConGCThread=n

    Specifies that n threads can be used to handle garbage collection during concurrent collection

  • -XX:CMSInitiationOccupancyFraction=precent

    In the following analysis, we will introduceFloating garbageTherefore, in order to remove floating garbage in a more timely manner, garbage collection in the old age will be triggered when the space occupation in the old age reaches percent.

  • -XX:+CMSScavengeBeforeRemark

    Because the CMS of the elderly generation involvesRelabel, re tagging is to judge whether the garbage object is re referenced. Then CMS will start from the new generation and use root reachability analysis (because root reachability analysis is irreversible, that is, it is impossible to directly check whether an object is referenced through an object, just like a binary search tree, you must first look down from the root object to see whether the object can be found). However, there are many objects in the new generation, and more reachability analysis from the new generation objects one by one is bound to slow down the response analysis, So inRelabelPreviously, if the parameter change switch has been turned on, a new generation of garbage collection will be carried out first.

  • -XX:ParallelGCThreads=n

    Specifies the number of parallel garbage processing threads. The default is the number of CPU cores.

4.4.2 flow chart
You have to know about JVM garbage collection

Response time is preferred png

The above figure is the working flow chart of CMS garbage collector in the old generation GC:

  1. Initial tag: tag the root object
    1. STW will be triggered
    2. There are fewer root objects, so tags are fast
  2. Concurrent marking: mark the remaining objects from the root object
    1. STW will not be raised
    2. The tag here can be executed concurrently with the user thread
  3. Relabel: rescan the garbage object obtained in the second step
    1. STW will be introduced
    2. In order to avoid some garbage objects being referenced again due to the continuous execution of the user thread in the process of concurrent marking
  4. Concurrent Cleanup: use the mark cleanup algorithm for the garbage object obtained at the end of the third step
    1. STW will not be introduced
    2. This can be executed concurrently with the user thread
4.3.3 characteristics of CMS
  1. At some stages, the garbage cleaning thread can execute concurrently with the user thread

  2. Minimize STW time

  3. In the process of concurrent marking, changing from garbage object to useful object requiresRelabelThe last step of filtering is performed in, but there is one case that cannot be handled, that is:

    In the process of concurrent marking, the floating garbage newly generated by the user thread cannot be recognized and can not be removed until the next garbage collection

  4. In the whole cleaning process, only the initial marking and re marking need STW, and the other steps can be executed concurrently with the user thread

  5. In the old days of CMS, the garbage collector may degenerate toSerialOldOld age garbage collector:

    When an object cannot be stored in memory and still cannot be stored after the garbage collection of the new generation, the garbage collection of the old generation needs to be carried out at this time. If it still cannot be stored after the garbage collection of the old generation, it is considered that the concurrency fails and will be returned toSerialOldCarry out an old age garbage collection. If it still cannot be stored after cleaning, it will report the wrong old age oom.

5、 G1 garbage collector

5.1 relevant JVM parameters

  • -XX:UseG1GC

    Start using G1 garbage collector

  • -20: G1heapregionsize = n (2048 by default)

    Specifies that the entire heap is divided into n regions, so the size of each region isHeap space / nHowever, there is a hard rule that the size of each region must be an integral multiple of 2.

    TwoinitialRegion number formula:

    Number of Cenozoic regions = 5% * n

    Number of regions in old age = 95% * n

    For the Cenozoic, in the past, the Cenozoic was further divided into Eden area and survivor (from, to) area, and the ratio of the three is still:8:1:1, this division is also applicable in G1, but it will not appear in the same region, but Eden in an independent Cenozoic region, from and to each occupy an independent region.

  • -XX:InitaingHeapOccupancyPercent=45%

    When the number of regions in the old age reaches n%, garbage collection in the old age will be triggered.

  • –XX:G1MixedGCuntTarget=n

    Garbage collection in the mixed mode is divided into n times in the concurrent cleaning phase

  • -XX:MaxGCPauseMillis=ms

    The specified maximum garbage collection time. If the time is very short, G1 will only recycle those regions with the highest collection value, so as to achieve this pause goal.

5.1 features

  • All age garbage collector

    G1 is responsible for garbage collection throughout the year. Unlike the previous garbage collectors, for example, serial (new generation) needs to cooperate with serialold (old age) and CMS (old age) needs to cooperate with parnew (new generation)

  • G1 divides the heap differently

    Before G1, the stack space was “cut” into two parts: the Cenozoic Era and the old age,

    G1 does not divide the whole heap space in the traditional way. G1 divides the whole heap space into one with the same sizeRegion, and eachRegionIt belongs to either the new generation or the old generation, and each new generation or old ageRegionPhysically, it is not a continuous physical address. And eachRegionIt does not necessarily belong to the new generation or old age in the whole project cycle, but is in a process of dynamic change.

  • Every time a generation is recycled, it will not be scanned and recycled

    Recycling value: This is everyRegionAttribute in Calc (survival rate of the object, estimated recovery time, recovery effect…)

    When recycling, priority will be given to those with high recycling valueRegionSo as to reduce the area of garbage collection.

  • Large object storage

    In previous generational models, when a large object enters the memory, if the whole new generation is still unable to use after garbage cleaning, but the old generation has enough space, the large object will directly enter the old age;

    In G1, too large objects are not stored in the above way. Instead, large objects are arranged in a region called humongous. In case of garbage collection of the new generation or the old generation, the region of humongous will be cleaned up.

5.2 G1 Cenozoic garbage recycling

Through the above text analysis, the number of regions in the Cenozoic is 5% of the number of all regions; This value is actually very small, so when the new generation of regions is not enough, the JVM will divide more regions to the new generation;

When the number of regions in the Cenozoic Era accounts for more than60%A new generation of garbage collection will be carried out.

The new generation of garbage collection will cause STW.

The specific garbage collection algorithm is the same as several other new generation garbage collectors. The new generation uses the replication algorithm.

  1. STW is performed first, and all user threads are sent to a safe point

  2. Analyze Eden’s objects, copy the surviving objects to to 0 Bay, then empty Eden, and then change the original to region to from region.

    When the life cycle of an object reaches a threshold, it will be promoted to the old age.

5.3 G1 old age garbage recycling

Trigger mechanism and parameters of old age garbage collection-XX:InitaingHeapOccupancyPercentof

However, it should be noted that this old age recycling is actually a mixed garbage recycling, which will clean up the new generation, old age and humongous at the same time.

Consistent with the new generation recycling algorithm, the replication algorithm is still used, but the garbage recycling process is equivalent to the CMS method with priority to response time in the old age

The process is divided into:

  1. Initial marking, STW will occur
  2. Concurrent marking
  3. Final marking, STW will occur
  4. According to the recycling value of different regions, the region with the highest recycling cost performance is selected for concurrent cleaning, unlike CMS, which does not carry out selective recycling, in order to control the time of STW.

Note: there may be knowledge errors or typos in the article. You are welcome to point out