Learn JVM performance optimization


Operational performance optimization

1 re recognize the JVM

We have drawn a picture before, which is the process from class file to class loader, and then to runtime data area. Now we can enrich and perfect this picture to show the general physical structure of JVM.


Execution engine: used to execute JVM bytecode instructions

There are mainly two ways to achieve this:

(1) translate the input bytecode instruction into another virtual machine instruction when loading or executing;

(2) translate the input bytecode instruction into the instruction set of the host local CPU when it is loaded or executed. These two methods correspond to the interpretation execution and immediate compilation of bytecode.

9.2 heap memory overflow

9.2.1 code



Remember to set parameters such as – xmx20m – xms20m

9.2.2 operation results

Visit – > http: / / localhost: 8080 / heap

Exception in thread “http-nio-8080-exec-2” java.lang.OutOfMemoryError: GC overhead limit exceeded

9.2.3 review JPS and Jinfo



9.2.4 review jmap manual export and parameter automatic export

Jmap manual export: jmap – dump: format = B, file = heap.hprof PID



Parameter auto export:

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=heapdump.hprof


9.3 method area memory overflow

For example, adding class information to the method area

9.3.1 ASM dependency and class code



9.3.2 code

Set the size of Metaspace, for example – XX: metaspacesize = 50m – XX: maxmetaspacesize = 50m

9.3.3 operation results

Visit – > http: / / localhost: 8080 / nonheap



9.4 virtual machine stack

9.4.1 code demonstration stackoverflow



9.4.2 operation results


9.4.3 description

When stack space is used to make recursive calls to methods, it is pressed into the stack frame. So when the recursive call is too deep, it is possible to run out of stack space and burst the stackoverflow error.

-Xss128k: set the stack size of each thread. After JDK 5, the stack size of each thread is 1m, and before, the stack size of each thread is 256K. Adjust according to the memory size required by the thread of the application. In the same physical memory, reducing this value can generate more threads. However, the operating system has a limit on the number of threads in a process, which can not be generated infinitely. The experience value is about 3000-5000.

The size of thread stack is a double-edged sword. If it is set too small, stack overflow may occur, especially when there are recursive and large loops in the thread. If it is set too large, the number of stacks created will be affected. If it is multi-threaded application, memory overflow error will occur.

9.5 thread deadlock

9.5.1 code



9.4.2 operation results


9.4.3 jstack analysis


Pull the printed information to the end to find out


9.4.4 jvisualvm


Dump thread information


9.6 garbage collection

After the memory is used, it is inevitable that there will be insufficient memory or when the set value is reached, the memory space needs to be garbage collected.

9.6.1 timing of garbage collection

GC is done automatically by the JVM, depending on the JVM system environment, so the timing is uncertain.

Of course, we can do garbage collection manually, such as calling the system. GC () method to notify the JVM to do a garbage collection, but we can’t control when it runs. That is to say, system. GC () just informs the JVM when to recycle.

However, it is not recommended to call this method manually because it consumes a lot of resources.

Generally, garbage collection will occur in the following situations

(1) when Eden area or s area is not enough

(2) there is not enough space in the old age

(3) insufficient space in the method area


Although the timing of garbage collection is uncertain, the process of heap memory collection can be combed again by combining the life-long case of the previous object.

The life of an object

I’m a common Java object. I was born in Eden district. In Eden District, I also saw my little brother who looks like me. We played in Eden district for a long time.

One day, there were so many people in Eden district that I was forced to go to the “from” District of survivor district. Since I went to the “from” District of survivor, I began to drift. Sometimes in the “from” District of survivor, sometimes in the “to” District of survivor, I had no place to live. Until I was 18 years old, my father said that I was an adult, and it was time to go to the society.

So I went to the old generation. In the old generation, there are many people, and they are very old. I know a lot of people here. In the old generation, I lived for 20 years (GC plus one year at a time) and was recycled.


9.6.2 preparation of experimental environment

My local machine uses JDK1.8 and Tomcat 8.5. You can also use Tomcat on Linux and download the GC logs.

9.6.3 GC log file

Review and sublimate the picture of garbage collector


To analyze the log information, you need to get the GC log file first, so you need to configure it first. You have seen these parameters before.

-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xloggc:$CATALINA_HOME/logs/gc.log

For example, open catalina.bat in windows, and add

set JAVA_OPTS=%JAVA_OPTS% -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xloggc:gc.log 

In this way, when you start Tomcat with startup.bat, you can get the gc.log file in the current directory

You can see that the default is parallelgc parallel GC log

[throughput priority]

2019-06-10t23:21:53.305 + 0800: 1.303: [GC (allocation failure) [psyounggen: 65536K [before youth recovery] – > 10748k [after youth recovery] (76288k [total youth size])] 65536K [before whole heap recovery] – > 15039k [after whole heap recovery] (251392k [total heap size]), 0.0113277 secs] [times: user = 0.00 sys = 0.00, real = 0.01 secs]

`Note that if there is a discrepancy between the recovered differences, it means that this part of the space is released from the old area CMS log

[pause time first]

Parameter setting


Restart Tomcat to get the GC log. The format of the log here is similar to the above. No analysis is required. G1 log

G1 log format reference link:


[pause time first]



Parameter setting



9.6.4 GC log file analysis tool gceasy

You can compare the throughput and pause time of different garbage collectors GCViewer


9.6.5 G1 tuning

Judgment basis of G1 garbage collector


(1) more than 50% of the heap is occupied by living objects

(2) the speed of object allocation and promotion changes greatly

(3) long time for garbage collection

(1) use g1gc garbage collector: – XX: + useg1gc

Modify the configuration parameters, get the GC log, and use the gcviewer to analyze the throughput and response time

Throughput       Min Pause       Max Pause      Avg Pause       GC count  

         99.16%         0.00016s         0.0137s        0.00559s          12

(2) adjust the memory size to get GC log analysis


For example, set the heap memory size, get the GC log, and use the gcviewer to analyze the throughput and response time

Throughput       Min Pause       Max Pause      Avg Pause       GC count

  98.89%             0.00021s             0.01531s            0.00538s             12

(3) adjust the maximum pause time

-XX: maxgcpausemillis = 200 set the maximum GC pause time indicator

For example, set the maximum pause time, get the GC log, and use the gcviewer to analyze the throughput and response time

Throughput       Min Pause       Max Pause      Avg Pause       GC count

        98.96%          0.00015s        0.01737s       0.00574s          12

(4) percentage of heap memory used when starting concurrent GC

-XX: initiating heapcoccupancypercent = 45 G1 use it to trigger concurrent GC cycles, based on the utilization rate of the whole heap, not just the utilization ratio of a generation of memory. A value of 0 means “GC loop is executed all the time.”. The default value is 45 (for example, 45% of the total or 45% used)

For example, set the percentage parameter, get the GC log, and use the gcviewer to analyze the throughput and response time

Throughput       Min Pause       Max Pause      Avg Pause       GC count

        98.11%          0.00406s        0.00532s       0.00469s          12

9.6.6 best practices for G1 tuning

Suggestions on the official website:


(1) do not manually set the size of the new generation and the old generation, just set the size of the whole heap

  • G1 collector will adjust the size of new generation and old generation during operation

  • In fact, adjust the speed and age of object promotion through the size of adapt generation, so as to achieve the pause time target set for the collector

  • If the size is set manually, it means that the automatic tuning of G1 is abandoned

(2) continuously optimize the pause time target

In general, it is OK to set this value to 100ms or 200ms (it will be different in different situations), but it is not reasonable to set it to 50ms. If the pause time is set too short, G1 will not keep up with the speed of garbage generation. Finally, it degenerates into full GC. So the tuning of this parameter is a continuous process, gradually adjusting to the best state. Pause time is only a goal and cannot always be met.

(3) use – XX: concgcthreads = n to increase the number of marked threads

If the threshold value of IHop is set too high, it may encounter the risk of transfer failure, such as insufficient space when the object is transferred. If the threshold is set too low, the tag cycle will run too frequently, and it is possible that the mixed collection cycle will not reclaim space.

>If the IHop value is set reasonably, but the concurrency cycle time is too long, you can try to increase the number of concurrent threads and increase concgcthreads.

(4) mixedgc tuning





(5) increase the heap memory size appropriately

9.7 one figure summarizes JVM performance optimization


The full text! thanks for watching

Related reading:

Interview the JVM, say these to the interviewer, and you will be impressed!

After reading this article, my grandma knows the memory model and garbage collection in the JVM!

Learn JVM configuration parameters and tool usage

Recommended Today

How to share queues with hypertools 2.5

Share queue with swote To realize asynchronous IO between processes, the general idea is to use redis queue. Based on the development of swote, the queue can also be realized through high-performance shared memory table. Copy the code from the HTTP tutorial on swoole’s official website, and configure four worker processes to simulate multiple producers […]