Huawei cloud database gaussdb (for Cassandra) unveils phase 2: troubleshooting experience of abnormal memory growth

Time:2021-6-13

Abstract:The abnormal growth of memory is a fatal problem for the program, because it may trigger oom, process abnormal downtime, business interruption and other results, so the reasonable planning and control of memory is particularly important.

This article is shared from Huawei cloud community《Huawei cloud database gaussdb (for Cassandra) unveils phase 2: troubleshooting experience of abnormal memory growth》Original author: Gauss Cassandra official.

Huawei cloud database gaussdb (for Cassandra) is a cloud native NoSQL database based on computing storage separation architecture and compatible with Cassandra ecology; It relies on shared storage pool to achieve strong consistency and ensure the safety and reliability of data. The core features are: separation of storage and calculation, low cost and high performance.

Problem description

Gaussdb (for Cassandra) self-developed architecture encountered some challenging problems, such as high CPU, memory leakage, abnormal growth of memory, high latency, these are also typical problems encountered in the development process. It is a big challenge to analyze the abnormal growth of memory. The abnormal growth of memory is a fatal problem for programs, because it may trigger oom, process abnormal downtime, business interruption and other results, so it is particularly important to plan, use and control the memory reasonably. By adjusting the cache capacity, bloom filter size, and memtable size, we can improve the performance and read-write delay.

In the process of offline testing, it is found that the memory of the kernel only increases but not decreases after a long time running, and there is an abnormal growth. It is suspected that there may be a memory leak.

Analysis & Verification

Firstly, according to the memory usage, the memory is divided into two parts: in heap and out of heap. Confirm that the memory in question is out of heap memory, and further analyze the out of heap memory. A more efficient memory management tool, tcmalloc, is introduced to solve the problem of abnormal memory growth. The following is the specific analysis and verification process.

Identify memory exception areas

Using JDK jmap command and Cassandra monitoring (configuring the monitoring item of JVM. Memory.), the heap memory of the JVM and the whole process memory are collected every 1min.

Start the test case until the total memory of the kernel reaches the maximum. By analyzing the change curves of heap memory and process memory, it is found that the heap memory is still relatively stable and does not keep rising, but the overall memory of the kernel is still rising during this period, and the growth curves of the two are inconsistent. That is to say, the problem should be out of heap memory.

Analysis and verification of out of heap memory

Glibc memory management

Using PMAP command to print the memory address space distribution of the process, it is found that there are a large number of 64MB memory blocks and many memory fragments. This phenomenon is related to glibc’s memory allocation mode. The usage of out of heap memory is similar to that of the whole process. It is suspected that the problem is caused by out of heap memory. In addition, glibc’s conditions for returning memory are harsh, that is, memory is not easy to release in time, and there are many memory fragments. When there are too many memory fragments and idle memory is wasted seriously, the maximum utilization of process memory may exceed the expected maximum, even oom.

Tcmalloc memory management

Tcmalloc memory manager is introduced to replace glibc’s ptmalloc memory management mode. To reduce excessive memory fragmentation and improve the efficiency of memory utilization, this analysis and verification uses gperftools-2.7 source code to compile the tcalloc. Running the same test case, it is found that the memory is still rising, but the rising rate is lower than before. The memory address distribution is printed out through PMAP, and it is found that the previous small memory blocks and memory fragments are significantly reduced, which indicates that the tool has a certain optimization effect and confirms the conjecture that there are too many memory fragments mentioned above.

However, the problem of abnormal growth of memory still exists, which is a bit like the untimely or non recycling of TCM alloc. In fact, the memory recovery of tcmalloc is relatively “reliable”, mainly for direct use when memory application is needed again, so as to reduce the number of system calls and improve performance. For this reason, we need to call the releasefreememory interface manually. The result is not obvious, and the reason is unknown (there may be free memory not to be released).

Manually trigger release free memory interface of tcmalloc

To verify this problem, the cache capacity is set.

  1. First, set the cache capacity to 6GB, and then press the read request to make the cache capacity of 6GB full
  2. Modify the cache capacity to 2GB. In order to release the memory quickly, manually call the release free memory interface of tcmalloc. It is found that there is no effect. It is speculated that the reason why the memory keeps rising after adopting tcmalloc may be related to the interface.
  3. In the releasefreememory interface, the log is recorded in many places, and then the process is started to test again. It is found that one error is that the system fails to call madwise.

Code location:
Huawei cloud database gaussdb (for Cassandra) unveils phase 2: troubleshooting experience of abnormal memory growth

Error log information:
Huawei cloud database gaussdb (for Cassandra) unveils phase 2: troubleshooting experience of abnormal memory growth

  1. Through the call failure, analyze the code. It is found that the memory release logic of tcalloc is “round robin”, that is, if a span fails to release in the middle, the subsequent span to be released will be terminated, and the release free memory logic call will end. This is consistent with the previous phenomenon. After executing the releasefreememory interface, it basically has no effect. It is found that each time several tens of MB are released, because the call failure of the interface leads to the termination of the release logic.
  2. Again, the reason why the system failed to call madwise is analyzed. By patching this method of the kernel, it is found that the reason for its failure is that the memory state corresponding to the incoming address block is locked. The system call fails and the error is illegal.
  3. The memory is in the locked state, which is related to the code calling mlock system method and the ulimit configuration of the system. Analysis of the relevant code found no exception. Query the ulimit configuration of the system and find that Max locked memory is unlimited. Modify its configuration to 16MB, restart Cassandra process, test again, found that the memory release effect is significant.
  4. Continue to run the test, found that the memory continued to rise the situation disappeared. When the business continues to exist, the memory will rise to the highest level, no longer rise, keep stable, and meet the planned usage of memory. After the business pressure reduces or even stops, the memory appears a slow downward trend.

Solution & summary

  1. Tcmalloc tool is introduced to optimize memory management. Excellent memory managers include Google’s tcmalloc and Facebook’s jemalloc
  2. Modify the max locked memory parameter configuration of the system.

Reasonable allocation process needs to use the maximum amount of memory, and reserve a certain capacity. Further analysis is needed for the memory that does not meet the expected growth. Memory related problems are highly related to programs. The key configuration of the system needs to be cautious and its impact should be evaluated. At the same time, all similar configurations were checked.

The command of release free memory is added and called by the back end to optimize the problem of tcmalloc hold memory not releasing. However, the execution of the release free memory command will lock the whole page heap, which may cause the memory allocation request to be hang, so it needs to be executed carefully.

Add dynamic configurable tcmalloc in the back end_ release_ Rate parameter to adjust the frequency at which tcmalloc returns memory to the operating system. The reasonable range of this value is [0-10]. 0 means never return. The larger the value is, the higher the return frequency is. The default value is 1.

epilogue

This paper analyzes the memory growth problems encountered in the development process, uses better memory management tools, and more fine-grained memory monitoring, more intuitive monitoring of the memory state of the database during operation, to ensure the stable and high-performance operation of the database.

Click follow to learn about Huawei’s new cloud technology for the first time~