An analysis of a surge in back-end service memory of a. Net recruitment network

Time:2021-11-2

1: Background

1. Tell a story

Some time ago, a friend Wx came to me and said that his program had a periodic surge in memory. He asked how to solve it. After communicating with his friends, his memory was probably5GAround, it will soar to near some time points10G+, draw a picture. That’s about it.

An analysis of a surge in back-end service memory of a. Net recruitment network

So the next thing is to find him that inexplicable5-6GWhat is it? Talk to WinDbg.

2: WinDbg analysis

1. Judge whether it is managed or unmanaged

From the description, the probability is a problem at the hosting level, but we still use it for the sake of the integrity of the article!address -summaryand!eeheap -gcTake a look.


0:000> !address -summary

--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
Free                                   1164      7f5`58f12000 (   7.958 TB)           99.48%
<unknown>                              6924        a`6de84000 (  41.717 GB)  97.90%    0.51%
Stack                                  1123        0`16340000 ( 355.250 MB)   0.81%    0.00%
Image                                  4063        0`1607d000 ( 352.488 MB)   0.81%    0.00%
Heap                                     71        0`0c9ea000 ( 201.914 MB)   0.46%    0.00%
TEB                                     374        0`002ec000 (   2.922 MB)   0.01%    0.00%
Other                                    13        0`001c6000 (   1.773 MB)   0.00%    0.00%
PEB                                       1        0`00001000 (   4.000 kB)   0.00%    0.00%

--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_PRIVATE                            5423        a`87200000 (  42.111 GB)  98.83%    0.51%
MEM_IMAGE                              7033        0`1e5d6000 ( 485.836 MB)   1.11%    0.01%
MEM_MAPPED                              113        0`01908000 (  25.031 MB)   0.06%    0.00%

--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE                               1164      7f5`58f12000 (   7.958 TB)           99.48%
MEM_RESERVE                            4165        8`1b873000 (  32.430 GB)  76.11%    0.40%
MEM_COMMIT                             8404        2`8b86b000 (  10.180 GB)  23.89%    0.12%


0:000> !eeheap -gc
Number of GC Heaps: 32
------------------------------
Heap 0 (00000000004106d0)
generation 0 starts at 0x0000000082eb0e58
generation 1 starts at 0x0000000082d79b20
generation 2 starts at 0x000000007fff1000
ephemeral segment allocation context: none
         segment             begin         allocated              size
000000007fff0000  000000007fff1000  0000000083f80128  0x3f8f128(66646312)
Large object heap starts at 0x000000087fff1000
         segment             begin         allocated              size
000000087fff0000  000000087fff1000  0000000883fe4190  0x3ff3190(67056016)
0000000927ff0000  0000000927ff1000  000000092bfe2430  0x3ff1430(67048496)
0000000a81c50000  0000000a81c51000  0000000a8221c858  0x5cb858(6076504)
Heap Size:               Size: 0xc53ef40 (206827328) bytes.
------------------------------
...
Heap 31 (0000000019c84130)
generation 0 starts at 0x0000000844fc5170
generation 1 starts at 0x0000000844f851f8
generation 2 starts at 0x000000083fff1000
ephemeral segment allocation context: none
         segment             begin         allocated              size
000000083fff0000  000000083fff1000  0000000845171ca0  0x5180ca0(85462176)
Large object heap starts at 0x00000008fbff1000
         segment             begin         allocated              size
00000008fbff0000  00000008fbff1000  00000008fffe2290  0x3ff1290(67048080)
000000094bff0000  000000094bff1000  000000094ea2ebb8  0x2a3dbb8(44293048)
000000096bff0000  000000096bff1000  000000096dbdec00  0x1bedc00(29285376)
Heap Size:               Size: 0xd79d6e8 (226088680) bytes.
------------------------------
GC Heap Size:            Size: 0x1f1986a88 (8348265096) bytes.

From the hexagram,10GThe managed heap is eaten up8.3GObviously, there is a problem with the managed layer. After you know the general direction, you can take a look at the managed heap. According to past experience, the program must be caused by generating a large number of class objects, and the command!dumpheap -stat


0:000> !dumpheap -stat
Statistics:
              MT    Count    TotalSize Class Name
...
000007fe9ddd5fc0   341280     30032640 System.ServiceModel.Description.MessagePartDescription
000007fe9c4865a0   866349     41584752 System.Xml.XmlDictionaryString
000007fe9defb098   937801     45014448 System.Xml.XmlDictionaryString
000007fe9c66bd28   105052     45086880 System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[System.Xml.XmlDictionaryString, System.Runtime.Serialization]][]
000007fe9e0f4d20   113299     49050864 System.Collections.Generic.Dictionary`2+Entry[[System.String, mscorlib],[System.Xml.XmlDictionaryString, System.Runtime.Serialization]][]
00000000003c9190    44573    618414438      Free
000007fef8f6c168   428410   1209974642 System.Char[]
000007fef8f4f1b8  2849758   1246912848 System.Object[]
000007fef8f6f058   531963   1670620873 System.Byte[]
000007fef8f6aee0  2368431   2382587716 System.String

It’s really slippery. It doesn’t hit the past experience. It can be seen that the most occupied are thoseByte,String,Char,ObjectBasic types. In fact, these basic types are difficult to check, or they can be used continuously-min, -maxTo filter, or write a script to group and sort it. The lame script is as follows:

"use strict";

/*
   Group the size of managed heap types by Mt
*/

let platform = 64
let mtlist = ["000007fef8f4f1b8"];
let maxlimit = 100;

function initializeScript() { return [new host.apiVersionSupport(1, 7)]; }
function log(str) { host.diagnostics.debugLog(str + "\n"); }
function exec(str) { log("\n" + str); return host.namespace.Debugger.Utility.Control.ExecuteCommand(str); }
function invokeScript() { for (var mt of mtlist) { groupby_mtsize_inheap(mt); } }

//Group a type by size
function groupby_mtsize_inheap(mt) {
    var size_group = {};
    var commandText = "!dumpheap -mt " + mt;
    var output = exec(commandText);
    for (var line of output) {
        if (line == "" || line.indexOf("Address") > -1) continue;
        if (line.indexOf("Statistics") > -1) break;
        var size = parseInt(line.substring(Math.ceil(platform / 2) + 1).trim());

        if (!size_group[size]) size_group[size] = 0;

        size_group[size]++;
    }
    show_top10_format(mt, size_group);
}

function show_top10_format(mt, size_group) {
    var maparr = [];

    //Turn array
    for (var size in size_group) {
        maparr.push({ "size": size, "count": size_group[size], "totalsize": (size * size_group[size]) });
    }

    maparr.sort(function (a, b) { return b.totalsize - a.totalsize });

    var topTotalSize = 0;

    //Output by size
    for (var i = 0; i < Math.min(maparr.length, maxlimit); i++) {
        var size = maparr[i].size;
        var count = maparr[i].count;
        var totalsize = Math.round(maparr[i].totalsize / 1024 / 1024, 2);

        topTotalSize += totalsize

        log("size=" + size + ",count=" + count + ",totalsize=" + totalsize + "M");
    }

    log("Total:" + topTotalSize + "M");

    //show max
    if (maparr.length > 0) {
        var size = maparr[0].size;
        var totalsize = Math.round(maparr[0].totalsize / 1024 / 1024, 2) + "M";
        var output = exec("!dumpheap -mt " + mt + " -min 0n" + size + " -max 0n" + size + " -short").Take(maxlimit);
        for (var line of output) {
            log(line);
        }
    }
}

Next, pass down the method table address of string to see the sorting results. The simplified output is as follows:


!dumpheap -mt 000007fef8f6aee0
size=29285946,count=2,totalsize=56M
size=29285540,count=2,totalsize=56M
size=29285502,count=2,totalsize=56M
size=29285348,count=2,totalsize=56M
size=27455186,count=2,totalsize=52M
size=31116504,count=1,totalsize=30M
size=31116490,count=1,totalsize=30M
size=31116306,count=1,totalsize=30M
size=31115934,count=1,totalsize=30M
size=31115920,count=1,totalsize=30M
size=31115718,count=1,totalsize=30M
size=29286342,count=1,totalsize=28M
size=29285898,count=1,totalsize=28M
...
Total:1198M

You can see that there are many large size strings. What are these strings? I’ll take a few and export them to txt to see.


0:000> !dumpheap -mt 000007fef8f6aee0 -min 0n31116490 -max 0n31116490 -short 
0000000a61c51000
0:000> !do 0000000a61c51000 
Name:        System.String
MethodTable: 000007fef8f6aee0
EEClass:     000007fef88d3720
Size:        31116490(0x1daccca) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String:      <String is invalid or too large to print>

Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
000007fef8f6dc90  40000aa        8         System.Int32  1 instance         15558232 m_stringLength
000007fef8f6c1c8  40000ab        c          System.Char  1 instance               50 m_firstChar
000007fef8f6aee0  40000ac       18        System.String  0   shared           static Empty
                                 >> Domain:Value  00000000003fb620:NotInit  000000001ca30bd0:NotInit  000000001f7b21a0:NotInit  000000001f8940c0:NotInit  0000000027dc46b0:NotInit  00000000281bd720:NotInit  00000000282b7ee0:NotInit  <<

0:000> .writemem D:\dumps\xxxx\string.txt 0000000a61c51000 L?0x1daccca
Writing 1daccca bytes..........

An analysis of a surge in back-end service memory of a. Net recruitment network

From the content, it is actually the base64 coding of PDF, which is investigated in the same waychar[]andbyte[]Type, found that most of them are also PDF, and the guessing program is modified in the process of processing PDFbyte[],char[],stringSwitching between, so most of these objects theoretically belong to rootless objects, in fact, through!heapstat -iuYou can also see that about5.5GThe rootless object of is waiting for GC to recycle.


0:000> !heapstat -iu                                        
Heap             Gen0         Gen1         Gen2          LOH                                        
Heap0        17625808      1274680     47745824    140181016                                        
...                                    
Total       357486256     28100616   2229673376   5733004848                                        
                                        
Free space:                                                 Percentage                                        
Heap0         3962240           24     11211224       298616SOH: 22% LOH:  0%                                        
Heap1         5625856          144      9857168       302152SOH: 27% LOH:  0%                                        
...                                    
Heap31        1448576           24     19957312       218024SOH: 25% LOH:  0%                                        
Total       181492784         1136    431825856      5183128                                        
                                        
Unrooted objects:                                           Percentage                                        
Heap0        12163928       243584        42872    137153536SOH: 18% LOH: 97%                                        
...                                    
Heap31         236832       239272      1435840    139770656SOH:  2% LOH: 99%                                        
Total       164954952      7948448     29066480   5530423784                                        

3: Summary

The main reason for this periodic surge in memory is that the program has received too many data from the upstreamPdf fileAfter all, these are all large objects, and char [], string and byte [] are switched, resulting in excessive memory consumption in a short time.

Finally, my personal suggestions:

  1. For a large number of PDFs, can we borrow third-party OSS software to avoid some unnecessary memory occupation.
  2. Whether the cleaning service can be limited or shared equally.

Later, my friend said that he did itFilter filterAnd someBusiness process optimizationThis problem has been solved. I think many friends have encountered this kind of problem in reality. Please leave a message to supplement your solution.