Using dotnet dump to find the reason why. Net core 3.0 takes up 100% of CPU

Time:2020-6-28

The company’s products have been continuously upgraded in line with. Net core 3.0 preview. Once deployed to a Linux server, there will occasionally be a process that takes up 100% CPU
Because the service is deployed in the cloud, remote debugging cannot be used; this problem cannot be reproduced on Linux server or windows development machine in the LAN. I am envious when I think of Java’s jstack. I think that. Net core has been out for so long, so I’d better try to find it, and finally I found a blog introduction diagnostics improvements in. Net core 3.0

  This article introduces three tools

dotnet counters: real time statistics of runtime, including CPU, memory, GC, exception, etc
• dotnet trace: similar performance detector
• dotnet dump: use this tool when the program crashes

This time, we use dotnet dump. Even if the program does not crash, we can also dump the program snapshot for analysis

Experimental environment

ubuntu-16.04.5-desktop-amd64
SDK 3.0.100-preview6-012264

1. Create a new simple console program (only. Net core 3.0 program, not. Net core 2.2) to simulate 100% CPU usage


mkdir NetCoreDumpTest && cd NetCoreDumpTest
dotnet new console

edit Program.cs


namespace NetCoreDumpTest
{
 using System;
 using System.Threading.Tasks;
 class Program
 {
  static void Main(string[] args)
  {
   Task.Factory.StartNew(() => PrintNumber("Print", 5));
   Console.WriteLine("Press any key to exit.");
   Console.ReadKey();
  }
  static void PrintNumber(string message, int startNumber)
  {
   var number = startNumber;
   while (true)
    Console.WriteLine($"{message} {number++}");
  }
 }
}

2. Install dotnet dump

dotnet tool install --global dotnet-dump --version 1.0.4-preview6.19311.1

Tips

If you are using bash, you can add it to your profile by running the following command:
cat << \EOF >> ~/.bash_profile
# Add .NET Core SDK tools
export PATH=”$PATH:/home/****/.dotnet/tools”
EOF
You can add it to the current session by running the following command:
export PATH=”$PATH:/home/****/.dotnet/tools”
You can invoke the tool using the following command: dotnet-dump
Tool ‘dotnet-dump’ (version ‘1.0.4-preview6.19311.1’) was successfully installed.

It is recommended to add $home /. Dotnet / tools to the path, OK, do as you like, and remember to use the following command to make the settings take effect immediately

source ~/.bash_profile

3. Use dotnet NetCoreDumpTest.dll Start our problem program, and then use PS – EF | grep dotnet to view the process ID of the program. You can see that the process ID is 3411


ps -ef | grep dotnet
z*****e  3411 1464 22 07:51 pts/8 00:00:59 dotnet NetCoreDumpTest.dll
z*****e  3431 2935 0 07:55 pts/9 00:00:00 grep --color=auto dotnet

For process 3411, we also need to know which thread accounts for the CPU. Using top-hp 3411, we can list all threads. Because top refreshes every 3 seconds, it may take several seconds to see which thread takes up a relatively high CPU. Here we can see the thread with PID = 3418 (please understand the process ID and thread ID of Linux by yourself)


top -Hp 3411
 PID USER  PR NI VIRT RES SHR S %CPU %MEM  TIME+ COMMAND
 3418 z*****e 20 0 2997700 29060 22400 R 10.3 1.4 0:20.68 dotnet
 3411 z*****e 20 0 2997700 29060 22400 S 0.0 1.4 0:00.11 dotnet
 3412 z*****e 20 0 2997700 29060 22400 S 0.0 1.4 0:00.02 dotnet
 3413 z*****e 20 0 2997700 29060 22400 S 0.0 1.4 0:00.00 dotnet
 3414 z*****e 20 0 2997700 29060 22400 S 0.0 1.4 0:00.00 dotnet
 3415 z*****e 20 0 2997700 29060 22400 S 0.0 1.4 0:00.01 dotnet
 3416 z*****e 20 0 2997700 29060 22400 S 0.0 1.4 0:00.00 dotnet
 3417 z*****e 20 0 2997700 29060 22400 S 0.0 1.4 0:00.00 dotnet
 3421 z*****e 20 0 2997700 29060 22400 S 0.0 1.4 0:00.00 dotnet

To get dump, you can only dump the process, so we input 3411


dotnet-dump collect -p 3411
Writing minidump with heap to /tmp/core_20190623_075649
Complete

4. Analysis

dotnet-dump analyze core_20190623_075649

Use clrthreads to view all threads

>clrthreads
ThreadCount:      4
UnstartedThread:  0
BackgroundThread: 3
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
                                                                                                        Lock
 DBG   ID OSID ThreadOBJ           State GC Mode     GC Alloc Context                  Domain           Count Apt Exception
   0    1  d53 0000000001307D80    20020 Preemptive  0000000000000000:0000000000000000 0000000001306450 1     Ukn
   4    2  d57 000000000135BBD0    21220 Preemptive  0000000000000000:0000000000000000 0000000001306450 0     Ukn (Finalizer)
   6    3  d59 00007F666C0009F0  1020220 Preemptive  0000000000000000:0000000000000000 0000000001306450 0     Ukn (Threadpool Worker)
   7    4  d5a 000000000130DA40  1021220 Preemptive  00007F6678106860:00007F6678106F20 0000000001306450 1     Ukn (Threadpool Worker)

The hexadecimal of thread 3418 that we care about is D5A, that is, the last line. Its DBG is 7. We need to use setthread 7 to set it as the thread of current operation

Then use clrstack to get the thread call information


> setthread 7
> clrstack
OS Thread Id: 0xd5a (7)
  Child SP    IP Call Site
00007F6715561558 00007f671a2bd4bd [InlinedCallFrame: 00007f6715561558] Interop+Sys.Write(System.Runtime.InteropServices.SafeHandle, Byte*, Int32)
00007F6715561558 00007f669f669a9e [InlinedCallFrame: 00007f6715561558] Interop+Sys.Write(System.Runtime.InteropServices.SafeHandle, Byte*, Int32)
00007F6715561540 00007F669F669A9E ILStubClass.IL_STUB_PInvoke
00007F67155615E0 00007F669F67333E System.ConsolePal.Write(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Boolean)
00007F67155616A0 00007F669F67360C System.ConsolePal.Write(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte[], Int32, Int32, Boolean) [/_/src/System.Console/src/System/ConsolePal.Unix.cs @ 1236]
00007F67155616C0 00007F669F672B2A System.IO.StreamWriter.Flush(Boolean, Boolean) [/_/src/System.Private.CoreLib/shared/System/IO/StreamWriter.cs @ 261]
00007F6715561710 00007F669F6729F3 System.IO.StreamWriter.WriteLine(System.String) [/_/src/System.Private.CoreLib/shared/System/IO/StreamWriter.cs @ 474]
00007F6715561760 00007F669F6727D3 System.IO.TextWriter+SyncTextWriter.WriteLine(System.String) [/_/src/System.Private.CoreLib/shared/System/IO/TextWriter.cs @ 891]
00007F67155617A0 00007F669F672770 System.Console.WriteLine(System.String) [/_/src/System.Console/src/System/Console.cs @ 550]
00007F67155617C0 00007F669F663791 NetCoreDumpTest.Program.PrintNumber(System.String, Int32) [/home/zhouke/NetCoreDumpTest/Program.cs @ 18]
00007F6715561800 00007F669F6636D9 NetCoreDumpTest.Program+<>c.<Main>b__0_0()
00007F6715561820 00007F669F1872A1 System.Threading.Tasks.Task.InnerInvoke() [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs @ 2466]
00007F6715561840 00007F669F18CBC2 System.Threading.Tasks.Task+<>c.<.cctor>b__274_0(System.Object) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs @ 2445]
00007F6715561850 00007F669F171AF2 System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(System.Threading.Thread, System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) [/_/src/System.Private.CoreLib/shared/System/Threading/ExecutionContext.cs @ 289]
00007F6715561890 00007F669F187111 System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs @ 2406]
00007F6715561910 00007F669F186F28 System.Threading.Tasks.Task.ExecuteEntryUnsafe(System.Threading.Thread) [/_/src/System.Private.CoreLib/shared/System/Threading/Tasks/Task.cs @ 2344]
00007F6715561930 00007F669F186EBB System.Threading.Tasks.Task.ExecuteFromThreadPool(System.Threading.Thread)
00007F6715561940 00007F669F17B754 System.Threading.ThreadPoolWorkQueue.Dispatch() [/_/src/System.Private.CoreLib/shared/System/Threading/ThreadPool.cs @ 663]
00007F67155619C0 00007F669F169A5B System.Threading._ThreadPoolWaitCallback.PerformWaitCallback() [/_/src/System.Private.CoreLib/src/System/Threading/ThreadPool.CoreCLR.cs @ 29]
00007F6715561D50 00007f6718a1ccaf [DebuggerU2MCatchHandlerFrame: 00007f6715561d50]

It’s a bit of a Java call stack, but we found our problem code


NetCoreDumpTest.Program.PrintNumber(System.String, Int32)

Sometimes we want to know what parameter is passed in which leads to high CPU consumption. We can add the parameter – A to clrstack


> clrstack -a
..............
00007F0DD6FFC7C0 00007F0D6EEF3791 NetCoreDumpTest.Program.PrintNumber(System.String, Int32) [/home/zhouke/NetCoreDumpTest/Program.cs @ 18]
  PARAMETERS:
    message (0x00007F0DD6FFC7E8) = 0x00007f0d4800b8b0
    startNumber (0x00007F0DD6FFC7E4) = 0x0000000000000005
  LOCALS:
    0x00007F0DD6FFC7E0 = 0x000000000014e42b
    0x00007F0DD6FFC7DC = 0x0000000000000001
...............

You can see that in parameters, startnumber is the value type, and you can directly see that the value is 5, while message is the reference type, pointing to 0x00007f0d4800b8b0. In this case, you need to use the dumpobj command


> dumpobj 0x00007f0d4800b8b0
Name:    System.String
MethodTable: 00007f0d6ef70f90
EEClass:   00007f0d6eede1c0
Size:    32(0x20) bytes
File:    /home/zhouke/dotnet/shared/Microsoft.NETCore.App/3.0.0-preview6-27804-01/System.Private.CoreLib.dll
String:   Print
Fields:
       MT  Field  Offset         Type VT   Attr      Value Name
00007f0d6ef6a138 400022b    8     System.Int32 1 instance        5 _stringLength
00007f0d6ef66f38 400022c    c     System.Char 1 instance        50 _firstChar
00007f0d6ef70f90 400022d   108    System.String 0  static 00007f0d47fff360 Empty

OK, you can see that it is a string with the content of “print”

If message is a complex type, you can check the information below fields for further details

Clrstack also has an experimental parameter – I, which helps to view various variable information. Lldb is needed. According to the official tutorial, I haven’t succeeded in the experiment yet

To view the process ID and thread ID, it is more convenient to use htop (to be installed), press F4 to filter, and enter dotnet

This figure is the result of rerunning the problem program. The process ID and thread ID are different from the previous one

In the second line, the process id = 1650 is white. In the first line, the CPU consumption is high. The problem thread id = 1658

summary

The above is the reason analysis of using dotnet dump to search. Net core 3.0 to occupy 100% CPU, which is introduced by Xiaobian. I hope it can help you. If you have any questions, please leave a message to me, and Xiaobian will reply you in time. Thank you very much for your support of the developepaer website!
If you think this article is helpful to you, welcome to reprint, please indicate the source, thank you!