Remember the CPU system of.NET hospital official account system.

Time:2021-7-2

1: Background

1. Tell a story

Last Thursday, a friend came up with a Wx consultation programCPU + threadI hope I can help you find out the situation of double high, as shown in the figure below:

From the screenshot, it’s just that the thread is exploding, but I don’t see the CPU exploding. What’s interesting is that this friend said:It's been manually recycledI don’t know why I want to laugh, but I cry with a smile.

Maybe my friend knows the old rules and sent me two dumps. Then I can start work. Since there are thousands of threads in every minute during the peak period, I share the article with you the other daySerial portThe problem is very similar. It must be that individual threads cannot exitlockThe CLR needs to create more thread pool threads to cope with the cumulative number of threadsWork QueueSo it’s still a prioritySynchronous block tableAgain, ten people use locks and eight people use locks.

2: WinDbg analysis

1. Find the CLR synchronization block table

You can use it!syncblkSee if there’s a lock.

0:000> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info  SyncBlock Owner
   95 00000262b8a30ca8          193         1 00000262b8a36b50 116b8  53   0000025e2a8ded70 System.Object
  118 00000262b8a32098          107         1 00000262bad503b0 710c 135   00000260ea8a9b00 NLog.Logger
  200 00000262ba236cc8           13         1 00000262b9df1660 8858  69   0000025e2a8dcdf0 System.Object
-----------------------------
Total           305
CCW             3
RCW             6
ComClassFactory 0
Free            116

Although there are more than normal holding lock values in the hexagram:193,107,13But intuition tells me, is it a deadlock??? usesosexExtended!dlkIs it true that the command can be retrieved automatically?

0:000> !dlk
Examining SyncBlocks...
Scanning for ReaderWriterLock instances...
Scanning for holders of ReaderWriterLock locks...
Scanning for ReaderWriterLockSlim instances...
Scanning for holders of ReaderWriterLockSlim locks...
Examining CriticalSections...
Scanning for threads waiting on SyncBlocks...
Scanning for threads waiting on ReaderWriterLock locks...
Scanning for threads waiting on ReaderWriterLocksSlim locks...
Scanning for threads waiting on CriticalSections...
No deadlocks detected.

From the last line, there’s no problemdeadlocksIt seems that my intuition is wrong.

I can only look back at the tallest one193, which means that there is one thread holding the lock (thread 53) and 96 threads waiting for the lock. It’s easy to solve the problem of lock. Just check its thread stack.

2. View thread stack

To be more stable, I’ll use it!dumpstackCall out the managed and unmanaged stacks of 53 threads, as shown in the following figure:

As you can see from the call stack above, the program usesNLog.WriteAfter writing the log, it finally gets stuck in the logcalling ntdll!NtCreateFileI’m also surprised at this Win32 function. Is the disk writing speed too low? I immediately asked my friend if it was SSD or not. My friend said it might not be. Moreover, my friend said that it could be up to 600m log in half an hour during the peak period. I think the problem lies in the slow disk writing…

3. Do you really decide to let the disk carry the pot?

It doesn’t seem appropriate to give this answer to a friend, let a friend change SSD? How can we do if we can’t handle the volume of logs? So the implication is: farming is responsible, and cattle are also responsible. How can we find responsibility from them??? Look back at the call stack.

0:053> !clrstack
OS Thread Id: 0x116b8 (53)
        Child SP               IP Call Site
0000006d65d3d238 00007ff849ac65b4 [InlinedCallFrame: 0000006d65d3d238] NLog.Internal.Win32FileNativeMethods.CreateFile(System.String, FileAccess, Int32, IntPtr, CreationDisposition, NLog.Targets.Win32FileAttributes, IntPtr)
0000006d65d3d238 00007ff7d2d8c33e [InlinedCallFrame: 0000006d65d3d238] NLog.Internal.Win32FileNativeMethods.CreateFile(System.String, FileAccess, Int32, IntPtr, CreationDisposition, NLog.Targets.Win32FileAttributes, IntPtr)
0000006d65d3d1f0 00007ff7d2d8c33e DomainBoundILStubClass.IL_STUB_PInvoke(System.String, FileAccess, Int32, IntPtr, CreationDisposition, NLog.Targets.Win32FileAttributes, IntPtr)
0000006d65d3d300 00007ff7d2d8bcdc NLog.Internal.FileAppenders.BaseFileAppender.WindowsCreateFile(System.String, Boolean)
0000006d65d3d380 00007ff7d2d8b94f NLog.Internal.FileAppenders.BaseFileAppender.TryCreateFileStream(Boolean)
0000006d65d3d3e0 00007ff7d2d8b673 NLog.Internal.FileAppenders.BaseFileAppender.CreateFileStream(Boolean)
0000006d65d3d440 00007ff7d2d8b501 NLog.Internal.FileAppenders.RetryingMultiProcessFileAppender.Write(Byte[])
0000006d65d3d490 00007ff7d2d8aca0 NLog.Targets.FileTarget.WriteToFile(System.String, NLog.LogEventInfo, Byte[], Boolean)
0000006d65d3d4e0 00007ff7d2a44dd3 NLog.Targets.FileTarget.ProcessLogEvent(NLog.LogEventInfo, System.String, Byte[])
0000006d65d3d550 00007ff7d2a485c9 NLog.Targets.Target.Write(NLog.Common.AsyncLogEventInfo)
0000006d65d3d590 00007ff7d2a487b7 NLog.Targets.Target.WriteAsyncLogEvent(NLog.Common.AsyncLogEventInfo)
0000006d65d3d610 00007ff7d2a48ab5 NLog.LoggerImpl.WriteToTargetWithFilterChain(NLog.Internal.TargetWithFilterChain, NLog.LogEventInfo, NLog.Common.AsyncContinuation)
0000006d65d3d670 00007ff7d2a38c45 NLog.LoggerImpl.Write(System.Type, NLog.Internal.TargetWithFilterChain, NLog.LogEventInfo, NLog.LogFactory)
0000006d65d3d6d0 00007ff7d2a39282 NLog.Logger.Trace(System.String)

I don’t know if you have found that thread 53 TMD not only deals with business, but also calls Win32(User mode kernel mode)Write file, who can bear it???

An efficient log system, go should beDedicated thread + log buffer queueI found the information of NLog. Hey, NLog really provides this solution.

So we have to optimize the default configuration of NLog. It seems that we can end this article. No, now that we’re all here, I have to find some developer responsibilities.

3. How to find the responsibility of developers

If you are careful, do you think there is something missing? Yes, it’s the synchronization block. There are three pieces of information on the hexagram. Right? For the convenience of checking, I’ll post it again.

0:000> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info  SyncBlock Owner
   95 00000262b8a30ca8          193         1 00000262b8a36b50 116b8  53   0000025e2a8ded70 System.Object
  118 00000262b8a32098          107         1 00000262bad503b0 710c 135   00000260ea8a9b00 NLog.Logger
  200 00000262ba236cc8           13         1 00000262b9df1660 8858  69   0000025e2a8dcdf0 System.Object

Index = 95 is related to nloger. How can index = 118 be related to nlogerNLog.LoggerWhat about the relevance? Next, let’s take these two objects0000025e2a8ded70, 00000260ea8a9b00Source code export, you can use!gcroot + !name2ee + !savemodule

0:053> !gcroot 0000025e2a8ded70
Thread 116b8:
    0000006d65d3d590 00007ff7d2a487b7 NLog.Targets.Target.WriteAsyncLogEvent(NLog.Common.AsyncLogEventInfo)
        rbp-48: 0000006d65d3d5b8
            ->  0000025e2a8ded70 System.Object
0:053> !name2ee *!NLog.Targets.Target.WriteAsyncLogEvent
--------------------------------------
Module:      00007ff7d2b172d8
Assembly:    NLog.dll
Token:       0000000006000b5e
MethodDesc:  00007ff7d2be3330
Name:        NLog.Targets.Target.WriteAsyncLogEvent(NLog.Common.AsyncLogEventInfo)
JITTED Code Address: 00007ff7d2a48700
--------------------------------------
0:053> !savemodule 00007ff7d2b172d8 E:\dumps.dll
3 sections in file
section 0 - VA=2000, VASize=7faa4, FileAddr=200, FileSize=7fc00
section 1 - VA=82000, VASize=3e8, FileAddr=7fe00, FileSize=400
section 2 - VA=84000, VASize=c, FileAddr=80200, FileSize=200

0:053> !gcroot 00000260ea8a9b00
Thread 710c:
    0000006d68f3df30 00007ff7d2d8a3b2 xxx.Logger.log(System.String)
        rdi: 
            ->  00000260ea8a9b00 NLog.Logger
0:053> !name2ee *!xxx.Logger.log
--------------------------------------
Module:      00007ff7d29b5558
Assembly:    xxx.dll
Token:       0000000006001ead
MethodDesc:  00007ff7d29b9a38
Name:        xxx.Logger.log(System.String)
JITTED Code Address: 00007ff7d2d8a260
--------------------------------------
0:053> !savemodule 00007ff7d29b5558 E:\dumps.dll
3 sections in file
section 0 - VA=2000, VASize=221cf0, FileAddr=200, FileSize=221e00
section 1 - VA=224000, VASize=3c8, FileAddr=222000, FileSize=400
section 2 - VA=226000, VASize=c, FileAddr=222400, FileSize=200

Open with ilspy2.dllAfter that, I found the interesting partLogger.log()Code, it’s really interesting… This is shown below.

public class Logger
{
	private static Logger Log = LogManager.GetLogger("");

	private static object lockCache = new object();

	public static void WriteLog(string message)
	{
		Task.Run(delegate
		{
			log(message);
		});
	}

	public static void log(string message)
	{
		try
		{
			if (message.Contains("xxxxxxx"))
			{
				lock (Log)
				{
					Log.Warn("    " + message + "\r\n\r\n");
				}
			}
			else
			{
				lock (Log)
				{
					Log.Info("    " + message + "\r\n\r\n");
				}
			}
		}
		catch (Exception)
		{
		}
	}

	public static void WriteLog(string message, params object[] args)
	{
		lock (Log)
		{
			Log.Info("    " + string.Format(message, args));
		}
	}
}

Actually inlog()Method added a lock, this is how distrust NLog ha, there is a little bit inWriteLog()MethodTask.RunLog. No wonder my friend said that there are thousands of threads per minute. This time I understand…

When I make complaints about it, I end it. I see another dump, but I don’t want to make complaints about it.

0:000> !t
ThreadCount:      200
UnstartedThread:  0
BackgroundThread: 200
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
                                                                                                        Lock  
       ID OSID ThreadOBJ           State GC Mode     GC Alloc Context                  Domain           Count Apt Exception
  78   47 afb8 000001cd7abbf1d0  3029220 Preemptive  000001CBB81648C0:000001CBB8166318 000001cd798a9d30 2     MTA (Threadpool Worker) System.IO.FileLoadException 000001cbb81644a8

0:000> !PrintException /d 000001cbb81644a8
Exception object: 000001cbb81644a8
Exception type:   System.IO.FileLoadException
Message: this file is in use by another program and cannot be accessed by the process( Exception from HResult: 0x80070020)
InnerException:   
StackTrace (generated):
    SP               IP               Function
    0000001B3703E750 0000000000000000 mscorlib_ni!System.Runtime.InteropServices.Marshal.ThrowExceptionForHRInternal(Int32, IntPtr)+0x1
    0000001B3703E750 00007FF7D2D30D87 UNKNOWN!NLog.Internal.FileAppenders.BaseFileAppender.WindowsCreateFile(System.String, Boolean)+0x157
    0000001B3703E7D0 00007FF7D2D3092F UNKNOWN!NLog.Internal.FileAppenders.BaseFileAppender.TryCreateFileStream(Boolean)+0x5f
    0000001B3703E830 00007FF7D2D30593 UNKNOWN!NLog.Internal.FileAppenders.BaseFileAppender.CreateFileStream(Boolean)+0xd3

StackTraceString: 
HResult: 80070020
The current thread is unmanaged

There’s moreProcess occupancy exception。。。 And the exception stack is not the familiar file creation functionWindowsCreateFileIs that right??? Well, curiosity drives me to decide to get that file name. I can switch to thread 78 and use it!clrstack -aCall out parameters and local variables, and find the final file name.

0:078> !clrstack -a
OS Thread Id: 0xafb8 (78)
0000001b3703e750 00007ff7d2d30ce1 NLog.Internal.FileAppenders.BaseFileAppender.WindowsCreateFile(System.String, Boolean)
    PARAMETERS:
        this () = 0x000001c9771abf40
0:078> !do 0x000001c9771abf40
Name:        NLog.Internal.FileAppenders.RetryingMultiProcessFileAppender
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ff830f88760  40000dd        8        System.Random  0 instance 000001c9771abf80 random
00007ff830f99808  40000de       10        System.String  0 instance 000001c9772fd418 k__BackingField

0:078> !DumpObj /d 000001c9772fd418
Name:        System.String
MethodTable: 00007ff830f99808
EEClass:     00007ff830876cb8
Size:        142(0x8e) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
String:      D:\xxx\wwwroot\WebService\log21-04-16\file.txt

Remember the second screenshot at the beginning of the article? My friend opened multiple copies of the WebService program. Unexpectedly, they all wrote one file. This is taboo…

3: Summary

So many people make complaints about it. Maybe my friends and I are doing it.Medical industryThe pressure from Party A is still great. Finally, the optimization measures are given as follows.

  • Modify NLog configuration file, supportProprietary thread + queueMode to release the business thread.

  • The writing and calling methods of NLog are too messy, so it needs to be re encapsulated. It only needs to provide an interface to the outside world. To use it, you need to trust it.

  • Conditional upgrade to SSD.

The last egg is the feedback of good news

More high quality dry goods: see my GitHub:dotnetfly

图片名称