Another. Net program is dead, using WinDbg to analyze a real case


1: Background

1. Tell a story

The day before yesterday, a fan friend left a message in the background and asked me to look at his WinForm programUI not responding + 410 threadsWhat is the situation, as shown in the figure below:

To tell you the truth, I especially like to see these real cases, just like doctors, just stay in the theory and those demos, there is no future. If there are friends who can’t make it in this area, I can help you to read dump for free, and then send a blog to explain it.

Well, let’s get down to business. Now that my fans have mentioned Gao Da410Threads, my instinctive response is either high load or wild threads, the latter are mostly numerous new threads stuck in a lock.

WinForm has a high load, which I haven’t encountered so far. If it’s stuck on a lock, it basically belongs to this category. With this preconceived idea, we can sacrifice WinDbg.

2: WinDbg analysis

1. Find the CLR synchronization block table

Ten people use locks, eight people use locks, so use them first!syncblkLook at the lock of the program.

0:000> !syncblk
Index         SyncBlock MonitorHeld Recursion Owning Thread Info          SyncBlock Owner
   76   070e5fa4           67         1 17367570 15e8 218   03e6dd68 System.IO.Ports.SerialStream
Total           789
CCW             39
RCW             2
ComClassFactory 1
Free            535

I’ll go. Judging from the hexagrams, the situation is very bad. Let me make a simple analysis.

  • MonitorHeld = 67

This 67 means that there is currently one thread holding the lock and 33 threads are waiting for the lock. Some friends must want to ask how to calculate it? Simple: when a thread holds a lockMonitorHeld+1When a thread is waiting for a lockMonitorHeld+2So the expression is:67= [1 + 66=(33*2)]

  • Owning Thread Info = 17367570 15e8 218

The above three information indicate the current thread. You can see the last 218. It is the thread ID mapped by WinDbg. If you don’t believe it, you can use it! Let’s find out.

0:000> !t
ThreadCount:      315
UnstartedThread:  0
BackgroundThread: 302
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   0    1  c64 00cc3de0     24220 Preemptive  042E1884:00000000 00cbc0a0 0     STA 
 214  240 1398 16702b90   1029220 Preemptive  00000000:00000000 00cbc0a0 0     MTA (Threadpool Worker) 
 215  323  b5c 12ab7260   1029220 Preemptive  00000000:00000000 00cbc0a0 0     MTA (Threadpool Worker) 
 216  290 1858 16c21c98   1029220 Preemptive  00000000:00000000 00cbc0a0 0     MTA (Threadpool Worker) 
 218  117 15e8 17367570   1029220 Preemptive  00000000:00000000 00cbc0a0 1     MTA (Threadpool Worker) 

Yes, the culprit 218 holds the lock, causing 33 threads to wait for it innocently…

  • SyncBlock Owner = System.IO.Ports.SerialStream

You may wonder which object lock holds? fromSyncBlock OwnerIt seems to beSerialStreamIt turns out that my brother is playing serial port coding. Let me worship him first.

2. View thread stack

We know that 218 is responsible for this. Next, let’s take a look at its thread stack. What are they doing?

Some friends may not understand the call stack above. I drew a diagram:

As you can see from the figure, the thread from ThreadPool is in the user-definedDataReceivedThe method is stuck. I use it for convenience!DumpILLook at the IL code for this method.

0:218> !name2ee *!xxx.TYAComYB.DataReceived
Module:      03b10cc4
Assembly:    YKit.dll
Token:       06000108
MethodDesc:  08533584
Name:        xxx.TYAComYB.DataReceived(System.Object, System.IO.Ports.SerialDataReceivedEventArgs)
JITTED Code Address: 08644dc0

0:218> !dumpil 08533584
ilAddr = 05dc2dd8
IL_0000: nop 
IL_0001: nop 
IL_0002: nop 
IL_0003: ret

This code actually hides the hook, using!dumpilYou can’t see the code. No wonder you see a similar confusion method on the thread stackxxx.TYAComYB.EYLlXL2bKH()However, there is no problem with disassembly, which is simplified as follows:

0:218> !U /d 08644edf
08644ddd e86edaffff      call    08642850 (, mdToken: 060004b6)
08644df4 e807deffff      call    08642c00 (xxxx.YBComParam.get_DataPacketLen(), mdToken: 0600010c)
08644dfb b92a3e136e      mov     ecx,offset mscorlib_ni!System.GC.ReRegisterForFinalize(System.Object)  (mscorlib_ni+0x3e2a) (6e133e2a)
08644e00 e80fd460f8      call    00c52214 (JitHelp: CORINFO_HELP_NEWARR_1_VC)
08644e15 e8e6ddffff      call    08642c00 (xxx.YBComParam.get_DataPacketLen(), mdToken: 0600010c)
08644e22 e8edac4d68      call    System_ni+0x13fb14 (70b1fb14) (System.IO.Ports.SerialPort.Read(Byte[], Int32, Int32), mdToken: 06004173)
08644e2e ff153836b103    call    dword ptr ds:[3B13638h] (xxxx.LogKit.WriteLine(System.Exception), mdToken: 06000183)
08644e59 e8a2ddffff      call    08642c00 (xxxx.YBComParam.get_DataPacketLen(), mdToken: 0600010c)
08644e64 ff1580355308    call    dword ptr ds:[8533580h] (xxxx.TYAComYB.EYLlXL2bKH(), mdToken: 06000107)
08644e9b ff15a4265308    call    dword ptr ds:[85326A4h] (xxxx.YBComParam.get_DataPacketStart(), mdToken: 0600010e)
08644ea8 e837e34e66      call    mscorlib_ni!System.Convert.ToByte(System.String, Int32) (6eb331e4)
08644ed9 ff1580355308    call    dword ptr ds:[8533580h] (xxxx.TYAComYB.EYLlXL2bKH(), mdToken: 06000107)

Anyway, I’ve done a lot of things, so I’m too lazy to analyze them.

Next, let’s see how the 33 threads are stuckSerialStreamWhere are you going? You can use it~*e !clrstackScan all the threads and take a few.

0:218> ~*e !clrstack
OS Thread Id: 0xc64 (0)
Child SP       IP Call Site
OS Thread Id: 0x13d8 (330)
Child SP       IP Call Site
1b1aec90 77c8016d [GCFrame: 1b1aec90] 
1b1aee30 77c8016d [GCFrame: 1b1aee30] 
1b1aede0 77c8016d [HelperMethodFrame: 1b1aede0] System.Threading.Monitor.ReliableEnter(System.Object, Boolean ByRef)
1b1aee70 710d6b54 System.IO.Ports.SerialPort.CatchReceivedEvents(System.Object, System.IO.Ports.SerialDataReceivedEventArgs)
1b1aeeac 710d9520 System.IO.Ports.SerialStream+EventLoopRunner.CallReceiveEvents(System.Object)
1b1aeec0 6e45e356 System.Threading.QueueUserWorkItemCallback.WaitCallback_Context(System.Object)
1b1aeec8 6e43da07 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
1b1aef34 6e43d956 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
1b1aef48 6e45f120 System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
1b1aef5c 6e45e929 System.Threading.ThreadPoolWorkQueue.Dispatch()
1b1aefac 6e45e7d5 System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
1b1af1d4 71382552 [DebuggerU2MCatchHandlerFrame: 1b1af1d4]

I’ll go. It’s stuckSystem.IO.Ports.SerialPort.CatchReceivedEventsHere, and it is provided by the framework, which is very confusing.

3. Analyze serialport source code

To see the source code of the serialport class, you can use ilspy, as shown in the following figure:

If you look at the picture I just drew, is it clearer? The reason isdataReceived(this, e);The triggered user callback function is not finished, which leads to a large number of threads waiting at lock.

3: Summary

In order to understand why there are so many threads created in the bottom layer, I specially checked the serial port class SerialPort, saying that the data sent by the serial port sender can be actively received by the receiver or passively received by the receiver, and passivity is suchEvent modeWhen the receiver receives the data sent by the sender, the operating system will let the CLR handle the callback event through thread, so from the perspective of hexagram, it is a typical lock wait caused by the receiver’s insufficient processing capacity.

Two optimization measures are proposed

  • promotexxx.TYAComYB.DataReceivedThe processing ability of business logic in method.

  • Increase the reservoir, let the bottom layer of waterlock (serialStream)Get released as soon as possible.

More high quality dry goods: see my GitHub:dotnetfly


Recommended Today

Detailed explanation of how to customize redis command with Lua

preface As a very successful database, redis provides a wealth of data types and commands. Using these, we can easily and efficiently complete many cache operations, but there are always some special problems or needs to be solved. At this time, we may need to customize our own redis data structure and commands. Redis command […]