Interpretation and application of c# span source code

Time:2021-11-26

1: Background

1. Tell a story

I’ve been too busy to produce articles in time these two days. I’m sorry. A friend in the group asked when to produce the next article of span a few days ago. Ha ha, here we are! Friends who have read the previous article should know that span unifies. Net programsStack + managed + unmanagedIt realizes the unified access of three large blocks of memory, and also exists as a first-class citizen in the. Net underlying library. Many existing classes provide support for span / readonlyspan.

  • String support for span / readonlyspan
public sealed class String
    {
        [MethodImpl(MethodImplOptions.InternalCall)]
        [NullableContext(0)]
        public extern String(ReadOnlySpan value);
    }
  • StringBuilder support for span / readonlyspan
public sealed class StringBuilder : ISerializable
    {
        public unsafe StringBuilder Append(ReadOnlySpan value)
        {
            if (value.Length > 0)
            {
                fixed (char* value2 = &MemoryMarshal.GetReference(value))
                {
                    Append(value2, value.Length);
                }
            }
            return this;
        }
    }
  • Int support for span / readonlyspan
public readonly struct Int32
    {
        public static int Parse(ReadOnlySpan s, NumberStyles style = NumberStyles.Integer, IFormatProvider? provider = null)
        {
            NumberFormatInfo.ValidateParseStyleInteger(style);
            return Number.ParseInt32(s, style, NumberFormatInfo.GetInstance(provider));
        }
    }

How about these general & basic classesSpan / ReadOnlySpan, let alone complex types, whose status is self-evident. Next, let’s talk about the mechanism of span itself.

2: Research on span principle

1. Span source code analysis

I believe there should be nothing wrong with using span flexibly to solve practical problems in work. With this foundation, I will deeply analyze the source code and user status of span with you, starting from the source code.

public readonly ref struct Span
    {
        internal readonly ByReference _pointer;

        private readonly int _length;
    }

Code aboveref structIt can be seen that this span is a value type that can only be allocated on the stack, and then the value in it_ Pointer and_ Length is two instance fields. I don’t know if there is a picture in my mind after reading these two fields. It’s probably like this.

It can be clearly seen that span is used to map a memory address that can be accessed continuously. The space size is controlled by length and the starting position is controlled by_ Pointer assignment is not very like pointer. Yes, the language team has to take care of your personal safety to ensure the high performance of your program. It takes great pains to use all kinds of means!

2. Span user status analysis

Although the picture has been drawn, many friends still want to see and practice. Hey, hey, don’t be afraid of any challenges. Let me turn the above picture into code first:

static void Main(string[] args)
        {
            var nums = new int[] { 1, 2, 3, 4, 5, 6 };

            var span = new Span(nums);

            Console.ReadLine();
        }

Next, I use WinDbg to find the span in the thread stack.

0:000> !clrstack -l
OS Thread Id: 0x181c (0)
        Child SP               IP Call Site
000000963277E5D0 00007ffc3e601434 ConsoleApp1.Program.Main(System.String[]) [E:\net5\ConsoleApp2\ConsoleApp1\Program.cs @ 13]
    LOCALS:
        0x000000963277E618 = 0x000001e956b8ab10
        0x000000963277E608 = 0x000001e956b8ab20

From the last line of code, we can see that the stack address of span is 0x000000963277e608, and the stack content is 0x000001e956b8ab20. According to the theory of the figure, 0x000001e956b8ab20 should be the memory address of nums array element 1, which can be verified with DP.

0:000> dp 0x000001e956b8ab20
000001e9`56b8ab20  00000002`00000001 00000004`00000003
000001e9`56b8ab30  00000006`00000005 00000000`00000000
000001e9`56b8ab40  00007ffc`3e6c4388 00000000`00000000

From the memory addresses in the above three lines, the of the array:1,2,3,4,5,6In order, some friends may have a little doubt. Why doesn’t the memory address of num point to array element 1? Let me popularize it. First use DP to call the memory address of the array.

0:000> dp 0x000001e956b8ab10
000001e9`56b8ab10  00007ffc`3e69f090 00000000`00000006
000001e9`56b8ab20  00000002`00000001 00000004`00000003
000001e9`56b8ab30  00000006`00000005 00000000`00000000

It can be seen that the first row is:00007ffc3e69f090 0000000000000006, the first 8 bytes represent the method table address of the array, and the last 8 bytes represent 6, that is, the array has 6 elements. If you don’t believe it, I’ll cut a picture:

Span is by_ Pointer + length, just now_ Pointer also showed you, where is the value of length? Because span is a struct, you need to type the smallest stack address of the thread stack with DP.

Here, I think what I said is clear enough. If I’m still a little confused, I can think about it carefully.

3: Practice of span in string and list

There are so many application scenarios of span that it is impossible to list them one by one in this article. Here I will give two examples to let you feel the power of span.

1. Application on string

Case: how to efficiently calculate the value entered by the user10+20 ?

1) Traditional substring approach

The traditional method is very simple. The code is as follows:

static void Main(string[] args)
        {
            var word = "10+20";

            var splitIndex = word.IndexOf("+");

            var num1 = int.Parse(word.Substring(0, splitIndex));

            var num2 = int.Parse(word.Substring(splitIndex + 1));

            var sum = num1 + num2;

            Console.WriteLine($"{num1}+{num2}={sum}");

            Console.ReadLine();
        }

The result is easy to calculate, but if you think about it carefully, is there any problem here? For example, in order to deduct num from word, I used substring twice, which means that two strings will be generated on the managed heap. If I execute 1W times, will there be 2W strings on the managed heap? The modification code is as follows:

for (int i = 0; i < 10000; i++)
            {
                var num1 = int.Parse(word.Substring(0, splitIndex));

                var num2 = int.Parse(word.Substring(splitIndex + 1));

                var sum = num1 + num2; 
            }

Then look at the number of strings on the managed heap

0:000> !dumpheap -type String -stat
Statistics:
              MT    Count    TotalSize Class Name
00007ffc53a81e18    20167       556538 System.String

There are 20167 in the managed heap, which is terrible. It really adds trouble to GC. Ha, there are 167 in the system. The next question is whether there is a way to replace substring so as not to generate temporary strings?

2) New span practices

If you understand the span structure diagram, you should be able to use it_ Pointer + length slice the string, right? The code is as follows:

for (int i = 0; i < 10000; i++)
            {
                var num1 = int.Parse(word.AsSpan(0, splitIndex));

                var num2 = int.Parse(word.AsSpan(splitIndex));

                var sum = num1 + num2; 
            }

Then verify in the managed heap that there is no temporary string?

0:000> !dumpheap -type String -stat
Statistics:
              MT    Count    TotalSize Class Name
00007ffc53a51e18      167        36538 System.String

You can see that there are only 167 system strings, and the performance has been greatly improved,.

2. Application on list

When using span, it is more often applied to array. After all, array is continuous memory on the managed heap, so it is convenient for span to draw a visual window on it. In fact, it is not only array, but also a view on the list from. Net5. The screenshot is as follows:

Because the curd of the list will cause the underlying array to be short and long or reallocated, so it is impossible to realize physical continuous memory. Therefore, after span is applied to the list, we hope that the list is immutable, which is also an official suggestion.

4: Summary

Generally speaking, span plays a more and more important role in the underlying framework of. Net. I believe that span will have great potential in NETCORE’s pursuit of higher and faster performance. Let’s learn it quickly,

More high quality dry goods: see my GitHub:dotnetfly

图片名称