Protobuf simple type direct deserialization method

Time:2021-9-10

I have an idea. It’s perfect to have a high-performance data protocol specification that can cross platform, enable data to be read between two different programs, and preferably support direct serialization of objects.

target

  1. Supports arbitrary object serialization
  2. Support from similarSystem.StringGet the class information from the string of and deserialize it
  3. Support direct serialization and deserialization of simple objects

programme

XML serialization

When it comes to serialization, the XML serialization provided by. Net is very useful. However, many types are not supported, such asDictionary<>Moreover, although this thing is powerful, the XML tag mechanism leads to more redundant content and larger space occupation.

Binary serialization

Supports arbitrary object serialization,. Net also providesBinaryFormatter

// code from https://stackoverflow.com/questions/7442164/c-sharp-and-net-how-to-serialize-a-structure-into-a-byte-array-using-binary
MyObject obj = new MyObject();
byte[] bytes;
IFormatter formatter = new BinaryFormatter();
using (MemoryStream stream = new MemoryStream())
{
   formatter.Serialize(stream, obj);
   bytes = stream.ToArray();
}

This method supports serialization of arbitrary objects, but there is a problem. It is strictly bound to the type model, only supports message interaction of the same assembly version, and does not support programs written in other languages.
I’ve used this method before forFast data saving and reading in a single program。 In this case, it is only to save the state of the object. The operation is very convenient. I think it is very appropriate.

Protobuf

This thing is the data format in grpc. It can cross platform and support multiple languages. The data is binary and the compression rate is very high. Well, that’s it.

If you want to use protobuf protocol in. Net, you often use two class libraries. One isGoogle.Protobuf, the other isprotobuf.net。 I won’t repeat the detailed differences. There is onearticleThere are multiple comparisons. Because I prefer to use c#’s type system directly, I still follow the suggestions of the article and use protobuf.net directly.

protobuf-net

When both sides of the communication are. Net programs, using protobuf does not need to write proto files directly, but can directly share the references of data classes. If you need to communicate with non. Net programs, you can also use tools to directly read information from proto and generate classes. Review the objectives and deal with them one by one.

  1. Serialization of arbitrary objects is supported

Protobuf serializes by defining entity classes, so it also supports arbitrary objects. I won’t elaborate here. You canOfficial websiteView detailed usage.

  1. Support from similarSystem.StringGet the class information from the string of and deserialize it

There has always been a pain point. Whether to restore general objects from the serialized content, that is, the object type is unknown at the time of compilation. By saving the string name of the type, when deserialization is required, the type is loaded through the type name to deserialize the content to the specified type. This needs to use reflection more or less.

static void Main(string[] args)
{
    var ps = new  List { "1346dfg" , "31461sfghj", "24576sth"} ;
    var name = ps.GetType().FullName;
    using (FileStream ms = new FileStream("d:\\a.txt", FileMode.Create))
    {
        Serializer.Serialize(ms, ps);
    }
    using (FileStream ms = new FileStream("d:\\a.txt", FileMode.Open))
    {
      //Data has been converted to a list object, but the returned type is still object, which can be forcibly converted.
        dynamic data = Serializer.Deserialize(Type.GetType(name), ms);
        Console.WriteLine(data[1]);
    }
}

Here, a fullname attribute of type is used. For built-in type objects, it is assumed that the type of PS isStringIf so, thatFullNamebySystem.String, the content returned is simple. But in this case,FullNamebySystem.Collections.Generic.List`1[[System.String, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], it feels a lot more complicated all of a sudden, and the fatal thing is that the reference and version declaration of corelib are clearly indicated here. If you need to deserialize in. Net core 3.1, it certainly cannot be implemented.

Try to solve this problem. This system. String is not only available in. Net 5, but also supported by other. Net platforms, so we have to find a way to remove itSystem.StringMy tail.

You can try to start with fullname, but this thing is a little too long, and I don’t like directly processing strings; Try to start with the type.

var ty = Type.GetType(name);
Console.WriteLine(ty.Name);
Console.WriteLine(ty.Namespace);
Console.WriteLine(ty.GenericTypeArguments[0].Name);
Console.WriteLine(ty.GenericTypeArguments[0].Namespace);
//Combine related codes
dynamic data = Serializer.Deserialize(Type.GetType($"{ty.Namespace}.{ty.Name}" +
    $"[{ty.GenericTypeArguments[0].Namespace}.{ty.GenericTypeArguments[0].Name}]"), ms);

Console.WriteLine(data[1]);

With a little modification, we can achieve our goal by manually connecting the namespace and name attributes.

List ` 1 means that there is only one parameter in this generic type. I’ll hard code it here. For other generic types, there may be multiple parameters. You need to identify them and adjust the code for constructing the type name.

According to this idea, the complete code is as follows:

static void Main(string[] args)
{
    var ps = new List { "1346dfg", "31461sfghj", "24576sth" };
    var ty = ps.GetType();
    //Save type name
    var name = $"{ty.Namespace}.{ty.Name}" +
            $"[{ty.GenericTypeArguments[0].Namespace}.{ty.GenericTypeArguments[0].Name}]";
    //The actual program does not involve file operation. Here is the usage of memorystream.
    using (MemoryStream ms = new MemoryStream())
    {
        Serializer.Serialize(ms, ps);
        //Reset pointer, read from scratch
        ms.Position = 0;
        //Deserialize with type name
        dynamic data = Serializer.Deserialize(Type.GetType(name), ms);
        Console.WriteLine(data[1]);
    }
}
  1. Support direct serialization and deserialization of simple objects
    The simple object I’m talking about is a system defined collection of generic types that directly haveTypeCode, and is not an object of object. To add, I often use several.

Built in type

The types defined in the system namespace, including datetime and int32, are all directSystem. Type nameForm of.

Note that int and float do not work. You need to use int32 and single.

Generic collection + built-in type

Generic collections are defined in the namespace system. Collections. Generic, so they are combined intoSystem. Collections. Generic. Generic name ` number of parameters [system. Type name]。 Two examples:

ListYesSystem.Collections.Generic.List`1[System.String]

DictionaryYesSystem.Collections.Generic.Dictionary`2[[System.Int32],[System.String]]

Built in type array

Add directly after name[]It can be in the form ofSystem. Type name []

supplement

Serialization does not require the serialization type to be exactly the same as the deserialization type. For example, array can be interchanged with list and IEnumerable. Therefore, some separately defined types with simple structure can be deserialized through built-in types, so it is not necessary to load the original class during deserialization, which simplifies the operation.

[ProtoContract]
public class Message
{
    [ProtoMember(1)]
    public List values { get; set; }
}

static void Main(string[] args)
{
    var ps = new Message { values = new List { "1346dfg", "31461sfghj", "24576sth" } };

    using (FileStream ms = new FileStream("d:\\a.txt", FileMode.Create))
    {
        Serializer.Serialize(ms, ps);
    }

    using (FileStream ms = new FileStream("d:\\a.txt", FileMode.Open))
    {
        //List is deserialized without using the message class. The fullname of message here is "consoleapp6. Program + message"
        dynamic data = Serializer.Deserialize(Type.GetType("System.Collections.Generic.List`1[System.String]"), ms);
        Console.WriteLine(data[1]);
    }
}

In addition, for the above dynamic, because it is not checked during compilation, students who are afraid of operation errors can carry out type conversion. Sharing a code snippet may be helpful.

//Convert object to a type
public static T ConvertTo(object value)
{
    return (T)Convert.ChangeType(value, typeof(T));
}

If it is limited to several types, you can use the switch statement to judge and convert the object to t for type safe operations. If not, it is recommended to use interfaces to define the general behavior of classesanswerSome suggestions are provided in. It is recommended to have a look.

reference material