The mutual transformation of protobuf and JSON

Time:2019-11-8

Preface

In recent work, we started to use Google’s protobuf to build rest API. According to the current sense, in addition to the features of protobuf, the interface has been strictly determined, and no other special benefits have been felt for the time being. It is said that protobuf is smaller and faster than JSON’s serialization, but according to the current requirements, it is estimated that there is no need for this performance. Since it’s a new technology, I’m very happy to learn.

In the MVC code architecture, protbuf is the technology used in the controller layer. In order to be able to divide each layer and make the implementation of the service layer independent of protobuf, it is necessary to convert the entity class of protobuf, which is called protobean here, into POJO. In the process of implementation, the implementation of protobuf to JSON is involved because of this article. The explanation of protobean to POJO will be explained in another or several articles, because it will be more complex.

This article has been put on for a long time, hoping to see the implementation of two jsonformats. I want to write after reading, but I’d better write it first. It’s very tiring.

For readers to read smoothly, the links involved in the article will be given at the end, rather than in the middle of the text.

The protobuf file used in the test is as follows:

syntax = "proto3";

import "google/protobuf/any.proto";

option java_package = "io.gitlab.donespeak.javatool.toolprotobuf.proto";
package data.proto;

message OnlyInt32 {
    int32 int_val = 1;
}

message BaseData {
    double double_val = 1;
    float float_val = 2;
    int32 int32_val = 3;
    int64 int64_val = 4;
    uint32 uint32_val = 5;
    uint64 uint64_val = 6;
    sint32 sint32_val = 7;
    sint64 sint64_val = 8;
    fixed32 fixed32_val = 9;
    fixed64 fixed64_val = 10;
    sfixed32 sfixed32_val = 11;
    sfixed64 sfixed64_val = 12;
    bool bool_val = 13;
    string string_val = 14;
    bytes bytes_val = 15;

    repeated string re_str_val = 17;
    map<string, BaseData> map_val = 18;
}

message DataWithAny {
    double double_val = 1;
    float float_val = 2;
    int32 int32_val = 3;
    int64 int64_val = 4;
    bool bool_val = 13;
    string string_val = 14;
    bytes bytes_val = 15;

    repeated string re_str_val = 17;
    map<string, BaseData> map_val = 18;

    google.protobuf.Any anyVal = 102;
}

Optional tools

There are two tools for converting protobean to JSON. One iscom.google.protobuf/protobuf-java-util, the other iscom.googlecode.protobuf-java-format/protobuf-java-formatThe performance and effect of the two need to be compared. What we use here iscom.google.protobuf/protobuf-java-util, becauseprotobuf-java-formatMediumJsonFormatThe map will be formatted as{"key": "", "value": ""}Object list, andprotobuf-java-utilMediumJsonFormatA structure that can be serialized into an ideal key value.

<!-- https://mvnrepository.com/artifact/com.google.protobuf/protobuf-java-util -->
<dependency>
    <groupId>com.google.protobuf</groupId>
    <artifactId>protobuf-java-util</artifactId>
    <version>3.7.1</version>
</dependency>

<!-- https://mvnrepository.com/artifact/com.googlecode.protobuf-java-format/protobuf-java-format -->
<dependency>
    <groupId>com.googlecode.protobuf-java-format</groupId>
    <artifactId>protobuf-java-format</artifactId>
    <version>1.4</version>
</dependency>

code implementation

import com.google.gson.Gson;
import com.google.protobuf.Message;
import com.google.protobuf.util.JsonFormat;

import java.io.IOException;

/**
 *In particular:
 * <ul>
 *< li > the implementation cannot process a message with an any type field</li>
 *< li > enum type data will be converted to enum string name</li>
 *< li > bytes will be converted to utf8 encoded string</li>
 * </ul>
 * @author Yang Guanrong
 * @date 2019/08/20 17:11
 */
public class ProtoJsonUtils {

    public static String toJson(Message sourceMessage)
            throws IOException {
        String json = JsonFormat.printer().print(sourceMessage);
        return json;
    }

    public static Message toProtoBean(Message.Builder targetBuilder, String json) throws IOException {
        JsonFormat.parser().merge(json, targetBuilder);
        return targetBuilder.build();
    }
}

For general data types, such as int, double, float, long, string, they can be transformed in an ideal way. For the enum type field in protobuf, it will be converted to string according to the name of enum. For fields of type bytes, a string of type utf8 is converted.

Any and oneof

AnyandOneofIt is a special two types in protobuf. If you try toOneofField conversion to JSON is normal. The field name is the name of the assigned oneof field.

The treatment of any will be special. If you convert directly, you will get an exception like the following. The type specified by typeurl cannot be found.

com.google.protobuf.InvalidProtocolBufferException: Cannot find type for url: type.googleapis.com/data.proto.BaseData

    at com.google.protobuf.util.JsonFormat$PrinterImpl.printAny(JsonFormat.java:807)
    at com.google.protobuf.util.JsonFormat$PrinterImpl.access$900(JsonFormat.java:639)
    at com.google.protobuf.util.JsonFormat$PrinterImpl$1.print(JsonFormat.java:709)
    at com.google.protobuf.util.JsonFormat$PrinterImpl.print(JsonFormat.java:688)
    at com.google.protobuf.util.JsonFormat$PrinterImpl.printSingleFieldValue(JsonFormat.java:1183)
    at com.google.protobuf.util.JsonFormat$PrinterImpl.printSingleFieldValue(JsonFormat.java:1048)
    at com.google.protobuf.util.JsonFormat$PrinterImpl.printField(JsonFormat.java:972)
    at com.google.protobuf.util.JsonFormat$PrinterImpl.print(JsonFormat.java:950)
    at com.google.protobuf.util.JsonFormat$PrinterImpl.print(JsonFormat.java:691)
    at com.google.protobuf.util.JsonFormat$Printer.appendTo(JsonFormat.java:332)
    at com.google.protobuf.util.JsonFormat$Printer.print(JsonFormat.java:342)
    at io.gitlab.donespeak.javatool.toolprotobuf.ProtoJsonUtil.toJson(ProtoJsonUtil.java:12)
    at io.gitlab.donespeak.javatool.toolprotobuf.ProtoJsonUtilTest.toJson2(ProtoJsonUtilTest.java:72)
    ...

To solve this problem, we need to manually add the type corresponding to typeurl. I found the answer from Tom Rothschild’s article protocol buffers, Part 3 – JSON format. It was a long time before I found it. In fact, at the top of the print method, it is noted that the method will throw an exception because there is no any type.

/**
* Converts a protobuf message to JSON format. Throws exceptions if there
* are unknown Any types in the message.
*/
public String print(MessageOrBuilder message) throws InvalidProtocolBufferException {
    ...
}

A TypeRegistry is used to resolve Any messages in the JSON conversion. You must provide a TypeRegistry containing all message types used in Any message fields, or the JSON conversion will fail because data in Any message fields is unrecognizable. You don’t need to supply a TypeRegistry if you don’t use Any message fields.

Class JsonFormat.TypeRegistry @JavaDoc

The above implementation can’t handle itAnyData of type. Need to add by yourselfTypeRegirstryTo transform.

@Test
public void toJson() throws IOException {
    //Multiple different descriptors can be added for typeregistry
    JsonFormat.TypeRegistry typeRegistry = JsonFormat.TypeRegistry.newBuilder()
        .add(DataTypeProto.BaseData.getDescriptor())
        .build();
    //The usingtyperegistry method rebuilds a printer
    JsonFormat.Printer printer = JsonFormat.printer()
        .usingTypeRegistry(typeRegistry);

    String json = printer.print(DataTypeProto.DataWithAny.newBuilder()
        .setAnyVal(
            Any.pack(
                DataTypeProto.BaseData.newBuilder().setInt32Val(1235).build()))
        .build());

    System.out.println(json);
}

From the above implementation, it is easy to think of a problem: for a field of any type, you must register all related message types before it can be converted to JSON normally. Similarly, when we useJsonFormat.parser().merge(json, targetBuilder);At the same time, you must first add related messages to the printer, which will inevitably lead to a lot of duplication of the whole code.

To solve this problem, I try to directlyMessageRemove allAnyOf the message in the fieldDescriptor, and then createPrinterIn this way, we can get a general transformation method. In the end, it failed. I thought it would get stuckrepeatedperhapsmapBut in the end, it is found that these are not problems, at least not in the transformation from protobean to JSON. The problem is that any’s design can’t fulfill this requirement.

Briefly speakingAnyThe source code of any is not many. You can roughly extract some codes as follows:

public  final class Any 
    extends GeneratedMessageV3 implements AnyOrBuilder {

    //Typeurl UU will be a java.lang.string value
    private volatile Object typeUrl_;
    private ByteString value_;
    
    private static String getTypeUrl(String typeUrlPrefix, Descriptors.Descriptor descriptor) {
        return typeUrlPrefix.endsWith("/")
            ? typeUrlPrefix + descriptor.getFullName()
            : typeUrlPrefix + "/" + descriptor.getFullName();
    }

    public static <T extends com.google.protobuf.Message> Any pack(T message) {
        return Any.newBuilder()
            .setTypeUrl(getTypeUrl("type.googleapis.com",
                                message.getDescriptorForType()))
            .setValue(message.toByteString())
            .build();
    }

    public static <T extends Message> Any pack(T message, String typeUrlPrefix) {
        return Any.newBuilder()
            .setTypeUrl(getTypeUrl(typeUrlPrefix,
                                message.getDescriptorForType()))
            .setValue(message.toByteString())
            .build();
    }

    public <T extends Message> boolean is(Class<T> clazz) {
        T defaultInstance = com.google.protobuf.Internal.getDefaultInstance(clazz);
            return getTypeNameFromTypeUrl(getTypeUrl()).equals(
                defaultInstance.getDescriptorForType().getFullName());
    }

    private volatile Message cachedUnpackValue;

    @java.lang.SuppressWarnings("unchecked")
    public <T extends Message> T unpack(Class<T> clazz) throws InvalidProtocolBufferException {
        if (!is(clazz)) {
            throw new InvalidProtocolBufferException("Type of the Any message does not match the given class.");
        }
        if (cachedUnpackValue != null) {
            return (T) cachedUnpackValue;
        }
        T defaultInstance = com.google.protobuf.Internal.getDefaultInstance(clazz);
        T result = (T) defaultInstance.getParserForType().parseFrom(getValue());
        cachedUnpackValue = result;
        return result;
    }
    ...
}

From the above code, we can easily see that the field of type any stores the message of type any, which has nothing to do with the original message value. After saving as any, any will save it to bytestring’svalue_And build atypeUrl_, so from an any object, we can’t know the type of the message object that was originally used to construct the any object(typeUrl_It just gives a description and cannot get the original class type by reflection and other methods). stayunpackMethod, the implementation method is to first build an example object with class, and useparseFromMethod to restore the original value. I’m very curious here. WhyAnyThis class can’t save the original class type of value? Or just define value as a message object. It will be more convenient to process, and it will not affect serialization. To be able to penetrate the designer’s intention, there are many things to learn.

In the end, there is still no way to write a general method to directly convert message into JSON as in the idea. Although it can’t be so intelligent, register all the messages that can be used manually.

package io.gitlab.donespeak.javatool.toolprotobuf;

import com.google.protobuf.Descriptors;
import com.google.protobuf.Message;
import com.google.protobuf.util.JsonFormat;

import java.io.IOException;
import java.util.List;

public class ProtoJsonUtilV1 {

    private final JsonFormat.Printer printer;
    private final JsonFormat.Parser parser;

    public ProtoJsonUtilV1() {
        printer = JsonFormat.printer();
        parser = JsonFormat.parser();
    }

    public ProtoJsonUtilV1(List<Descriptors.Descriptor> anyFieldDescriptor) {
        JsonFormat.TypeRegistry typeRegistry = JsonFormat.TypeRegistry.newBuilder().add(anyFieldDescriptor).build();
        printer = JsonFormat.printer().usingTypeRegistry(typeRegistry);
        parser = JsonFormat.parser().usingTypeRegistry(typeRegistry);
    }

    public String toJson(Message sourceMessage) throws IOException {
        String json = printer.print(sourceMessage);
        return json;
    }

    public Message toProto(Message.Builder targetBuilder, String json) throws IOException {
        parser.merge(json, targetBuilder);
        return targetBuilder.build();
    }
}

Implemented through gson

In the process of searching for data, we also found a transformation method completed by gson. From Alexander Moses’ converting protocol buffers data to JSON and back with gson type adapters. But I don’t think he is right about some points in this article. One is that the plugins of protbuf are still good. For example, idea is easy to find, vscode is easy to find, and eclipse can use protobuf DT (this DT will have some problems, so I have a chance to talk about it). The article is very clear. I’m here to change his implementation to be more general.

This implementation is still aboveJsonFormat, so there is no support for any transformation. If you want to support any, you can modify it according to the above code. There will be no more changes here.

package io.gitlab.donespeak.javatool.toolprotobuf;

import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonParser;
import com.google.gson.TypeAdapter;
import com.google.gson.stream.JsonReader;
import com.google.gson.stream.JsonWriter;
import com.google.protobuf.Message;
import com.google.protobuf.util.JsonFormat;
import io.gitlab.donespeak.javatool.toolprotobuf.proto.DataTypeProto;

import java.io.IOException;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;

/**
 * @author Yang Guanrong
 * @date 2019/08/31 17:23
 */
public class ProtoGsonUtil {

    public static String toJson(Message message) {
        return getGson(message.getClass()).toJson(message);
    }

    public static <T extends Message> Message toProto(Class<T> klass, String json) {
        return getGson(klass).fromJson(json, klass);
    }

    /**
     *If this method is to be set as the public method, you need to determine whether gson is an immutable object, otherwise it should not be opened
     *
     * @param messageClass
     * @param <E>
     * @return
     */
    private static <E extends Message> Gson getGson(Class<E> messageClass) {
        GsonBuilder gsonBuilder = new GsonBuilder();
        Gson gson = gsonBuilder.registerTypeAdapter(DataTypeProto.OnlyInt32.class, new MessageAdapter(messageClass)).create();

        return gson;
    }

    private static class MessageAdapter<E extends Message> extends TypeAdapter<E> {

        private Class<E> messageClass;

        public MessageAdapter(Class<E> messageClass) {
            this.messageClass = messageClass;
        }

        @Override
        public void write(JsonWriter jsonWriter, E value) throws IOException {
            jsonWriter.jsonValue(JsonFormat.printer().print(value));
        }

        @Override
        public E read(JsonReader jsonReader) throws IOException {
            try {
                //The template < e extends message > must be used here, and message cannot be used directly, otherwise newbuilder method will not be found
                Method method = messageClass.getMethod("newBuilder");
                //Call static method
                E.Builder builder = (E.Builder)method.invoke(null);

                JsonParser jsonParser = new JsonParser();
                JsonFormat.parser().merge(jsonParser.parse(jsonReader).toString(), builder);
                return (E)builder.build();
            } catch (NoSuchMethodException | IllegalAccessException | InvocationTargetException e) {
                e.printStackTrace();
                throw new ProtoJsonConversionException(e);
            }
        }
    }

    public static void main(String[] args) {
        DataTypeProto.OnlyInt32 data = DataTypeProto.OnlyInt32.newBuilder()
            .setIntVal(100)
            .build();

        String json = toJson(data);
        System.out.println(json);

        System.out.println(toProto(DataTypeProto.OnlyInt32.class, json));
    }
}

Reference resources

  • com.google.protobuf/protobuf-java-util @Github
  • com.googlecode.protobuf-java-format/protobuf-java-format @Github
  • Protocol Buffers, Part 3 — JSON Format
  • Converting Protocol Buffers data to Json and back with Gson Type Adapters
  • Any source code @ GitHub
  • Any official document @ Office