Default serialization in J2SE

Time:2021-12-30

What is to be saved is also saved. Generally, we only need to save logical data. Data that do not need to be saved can be marked with the keyword transient.

Here is an example:

  import java.io.*;

  public class Serial implements Serializable {
  int company_id;
  String company_addr;

  transient boolean company_flag;
  }

Then company_ The flag field will not participate in serialization and deserialization, but you also increase the responsibility for its initial value. This is also one of the problems often caused by serialization. Because serialization is equivalent to a public constructor that only accepts data streams, this object construction method is outside the language. But it is still a formal constructor. If your class cannot be initialized through other aspects, you need to provide additional readObject methods, first deserialize normally, and then initialize the fields marked by transient.

When not appropriate, using Java’s default serialization behavior may have a speed impact, and in the worst case, it may lead to overflow. In the implementation of some data structures, it is often full of various circular references. The default serialization behavior of Java does not understand your object structure. As a result, Java tries to save the object state through an expensive “graph traversal”. It is conceivable that it is not only slow, but also may overflow. At this time, you should provide your own readObject to replace the default behavior.

Compatibility issues

Compatibility has always been a complex and troublesome problem.

Do not compatibility:

First, let’s take a look at what we should pay attention to if our goal is not compatibility. There are many occasions to avoid compatibility. For example, war3 cannot read previous replays whenever the version is upgraded.

Compatibility is also version control. Java controls through a uid (stream unique identifier). This uid is implicit. It is calculated through many factors such as class name and method name. In theory, it is a one-to-one mapping relationship, that is, unique. If the UIDs are different, deserialization cannot be realized, and invalidclassexception will be obtained.

When we want to artificially generate a new version (the implementation has not been changed) and abandon the previous version, we can implement it through explicit reputation uid:

  private static final long serialVersionUID=????;

You can make up a version number, but be careful not to repeat it. In this way, the old version will get invalidclassexception during deserialization. We can catch this exception in the old version and prompt the user to upgrade the new version.

Maintain compatibility when changes are minor (a special case of downward compatibility):

Sometimes when your class adds some unimportant non private methods, and the logical fields do not change, you certainly want the old version to be compatible with the new version. The methods are also implemented through explicit reputation UIDs. Let’s verify it.

Old version:

  import java.io.*;

  public class Serial implements Serializable {

  int company_id;
  String company_addr;

  public Serial1(int company_id, String company_addr) {
  this.company_id = company_id;
  this.company_addr = company_addr;
  }

  public String toString() {
  return “DATA: “+company_id+” “+
  company_addr;
  }
  }

New version

  import java.io.*;

  public class Serial implements Serializable {

  int company_id;
  String company_addr;
  public Serial1(int company_id, String company_addr) {
  this.company_id = company_id;
  this.company_addr = company_addr;
  }

  public String toString() {
  return “DATA: “+company_id+” “+ company_addr;
  }
Public void todo() {} / / irrelevant method
  }

First serialize the old version and then read it with the new version. An error occurred:

  java.io.InvalidClassException: Serial.Serial1; local class incompatible: stream classdesc serialVersionUID = 762508508425139227, local class serialVersionUID = 1187169935661445676

Next, we add an explicit reputation uid:

  private static final long serialVersionUID=762508508425139227l;

Run again and smoothly generate new objects

  DATA: 1001 com1

How to maintain upward compatibility:

Upward compatibility means that the old version can read the serialized data stream of the new version. It often appears that the data in our server is updated. We still hope that the old client can support deserialization of the new data stream until it is updated to the new version. It can be said that this is a semi-automatic thing.

Generally speaking, because in Java, serialVersionUID is the only flag that controls whether deserialization can succeed. As long as this value is different, deserialization cannot succeed. However, as long as the value is the same, it will be deserialized anyway. In this process, for upward compatibility, the redundant content in the new data stream will be ignored; For downward compatibility, all contents contained in the old data stream will be restored, and the parts not involved in the new version of the class will remain the default values. Using this feature, it can be said that as long as we keep the serialVersionUID unchanged, the upward compatibility is realized automatically.

Of course, once we remove the old content in the new version, the situation is different. Even if the uid remains unchanged, an exception will be thrown. Because of this, we should keep in mind that once a class implements serialization and maintains up-down compatibility, it can’t be modified casually!!!

The test also proves this. Interested readers can try it by themselves.
How to maintain downward compatibility:

As pointed out above, you will take it for granted that downward compatibility is automatic as long as the serialVersionUID remains unchanged. But in fact, downward compatibility is more complex. This is because we must be responsible for fields that are not initialized. Make sure they can be used.

So we must use it

  private void readObject(java.io.ObjectInputStream in)
  throws IOException, ClassNotFoundException{
  in. defaultReadObject();// Deserialize the object first
  if(ver=5552){
/ / previous version 5552
… initialize other fields
  }else if(ver=5550){
/ / previous version 5550
… initialize other fields
  }else{
/ / too old version not supported
  throw new InvalidClassException();
  }
  }

Careful readers will notice to ensure in defaultReadObject(); To execute smoothly, the serialVersionUID must be consistent, so the ver here cannot use the serialVersionUID. The ver here is a final long ver = XXXX that we have installed in advance; And it cannot be modified by transient. Therefore, there are at least three requirements for maintaining downward compatibility:

  1. SerialVersionUID is consistent

  2. Pre install our own version identification mark final long ver = XXXX;

  3. Ensure that all domains are initialized

  Discuss the compatibility policy:

Here we can see that maintaining downward compatibility is troublesome. And as the number of versions increases. Maintenance can become difficult and cumbersome. It is beyond the scope of this article to discuss what kind of programs should use and what kind of compatibility serialization strategy, but the requirements for the storage function of a game and the compatibility of documents of a word processing software are certainly different. For the storage function of RPG Games, it is generally required to maintain downward compatibility. If the Java serialization method is used here, it can be prepared according to the three points analyzed above. In this case, the object serialization method can be used. For a word processing software, the document compatibility requirements are quite high. Generally, the strategy is to require good downward compatibility and upward compatibility as much as possible. Generally, object serialization technology will not be used. A well-designed document structure can better solve the problem.

  Data consistency problem and constraint problem

You should know that serialization is another form of “public constructor”, but it only constructs objects without any inspection, which is very uncomfortable. Therefore, necessary inspection is necessary, which uses readobject()

  private void readObject(java.io.ObjectInputStream in)
  throws IOException, ClassNotFoundException{
  in. defaultReadObject();// Deserialize the object first
… check and initialize
  }

For structural reasons, a function called initialize is usually used to check and initialize. If it fails, an exception is thrown. Keeping checking and initializing is easy to forget, which often leads to problems. Another problem is that when the parent class does not add readObject (), the child class can easily forget to call the corresponding initialize function. This seems to go back to the problem of why constructors were introduced at the beginning. The reason is to prevent subclasses from forgetting to call initialization functions, causing various problems. Therefore, if you want to maintain data consistency, you must add readObject ().

safety problem

The topic of security is beyond the scope of this article, but you should know that it is possible for an attacker to prepare a malicious data stream for your class and try to generate a wrong class. When you need to ensure the security of your object data, you can generally use the above methods to check and initialize, but it is not easy to check some references. The solution is to make a protective copy of important parts. A good method is recommended here. Instead of copying individual domains, it directly copies the entire object. This is it.

  Object readResolve() throws ObjectStreamException;

The purpose of this method is that it will be called immediately after readObject (). It will replace the deserialized object with the returned object. That is, the original readObject () deserialized object will be discarded immediately.

  Object readResolve() throws ObjectStreamException{
  return new Serial2(this.xxx1,this.xxx2);// Xxx1 and XXX2 are just deserialized. This is a protective copy
  }

In this way, although it is a waste of time, this method can be used for particularly important and safe classes. If it is troublesome to check the data consistency and constraint problems one by one, this method can also be used, but the cost should be considered and the following limitations should be noted. One obvious disadvantage of using readresolve () is that when the parent class implements readresolve (), the child class will become cluster free. If the readresolve () of a protected or public parent class exists and the child class does not overwrite it, the child class will eventually get a parent class object when deserializing, which is neither the result we want nor easy to find. Having subclasses override readresolve () is undoubtedly a burden. In other words, implementing readresolve () to protect the class is not a good method for the class to inherit. We can only use the first method to write a protective readObject ().

So my suggestion is: in general, only final classes are protected by readresolve ().

Recommended Today

Configuration and test of opencvsharp in VS

1. What is opencvsharp Opencvsharp is developed by a Japanese engineer. The project address is: https://github.com/shimat/opencvsharp . It is OpenCV Net wrapper, which is closer to the original opencv than emgucv, and has many sample references. 2. Opencvsharp features It directly encapsulates more opencv methods, reduces the difficulty of learning, and is easier to use […]