String string

Time:2021-6-8

Implementation principle

In java6 and previous versions, string object is an object that encapsulates char array, and it mainly has four member variables: char array, offset offset, count of characters, hash value.

From Java 7 to Java 8, there are no longer offset and count variables in the string class. The advantage is that the string object takes up a little less memory.

Starting from Java 9, the char field is changed to byte field, and a new attribute coder is maintained, which is an identifier of encoding format.

A char character takes up 16 bits and 2 bytes. In this case, it is very wasteful to store the characters in the single byte code (the characters that occupy one byte). In order to save memory space, the string class of jdk1.9 uses an 8-bit, 1-byte byte array to store strings.

The function of the new property coder is to determine how to calculate the string length according to this field when calculating the string length or using the indexof() function. By default, the coder property has two values: 0 and 1. 0 represents Latin-1 (single byte encoding) and 1 represents utf-16. If string determines that the string contains only Latin-1, the value of the coder property is 0, otherwise it is 1.

Immutable

Looking at the code of the string class, we can find that the string class is modified by the final keyword, so the class cannot be inherited, and the variable char array in the string class is also modified by final, so the string object cannot be modified.

The immutable string object has the following advantages:

First, ensure the security of the string object. Assuming that the string object is mutable, the string object may be modified maliciously.

Secondly, it ensures that the hash attribute value will not change frequently, which ensures the uniqueness, so that the corresponding key value caching function can be realized only by similar HashMap containers.

Third, string constant pool can be implemented.

In Java, there are usually two ways to create string objects:

The first is created by string constants, such asString str = "abc"

The second is the creation of string variables in the form of new, such asString str = new String("abc")

When the first method is used to create string objects in code, “ABC” constant string will be put into constant structure when compiling class file, and “ABC” will be created in constant pool when loading class; STR then references the string object in the constant pool. This method can reduce the repeated creation of string objects with the same value and save memory.

String str = new String("abc")In this way, “ABC” constant string will be put into constant structure when compiling class file, and “ABC” will be created in constant pool when loading class; Secondly, when new is called, the JVM command will call the constructor of string, and the char array in the string object will refer to itChar array of “ABC” string in constant poolTo create a string object in the heap memory; Finally, STR will refer to the string object, which is different from the reference to the “ABC” string in the constant pool.

Object and reference: the content of an object is stored in memory. The operating system finds the stored content through the memory address. Reference refers to the memory address.

For example:String str = new String("abc")The variable STR points to the storage address of the string object, that is to say, STR is not an object, but an object reference.

String splicing

Constant addition

String str = "ab" + "cd" + "ef";

View the compiled bytecode

0 ldc #2 <abcdef>
2 astore_1
3 return

You can see that the compiler optimizes the code as follows

String str= "abcdef";

Variable addition

String a = "ab";
String b = "cd";
String c = a + b;

View the compiled bytecode

0 ldc #2 <ab>
 2 astore_1
 3 ldc #3 <cd>
 5 astore_2
 6 new #4 <java/lang/StringBuilder>
 9 dup
10 invokespecial #5 <java/lang/StringBuilder.<init>>
13 aload_1
14 invokevirtual #6 <java/lang/StringBuilder.append>
17 aload_2
18 invokevirtual #6 <java/lang/StringBuilder.append>
21 invokevirtual #7 <java/lang/StringBuilder.toString>
24 astore_3
25 return

It can be found that when adding strings in Java, the underlying layer uses StringBuilder, and the code is optimized as follows:

String c = new StringBuilder().append("ab").append("cd").toString();

String.intern

String a = new String("abc").intern();
String b = new String("abc").intern();
System.out.print(a == b);

Output results:

true

In stringconstantBy default, objects are placed in the constant pool. For example:String a = "123"

In stringvariableIn, the object is created in the heap memory, and a string object is also created in the constant pool. The char array in the string object will refer to the char array in the constant pool and return the heap memory object reference. For example:String b = new String("abc")

If you call the intern method, you will check whether there is a reference to the string of the object in the string constant pool. If not, in JDK1.6, you will copy the string in the heap to the constant pool and return the string reference. Because there is no reference to it, the original string in the heap memory will be recycled through the garbage collector.

After jdk1.7, because the constant pool has been merged into the heap, the specific string will not be copied, but the reference of the first string will be added to the constant pool; If so, the string reference in the constant pool is returned.

Let’s start to analyze the above code block:

At the beginning, the string “ABC” creates a string object in the constant pool when the class is loaded.

When creating a variable, calling new sting() will create a string object in the heap memory, and the char array in the string object will refer to the string in the constant pool. After calling the intern method, you will go to the constant pool to find out whether there is a reference equal to the string object. If there is, you will return the string reference in the constant pool.

When creating the B variable, calling new sting() will create a string object in the heap memory, and the char array in the string object will refer to the string in the constant pool. After calling the intern method, you will go to the constant pool to find out whether there is a reference equal to the string object. If there is, you will return the string reference in the constant pool.

Two string objects in heap memory will be garbage collected because there is no reference to them. So a and B refer to the same object.

If a string object is created at run time, it will be created directly in heap memory, not in constant pool.Therefore, for dynamically created string objects, the intern method is called. In JDK1.6, the runtime constant is created in the constant pool and the string reference is returned. After jdk1.7, the string constant reference in the heap is put into the constant pool. When other string objects in the heap get the string object reference through the intern method, If so, the string reference in the constant pool will be returned, pointing to a string object with the same address as the previous string.

Use a graph to summarize the creation and allocation of memory address of string string

String string

One thing to note when using intern method is that it must be combined with the actual scene. Because the implementation of constant pool is similar to that of a hashtable, the larger the data stored in hashtable, the more time complexity of traversal will increase. If the data is too large, it will increase the burden of the entire string constant pool.

Determine whether the strings are equal

//Running environment JDK1.8
//Running environment JDK1.8

String s1 = new String("1") + new String("1")A new string object is combined in the heap"11", ins1.intern()After that, because there is no reference to the string in the constant pool, a heap string is generated in the constant pool"11"In this caseString s2 = "11"Returns a heap string"11"Sos1==s2

Running the following code in jdk1.7 and later, you will find that the result is true, but the result is false in JDK1.6

String s1 = new String("1") + new String("1");
System.out.println( s1.intern()==s1);

StringBuilder and StringBuffer

Because the value of a string is immutable, every operation on a string generates aNew string objectThis is not only inefficient, but also a waste of limited memory space.

Unlike the string class, the objects of the StringBuffer and StringBuilder classes can be modified many times, andNo new objects are generated

StringBuilder class was proposed in Java 5. The biggest difference between StringBuilder and StringBuffer is that the method of StringBuilder is not thread safe (can’t be accessed synchronously).

Because StringBuilder has a speed advantage over StringBuffer,Therefore, it is recommended to use StringBuilder class in most cases. However, when the application requires thread safety, the StringBuffer class must be used.

This work adoptsCC agreementReprint must indicate the author and the link of this article