About map (1)

Time:2021-4-16

The difference between HashMap and hashtable

① HashMap is thread unsafe, hashtable is thread safe;
② Because of thread safety, hashtable is less efficient than HashMap;
③ HashMap only allows the key of one record to be null at most, and allows the value of multiple records to be null, but hashtable does not allow it;
④ By default, the initialization array size of HashMap is 16, and hashtable is 11. When the former is expanded, it is expanded twice, and the latter is expanded twice + 1;
⑤ HashMap needs to recalculate the hash value, while hashtable directly uses the object’s hashcode

Data structure of HashMap

Hash table structure (linked list hash: array + linked list) implementation, combining the advantages of array and linked list. When the length of the linked list exceeds 8, the linked list is converted into a red black tree.

How HashMap works

The bottom layer of HashMap is realized by hash array and unidirectional linked list. Each element in the array is a linked list, which is implemented by the internal class (node) Map.Entry The HashMap is stored and obtained by the put & get method.

When storing an object, pass the K / V key value to the put() method
Firstly, the hash (k) method is called to calculate the hash value of K, and then the array subscript is calculated by combining with the array length;
(2) adjust the size of the array (when the number of elements in the container is greater than capacity * loadactor, the container will be expanded to 2n);
(3) I. if the hash value of K does not exist in the HashMap, insert it; if it exists, collision occurs;

II. If the hash value of K exists in the HashMap and both of them return true, the key value pair will be updated;
III. If the hash value of K exists in the HashMap and both of them return false, it will be inserted into the tail of the linked list (tail insertion method) or the red black tree
How to add a tree. (before JDK 1.7, head insertion method is used, and JDK 1.8, tail insertion method is used)
(when the collision causes the linked list to be larger than tree if)_ When threshold = 8, the linked list is converted into a red black tree.)

When getting an object, pass K to the get() method:

First, call the hash (k) method (calculate the hash value of K) to obtain the array subscript of the linked list where the key value is located;
2. Traverse the linked list in order, and use the equals () method to find the V value corresponding to the K value in the same node linked list.
Hashcode is the location and storage location; equals is qualitative and compares whether the two are equal.

 

The hashcodes of the two objects are the same

Because the hashcodes are the same, they are not necessarily equal (compared by the equals method), so the subscripts of the arrays of two objects are the same, thus “collision”. Because HashMap uses linked list to store objects, this node will be stored in the linked list.

The implementation of hash and the reasons for its implementation

In JDK 1.8, it is realized by the high 16 bits XOR low 16 bits of hashcode (): (H = k.hashcode ()) ^ (H > > > 16), mainly considering the speed, efficiency and quality, so as to reduce the system overhead,
It will not cause collision because the high order does not participate in the calculation of subscript.

Reason for using XOR operator

It ensures that as long as one of the 32-bit values of the object’s hashcode changes, the entire hash () return value will change. Reduce collisions as much as possible.

 

The process of put method in HashMap

Call the hash function to get the hash value corresponding to the key, and then calculate its array subscript;
1. If there is no hash conflict, put it directly into the array; if there is hash conflict, put it behind the linked list in the form of linked list;
2. If the length of the linked list exceeds the threshold value (tree height = = 8), the linked list will be turned into a red black tree, and if the length of the linked list is less than 6, the red black tree will be turned back to the linked list;
3. If the key of the node already exists, replace its value;
4. If the key value pair in the set is greater than 12, call the reset method to expand the array.

How to expand an array

Create a new array with twice the capacity of the old array, and recalculate the storage location of the nodes in the old array. There are only two positions of nodes in the new array: the original subscript position or the original subscript + the size of the old array.

 

 

Another thread safe class in Java that is very similar to HashMap is concurrent HashMap, which is also thread safe. What is the difference between concurrent HashMap and hashtable in thread synchronization

Concurrent HashMap class java.util.concurrent A thread safe and efficient HashMap implementation is provided in.
Hashtable is the principle of locking by using the synchronize keyword (that is, locking objects);
Concurrent HashMap adopts the method of segment lock in JDK 1.7 and CAS (lock free algorithm) + synchronized in JDK 1.8.

 

ConcurrentHashMap 
1、Important constants
  private transient volatile int sizeCtl;
When it is a negative number, – 1 means that it is initializing, – N means that N – 1 threads are expanding;
When it is 0, it means that the table has not been initialized;
When it is other positive numbers, it indicates the size of initialization or the next expansion.
2、data structure
Node is the basic unit of storage structure, which inherits the entry in HashMap to store data;
Treenode inherits node, but the data structure is changed to binary tree structure, which is the storage structure of red black tree and is used to store data in red black tree;
Treebin is a container encapsulating treenode, which provides some conditions for transforming red black tree and control of lock.

3、Put () method when storing objects
If there is no initialization, call the inittable () method to initialize;
If there is no hash conflict, CAS will insert without lock directly;
If you need to expand the capacity, you need to expand it first;
If there is a hash conflict, a lock is added to ensure thread safety. There are two cases: one is that the linked list is directly traversed to the end and inserted; the other is that the red black tree is inserted according to the red black tree structure;
If the number of the linked list is greater than the threshold value of 8, it must be converted to the red black tree structure first, and break will enter the cycle again
If the addition is successful, the addcount () method is called to count the size and check whether expansion is needed.
4、Expansion
Transfer (): the default capacity is 16. When the capacity is expanded, the capacity will be doubled.
Helptransfer (): call multiple worker threads to help with the expansion, which will be more efficient.
5、Get() method when getting object
Calculate the hash value, locate the index position of the table, and return if the first node matches;
In case of capacity expansion, it will call the mark to mark the node being expanded ForwardingNode.find () method, find the node, and return after matching;
If they do not match, we will traverse the nodes downward and return if they match, otherwise null will be returned at last.

The difference between HashMap and concurrent HashMap

Apart from locking, there is not much difference in principle. Null is allowed for key value pairs of HashMap, but none of them is allowed in concurrenthashmap.

Why concurrent HashMap is more efficient than hashtable

  HashTable
Using a lock (locking the whole linked list structure) to deal with concurrency problems, multiple threads compete for a lock, which is easy to block;
  ConcurrentHashMap
The use of reentrantlock + segment + hashentry in JDK 1.7 is equivalent to dividing a HashMap into multiple segments, and assigning a lock to each segment, which supports multi-threaded access. Lock granularity: Based on segment, including multiple hashentries.
JDK 1.8 uses CAS + synchronized + node + red black tree. Lock granularity: node (first node) (Implementation) Map.Entry )。 Lock granularity is reduced.

 

Concurrency of concurrent HashMap

The maximum number of threads that can update concurenthashmap at the same time without lock contention. The default value is 16, which can be set in the constructor.

When the user sets the concurrency, concurrent HashMap will use the minimum 2 power exponent greater than or equal to the value as the actual concurrency (if the user sets the concurrency to 17, the actual concurrency is 32)

 

Changes to HashMap in jdk8
In Java 1.8, if the length of the linked list exceeds 8, the linked list will be converted to a red black tree. (the number of barrels must be greater than 64. If it is less than 64, only the capacity will be expanded.)
When hash collision occurs, Java 1.7 will be inserted at the head of the linked list, while Java 1.8 will be inserted at the tail of the linked list
In Java 1.8, entry is replaced by node.

 

Why not use binary search tree instead of red black tree? Why not use red black trees all the time

 

The reason why red black tree is chosen is to solve the defect of binary search tree. In special cases, binary search tree will become a linear structure (which is the same as the original linked list structure, causing a deep problem), and traversal search will be very slow.

After inserting new data, the red black tree may need to maintain balance through such operations as left rotation, right rotation and color change. The purpose of introducing red black tree is to find data quickly and solve the problem of query depth of linked list. Red black tree belongs to balanced binary tree,

However, in order to maintain “balance”, we need to pay a price, but the cost of resources is less than traversing the linear list. Therefore, when the length is greater than 8, we will use the red black tree, if the length of the list is very short,

There is no need to introduce red black trees at all, but the introduction will be slow.

 

Red black tree
1. Each node is either red or black
2. The root node is always black
3. If the node is red, its child nodes must be black (otherwise not necessarily)
4. Each leaf node is a black null node (NIL node)
5. Every path from root node to leaf node or empty child node must contain the same number of black nodes (that is, the same black height)