Talking about Java collection

Time:2021-1-26

preface

Most programming languages provide array to save objects, and array is one of the most important data structures. However, the length of the array has been defined during initialization, which is immutable and troublesome to use. Therefore, Java inJDK 1.2The collection framework is added to the version to save and manipulate objects.

The container in Java adopts the idea of “holding objects”, which is mainly composed of inheritanceCollectionAndMapTwo interfaces. Let’s take a look at these two types of containers

  1. Collection: it is mainly a collection of objects. It mainly stores single elements.
  2. Map: it mainly stores the relation mapping table about “key value pair”, mainly storeskey-valueThe key value is correct.

Collection

CollectionInterface is the main interface, which is further divided intoListSetAndQueueThese three interfaces are inherited fromCollection, but the functions to be realized are different.ListWhen storing elements, the insertion order should be maintained;SetIt does not contain repeated elements;QueueDetermine the order in which objects are generated (usually in the same order in which they are inserted) according to the sorting rules.

Talking about Java collection

But because they are inherited fromCollectionInterface, so they all have some of the same operations:

public interface Collection<E> extends Iterable<E> {

In Java 8, the interface also adds a default method:

//Default method newly added in Java 8

That’s allCollectionIt’s a new API. After subclass inheritance, these methods are also inherited, but subclasses can be implemented with different data structures.

Iterator & Iterable

IteratorIs the iterator in Java, which can make the class that implements the interface iterate. Let’s take a lookIteratorInterface:

public interface Iterator<E> {

And what we’re going to learn nextCollectionInterface inheritedIterableInterface, in whichiterator()Methods can produceIteratorObject to iterate through the collection

public interface Iterable<T> {
    Iterator<T> iterator();
    // JDK 1.8
    default void forEach(Consumer<? super T> action) {
        Objects.requireNonNull(action);
        for (T t : this) {
            action.accept(t);
        }
    }
    default Spliterator<T> spliterator() {
        return Spliterators.spliteratorUnknownSize(iterator(), 0);
    }
}

IterableInterface provides a way to getIteratorObject, so it implementsIterableThe collection of interfaces can still use iterators to traverse and manipulate the objects in the collection.

Let’s use it to realize itIterableCollection usage of interfacesIteratorIterator traverses:

LinkedList<String> list = new LinkedList<>();

But this is more troublesome. After JDK 1.8, it is providedfor-eachMethodIterableInterface object, which is a kind of JavaGrammar sugar. As follows:

for (String s : list) {
    System.out.println(s);
}

ListIteratorExist inListIn the set, there is a function ratioIteratorMore powerful iterators.

public interface ListIterator<E> extends Iterator<E> {

As you can see from the above method,ListIteratorIt is a two-way move, and generates indexes to the previous and next elements according to the elements pointing to the current position in the iteratorindex

public static void main(String[] args) {

fail-fast

fail-fastIs a kind of error detection mechanism in Java collection. When multiple threads change the structure of a part of the collection, how can it happenfail-fastMechanism, which will be thrown at this timeConcurrentModificationExceptionAbnormal.

The simplest example is in usefor-eachDelete when traversing syntax:

List<String> list = new ArrayList<>();
list.add("Jan");
list.add(null);
list.add("Feb");
list.add("Mar");
System.out.println(Arrays.toString(list.toArray()));
for (String s : list) {
    if (s == null) list.remove(s);
}
System.out.println(Arrays.toString(list.toArray()));

In this way, an error will be reported at runtime

Exception in thread "main" java.util.ConcurrentModificationException
    at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1042)
    at java.base/java.util.ArrayList$Itr.next(ArrayList.java:996)
    at xxx.xxx.Xxxx.java:22)

We all know thatfor-eachIt’s just a kind ofGrammar sugarIn itselfIteratorOperation, let’s see the source code in the error report above:

public E next() {

stayIteratorOfnextDuring operation, the operation will be checked:

final void checkForComodification() {
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
}

andmodCountThe value of theremoveModify during operation:

public E remove(int index) {

bringmodCountAndexpectedModCountThe values of cannot be equal, so thejava.util.ConcurrentModificationExceptionException, terminate traversal.

If we want to solve the above problems, we give two solutions

useIteratorOperation to delete or JDK 1.8 newremoveIfDefault method.

Iterator<String> iterator =  list.iterator ();

It can also be usedCopyOnWriteArrayListConcurrency containerArrayList

Now let’s take a look at inheritanceCollectionThree sub interfaces of the interface.

List

ListInterface extended fromCollectionInterface, which is orderly and repeatable.

ListThe following methods are added:

//Add set C to the set at a location

We often useListClasses are mainlyArrayListandLinkedList

ArrayList

ArrayListArray is used to store objects in the bottom layer of

transient Object[] elementData;

And because of the realizationRandomAccessInterface, so the search is very fast.

When adding an element, it is added at the endelementData[size++] = eSo adding in order is very fast.

andArrayListWhen deleting elements, theSystem.arraycopyMake a copy operation.

System.arraycopy(elementData, index+1, elementData, index, numMoved)

If you copy too many elements, you will lose performance.

LinkedList

LinkedListThe bottom layer of the system uses bidirectional linked list to store objects. Sequential access, allowing storage of any object (including null).LinkedListIt also realizes theDequeInterface that is inherited fromQueueSo this class can be used as a queue.

public class LinkedList<E>

becauseLinkedListRealizedDequeInterface, so its operation is bidirectional. For example, in the following method, you can choose whether to add a header or a tail.

public void addFirst(E e) {
    linkFirst(e);
}
public void addLast(E e) {
    linkLast(e);
}

LinkedListChina and IsraelNodeRepresents each node in the bidirectional linked list, as follows:

private static class Node<E> {

We know from the above source codeLinkedListData structure, therefore, to useLinkedListWhen you add, delete, change and search, you will search from the beginningArrayListIt can be accessed randomly. However, when performing insert and delete operations, theArrayListFast, just point the pointer to the element you want.

Vector

VectorData structure and application ofArrayListSimilarly, both use arraysObject[] elementDataTo store data, but will usesynchronizedKeyword lock synchronization:

public synchronized boolean add(E e) {
    modCount++;
    ensureCapacityHelper(elementCount + 1);
    elementData[elementCount++] = e;
    return true;
}

addThe way is to usesynchronizedKeyword lock, others such asgetremoveAnd other methods have lock. Therefore,VectorusesynchronizedKeyword to ensure thread safety, but the efficiency is low, it is not recommended to use.

We generally recommend itArrayListInsteadVectorTo achieve the function.

Stack

StackInherited fromVector, which is a last in first out container, that is, constantly pushing elements in (push)StackThe pop element must be the last one pressed inStackElements in.

StackThere are several ways to realize it

//Type elements into the stack

But becauseStackInherited fromVectorSo it also containsVectorAll APIs in. So we don’t recommend it. We can use itDequeInterface to realize the function of stack.

Thread safety

Because it’s thread safeVectorClass is not recommended, butArrayListperhapsLinkedListThere is no thread safety mechanism. If we need to realize thread safety, we need to use itCollectionsStatic methods of classessynchronizedList()Get thread safeListOr useCopyOnWriteArrayListImplementation of thread safe operation.

How to ensure that a set cannot be modified?

have access toCollections.unmodifiableCollection(Collection c)Method to create a read-only collection so that any operation that changes the collection will be thrownjava.lang.UnsupportedOperationExceptionAbnormal.

Example code:

List<String> list = new ArrayList<>();

Queue

QueueQueues are FIFO linear data structures; let’s take a lookQueueInterface for:

public interface Queue<E> extends Collection<E> {

We can see from the above,QueueInterfaces are inheritanceCollectionThe sub interface of is mainly to add six methods, which can be divided into two groups for adding, deleting and querying.add/remove/elementFor a group, its case is to throw an exception after failure;offer/poll/peekFor a group, its case is to return a special value (null or false) after failure. Only when the queue is bounded and there is no free space can the add operation throw an exception or returnfalse. In this case, we useofferThis is the way to replace itaddThis set of operations.

In JDK, there is no implementation of a queue, only oneQueueInterface. becauseQueueOnly basic queue function. Therefore, we need to expandQueueInterface.

PriorityQueue

PriorityQueueIt is a priority queue based on binary heap, which is implemented by array and can be specifiedComparatorComparator, if not passed inComparatorThen the natural order is as follows:

public class PriorityQueue<E> extends AbstractQueue<E>

The above note describestransient Object[] queueThe stored priority queue is represented as a balanced binary tree, and the location of the queue and its sub queues. If you don’t pass incomparator, will be sorted in natural order.

public boolean offer(E e) {
    if (e == null)
        throw new NullPointerException();
    modCount++;
    int i = size;
    if (i >= queue.length)
        grow(i + 1);
    size = i + 1;
    if (i == 0)
        queue[0] = e;
    else
        siftUp(i, e);
    return true;
}

Let’s take a look at its method of adding elements,PriorityQueueNot thread safe and not supportednull. If you want thread safety, you can use thejava.util.concurrent.PriorityBlockingQueueClass.

Deque

DequeThe interface is rightQueueThe expansion of,DequeIs inherited fromQueueTo achieve both ends can be in and out of the two terminal queue. It has the right APIFirstEnd andLastThe operation of the terminal,add/remove/getIs a group of operations, will throw exception;offer/poll/peekIs a set of operations that return values on failure.

public interface Deque<E> extends Queue<E> {

When using, please select the same group to use.

realizationDequeThe main features of the interface areArrayDequeLinkedListPriorityQueueAnd other concurrent containersConcurrentLinkedDequeLinkedBlockingDequeAnd so on.

Here we only introduceArrayDequeandPriorityQueueLinkedListIt’s already mentioned above.

ArrayDeque

ArrayDequeIt’s implementationDequeInterface, which inherits from theAbstractCollectionThe bottom layer is based onCollectionThe collection framework of interface implementation is as follows

public class ArrayDeque<E> extends AbstractCollection<E>

Let’s take a lookArrayDequeRealizedaddFirstmethod:

public void addFirst(E e) {
    if (e == null)
        throw new NullPointerException();
    elements[head = (head - 1) & (elements.length - 1)] = e;
    if (head == tail)
        doubleCapacity();
}

fromaddFirstAs you can see, its operation is not thread safe and cannot be insertednullAnd ifhead == tailThen the capacity will be expanded.

andLinkedListIt’s also trueDequeThe inevitable conflict withArrayDequeCompare.

ArrayDequeThe bottom layer is implemented by arrays, while the bottom layer is implemented by arraysLinkedListThe bottom layer is realized by circular linked list. The performance of linked list in addition and deletion method is higher than that of array structure, but the performance of query method array structure is higher than that of linked list structure. But the elements in the array are not moved, only added at the back, and the efficiency is not bad.

ArrayDequeIt can be used as a queue or as a stack. So we can use it insteadStackRealize the stack function.

Set

SetInterface extended fromCollectionInterface, which is characterized by non repetition.

public interface Set<E> extends Collection<E> {
    int size();
    boolean isEmpty();
    boolean contains(Object o);
    Iterator<E> iterator();
    Object[] toArray();
    <T> T[] toArray(T[] a);
    boolean add(E e);
    boolean remove(Object o);
    boolean containsAll(Collection<?> c);
    boolean addAll(Collection<? extends E> c);
    boolean retainAll(Collection<?> c);
    boolean removeAll(Collection<?> c);
    void clear();
    boolean equals(Object o);
    int hashCode();
    @Override
    default Spliterator<E> spliterator() {
        return Spliterators.spliterator(this, Spliterator.DISTINCT);
    }
}

fromSetInterface andCollectionFor comparison,SetWith andCollectionExactly the same interface, no extra functionality.

SetThe common implementation classes of areHashSetLinkedHashSetAndTreeSet

HashSet

fromHashSetThe source code can be seen, its underlying useHashMapOfkeyTo store elements, the main feature is disorder.

public class HashSet<E>

And fromaddAccording to the method,HashSetIs thread safe determined byHashMapIt’s a decision, not a decisionHashMapIt’s not thread safe in itself, soHashSetIt’s also thread unsafe.

LinkedHashSet

LinkedHashSetSelf inheritanceHashSet

public class LinkedHashSet<E>
    extends HashSet<E>
    implements Set<E>, Cloneable, java.io.Serializable {
    public LinkedHashSet() {
        super(16, .75f, true);
    }
}

It calls the parent classHashSetThe construction method of this paper is as follows

HashSet(int initialCapacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<>(initialCapacity, loadFactor);
}

There are construction methods in the parent class,HashMapByLinkedHashMapTo achieve, andLinkedHashMapThe underlying use of isHashMap+Bidirectional list to achieve, so that you can retain the order of insertion.LinkedHashSetIt’s not thread safe either.

TreeSet

TreeSetThe bottom layer is to useTreeMapThe data structure is array + red black tree, so the elements stored in it are ordered. You can customize the comparator or use the comparator to realize natural sorting.

public class TreeSet<E> extends AbstractSet<E>

If you need to customize the sort, use the following construction method to pass in aComparatorComparator:

public TreeSet(Comparator<? super E> comparator) {
    this(new TreeMap<>(comparator));
}

TreeSetCan’t storenullAnd it’s not thread safe. becauseTreeSetNo repetition and orderly features, can be usedTreeSetRealize the function of school performance list.

class Student implements Comparable<Student> {

That’s right up thereCollectionLet’s have a general introduction to the collectionMap

Map

MapIt’s stored bykey-valueThe mapping table of the object composed of,keyIt is unique and we can use itkeyTo find the correspondingvalue

Talking about Java collection

Let’s take a lookMapDefinition of interface:

public interface Map<K,V> {

There are some ways to knowMapFunctions provided. And forMapThere are two different directions for the implementation of

  • AbstractMap: using abstract classes to implementMapSome of the general functions are basically realizedMapWe should inherit it, for exampleHashMap
  • SortedMapThat’s rightMapInterface, which defines theComparatorObject, which is sorted according to, for exampleTreeMap

Here are some of the most commonly used implementation classesHashMapTreeMapLinkedHashMapandConcurrentHashMap

HashMap

HashMapIt’s the most common oneMapAnd it doesAbstractMapclass

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable

It works by calculationkeyObjecthashIt depends on the valueMapThe order of insertion cannot be guaranteed.

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

Before JDK 1.8, the bottom layer usesArray + single linked listImplementation; use after JDK 1.8Array + single linked list + red black treerealization. happenhashIn case of conflict,HashMapThe elements with the same mapping address will be connected into a linked list. When the length of the linked list is greater than 8 and the length of the array is greater than 64, the single linked list will be converted into a red black tree.

TreeMap

TreeMapIt has the function of sortingMapAnd it doesNavigableMapInterface:

public interface NavigableMap<K,V> extends SortedMap<K,V> {

andNavigableMapIs inherited fromSortedMapInterface, which can receive aComparatorMake a custom sort:

public TreeSet(Comparator<? super E> comparator) {
    this(new TreeMap<>(comparator));
}

And if you don’tComparatorBy default, thekeyNatural order, therefore,keyIt’s about to come trueComparableInterface, otherwise it will be thrown at runtimejava.lang.CassCastExceptionAbnormal.

TreeMapThe bottom layer is made ofArray + red black treeTo achieve the data structure, the entire data structure is maintained in an orderly state, and can not be insertednullKey, but there can benullValue.

Here is a small example:

class Student {

LinkedHashMap

LinkedHashMapIt’s inHashMapOn the basis of the two-way linked list, used to ensure the order of the elements.

public class LinkedHashMap<K,V>

Here’s a little example:

LinkedHashMap<String, Integer> map = new LinkedHashMap<>();

It can be seen that its output is consistent with the insertion order, and it is orderly.

IdentityHashMap

IdentityHashMapInherited fromAbstractMapAbstract class and implementMapInterface, which connects withHashMapThe inheritance class of is basically the same as the implementation interface.

public class IdentityHashMap<K,V>
    extends AbstractMap<K,V>
    implements Map<K,V>, java.io.Serializable, Cloneable

howeverIdentityHashMapuseObject[] tableArray to store elements, and can store duplicatekeyBecause inIdentityHashMapIs used initem == kIt takes the same reference to judge whether they are the same.

Let’s look at the following example:

IdentityHashMap<String, Integer> identityHashMap = new IdentityHashMap<>();

that is becausejan1andjan2All you want is the data in the constant poolJan. If you use the following methods:

IdentityHashMap<String, Integer> identityHashMap = new IdentityHashMap<>();

This shows thatjan2It’s pointing to the space in the heap,newThe string object that comes out. And if you useidentityHashMap.get(new String("Jan"))To get the value, it will outputnullThe principle is also a new string object created in the heap.

IdentityHashMapYeshashStorage, so it’s out of order and not thread safe.

WeakHashMap

WeakHashMapInherited fromAbstractMapAbstract class and implementMapInterface, which is also based onhashRealizedMapSo it’s related toHashMapMost of the functions of are the same.

public class WeakHashMap<K,V>
    extends AbstractMap<K,V>
    implements Map<K,V>

But inWeakHashMapImplemented inEntryBut not the same:

private static class Entry<K,V> extends WeakReference<Object> implements  Map.Entry <K,V> {

EntryInheritedWeakReference, usingWeakReferenceDoes not prevent GC recoveryKeyThat is, every GC will clear this object.

andWeakHashMapMaintenance inReferenceQueueWhat is the purpose of storagekeyHas been cleared.

private final ReferenceQueue<Object> queue = new ReferenceQueue<>();

Every timeput/sizeAnd so onexpungeStaleEntriesMethod to delete the deleted data in the tablekeyCorrespondingEntryTo achieve the effect of synchronization.

private void expungeStaleEntries() {

WeakHashMapIt is usually used as a cache to store key value pairs that only need to be saved for a short time. It is also a non thread safe collection.

Hashtable

HashtableIt’s also trueMapInterface. Underlying useArray + linked listTo achieve.

public class Hashtable<K,V>

In progressput/removeAnd so onsynchronizedKeyword to achieve synchronization, relatively simple.

public synchronized V put(K key, V value) {
    if (value == null) {
        throw new NullPointerException();
    }
    Entry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    @SuppressWarnings("unchecked")
    Entry<K,V> entry = (Entry<K,V>)tab[index];
    for(; entry != null ; entry = entry.next) {
        if ((entry.hash == hash) && entry.key.equals(key)) {
            V old = entry.value;
            entry.value = value;
            return old;
        }
    }
    addEntry(hash, key, value, index);
    return null;
}

This kind of operation makesHashtableIs a thread safeMapBut it also makes theHashtableThe performance of the system is poorVectorAlmost, so it’s going to be eliminated.

And we want to use thread safeMap, you can usejava.util.concurrentUnder the bagConcurrentHashMapClass.

summary

This article is to do a more overall introduction to Java container, to understand the characteristics of each container. Let’s summarize:

  1. Stored as an object, using the implementationCollectionInterface implementation class; stored by key value pair, using implementationMapInterface implementation class.
  2. realizationListThe classes of the interface are all ordered and repeatable collectionsArrayListLinkedListandVector
  3. realizationSetThe elements stored in the classes of the interface are not repeatableHashSetLinkedHashSetandTreeSet
  4. QueueInterface implements the basic operation of the queue, andDequeDefines the operation of a two terminal queue.
  5. RealizedMapInterface classes are based onhashStoredHashMapThat can be sortedTreeMapAnd weakly quotedWeakHashMap

When selecting containers, we usually compare their data structure, thread safety, repetition, order and application scenarios to select the containers we want to use.

More content, please pay attention to the official account.Hairen’s blog“, reply to” resources “to get free learning resources!

Talking about Java collection