Java collection notes (list, queue, set and map)

Time:2022-5-26

The following figure shows the overall framework of Java collections. The yellow box represents the interface, the green box represents the abstract class, and the blue box represents the concrete class. The solid line represents the inheritance relationship, and the dotted line represents the implementation relationship. abstract classAbstractCollectionIt appears twice in the figure. This is to facilitate the connection. It seems that the relationship should be clear.

Java 集合框架

As can be seen from the above figure, collections in Java are divided into two categories: collection and map. A collection is a single valued collection in which instance objects are stored as a single individual. A map is a set of pairs of values. A pair of instance objects are stored in the map in the form of a combination of key value pairs. Collection contains three types of collections: list, queue and set. Here are the specific implementation classes of list, queue, set and map collections.

1. List

The elements used by list to store are sequential, and duplicate elements are allowed. There are three specific implementation classes of list: ArrayList, vector and LinkedList.

1.1 differences among ArrayList, vector and LinkedList:

  1. ArrayList is actually maintaining an array. When the capacity is insufficient, it will automatically expand the capacity, that is, copy the original array into a larger array. So it has the same advantages and disadvantages as arrays. The advantage is fast random access. The disadvantage is that there can be no physical interval between elements (object reference or basic type value). Therefore, the random search or traversal operation of ArrayList is very efficient, but the insertion or deletion of ArrayList in the non end position is expensive, and the elements need to be copied and moved [1].
  2. Vector is also implemented internally through arrays. Unlike ArrayList, it supports thread synchronization and ensures thread safety in the process of adding, deleting, modifying and querying elements. However, thread synchronization is a time-consuming operation, so it is less efficient than ArrayList.
  3. LinkedList is internally implemented using a linked list structure, so it is suitable for dynamic insertion and deletion, but the speed of access and traversal is relatively slow, because random access elements need to be traversed from scratch.

In fact, the vector class also has a subclass stack, which is the implementation class of the stack structure, but the officially provided implementation class of the stack is considered unreasonable [2] and is usually not used. Instead, use LinkedList directly, which implements the deque interface (the sub interface of queue), so it provides methods for operating header and footer elements, which allows us to use a LinkedList object as a stack, queue or two-way queue, which will be mentioned later when describing deque.

1.2 capacity expansion methods of ArrayList and vector

  1. When the initial capacity is not specified, the ArrayList object creates an empty array by default, that is, the initial capacity is 0. The source code is as follows:

    /**
    * Constructs an empty list with an initial capacity of ten.
    */
    public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }

    Among themDEFAULTCAPACITY_EMPTY_ELEMENTDATAIs an empty array.

    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    When adding an element for the first time, if the minimum capacity requiredminCapacityLess than the default capacityDEFAULT_CAPACITY = 10, the capacity will be expanded to 10, otherwise it will be directly expanded to the required minimum capacityminCapacity。 Later, when capacity expansion is required, the default new capacity isnewCapacityAt current capacityoldCapacityIncrease by 50% (50% of the current capacity is calculated by shifting one bit to the right in the source code), if the minimum capacity is requiredminCapacityDefault new capacity exceedednewCapacity, then directly expand the capacity to the required minimum capacityminCapacity。 The specific processing logic is in the newcapacity method of ArrayList. The source code is as follows:

    /**
    * Returns a capacity at least as large as the given minimum capacity.
    * Returns the current capacity increased by 50% if that suffices.
    * Will not return a capacity greater than MAX_ARRAY_SIZE unless
    * the given minimum capacity is greater than MAX_ARRAY_SIZE.
    *
    * @param minCapacity the desired minimum capacity
    * @throws OutOfMemoryError if minCapacity is less than zero
    */
    private int newCapacity(int minCapacity) {
    int oldCapacity = elementData.length;
    //50% of the current capacity is obtained by shifting the meter one bit to the right
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity <= 0) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
    return Math.max(DEFAULT_CAPACITY, minCapacity);
    If (mincapacity < 0) // value overflow (exceeding the maximum range of int type)
    throw new OutOfMemoryError();
    return minCapacity;
    }
    //The following code is used to ensure that the array length does not exceed the maximum value of int type integer MAX_ VALUE
    return (newCapacity - MAX_ARRAY_SIZE <= 0)
    ? newCapacity
    : hugeCapacity(minCapacity);
    }
  2. When the initial capacity is not specified, the array capacity created by the vector object is 10 by default. See the source code:

    public Vector(int initialCapacity, int capacityIncrement) {
        super();
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        //Create an array and specify the initial capacity    
        this.elementData = new Object[initialCapacity]; 
        this.capacityIncrement = capacityIncrement;
       }
    public Vector(int initialCapacity) { 
        this(initialCapacity, 0);
    }
    public Vector() {
        this(10); //  The initial capacity of the array is 10
    }

    Notice one of the two argument constructorscapacityIncrementParameter, which is used to specify the increment of array expansionthis.capacityIncrement, if not specified, its default value is 0. When capacity expansion is needed, the increment will be judged firstcapacityIncrementWhether the value is greater than 0. If yes, the default new capacitynewCapacityAt the current capacityoldCapacityAdd one abovecapacityIncrementValue, otherwise directly in the current capacityoldCapacityDouble above. If the minimum capacity requiredminCapacityDefault new capacity exceedednewCapacity, then directly expand the capacity to the required minimum capacityminCapacity。 The specific processing logic is in the newcapacity method of vector. The source code is as follows:

    /**
     * Returns a capacity at least as large as the given minimum capacity.
     * Will not return a capacity greater than MAX_ARRAY_SIZE unless
     * the given minimum capacity is greater than MAX_ARRAY_SIZE.
     *
     * @param minCapacity the desired minimum capacity
     * @throws OutOfMemoryError if minCapacity is less than zero
     */
    private int newCapacity(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + ((capacityIncrement > 0) ?
                                         capacityIncrement : oldCapacity);
        if (newCapacity - minCapacity <= 0) {
            if (minCapacity < 0) // overflow
                throw new OutOfMemoryError();
            return minCapacity;
        }
        //The following code is used to ensure that the array length does not exceed the maximum value of int type integer MAX_ VALUE
        return (newCapacity - MAX_ARRAY_SIZE <= 0)
            ? newCapacity
            : hugeCapacity(minCapacity);
    }

Simply put, by default, when the minimum capacity required for capacity expansion does not exceed the default new capacity:

The initial capacity of ArrayList is 0. The default size for the first expansion is 10. When expanding, it increases by 50% over the original capacity;

The initial capacity of vector is 10, which is doubled over the original capacity during capacity expansion.

2. Queue

Queue is an interface about queue structure. The elements stored in the queue are also sequential, and duplicate elements are allowed. However, the queue does not allow random access using the index. It only allows the operation of the head and tail elements, that is, adding new elements from the tail and obtaining the first added elements from the head (first in, first out, FIFO).

Queue has a sub interface deque, which is an interface about two-way queue structure. Both ends of the two-way queue can add or access elements, so the entity class object that implements deque can be used as a queue or two-way queue, or as a stack. Two common entity classes that implement deque are arraydeque and LinkedList. It can also be seen from the name that arraydeque is implemented through array, and LinkedList is implemented through linked list structure.

As can be seen from the previous collection framework diagram, the entity class that purely implements the queue interface is PriorityQueue. It is not a queue implementation following FIFO standard, but a queue implementation with priority order. PriorityQueue internally maintains a balanced binary heap according to the priority order of elements, and the elements are stored in the heap. Therefore, it is necessary to ensure that each element can be compared (for basic data types, according to the natural order, for reference types, either implement the comparable interface or specify a comparator to provide comparison rules). The priority of the header of the PriorityQueue isminimumElement of. It returns the header element each time the queue is accessed.

3. Set

The set interface corresponds to the set concept in mathematics. Set does not allow storing duplicate elements, and only allows storing at most one null element. The common implementation classes for implementing the set interface include HashSet, linkedhashset and TreeSet. In fact, their implementation method is to encapsulate the corresponding map implementation class object in the class, that is, a HashMap object is encapsulated in the HashSet, a LinkedHashMap object is encapsulated in the linkedhashmset, and a treemap object is encapsulated in the TreeSet. Set is a single valued set, so it only cares about the key elements in the corresponding map object, and the positions of value elements all use an object object objectPRESENTfill.

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

Therefore, the characteristics of the three set implementation classes are consistent with their corresponding map implementation classes. The brief description is as follows [3]:

  1. The elements stored in HashSet are out of order, and the time of accessing elements is close to the constant level. Therefore, it is most appropriate to select HashSet when it is not necessary to maintain the order of elements.
  2. The linkedhashset is stored in the order in which the elements are stored and accessed in the order of insertion during traversal. Therefore, linkedhashset is suitable when the order of storing elements needs to be maintained.
  3. TreeSet will sort the stored elements, so the elements are required to be comparable by default (sorted in ascending order by default). You can also specify the comparator object to provide comparison rules. Therefore, TreeSet is suitable for traversing elements in natural or custom order.

The implementation of the internal data structure is described in the map section below.

4. Map

Sorting

reference resources

  1. The difference between vector and ArrayList in Java

  2. On Java 8 by Bruce Eckel

  3. Differences between HashMap, LinkedHashMap and treemap (turn)