Read the source code and review the basics through LinkedList

Time:2020-11-18

preface

This paper is based on JDK1.8

Last time, when I briefly introduced ArrayList, I mentioned that ArrayList implements the randomaccess interface and has the ability of random access. At that time, it was said that this interface was easier to understand with LinkedList. It’s time to pay off your vows today and start reading LinkedList.

LinkedList is also one of the most commonly used collections for ordinary programmers,

List list1 = new ArrayList()
List list2 = new LinkedList(),

How to choose?

In fact, the biggest difference between the two is that they are implemented in different ways. We can know by just looking at the names,ArrayList is based on arrays, while LinkedList is based on linked listsSo the key is the difference between arrays and linked lists.

Speaking of this, does it not explain a very important truth: foundation, foundation, foundation. If you want to be a real programmer, no matter you are a professional or a monk, you should work hard to lay a solid foundation.

Back to the point,ArrayList is based on array, which is fast to search (by index) and slow to add or delete; LinkedList is based on linked list, which is fast to add and delete, but slow to find。 But it’s only relative. It’s not enough to just know these two points, so keep looking.

Class signature

public class LinkedList
    extends AbstractSequentialList
    implements List, Deque, Cloneable, java.io.Serializable

LinkedList继承体系

As there are many omissions in the previous article, here are some explanations:

generic paradigm

Collection classes have been generics since 1.5, which isThe main function is to check at compile time to avoid adding unreasonable types. In short:

Instead of generics, list LIST1 = new linkedlist(); at this time, the default type is object, that is, any type of element can be added. When the element is taken out, it is necessary to cast the type to increase the probability of error.

Using generics, list list2 = new LinkedList(); where string to the right of the equal sign can be omitted. At this point, the compilation will be checked, adding non string type elements will directly fail the compilation, and there is no need to cast when obtaining. Of course, this involves erasure of different types in different periods, which is not the focus of this paper. If it is necessary to write about it later.

Because most of the time we use collections we want to store the same type, it’s important to use generics (declaring types in advance). There is also an idea: the earlier mistakes are exposed, the better.

Serializable and clonable

It implements clonable and serializable interfaces, and has the ability of cloning and serialization.

Deque

The deque interface is implemented, and the deque interface inherits the queue interface, which also meansLinkedList can be used as a queue to realize “first in, first out”

List and abstractlist

In the last article, there is a detail that was not mentioned. Maybe many people have doubts about why the abstract class abstractlist has implemented the list interface. ArrayList should implement the list interface again while inheriting abstractlist? For today’s protagonist, LinkedList inherits abstractsequentiallist, while abstractsequentiallist inherits abstractlist. Why does LinkedList implement the list interface alone?

AbstractList和List接口

There are two answers on stack overflow:

A netizen said that he asked the author of the design class, and the author himself said that this was a defect in the design at that time, and it has been left over. (of course, I personally think this statement needs to be verified.).

The second example shows that if you don’t directly implement the list interface again, unexpected results may occur when using the proxy. (from a practical point of view, it makes sense, but on closer consideration, the collection class has already appeared in jdk1.2, and the proxy class has appeared in 1.3, which is logically questionable.)

My personal understanding:

In the design of the set class, Dashen fully considered the situation of future optimization.

Specifically speaking, here is how to understandThe difference between interface and abstract class, especially before java8. Interface is a kind of specification, which is convenient for planning system. Abstract class has been partially implemented, which helps us to reduce redundant code. In other words, the abstract class here is equivalent to a tool class, but it just implements the list interface. Moreover, in view of Java single inheritance, abstract classes may be replaced.

stayInterface oriented programmingIn the process of, list list = new linkedlist(); if LinkedList has a better implementation in the future and no longer inherits the abstractsequentiallist abstract class, because it has directly implemented the list interface, as long as the internal implementation is logical, the above old code will not have problems. On the contrary, if the list is not implemented and the abstractsequentiallist abstract class is not inherited, the above old code cannot be compiled and “downward compatible”.

Randomaccess interface (not implemented)

LinkedList does not implement the randomaccess interfaceThe interface is implemented by ArrayList, which is put here for comparison.

Note that the ability of random access here refers to the index based access, that is, the e get (int index) method defined by the list interface. At the same time, it means that both ArrayList and LinkedList must implement this method.

Back to the essence of the problem, why can array based ArrayList be accessed randomly while linked list based LinkedList can not?

Or the most basic knowledge: an array is a continuous block of memory, each element is allocated a fixed size, it is easy to locate the specified index. In addition to data, each node in the linked list also has a pointer to the next node. Memory allocation is not necessarily continuous. To know the value of an index, you can only traverse it from the beginning (or from the end).

The randomaccess interface is a markup interface with no methods. The only function is to use instanceof to judge whether an implementation collection has the ability of random access.

List list1 = new LinkedList();
if (list1 instanceof RandomAccess) {
    //...
}

It doesn’t matter. The key to this problem is the difference between ArrayList and LinkedList in implementing the get method in the list interface, which will be discussed later.

variable

//Actual number of storage elements
transient int size = 0;

/**
 *Point to the head node. There is no reference to the previous node in the head node
 * 
 * Invariant: (first == null && last == null) ||
 *            (first.prev == null && first.item != null)
 */
transient Node first;

/**
 *Pointing to the tail node, the tail node does not have a reference to the next node
 * Invariant: (first == null && last == null) ||
 *            (last.next == null && last.item != null)
 */
transient Node last;

//Node type, which contains stored elements and pointers to the next and previous nodes, respectively
private static class Node {
    E item;
    Node next;
    Node prev;

    Node(Node prev, E element, Node next) {
        this.item = element;
        this.next = next;
        this.prev = prev;
    }
}

Note the node type here, and you can see that the LinkedList implementation is based onDouble linked list。 Why not use one-way linked list in the end? The most important reason is for the sake of search efficiency. As mentioned above, the search efficiency of linked list is relatively low. If it is a one-way linked list, no matter where the index is, it can only start from the beginning, and it takes an average of N times; if it is a two-way linked list, first judge whether the index is in the first half or the second half, and then decide whether to start from the beginning or from the end, which takes an average of N / 2 times. Of course, the disadvantage of bidirectional linked list is that the storage space needs to be larger, which reflects the idea of space for time from another aspect.

The above two variables, first and last, are essentially references to objects. They are no different from s in student s = new student(), except that first must point to the chain header node, and last must point to the end node of the chain list, playing a marking role, so that we can traverse from the beginning or from the tail at any time.

Constructor

//Empty parameter structure
public LinkedList() {
}

//The LinkedList is constructed by specifying the collection, and the addall method is called
public LinkedList(Collection extends E> c) {
    this();
    addAll(c);
}

common method

There are many commonly used methods (more than one graph can not be cut down). There are mainly two types of methods, one is list system, the other is deque system

Methods in list system:

LinkedList中List体系下的方法

Here we mainly look at two, add and get

add(E e)

Add elements to the end of the linked list, return true if successful

//Add an element to the end of the linked list. If successful, return true
public boolean add(E e) {
    linkLast(e);
    return true;
}


void linkLast(E e) {
    //1. Copy a reference l pointing to the tail node
    final Node l = last;
    //2. Construct the element to be added as a node, and prev points to the tail node
    final Node newNode = new Node<>(l, e, null);
    //3. Last points to the newly constructed node
    last = newNode
    //4. If the initial linked list is empty, point first to the new node
    if (l == null)
        first = newNode;
    //5. If the initial linked list is not empty, the next of the last element before adding points to the new node
    else
        l.next = newNode;
    
    //Number of elements stored + 1
    size++;
    //Modification times + 1
    modCount++;
}

The key islinkLast(E e)Method, divided into two cases, the first is to add elements to the empty list and the first to add non empty linked list.

The knowledge involved here is very basic, and it is also the basic operation of the linked list, but it is difficult to describe it clearly with language alone, so draw a simple diagram to show it (the first time you draw a picture, you can’t be perfect, just let it go)

Linklast (e e) method
The basic form of double linked list

双向链表的基本形式

Adding an empty linked list

Corresponding to linklast (E) method notes 1, 2, 3, 4

An empty linked list with no nodes means that both first and last point to null
空链表

1. Copy a reference l pointing to the tail node (blue part)

空链表添加1
At this point, the copied reference l also points to null

2. Construct the element to be added as a node newnode, and prev points to L, that is, null

空链表添加2

3. Last points to the newly constructed node (red part)
空链表添加3

4. Initially, the linked list is empty, and first points to the new node
空链表添加4

At this point, both first and last point to the only non empty node. Of course, the reference to newnode still exists, but it is no longer meaningful.

Adding non empty linked list

Corresponding to linklast (E) method notes 1, 2, 3, 5

1. Copy a reference l pointing to the tail node (blue part)
非空添加1

2. Construct the element to be added as a node newnode, and prev points to the tail node (blue part)
非空添加2

3. Last points to the newly constructed node (red part)
非空添加3

5. Point the next of the last element before adding to the new node (green part)
非空添加5

At this point, the references to newnode and l still exist, but they are meaningless.

add(int index, E element)

Adds an element to the specified location

public void add(int index, E element) {
    checkPositionIndex(index);

    if (index == size)
        linkLast(element);
    else
        linkBefore(element, node(index));
}

It can be seen that the method first checks whether the specified index conforms to the rules, that is, when index > = 0 and index < = size;

If index = = size, it is equivalent to inserting directly at the end of the linked list and calling the linklast method directly;

The above is not satisfied, the linkBefore method is invoked, and the node (index) is invoked in linkBefore.

node(index)

The function of node (index) is to return the node of the specified index. Here we use the knowledge we mentioned earlier. First, judge whether the index is in the first half or the second half, and then decide whether to traverse from the beginning or from the end.

Node node(int index) {
    // assert isElementIndex(index);

    //If the index is in the first half, it is traversed from the beginning to the end
    if (index < (size >> 1)) {
        Node x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        //If the index is in the second half, it is traversed from the tail forward
        Node x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}
linkBefore

Looking back at linkbefore, the parameters are the element to be inserted and the node at the specified position

void linkBefore(E e, Node succ) {
    // assert succ != null;
    //1. Copy the previous node reference pointing to the target location
    final Node pred = succ.prev;
    //2. Construct a new node, prev points to the previous node of the target location, and next points to the original target location node
    final Node newNode = new Node<>(pred, e, succ);
    //3. The original node prev points to the new node
    succ.prev = newNode;
    //4. If it is inserted at the head node, first points to the new node
    if (pred == null)
        first = newNode;
    //5. Non head node, the next node of the target location points to the new node
    else
        pred.next = newNode;
    
    
    size++;
    modCount++;
}

It can be seen from the above process that the key process lies in the linkbefore method. We also draw pictures to show that:

Add:

1. Copy the previous node reference pointing to the target location

Node pred = succ.prev;

linkBefore头结点1

The essence is to point to null

2. Construct a new node, prev points to the previous node of the target location, and next points to the original target location node

Node newNode = new Node<>(pred, e, succ);

linkBefore头结点2

Add new target node to prev

succ.prev = newNode;

linkBefore头结点3

4. First points to the new node

first = newNode;

linkBefore头结点4

Middle position add

As shown in the figure, if the specified node is added to the third node, i.e. index = 2, there must be a disconnection process between the second and third nodes.
linkBefore中间位置添加

1. Copy the previous node reference pointing to the target location, that is, the second node

Node pred = succ.prev;

linkBefore中间位置添加1

2. Construct a new node, prev points to the previous node of replication, and next points to the node on the original target location

Node newNode = new Node<>(pred, e, succ);

linkBefore中间位置添加2

Add new target node to prev

succ.prev = newNode;

linkBefore中间位置添加3

5. The next node of the target location points to the new node

pred.next = newNode;

linkBefore中间位置添加4

get(int index)

public E get(int index) {
    checkElementIndex(index);
    return node(index).item;
}

The get method obtains elements by index and calls node (index) in essence. As mentioned in the previous part, the bidirectional linked list improves the efficiency to a certain extent, reducing from n to N / 2, but in essence, the time complexity is still a constant multiple of N. therefore, it is not easy to use this method. When random access is required, ArrayList should be used, and traversal access, addition and deletion are required Consider LinkedList when searching more and searching less. This method is specified by the list interface, which is one of the reasons that LinkedList does not implement the randomaccess interface.

Methods in deque system

When we use LinkedList as queue and stack, we mainly use the method under deque system.

LinkedList中Deque体系下的方法

If you take a closer look, you will find that many of the above methods are basically repetitive. For example, push (E) actually calls addfirst (E),

Addfirst (E) also calls linkfirst (E) directly; pop() calls removefirst() directly;

Why do you have so many names for a method?
In fact, LinkedList has different roles when viewed from different perspectives. It can be said that it can be added everywhere and deleted everywhere.

It is recommended to read the corresponding notes carefully.

As a queue

The basic characteristics of queues are“First in, first out”, which is equivalent to adding elements to the end of the list and deleting elements in the head of the list.

The corresponding method isoffer(E e),peek(),poll()

public boolean offer(E e) {
    return add(e);
}


public boolean add(E e) {
    linkLast(e);
    return true;
}

It can be seen that the essence of the offer method is to add elements at the end of the linked list, as already mentioned in the linklast (E) method.

/**
* Retrieves, but does not remove, the head (first element) of this list.
*
* @return the head of this list, or {@code null} if this list is empty
* @since 1.5
*/
public E peek() {
    final Node f = first;
    return (f == null) ? null : f.item;
 }

The Peek () method returns the first element of the queue, but does not delete the element. That is, multiple peeks get the same element.

/**
 * Retrieves and removes the head (first element) of this list.
 *
 * @return the head of this list, or {@code null} if this list is empty
 * @since 1.5
 */
public E poll() {
    final Node f = first;
    return (f == null) ? null : unlinkFirst(f);
}

The poll () method returns the first element of the queue and removes it from the queue. In other words, multiple polls get different elements.

Obviously, the poll method is more in line with the concept of queues.

There is no detailed explanation of the methods related to deletion, because if the previous addition method is examined in detail, the deletion method is also very simple. It is nothing more than connecting the pointer of the deleted element. There is no need to waste space here. You may as well draw it yourself, which will help you understand.

As a stack

The basic characteristics of the stack are“First in, then out”, which is equivalent to adding elements to the head of the linked list and deleting elements at the head.

The corresponding method isPush (E, e) and pop().

public void push(E e) {
    addFirst(e);
}

public void addFirst(E e) {
    linkFirst(e);
 }

It can be seen that push calls addfirst, and then calls linkfirst (E). As mentioned in the add (int index, e element) method in the header, it is just that the method names are different.

public E pop() {
    return removeFirst();
}

The pop () method returns and removes the first element.

summary

This article mainly talks about the most basic content related to LinkedList. It is more about reviewing some basic knowledge, including Java related knowledge and the most basic data structure knowledge, such as linked list related operations. The first time you draw a picture to illustrate a problem, sometimes a picture is worth a thousand words. The biggest feeling of writing here is that the foundation is very important, it determines how far you can go.

I hope my article can bring you a little help!