Who said that ordered list can’t be binary search, just need evolution?!

Time:2020-10-21

preface

This article is on the album: http://dwz.win/HjK , click to unlock more knowledge of data structures and algorithms.

Hello, this is tongge.

In the last section, we learned everything about hashes, especially the evolution of hashes. I believe that through the learning in the previous section, you can tell the interviewer how the hash table developed to this stage from the beginning to the end.

However, can the ultimate form of HashMap be realized only in the form of “array + linked list + red black tree”? Is there an alternative? Why doesn’t Java use this alternative?

In this section, we’ll learn another data structure — skip table. I’ll divide it into two sections. The first section introduces the evolution process of jump table, the second section code implements jump table, and rewrites HashMap.

OK, let’s start with the first section of the skip table.

Ordered array

As we all know, arrays can support random access, that is, they can quickly locate elements through subscripts, and the time complexity is O (1).

So what is the use of this random access feature in addition to finding elements by subscript?

Imagine, if an array is ordered, I’m looking for someSpecified elementHow to find out the most quickly?

Who said that ordered list can't be binary search, just need evolution?!

Simply, traverse the entire array from the beginning, and return when it encounters the element to be searched. For example, it takes 6 times to find the element 8, and 8 times to find the element 10.

Therefore, the time complexity of searching elements in this way is O (n).

Who said that ordered list can't be binary search, just need evolution?!

Fast method, because the array itself is ordered, we can use binary search. First, search from the middle. If the specified element is smaller than the middle element, then search on the left half. If the specified element is larger than the middle element, search in the right half and proceed in turn until the specified element is found. For example, to find the element 8, first locate it in the middle (7 / 2 = 3). In the next search, add 1 to the left pointer, take position 4 as the left pointer, and change the middle position to the position of (4 + (7-4) / 2 = 5). It only takes two times to find the element 8.

Using binary search, the efficiency is improved by more than a little bit. Even in the worst case, it only needs the time complexity of log (n).

Who said that ordered list can't be binary search, just need evolution?!

Ordered list

Above, we introduced the quick search of ordered arrays. Next, let’s take a look at the ordered linked list.

Who said that ordered list can't be binary search, just need evolution?!

The above is an ordered linked list. At this time, when I want to find the element 8, I can only search from the chain header until I encounter 8. The time complexity is O (n). It seems that there is no better way.

Who said that ordered list can't be binary search, just need evolution?!

Let’s consider the differences between ordered arrays and ordered linked lists. The reason why ordered arrays can be directly located to intermediate elements is that they can be accessed quickly through indexes (subscripts). Then, can we implement similar functions by adding indexes to ordered lists?

The answer is yes. This kind of ordered linked list with index is a jump list. Please jump to the table below.

Skip Watch

The first question is: how to give an ordered list a reference?

Here, we need to add a concept of “layer”. Assuming that the level of the original linked list is 0, then select some elements to extend upward to form the level 1 index. Similarly, on the basis of the level 1 index, select some elements to extend upward to form the level 2 index until you feel that the number of layers of the index is almost the same. Yes, the skip table is so arbitrary that you can fill it That’s good^^

Suppose, for the ordered linked list above, I add the following indexes:

Who said that ordered list can't be binary search, just need evolution?!

The second question: where to access the hop table? 6? 3? 1? 9?

It doesn’t seem to work. Therefore, we need to add a special node, the head node, in front of element 0. For example, after the header node is added to the hop table above, it looks like this:

Who said that ordered list can't be binary search, just need evolution?!

At this point, as long as we start from the node H2, we can quickly find any element in the hop table.

For example, to find the element 8, H2 first looks to the right. Eh, it is 6, smaller than 8. Jump to the position of 6, and then look to the right. Ah, it is 9, which is larger than 8. Therefore, you can’t jump over. Jump down to the position of level 1, 6, and look right again. You can’t jump past it. Jump down one step to 6 on level 0. Since, to level 0, you can only follow the chain The table traverses backward in turn until 8 is encountered. The whole process is as follows:

Who said that ordered list can't be binary search, just need evolution?!

As you can see, the whole process is jumping and jumping, so it is named “Tiao watch”.

The number of elements here is relatively small, and you may not see much advantage. If there are many elements, every two elements form an index upward, and every two indexes form an index upward. Finally, it is similar to a balanced binary tree

Who said that ordered list can't be binary search, just need evolution?!

As you can see, each search can reduce the search range by half. Therefore, the query time complexity of hop table is O (log n).

However, it is impossible to use this completely balanced skip table in practice, because if we want to keep the balanced feature, we must do rebalancing when inserting or deleting elements, which greatly reduces the efficiency. Therefore, generally, we use random to determine whether an element or index should produce an index.

The third question: when will the index be produced?

The best time is to insert an element, because the next step is to use the index. Why? No matter insert, delete or query, in fact, you must first query to find the element before you can proceed to the next step. To put it bluntly, it means that no matter what the operation is, the query must go through the index. If you want to go through the index, you must first build the index. If you want to build the index, you should insert the element.

OK, I’ll take you through the whole process of creating a skip table by using a step-by-step method:

  1. In the initial state, there is only one header node H0 (no, there is also a watermark of tongge reading source code, naughty ^ ^).

    Who said that ordered list can't be binary search, just need evolution?!

  2. Insert an element 4, put it after H0, and randomly decide whether to index upward. The result is no index.

    Who said that ordered list can't be binary search, just need evolution?!

  3. Insert an element 3 and search from H0. The next element of H0 is 4, which is larger than 3. Therefore, 3 is placed between H0 and 4, and then asked whether to form an index. At this time, 3 forms an index upward. At the same time, H0 also forms an index H1 upward. The results are as follows:

    Who said that ordered list can't be binary search, just need evolution?!

  4. Insert an element 9, search from H1, and then go through H1 > 3 > 3 > 4, but no position is found. Finally, insert it after 4 and ask whether to form an index. At random, they say that I want to form an index, and I want to form a two-level index (at most one more than the current level), and then it turns out to be like this:

    Who said that ordered list can't be binary search, just need evolution?!

  5. Then, elements 1 and 7 are inserted, which are neither surprised nor pleased, nor indexed:

    Who said that ordered list can't be binary search, just need evolution?!

  6. Insert element 6. According to the index, the search route is H2 > H1 > 3 > 3 > 4. Eh, it is found that the next 4 is 7. Therefore, 6 is placed between 4 and 7. Then, decide whether to form an index or not. At random, I want to form an index, and I also want to form a two-level index. At this time, it is very troublesome. When the index of element 6 is formed, the line 3 – > 9 and H2 – > need to be modified 9, the result is as follows:

    Who said that ordered list can't be binary search, just need evolution?!

  7. At the end of the paper, elements 8 and 10 are inserted, both of which are risk-free and have no index. Therefore, the final result is as follows:

    Who said that ordered list can't be binary search, just need evolution?!

As you can see, the skip table is a very random data structure. Even if the elements are re inserted in the same order, the generated skip table may be completely different and capricious. Therefore, I like the data structure of jump table very much.

Fourth question: the process of inserting elements is described above. What is the deletion process like?

In the deletion process, we should first find the elements. However, there are some small differences, very small differences, which are difficult to describe. For example, to delete the element 6, can I come from the path H2 – > 6 – > 6?

No, because from this path, after deleting index 6 of the first layer, the line 3 – > 9 cannot be repaired. Therefore, when deleting an element, you can only take the path H2 – > H1 – > 3 – > 3 – > 4 – > 6, and remember the last index of each layer on the way, so that the index of each layer can be correctly repaired after deleting the element 6.

Delete 6 as follows:

Who said that ordered list can't be binary search, just need evolution?!

Well, at this point, I can’t help but think of a small optimization item in the Java skipping table concurrentskiplistmap. In the concurrentskiplistmap, whether search, insert, or delete, they all follow the same search path as delete. In fact, we can simply optimize it, and we can take another path when inserting and searching.

Who said that ordered list can't be binary search, just need evolution?!

The source code analysis of my classmate skipplist can be analyzed

Well, that’s all we have to say about the theory of skip watch.

Postscript

In this section, we show the whole process of searching, inserting, and deleting elements in the skip table completely and clearly by means of step-by-step diagram. Did you get it? Can you hang up the interviewer?

However, many students may say “talk is leap, show me the code”, OK. In the next section, I will show you the details of jump table implementation in code, and rewrite HashMap and next part with jump table.

Pay attention to the princess “tongge read the source code” to unlock more source code, foundation and architecture knowledge.