[leetcode topic] #1 sum of two numbers — exploring the principle and source code of HashMap


0 Preface

This topic will analyze some data structures and algorithm principles behind the best solution based on the topic of leetcode.
This paper will analyze the principle of HashMap in Java and the reasons for its high performance based on the sum of topic #1 two numbers.

1 Title Description

Sum of two numbers

Given an array of integersnumsAnd an integer target valuetarget, please find it in the arrayAnd are the target valuesThatTwoIntegers and return their array subscripts.

You can assume that each input will correspond to only one answer. However, the same element in the array cannot be used twice.

You can return answers in any order.

Example 1:
Input:nums = [2,7,11,15], target = 9
Explanation:Because num [0] + num [1] = = 9, return [0, 1].

Example 2:
Input:nums = [3,2,4], target = 6

Example 3:
Input:nums = [3,3], target = 6


  • 2 <= nums.length <= 103
  • -109 <= nums[i] <= 109
  • -109 <= target <= 109
  • There will only be one valid answer

2 best solutions and questions

The easiest solution to think of is violent enumeration, loop nesting, and space complexity is onlyO(1), but the time complexity isO(N ^2)。 The better solution is to speed up the efficiency of data query based on hash table (such as HashMap in Java), and exchange space for time, so that the complexity of time and space isO(N)

The specific description and code can be seen in: https://leetcode-cn.com/probl…

The question here is, why can the query efficiency be significantly improved by using hash table?

3 principle analysis of hash table

1、Why is there a hash table structure and what is it composed of?

The known structures include array and linked list, which have their own advantages and disadvantages:

  • array: the storage structure of array is continuous, and the space complexity is large, but the time complexity of query is small. Its search efficiency is generally low through insertion and deletion of subscripts. That is,Fast query, slow addition and deletion
  • Linked list: the linked list storage structure is discrete and the space complexity is small. Its addressing (through subscript search) efficiency is low, and the general insertion and deletion efficiency is high. That is,Slow query, fast addition and deletion

Hash table is built based on array and linked list in order to give full play to the advantages of these two structures.

For a hash table, it contains four parts:

  • Key: the key value
  • Value: value
  • Hash: the hash value corresponding to the key. After some operation based on the hash, the index of the table is obtained
  • Next: when the index conflicts, store the conflicting data, linked list structure

2、Why is the query efficiency of hash table higher than that of array?

Because the structure of hash table is also a bit similar to dictionary. The dictionary can quickly query the value through the key, and the hash table can quickly query the value through the key, but the hash transformation (i.e. hash calculation) is carried out in the middle, and the index of the value is determined through the hash value, so as to obtain the data.

A certain key can calculate the unique hash value, and the index is obtained by the hash through some operation. It is difficult to avoid the hash value conflict. In order to solve this conflict, the linked list is introduced to efficiently store the same hash value.

3、hashMap. What is the time complexity of containskey (value)?

HashMap has the following differences between different versions of JDK:

  • Before JDK 1.8, HashMap was composed of arrays and linked lists.
  • After JDK 1.8, HashMap is composed of array, linked list and red black tree.

Red black tree is characterized by fast query and slow addition and deletion.
Red black treeAs a binary search tree, it adds coloring and related properties on the basis of binary search tree, making the red black tree relatively balanced, so as to ensure that the time complexity of search, insertion and deletion of red black tree is the worst o (log n). Because the query time complexity of the linked list is O (n), when the linked list is very long, it will be transformed into a red black tree to improve the efficiency!

Red black tree is introduced to solve the problem of query performance degradation caused by too long linked list when conflict occurs. JDK 1.8 stipulates that when the length of the linked list is greater than 8, the linked list will be transformed into a red black tree structure. The worst time complexity before conversion isO(N), the worst time complexity after conversion isO(logn)

Recommended videos for explaining the principle of HashMap (p6-p9): https://www.bilibili.com/vide…
Recommended principles of red black tree: https://www.cnblogs.com/yyxt/…

To sum up:
Before JDK 1.8, HashMap Containskey (value) is O (1) at best and O (n) at worst
After JDK 1.8, HashMap Containskey (value) is O (1) at best and O (LGN) at worst

4 HashMap source code analysis

This article is very clear and will not be repeated.

If you don’t understand the article, you can take a look at the video mentioned above first.