The implementation principle of Python dictionary

Time:2021-4-6

The implementation principle of Python dictionary

Pseudo code

a = {}
a['key1'] = 1
a['key2'] = 6
del a ['key1']

Underlying implementation

  1. Python interpreter executes a = {}

    Python interpreter read here, for example, will give five consecutive memory space, there are five consecutive memory addresses, can put data
  2. The Python interpreter executes a [‘key1 ‘] = 1

    Here, the Python interpreter will hash key1 to get a decimal hash value. Because it has five positions, it will take five and get five positions: 0, 1, 2, 3 and 4. For example, if key1 gets one, it will put the value into position 1. This is called hash location mapping, also known as HashMap. However, there is a case where two keys may calculate the same hash value
  3. The Python interpreter executes a [‘key123 ‘] = 6

    It is also possible to calculate position 1, but position 1 is occupied. At this time, the parser will hash the value of key123, add position 1 of key1, hash it, and take the remainder to get a new address and put it in.

    What about deleting values?

    For example, delete
  4. The Python interpreter executes del a = [‘key1 ‘]

    When searching for a [‘key123 ‘], because the hashed location is the same, but the content in the location is empty, the value will not be found. How to solve this problem?

    When deleting, it will not be deleted. Instead, it will be marked at this location. If someone has come here, then it will take out the index value of this location, hash it again, find the location, and then it will find the accurate value. This is the way to solve hash collision.

Dictionary expansion

The expansion of the dictionary is that once it is less than one third of the total capacity, it will be expanded and the position will be rearranged. Because the total length has changed, once the amount of data is large, it will waste a large third. This is also one of the shortcomings of this algorithm, which trades space for time. So dictionaries can’t store big data.

This work adoptsCC agreementReprint must indicate the author and the link of this article