We explained in the previous chapter.Hash table
Most Importanthash function
And with pseudo code and C language, we implemented our ownhash function
,hash function
incollision
It’s inevitable when it happens.collision
How can we deal with it effectively? Let’s talk about this chapter.
Handling collisions
hash function
Mapping infinite inputs to finite outputs occurs when different inputs are mapped to the same output.collision
, each.Hash table
They all use different methods to deal with it.collision
。
Our hash table will use a dual hashing technique called open address to handle conflicts. Double hashing uses two hash functions to calculate occurrencecollision
Index of post-stored records.
double hashing
Wheni
Happencollision
Then we use the following method to obtain the index:
index = hash_a(string) + i * hash_b(string) % num_buckets
When it doesn’t happencollision
At that time,i=0
So the index ishash_a
Value, occurrencecollision
Later,hash_a
The result needs to go through once.hash_b
Processing.
hash_b
Possible return0
Reduce item 2 to0
This leads toHash table
Insert multiple records into the samebucket
In Chinese, we can be inhash_b
After the result1
To deal with this situation, make sure it never does0
:
index = (hash_a(string) + i * (hash_b(string) + 1)) % num_buckets
Algorithm implementation
// hash_table.c
static int ht_get_hash(const char* s, const int num_buckets, const int attempt) {
const int hash_a = ht_hash(s, HT_PRIME_1, num_buckets);
const int hash_b = ht_hash(s, HT_PRIME_2, num_buckets);
return (hash_a + (attempt * (hash_b + 1))) % num_buckets;
}
Chapter 1: hash functions
Next Chapter: Completing the Hash Table API