Technology sharing | hash join MySQL 8

Time:2020-2-17

By Erik fr ø Seth
Guan Changlong
Original: https://mysqlserverteam.com/h

For a long time, it is only a variant of nested loop algorithm that performs join queries in MySQL. With the release of MySQL 8.0.18, you can now join using hash join. This blog article will introduce how it works, how long it takes to use it, and how it compares with the old join algorithm in MySQL in terms of performance.

What is hash join?

Hash join is a way to perform a join in which a hash table is used to find matching rows between two inputs (an input is one or more tables). It is usually more efficient than nested loop join, especially if one of the inputs can be held in memory. To see how it works, we’ll use the following query as an example:

SELECT
  given_name, country_name
FROM
  persons JOIN countries ON persons.country_id = countries.country_id;

① Construction stage

Generally, hash join is divided into two stages: construction stage and exploration stage.

In the build phase, the service uses the join attribute as the hash table key to build an in memory hash table that stores rows from one of the inputs. This input is also known as build input, let’s assume thatcountriesSpecify as build input. Ideally, the service would choose the smaller of the two inputs as the build input (in bytes, not rows).

Becausecountries.country_idIs a join condition that belongs to the build input, so it is used as a key in the hash table. Once all rows are stored in the hash table, the build phase is complete.

Technology sharing | hash join MySQL 8

② Detection phase

In the probe phase, the service starts from the probe input (in our examplepersons)Read rows. For each line, the service usespersons.country_idThe value in is used as a lookup key to detect whether the hash table matches the row. For each match, a merged row is sent to the client. Finally, the service scans each input only once, using a constant time lookup to find a matching row between the two inputs.

Technology sharing | hash join MySQL 8

This approach works well assuming that the service can store the entire build input in memory. The amount of memory available is determined by the system variablejoin_buffer_sizeControl, which can be adjusted during operation. But what happens if the build input is greater than the available memory? We overflow to disk!

③ Overflow to disk

When the build phase memory is full, the server writes out the rest of the build input to multiple block files on disk. The server tries to set the number of blocks so that the largest block is just right for memory (we’ll see why soon), but the strict limit is 128 block files per input. The hash value of the join attribute is calculated to determine which block file the row is written to. Note that in the illustration, a different hash function is used than the one used in the memory build phase. We’ll see why later.

Technology sharing | hash join MySQL 8

In the probe phase, the server probes the matching rows in the hash table as if everything were suitable for memory. But in addition, a row may match a row in the build input written to disk. As a result, each line from the probe input is also written to a set of block files. Use the hash functions and formulas used when writing build input to disk to determine which block file to write rows to. In this way, we can determine that the matching lines between the two inputs are in the same pair of block files.

Technology sharing | hash join MySQL 8

After the probe phase, we start to read the block file from disk. Typically, the server uses the first set of block files as build and probe input to perform the build and probe phases. We load the first block file in the build input into a hash table in memory. This explains why we want the largest block to fit exactly in memory. If the block file is too large, we need to break it up into smaller blocks. After loading the building blocks, we read the corresponding block file from the probe input and probe for matches in the hash table, just as everything is suitable for memory. After processing the first pair of block files, we will move to the next pair of block files and continue until all the pair of block files have been processed.

Technology sharing | hash join MySQL 8

You may now have guessed why two different hash functions should be used when dividing rows into block files and writing rows to the hash table. If we want to use the same hash function for two operations, we will get a very bad hash table when we load the building block file into the hash table, because all lines in the same block file will be hashed to the same value.

How can I use it?

Hash join is enabled by default, so you don’t need to do anything to use it. It is worth noting that hash join is built on a new iterator actuator, which means that you must use theEXPLAIN FORMAT = treeTo see if hash join will be used:

mysql> EXPLAIN FORMAT=tree
    -> SELECT
    ->   given_name, country_name
    -> FROM
    ->   persons JOIN countries ON persons.country_id = countries.country_id;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Inner hash join (countries.country_id = persons.country_id)  (cost=0.70 rows=1)
    -> Table scan on countries  (cost=0.35 rows=1)
    -> Hash
        -> Table scan on persons  (cost=0.35 rows=1)     |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

In general, if you join tables together using one or more equal join conditions, and the join condition does not have an index, hash join is used. If indexes are available, MySQL tends to use nested loops with index lookups.

We have introduced a new optimizer switch that enables you to disable hash join for any query:

mysql> SET optimizer_switch="hash_join=off";
Query OK, 0 rows affected (0.00 sec)

mysql> EXPLAIN FORMAT=tree
    -> SELECT
    ->   given_name, country_name
    -> FROM
    ->   persons JOIN countries ON persons.country_id = countries.country_id;
+----------------------------------------+
| EXPLAIN                                |
+----------------------------------------+
|                                         |
+----------------------------------------+
1 row in set (0.00 sec)

When hash join is disabled, MySQL will return to the block nesting loop to use the old executor (block nesting loops are not supported by iterator executors). This switch makes it easy to compare the performance of hash join and block nested loops.

If the build input is too large to fit in memory and use disk, you can increase the size of the connection buffer. In contrast to block nesting loops, hash join allocates memory incrementally, which means it will never use more memory than it needs. Therefore, when using hash join connections, it is safer to use a larger connection buffer size.

Performance data!

We did some benchmarking to see how hash joins compare to block nesting loops, and the results looked like this:

Technology sharing | hash join MySQL 8

You can view a demonstration of the results here. First of all, I must mention that we did disable all indexes in this test. This is for the optimizer to use block nested loops and hash joins to create execution plans, so the numbers you see here do not show an overall improvement in dbt-3 execution time. This test is done to highlight the difference between block nested loops and hash joins. But we can see that in all queries using hash join, hash join is obviously better than block nested loop. The buffer pool is resized so that all data is in memory, and the connection buffer size remains the same as the default (about 250kb). The significant improvement is due to the fact that hash join is scanned only once per input, and it uses constant time lookups to find matching rows between two tables.

Unfortunately, there are some limitations in the current implementation of hash join:

MySQL only supports internal hash joins, which means that inverse, semi and outer joins are still performed using block nesting loops.
The optimizer / Planner uses block nested loops to perform joins. But you should use hash join more often.
We hope to eliminate these two limitations in the future, but even if they exist, hash joins should make your queries run faster.