This is how Kafka’s time wheel principle is solved!


[Abstract] Kafka time wheel is the basis of Kafka’s efficient delay task. It simulates the real life clock’s representation of time. At the same time, the way of time wheel is not limited to Kafka. It is a general time representation. This paper mainly introduces the principle of Kafka’s time wheel.

There are some delayed operations in Kafka, such as delayedfetch, delayedproduction, delayedheartbeat, etc. in Kafka, the addition, rotation, execution and extinction of timed tasks are realized by time rounds. (time wheel is not a unique design of Kafka, but a general implementation method. It is also used in netty.)

1. What is the time wheel

Refer to the two pictures on the Internet (excerpt from…

This is how Kafka's time wheel principle is solved!This is how Kafka's time wheel principle is solved!

These two figures clearly illustrate the structure of Kafka time wheel: similar to real timepieces, it is composed of multiple circular arrays, each of which contains 20 time units, representing a time dimension (one round). For example, in the first layer of time wheel, each element in the array represents 1ms, and a circle is 20ms. When the delay time is greater than 20ms, it is carried to the second layer In the second layer, each “grid” represents 20ms, and so on

For a delayed task, there are three processes: entering the time wheel, demoting and due execution.

  • Entering the time wheel
  1. According to the delay time to calculate the corresponding time wheel “level” (such as “hour level” or “minute level” or “second level” in the clock, it is actually a process of “upgrading” until a suitable “level” is found)
  2. Calculate the position in the round and insert it (each bucket is a bi-directional linked list, which may contain multiple delayed tasks, which is also a major reason for time round to improve efficiency, which will be mentioned later)
  3. If the bucket is inserted for the first time, it needs to be added to the delayqueue (the introduction of delayqueue is to solve the problem of “empty propulsion”, which will be mentioned later)

This is how Kafka's time wheel principle is solved!

  1. When the time “advances” to a bucket, it indicates that the task in the bucket has finished its time in the current time wheel and needs to be “demoted”, that is, entering a smaller granularity time wheel. The process of reinsert is similar to entering the time wheel

This is how Kafka's time wheel principle is solved!

  • Due execution
  1. During the reinsert process, these tasks are executed if it is found that they have expired

This is how Kafka's time wheel principle is solved!

The overall process is as follows:

This is how Kafka's time wheel principle is solved!

2. Advance of time

An intuitive idea is that, like clocks and clocks in real life, it requires a thread to execute continuously. In most cases, most of the buckets in the time wheel are empty, and the “pushing” of the pointer has no real effect. Therefore, in order to reduce this “empty push”, Kafka introduces delayqueue, which is based on the bucket, Every time a bucket expires, that is queue.poll Only when the result can be obtained can the time be “pushed forward”, which reduces the overhead of thread idling of expiredoperationreactor.

This is how Kafka's time wheel principle is solved!

3. Why use time wheel

When using delayed tasks, the more direct ideas are delayqueue and scheduledthreadpool executor. Compared with time round, the biggest advantage of time round is time complexity

Time complexity comparison:

This is how Kafka's time wheel principle is solved!

Therefore, in theory, the time performance advantage of timingwheel will be more obvious when there are more tasks

The main reasons for the high performance of Kafka time wheel are summarized

(1) The structure of time wheel and bidirectional list bucket make the insertion operation achieve o (1) time complexity

(2) The design of the bucket allows multiple tasks to be “merged”, so that multiple inserts of the same bucket only need to be queued in the delayqueue once. At the same time, the number of elements in the delayqueue is reduced, the depth of the heap is also reduced, and the insertion and pop-up operation costs of the delayqueue are also smaller

Click follow to learn about Huawei’s new cloud technologies~