Kafka replica and ISR design (I)


In Kafka, a partition log is actually a backup log. Kafka uses multiple identical backup logs to improve system availability. These backup logs are actually called replicas.

Kafka’s replica can be divided into leader replica and follower replica. Leader replica provides client with read-write request. Follower replica is only used to passively synchronize data from leader replica, and does not provide read-write service externally.

If all the replicas of Kafka’s nodes are running normally, then the leader replica will remain unchanged, but there is no absolutely stable system in the world. Once there is a problem with Kafa’s leader replica node, then the follower replica needs to compete to become a leader replica, but not all the follower replicas are qualified to compete to be a leader replica. Obviously, a follower replica is assumed to be a leader replica The data behind is far less than the leader copy, which is not eligible. As a result, a set of qualified copies of the follower, collectively referred to as the ISR, has been maintained within Kafka.

Copies in ISR will be removed and added.

Key concept points

The following figure mainly describes the important concepts in Kafka logs. The related concepts in the figure below are related to production, message consumption, ISR and replica synchronization mechanism.
Kafka replica and ISR design (I)

  • First message offset: saves the offset of the first message contained in the copy
  • Log high watermark value (HW): the HW of the leader copy determines the message range that consumers can consume. Messages lower than or equal to HW can be consumed by consumers
  • End displacement (LEO): Leo always points to the location where the next message is written. Messages between HW and Leo of leader indicate that they have not been fully backed up. Only after all the replicas in ISR have updated their Leo, the HW of the leader will move right to indicate that the message is written successfully.


ISR is actually a set of replicas of a group of leaders synchronous followers maintained by Kafka and competing for posts.

Why the follower replica is not synchronized with the leader replica:

  • Synchronous data request speed can’t catch up: the follower copy can’t catch up with the message receiving speed of the leader copy end in a period of time. For example, the network I / O of the follower replica is blocked, which greatly reduces the synchronization speed of the leader replica of the follower replica
  • Process stuck: the follower copy can’t make requests to the leader for a while, for example, the follower frequently GC
  • Newly created replica: the user actively increases the number of replicas, and the newly created replica will catch up with the leader’s progress after startup. During this period, the newly added follower replica is usually not synchronized with the leader replica

This parameter is used to detect the problem that synchronous data request speed cannot catch up with. If the number of replica messages in ISR falls and the number of messages in leader replica exceeds the setting of this parameter, ISR will be kicked out.

This parameter was removed after Kafka version Why was it removed?

There must be some drawbacks in him. Considering the following situation, the production rate of Kafka’s producers is not stable, and there will be peak and low peak. At the peak, due to a large number of messages gathered, the message difference between the message in ISR and the leader exceeds this value, so the copy in ISR will be kicked out.

However, with the stable and decreasing of the production message rate, and at this time, the follower replica is also trying to catch up with the leader replica. When the follower replica catches up with the leader replica again, ISR will be added again.


This parameter is used to detect two other situations: if the follower replica cannot request data from the leader replica within this time, the ISR will be kicked out.

Since the setting of replica.lag.max.messages parameter is removed in the new version, replica.lag.max.ms is also used to detect that the speed of synchronous data request cannot catch up with the problem. However, when it is used to detect the secondary problem, the detection mechanism is that as long as the time that the follower copy lags behind the leader does not last longer than this parameter, it is considered as synchronization. If the persistence exceeds this parameter, it is considered as non synchronization.