[Kafka] is__ consumer_ Offset setting automatic cleaning rules

Time:2021-4-24

This method may have risks, please read the article before operation.

The commands in this article can be executed when Kafka is not started.


cause

The Kafka server ran out of storage after a few months. By analyzing the space occupied by Kafka, it is found that Kafka is generated automatically“__consumer_offset”Topic, which takes up a lot of space, is used to record the consumption offset of each user’s topic. This topic is different from other topics in cleaning rules. In some special cases, it may not be cleaned up all the time and the server resources may be exhausted.

View cleanup policy

Because the Kafka version on the server is older, the parameter used here is–zookeeperInstead of−−bootstrap-serverParameters.

Use the following command to view the cleanup policy:
./kafka-configs.sh --zookeeper zk01:2181,zk02:2181,zk03:2181/kafka --entity-type topics --entity-name __consumer_offsets --describe
On this server, the following results are obtained
Configs for topics:__consumer_offsets are segment.bytes=104857600,cleanup.policy=compact,compression.type=uncompressed
Size of each file block100MB, cleaning strategy iscompressThe compression strategy isNo compression
Of course, the server space will be used up.

handle

First, we will deal with__ consumer_ The special cleaning policy of offset is deleted
./kafka-configs.sh --zookeeper zk01:2181,zk02:2181,zk03:2181/kafka --entity-type topics --entity-name __consumer_offsets --alter --delete-config cleanup.policy
It is said that this can make the cleaning strategy here consistent with the common topic, but just in case, manually add a set of cleaning strategies:
./kafka-configs.sh --zookeeper zk01:2181,zk02:2181,zk03:2181/kafka --alter --entity-name __consumer_offsets --entity-type topics --add-config retention.ms=604800000
./kafka-configs.sh --zookeeper zk01:2181,zk02:2181,zk03:2181/kafka --alter --entity-name __consumer_offsets --entity-type topics --add-config cleanup.policy=delete

These two lines of command will__ consumer_ The cleaning logic of offset is adjusted to “clear the data 7 days ago, and the cleaning strategy is delete”

After running Kafka, you can see a large number of data being marked for deletion. Wait for one minute (if the delay parameter is not adjusted), and Kafka will automatically delete the data 7 days ago. There is no need to worry about server space in the future.

risk

Although the server space problem has been solved, there is also a question: will this cause the offset records of some partitions to disappear, leading to repeated consumption?
For example, a topic has 200 records. The consumer consumed 100 records eight days ago, and there was no consumer consumption or producer production during the eight days.
If the consumer goes to spend today, the previous consumption record will not change for more than 7 days, and it is likely to have been deleted. Will consumers start to consume from message 0?
In the current environment, topic has a 7-day limit to save data, which avoids this problem, but it may cause some problems.

Recommended Today

Swift advanced (XV) extension

The extension in swift is somewhat similar to the category in OC Extension can beenumeration、structural morphology、class、agreementAdd new features□ you can add methods, calculation attributes, subscripts, (convenient) initializers, nested types, protocols, etc What extensions can’t do:□ original functions cannot be overwritten□ you cannot add storage attributes or add attribute observers to existing attributes□ cannot add parent […]