How to implement offset consumption specified by Kafa in spring boot

Time:2021-8-28

This article mainly introduces how to realize Kafa specified offset consumption in springboot. It is introduced in great detail through the example code, which has a certain reference value for everyone’s study or work. Friends in need can refer to it

In the process of Kafka consumption, it is inevitable to encounter the scenario of re consumption. For example, we need to store the Kafka data after consuming it. If the database goes down at a certain time, the data consumed by Kafka cannot be stored. In order to make up for the data loss during the database down period, we can specify the offset of Kafka consumer to the value at a certain time before, Then re consume.

First create a Kafka consumer service

@Service
@Slf4j
//Implement the commandlinerunner interface to automatically run its run method when springboot starts.
public class TspLogbookAnalysisService implements CommandLineRunner {
 @Override
 public void run(String... args) {
  //do something
 }
}

Establishment of Kafka consumption model

There are multiple partitions for each topic in Kafka server. Each partition maintains an offset. Our goal is to realize the offset consumption specified by Kafka consumer.

Here, the one-to-one consumption model of consumer — > partition is used, and each consumer manages its own partition.

@Service
@Slf4j
public class TspLogbookAnalysisService implements CommandLineRunner {
 //Declare the number of consuming threads equal to the number of Kafka partitions. One partition corresponds to one consuming thread
 private static final int consumeThreadNum = 9;
 //Specifically specify the offset to start consumption for each partition
 private List<Long> partitionOffsets = Lists.newArrayList(1111,1112,1113,1114,1115,1116,1117,1118,1119);
 
 private ExecutorService executorService = Executors.newFixedThreadPool(consumeThreadNum);

 @Override
 public void run(String... args) {
  //Loop traversal to create consumption thread
  IntStream.range(0, consumeThreadNum)
    .forEach(partitionIndex -> executorService.submit(() -> startConsume(partitionIndex)));
 }
}

Processing of offset by Kafka consumer

Declare the configuration class of Kafka consumer

private Properties buildKafkaConfig() {
 Properties kafkaConfiguration = new Properties();
 kafkaConfiguration.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "");
 kafkaConfiguration.put(ConsumerConfig.GROUP_ID_CONFIG, "");
 kafkaConfiguration.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "");
 kafkaConfiguration.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "");
 kafkaConfiguration.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "");
 kafkaConfiguration.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "");
 kafkaConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"");
 kafkaConfiguration.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "");
 ... more configuration items

 return kafkaConfiguration;
}

Create Kafka consumer, process offset, and start the task of consuming data#

private void startConsume(int partitionIndex) {
 //Create Kafka consumer
 KafkaConsumer<String, byte[]> consumer = new KafkaConsumer<>(buildKafkaConfig());

 try {
  //Specify the consumption partition corresponding to the consumer
  TopicPartition partition = new TopicPartition(kafkaProperties.getKafkaTopic(), partitionIndex);
  consumer.assign(Lists.newArrayList(partition));

  //Offset processing of consumer
  if (collectionUtils.isNotEmpty(partitionOffsets) && partitionOffsets.size() == consumeThreadNum) {
   Long seekOffset = partitionOffsets.get(partitionIndex);
   log.info("partition:{} , offset seek from {}", partition, seekOffset);
   consumer.seek(partition, seekOffset);
  }
  
  //Start consumption data task
  kafkaRecordConsume(consumer, partition);
 } catch (Exception e) {
  log.error("kafka consume error:{}", ExceptionUtils.getFullStackTrace(e));
 } finally {
  try {
   consumer.commitSync();
  } finally {
   consumer.close();
  }
 }
}

Consumption data logic, offset operation

private void kafkaRecordConsume(KafkaConsumer<String, byte[]> consumer, TopicPartition partition) {
 while (true) {
  try {
   ConsumerRecords<String, byte[]> records = consumer.poll(TspLogbookConstants.POLL_TIMEOUT);
   //Specific processing flow
   records.forEach((k) -> handleKafkaInput(k.key(), k.value()));

   //Very important: the log records the offset and partition related information of the current consumer (if you need to specify offset consumption again later, you can obtain the offset and partition information from the log here)
   if (records.count() > 0) {
    String currentOffset = String.valueOf(consumer.position(partition));
    log.info("current records size is:{}, partition is: {}, offset is:{}", records.count(), consumer.assignment(), currentOffset);
   }
 
   //Offset commit  
   consumer.commitAsync();
  } catch (Exception e) {
   log.error("handlerKafkaInput error{}", ExceptionUtils.getFullStackTrace(e));
  }
 }
}

The above is the whole content of this article. I hope it will be helpful to your study, and I hope you can support developpaer.