Talk about Kafka: compile Kafka source code and build source code environment

Time:2021-11-25

1、 Foreword

The version of Kafka compiled by Lao Zhou here is 2.7. Why use this version to build a reading environment for the source code? Because this version is relatively new. And why don’t I use the version after 2.7? For example, 2.8 is not stable because zookeeper is removed, and the production environment is not recommended. Therefore, the source code is built and studied in version 2.7.

2、 Environmental preparation

  • JDK:1.8.0_241
  • Scala:2.12.8
  • Gradle:6.6
  • Zookeeper:3.4.14

3、 Environment construction

3.1 JDK environment construction

I don’t have to say this. Java Native machines have JDK environment.

3.2 construction of scala environment

Download link:https://www.scala-lang.org/download/2.12.8.html

Talk about Kafka: compile Kafka source code and build source code environment

Lao Zhou here is a Mac OS system. Just look at your system here.

3.2.1 configuring Scala environment variables

The terminal inputs the following commands for editing:

vim ~/.bash_profile

#The path here is where you install
SCALA_HOME=/Users/Riemann/Tools/scala-2.12.8
export SCALA_HOME
export PATH=$PATH:$SCALA_HOME/bin

#Make the environment variable effective and execute it on the command line.
source  ~/.bash_profile

3.2.2 verification

The terminal inputs the following commands:

scala -version

The following prompt appears, indicating that the scala environment is successfully built.
Talk about Kafka: compile Kafka source code and build source code environment

3.3 establishment of gradle environment

First, go to gradle’s official website:https://services.gradle.org/distributions/

As shown below:
Talk about Kafka: compile Kafka source code and build source code environment
We select the release version we want to install. Gradle-x.x-bin.zip is the installation release version that needs to be downloaded, gradle-x.x-src.zip is the source code, and gradle-x.x-all.zip is to download all the files. My local is gradle-6.6.

The source code downloaded by gradle does not need to be installed. We can directly unzip the downloaded compressed package in the local directory. The unzipped directory is shown in the figure below.

Talk about Kafka: compile Kafka source code and build source code environment

3.3.1 configuring gradle environment variables

The terminal inputs the following commands for editing:

vim ~/.bash_profile

#The path here is where you install
GRADLE_HOME=/Users/Riemann/Tools/gradle-6.6
export GRADLE_HOME
export PATH=$PATH:$GRADLE_HOME/bin

#Make the environment variable effective and execute it on the command line.
source  ~/.bash_profile

3.3.2 verification

The terminal inputs the following commands:

gradle -v

The following prompt appears, indicating that the gradle environment is successfully built.

Talk about Kafka: compile Kafka source code and build source code environment
3.4 zookeeper environment construction

Zookeeper environment Lao Zhou has built it in the Linux environment and can be used directly. Here I also give the steps to build. No matter what system you are, it is similar~

3.4.1 Download

wget http://mirrors.hust.edu.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz

3.4.2 decompression

tar -zxvf zookeeper-3.4.14.tar.gz

3.4.3 enter the zookeeper-3.4.14 directory and create the data folder

 cd zookeeper-3.4.14 
 mkdir data

3.4.4 modify configuration file

cd conf
mv zoo_sample.cfg zoo.cfg

3.4.5 modify the data attribute in zoo.cfg

dataDir=/root/zookeeper-3.4.14/data

3.4.6 zookeeper service startup

Enter the bin directory, start the service, and enter the command

./zkServer.sh start

The following output indicates successful startup

Talk about Kafka: compile Kafka source code and build source code environment

3.5 construction of Kafka source code environment

Download the source code package of the corresponding version on the official website at:http://kafka.apache.org/downloads

Talk about Kafka: compile Kafka source code and build source code environment
After downloading and decompressing, the source file also needs to import the dependent jar package. Individuals use idea to import the project. After importing, they need to use the previously configured gradle as the gradle home address.

3.5.1 import Kafka source code to idea
Talk about Kafka: compile Kafka source code and build source code environment
Talk about Kafka: compile Kafka source code and build source code environment
Talk about Kafka: compile Kafka source code and build source code environment
3.5.2 modify build.gradle

Next, you can’t import the jar package. You need to replace the image file download server with a domestic private server, otherwise it will be quite slow and directly lead to “time out” error.

Enter the Kafka source package, modify the build.gradle file, and add the Ali private server configuration to the original configuration.

buildscript {
    repositories {
        maven {
            url 'http://maven.aliyun.com/nexus/content/groups/public/'
        }
        maven {
            url 'http://maven.aliyun.com/nexus/content/repositories/jcenter'
        }
    }
}
 
allprojects {
    repositories {
        maven {
            url 'http://maven.aliyun.com/nexus/content/groups/public/'
        }
        maven {
            url 'http://maven.aliyun.com/nexus/content/repositories/jcenter'
        }
    }
}

Talk about Kafka: compile Kafka source code and build source code environment
3.5.3 code construction

It can be built with commands or gradle in the idea graphical interface. Here, the operation of the idea graphical interface is easier, but gradle commands are also provided here.

./gradlew clean build -x test

Find the jar package required to directly download the wrapper, manually copy the jar file to the gradle / wrapper subdirectory under the Kafka path, and then re execute the gradlew build command to build the project.

Link:https://pan.baidu.com/s/1W6EHysWY3ZWQZRWNdNZn3QExtraction code: hpj5

Gradle other commands:

#Build the jar package and run it
./gradlew jar

#Build a project, depending on whether you are an idea tool or eclipse
./gradlew idea
./gradlew eclipse

#Build source package
./gradlew srcJar

#Building Javadoc documents
./gradlew aggregatedJavadoc

#Clean up and build
./gradlew clean

4、 Code structure

Talk about Kafka: compile Kafka source code and build source code environment

4.1 code installation package structure

  • Bin directory: save the Kafka tool line scripts. The well-known Kafka server start and Kafka console producer scripts are stored here.
  • Checkstyle Directory: code specification, automatic detection.

    What is checkstyle? The discussion on formatting has never been interrupted. Up to now, there is no complete conclusion on what is right and what is wrong. But with the development of time, a set of norms has gradually been derived. There is no absolute right or wrong. The key lies in the definition of norms. The most famous is Google style guide. Checkstyle is an automation plug-in developed in this style to help judge whether the code format meets the specification.

    The files in this directory define the specification of engineering code format. We can see the relevant checkstyle configuration and automatic code format configuration in build.gradle:

    Checkstyle configuration:

Talk about Kafka: compile Kafka source code and build source code environment
Talk about Kafka: compile Kafka source code and build source code environment
Scala automation code formatting configuration:
Talk about Kafka: compile Kafka source code and build source code environment

  • Clients Directory: save Kafka client code, such as producer and consumer codes in this directory.
  • Config Directory: save Kafka’s configuration file, of which the more important configuration file is server.properties.
  • Connect Directory: save the source code of the connect component. Kafka connect component is used to realize real-time data transmission between Kafka and external systems.
  • Core Directory: save the broker side code. All Kafka server-side codes are saved in this directory.
  • Docs Directory: Kafka design documents and component related structure diagrams.
  • Examples Directory: Kafka sample related directory.
  • Generator Directory: Kafka message class processing module, which mainly generates corresponding Java classes according to the message JSON file under the clients module. In the build.gradle file, you can see that a task processmessages is defined:

    ! [insert picture description here]( https://img-blog.csdnimg.cn/0b465db793e347e5976d7f9c80a5c102.png )
  • Gradle Directory: gradle scripts, dependent package definitions and other related files.
  • Jmh benchmarks Directory: Kafka code micro benchmark related classes.

    Jmh, namely Java microbenchmark harness, is a tool suite dedicated to code microbenchmark testing. What is micro benchmark? In short, it is a benchmark based on the method level, and the accuracy can reach the microsecond level. When you locate a hot method and want to further optimize the performance of the method, you can use jmh to quantitatively analyze the optimization results.

    Typical application scenarios of jmh include:

    • Want to know exactly how long a method needs to be executed and the correlation between execution time and input;
    • Compare the throughput of different interface implementations under given conditions to find the optimal implementation.
  • Kafka logs Directory: the directory generated by configuring log.dirs in the server.properties file.
  • Log4j appender Directory:

    A log4j appender that produces log messages to Kafka

    There is a kafkalog4jappender class in this directory.

  • Raft Directory: related to raft consistency protocol.
  • Streams Directory:

    Kafka Streams is a client library for building applications and microservices, where the input and output data are stored in Kafka clusters.

    Provide a Kafka based streaming processing class library, which directly provides specific classes for developers to call. The operation mode of the whole application is mainly controlled by developers, which is convenient for use and debugging.

    Kafka streams is a library used to build stream handlers. In particular, its input is a Kafka topic and its output is another Kafka topic (either calling external services, updating databases, or others). It allows you to do this in a distributed and fault-tolerant way.

  • Tests Directory: this directory describes how to perform Kafka system integration and performance testing.
  • Tools Directory: tool class modules.
  • Vagrant Directory: describes how to run Kafka in vagrant virtual environment, and provides relevant script files and documentation.

    Vagrant is a ruby based tool for creating and deploying virtualized development environments. It uses Oracle’s open source VirtualBox virtualization system and chef to create an automated virtual environment.

4.2 project structure

The project structure mainly focuses on the core directory. The core directory is the Kafka core package, including cluster management, partition management, storage management, replica management, consumer group management, network communication, consumption management and other core categories.
Talk about Kafka: compile Kafka source code and build source code environment

  • Admin package: the function of executing management commands;
  • API package: encapsulate request and response dto objects;
  • Cluster package: cluster objects. For example, replica class represents a partition replica and partition class represents a partition;
  • Common package: General jar package;
  • Controller package: classes and key modules related to kafkacontroller (KC). A Kafka cluster has only one leader KC, which is responsible for partition management and replica management, and ensures the synchronization of cluster information in the cluster;
  • Coordinator package: saves the group coordinator code of the consumer side and the transactioncoordinator code for transactions. Analyzing the coordinator package, especially the group coordinator code on the consumer side, is the key to the design principle of the broker side coordinator component.
  • Log package: it saves the core log structure code of Kafka, including logs, log segments, index files, etc. in addition, the implementation mechanism of log comparison is encapsulated under this package, which is a very important source package.
  • Network package: it encapsulates the code of Kafka server-side network layer, especially the socketserver.scala file. It is a specific operation class for Kafka to implement reactor mode, which is very worth reading.
  • Consumer package: this package will be discarded later and replaced by consumer related classes under the clients package.
  • Server package: as its name implies, it is the server-side main code of Kafka. There are many classes in it. Many key Kafka components are stored here, such as state machine, purgatory delay mechanism, etc.
  • Tools package: tool class.

5、 Environmental validation

Now let’s verify whether the Kafka source code environment is built successfully.

5.1 first, we create a new resources directory in the core / SRC / main directory, and then copy the log4j.properties configuration file in the conf directory to the resources directory.

As shown in the figure below:
Talk about Kafka: compile Kafka source code and build source code environment
5.2 modify the server.properties file in the conf directory

log.dirs=/Users/Riemann/Code/framework-source-code-analysis/kafka-2.7.0-src/kafka-logs

Other configurations in the server.properties file need not be modified for the time being.

5.3 configure kafka.kafka in idea

The specific configuration is shown in the figure below:

Talk about Kafka: compile Kafka source code and build source code environment
5.4 starting Kafka broker

If the startup is successful, the console output is normal, and the following output can be seen:

Talk about Kafka: compile Kafka source code and build source code environment

5.5 the following exceptions may occur:

5.5.1 abnormality 1

log4j:WARN No appenders could be found for logger (kafka.utils.Log4jControllerRegistration$).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Add two log packages slf4j-log4j12-1.7.30.jar and log4j-1.2.17.jar to the project structure. Of course, you can also add the corresponding configuration in build.gradle to add the package.

Method 1:
Talk about Kafka: compile Kafka source code and build source code environment
Method 2:

compile group: 'log4j', name: 'log4j', version: '1.2.17'
compile group: 'org.slf4j', name: 'slf4j-api', version: '1.7.30'
compile group: 'org.slf4j', name: 'slf4j-log4j12', version: '1.7.30'

Core modules added to the build.gradle file:

Talk about Kafka: compile Kafka source code and build source code environment
5.5.2 abnormality 2

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

Talk about Kafka: compile Kafka source code and build source code environment
5.6 sending and consuming messages

We use Kafka’s own scripting tool to verify the Kafka source code environment built above

First, we go to the ${kafka_home} / bin directory and create a file named topic with the command kafka-topics.sh_ Topic of test:

The execution effect is shown in the following figure:

Talk about Kafka: compile Kafka source code and build source code environment
Then we start a consumer on the command line through the command kafka-console-consumer.sh to consume topic_ The topic test is as follows:

./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic_test

Talk about Kafka: compile Kafka source code and build source code environment
Next, we start a command line producer to topic through the command kafka-console-producer.sh_ Test generates data in this topic as follows:

Talk about Kafka: compile Kafka source code and build source code environment
When we enter a message and press enter, the message will be sent to topic_ Test is in this topic.

Talk about Kafka: compile Kafka source code and build source code environment

After entering the message and pressing enter, we can receive the message at the consumer. The effect is shown in the following figure:
Talk about Kafka: compile Kafka source code and build source code environment
After the success, we will analyze the source code of Kafka broker one after another and look forward to it~

Recommended Today

Apache sqoop

Source: dark horse big data 1.png From the standpoint of Apache, data flow can be divided into data import and export: Import: data import. RDBMS—–>Hadoop Export: data export. Hadoop—->RDBMS 1.2 sqoop installation The prerequisite for installing sqoop is that you already have a Java and Hadoop environment. Latest stable version: 1.4.6 Download the sqoop installation […]