JDBC source connector for deep interpretation of Kafka connector

Time:2021-10-11

abstract

The project needs to use Kafka stream to load the data in MySQL database, and then do a data filtering function similar to ETL. In this way, the Kafka data imported into a topic and the data in the database connected to MySQL through Kafka connect are filtered and de duplicated.

content

1、 Kafka installation

  • The Kafka connector function is introduced from Kafka version 1.0 and above. First, we need to check whether the corresponding version supports connect (intuitively: the bin directory contains connect, and the conf directory contains connect);

Bin directory:
JDBC source connector for deep interpretation of Kafka connector

Conf Directory:
JDBC source connector for deep interpretation of Kafka connector

  • We use version: Kafka_ 2.11-1.0.1.jar. Where 2.11 is the scala version and 1.0.1 is the Kafka version;

2、 Download the Kafka connect JDBC plug-in

Go to the website:https://www.confluent.io/hub/…
Download;
JDBC source connector for deep interpretation of Kafka connector
Select the corresponding version
JDBC source connector for deep interpretation of Kafka connector
Extract the following directory structure:
JDBC source connector for deep interpretation of Kafka connector
Get:
JDBC source connector for deep interpretation of Kafka connector

Extract the jar files from Lib in the plug-in and put them into the LIBS directory of Kafka:
JDBC source connector for deep interpretation of Kafka connector

3、 Copy the MySQL driver of Java to the LIBS directory of Kafka

JDBC source connector for deep interpretation of Kafka connector

4: Connect-mysql-source.properties configuration file

Copy the files in the etc directory of Kafka connect JDBC to the config directory of Kafka and modify them to connect-mysql-source.properties;
JDBC source connector for deep interpretation of Kafka connector

Copy to Kafka config:
JDBC source connector for deep interpretation of Kafka connector

Modify the configuration according to the local data source:

# A simple example that copies all tables from a SQLite database. The first few settings are
# required for all connectors: a name, the connector class to run, and the maximum number of
# tasks to create:
name=test-source-mysql-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=10
# The remaining configs are specific to the JDBC source connector. In this example, we connect to a
# SQLite database stored in the file test.db, use and auto-incrementing column called 'id' to
# detect new rows as they are added, and output to topics prefixed with 'test-sqlite-jdbc-', e.g.
# a table called 'users' will be written to the topic 'test-sqlite-jdbc-users'.
#connection.url=jdbc:mysql://192.168.101.3:3306/databasename?user=xxx&password=xxx
connection.url=jdbc:mysql://127.0.01:3306/us_app?user=root&password=root
table.whitelist=ocm_blacklist_number
#Bulk is batch import. In addition, there are increasing and timestamp modes
mode=bulk
#timestamp.column.name=time
#incrementing.column.name=id
topic.prefix=connect-mysql-

Configuration description reference:https://www.jianshu.com/p/9b1…

5、 Modify config / connect-standalone.properties.in Kafka directory

bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000

6、 Start Kafka connect

bin/connect-standalone.sh config/connect-standalone.properties config/connect-mysql-source.properties

JDBC source connector for deep interpretation of Kafka connector

Note: connect-standalone.sh is a single node mode. In addition, there is a connect distributed cluster mode. If you use the cluster mode, you need to modify connect-distributed.properties

7、 Consume Kafka and check whether the import is successful

You can start a consumer and consume connect mysql OCM from the starting point_ blacklist_ Number. If you can see the output, your connector configuration is successful.

./kafka-console-consumer.sh --zookeeper 127.0.0.1:2181 --topic connect-mysql-ocm_blacklist_number --from-begin

JDBC source connector for deep interpretation of Kafka connector

reference resources:

https://blog.csdn.net/u014686…

Kafka stream reference:
https://www.infoq.cn/article/…