Practice of database synchronization with Canal + Kafka


In the architecture of microservice splitting, each service has its own database, so the problem of data communication between services is often encountered. For example, the data of B service database comes from the database of a service; when the data of a service changes, it needs to be synchronized to B service.

The first solution:

In the code logic, when there is related a service data writing operation, the B service interface is called by calling the interface, and the B service writes the data to the new database. This method seems simple, but in fact, there are many “pits”. A lot of code for calling interface synchronization will be added in a service code logic, which increases the complexity of project code and will be more and more difficult to maintain in the future. Moreover, the way of interface call is not a stable way, there is no retrial mechanism, no synchronous location record, how to deal with the failure of interface call, the problems caused by a sudden large number of interface calls, etc., all of which need to be considered and dealt with in the business. There will be a lot of work here. With this in mind, this plan is ruled out.

The second solution:

Synchronize through the binlog of the database.This solution is independent of a service and will not have code coupling with a service. It can transmit data directly through TCP connection, which is better than interface call. This is a mature production solution, and there are many binlog synchronization Middleware Tools, so we focus on which tool can better build a stable, performance satisfied and easy to deploy high availability solution.

After investigation, we chosecanal[]。canalIt is Alibaba MySQL binlog incremental subscription & consumption component, which has been practiced in production. It also supports the combination with other common middleware components, such as Kafka, elasticsearch, etccanal-goGo language client library, to meet our needs on go, other specific content refer to the canal GitHub home page.

Schematic diagram

Practice of database synchronization with Canal + Kafka

Practice of database synchronization with Canal + Kafka

OK, let’s do it! Now we need to synchronize the data changes of database a to database B. According to wiki, I quickly started a computer with dockercanal-serverService, direct usecanal-gowritecanal-clientCode logic. usecanal-goDirect connectioncanal-servercanal-serverandcanal-clientSocket is used to communicate with each other. The transmission protocol is TCP, and the interaction protocol is Google protocol buffer 3.0.


1. Can connect to a database to simulate slave

2. Canal client establishes a connection with canal and subscribes to the corresponding database table

3. A database changes and writes to binlog. Canal sends dump request to the database, obtains binlog and parses it, and sends the parsed data to canal client

4. Canal client receives the data and synchronizes it to the new database

The serialization speed of protocol buffer is very fast. The data obtained after deserialization is the data of each line. According to the structure of field name and field value, it is put into an array. The code is simple

func Handler(entry protocol.Entry)  {
    var keys []string
    rowChange := &protocol.RowChange{}
    proto.Unmarshal(entry.GetStoreValue(), rowChange)
    if rowChange != nil {
        eventType := rowChange.GetEventType()
        for _ , rowData := range  rowChange.GetRowDatas () {// traverse each row of data if eventtype== protocol.EventType_ DELETE || eventType ==  protocol.EventType_ UPDATE {
                 columns :=  rowData.GetBeforeColumns () // get all field properties before change} else if eventtype== protocol.EventType_ INSERT {
                 columns :=  rowData.GetAfterColumns () // get all field properties before and after the change}

Problems encountered

For high availability and higher performance, we will create multiplecanal-clientForm a cluster to parse and synchronize to the new database. Here comes a more important question, how to ensure the quality of the productscanal-clientWhat about the order of cluster consumption binlog?

The binlog we use is the row mode. Each write operation generates a binlog log. Take a simple example: insert a record of a and modify it immediately. Two messages are sent to thecanal-clientIf, due to network and other reasons, the updated message is processed earlier than the inserted message and no record has been inserted, the final effect of the update operation will fail.

What shall I do? Canal can be combined with message queue, and support Kafka, rabbitmq, rocketmq, so excellent. We implement the order of messages at the message queue layer. (how to do it later)

Choose Canal + Kafka scheme

We choose the industry benchmark of message queuing: Kafka ucloud provides Kafka and rocketmq message queuing products and services, which can quickly and easily build a message queuing system. Accelerate development and facilitate operation and maintenance.

Now let’s explore it

① Select Kafka message queue product and apply for activation

Practice of database synchronization with Canal + Kafka

② After opening, create Kafka cluster in the management interface, and select the corresponding hardware configuration according to their own needs

Practice of database synchronization with Canal + Kafka

A kafka+ZooKeeper cluster is built awesome!

Practice of database synchronization with Canal + Kafka

It also includes node management, topic management and consumer group management. It is very convenient to modify the configuration directly in the console

In terms of monitoring view, the monitoring data includes Kafka generation and consumption of QPS, cluster monitoring and zookeeper monitoring. Can provide more perfect monitoring indicators.

Practice of database synchronization with Canal + Kafka

Practice of database synchronization with Canal + Kafka

Practice of database synchronization with Canal + Kafka

Kafka configuration of canal

Canal with Kafka is also very simple. vi /usr/local/canal/conf/

# ...
#Options: TCP (default), Kafka, rocketmq
canal.serverMode = kafka
# ...
#Kafka / rocketmq cluster configuration: = = 0
#In flagmessage mode, the value can be increased, but it should not exceed the upper limit of MQ message body size = 16384 = 1048576
#In flatmessage mode, please increase the value, 50-200 is recommended = 1 = 33554432
#The batch size of canal is 50K by default, and the maximum message body limit of Kafka should not exceed 1m (below 900K) = 50
#Timeout time of canal get data, unit: ms, null is unlimited timeout = 100
#Is it a flat JSON format object = false = none = all
#Does Kafka message delivery use transactions = false

# mq config
# dynamic topic route by schema or table regex,mytest2..*,.*..*
# hash partition config

Details are as follows:

Solving the problem of sequential consumption

See the configuration line below

We have configured Kafka’s partition hash, and our topic is a table. The effect is that the data of a table will only be pushed to a fixed partition, and then pushed to the consumer for consumption processing, and synchronized to the new database. In this way, the problem of binlog log sequence processing encountered before is solved. In this way, even if we deploy multiple Kafka consumers to form a cluster, consumers consume messages from a partition, that is, consume and process data from the same table. In this way, parallel processing is sacrificed for a table. However, I personally feel that with Kafka’s powerful processing architecture, it is not easy for our business to have bottlenecks in Kafka. And our business purpose is not real-time consistency. Under certain delay, the two databases ensure the final consistency.

The figure below is the final synchronization architecture. We have implemented clustering in each service node. They all run on the uk8s service of ucloud, ensuring the high availability of service nodes.

Canal is also a cluster switch, but only one canal will be processing binlog at a certain time, and the others are redundant services. When the canal service is down, one of the redundant services will switch to the working state. Similarly, only one canal can work because of the need to ensure the sequential reading of binlog.

Practice of database synchronization with Canal + Kafka

In addition, we also use this architecture to synchronize cache failures. The cache mode we use is:Cache-Aside. Similarly, if cache invalidation is performed where the data in the code changes, the code will be complicated. Therefore, on the basis of the above architecture, we put the complex logic of triggering cache invalidation into the systemkafka-clientIn order to achieve the purpose of decoupling.

At present, this set of synchronization architecture is running normally. If there are problems in the future, we will continue to update it.

For more information, please click on the author’s home page below~

Author: Carly, application R & D Engineer of ucloud