EMQ x enterprise (mqtt broker) + Apache Kafka to build high performance Internet of things message processing background

Time:2020-5-22

background

In all kinds of Internet of things projects, the messages generated by the devices not only affect the devices, but also need to be used by the business system to achieve functions such as security audit, traffic billing, data statistics, notification trigger, etc., which are similar to those easily completed through the following prototype systems:

EMQ x enterprise (mqtt broker) + Apache Kafka to build high performance Internet of things message processing background

In this prototype, multiple data channels need to be maintained on EMQ x for each business link to obtain message data from EMQ x according to their own needs. The problem with this solution is:

  • Each service needs to establish data channel with EMQ x, and the establishment and maintenance of data channel need additional resource overhead. The speed of data synchronization seriously affects the high-speed message exchange of EMQ X;
  • With the growth of business, every new business link needs to affect the change of the whole system;
  • Due to the different processing speed and timing of each link, some services will be blocked when the message volume is large, further resulting in data loss, system stability reduction and other serious consequences.

The above problems are highly consistent with the problems encountered in current Internet applications, that is, data integration and data synchronization between multiple business systems. In Internet applications, message queuing is generally integrated for peak clipping, current limiting, queue processing and other operations to realize the decoupling of data and business. With the help of message and flow middleware bridging functions such as rabbitmq, Kafka, rocketmq, pulsar provided by EMQ x, Internet of things projects can also use this model to solve the above problems.

Taking the common scenarios of Internet of things as an example, this paper introduces how to use EMQ x message middleware and open source flow processing platform Kafka to process the massive information data of Internet of things, store the massive data flow in a highly reliable and error tolerant way and ensure the order of data flow for message data storage, and effectively provide the message data to multiple business links.

Business scenario

Suppose there is now an intelligent door lock project, all the door locks will report the door lock information every 1 minute or any time when the door lock status changes, such as opening / closing, and the mqtt subject is as follows (QoS = 1):

devices/{client_id}/state

The format of data sent by each device is JSON, including data such as door lock power, unlocking status, operation result, etc. the content is as follows:

{
  "process_id": "7802441525528958",
  "action": "unlock",
  "battery": 83.4,
  "lock_state": 1,
  "version": 1.1,
  "client_id": "10083618796833171"
}

Each door lock subscribes to a unique topic. As a remote unlocking command, mqtt topics are issued as follows (QoS = 1):

devices/{client_id}/command

The data issued includes unlocking instruction, message encryption verification information, etc

{
  "process_id": "7802441525528958",
  "action": "unlock",
  "nonce_str": "u7u4p0n8",
  "ts": 1574744434,
  "sign": "e9f5af7deaa28563"
}

The uplink and downlink message data need to be used by the following three business links:

  • Message notification: the notification method (SMS, email) that the unlocking status is notified to the lock user binding;
  • Status monitoring: analyze and process the status information reported by the door lock regularly. If the power and status are abnormal, an alarm shall be triggered to inform the user;
  • Security audit: analyze the uplink and downlink message data, record the user’s unlocking behavior, and prevent the attack of the downlink instruction from being tampered and replayed.

In this scheme, EMQ x will bridge the messages of the above topics to Kafka for use by the business system to realize decoupling between the business system and EMQ X.

client_ ID is the door lock ID, which is the mqtt client ID used for connecting the door lock to EMQ X.

Scheme introduction

KafkaIt is an open source stream processing platform developed by the Apache Software Foundation and written by Scala and Java. The goal of this project is to provide a unified, high throughput and low latency platform for real-time data processing.

Kafka has the following characteristics:

  • High throughput: throughput up to hundreds of thousands of high concurrency, supporting thousands of clients to read and write at the same time;
  • Low latency: with a minimum latency of a few milliseconds, it is easy to build real-time streaming applications;
  • Data reliability: the message data is stored safely and distributed, copied to the fault-tolerant cluster, processed strictly according to the queue order, provided message transaction support, and guaranteed data integrity and consumption reliability;
  • Cluster fault tolerance: n-1 nodes are allowed to fail in multi node replica
  • Scalability: support cluster dynamic expansion.

In this scheme, Kafka is integrated to provide message queue and message bus for message delivery between EMQ x message server and application program. The producer (EMQ x) adds data to the end of the queue, and each consumer (business link) reads the data in turn and processes it by themselves. This architecture takes into account performance and data reliability, and effectively reduces system complexity and improves system scalability. The prototype of the scheme is as follows:

EMQ x enterprise (mqtt broker) + Apache Kafka to build high performance Internet of things message processing background

EMQ x enterprise installation

install

If you are a new user of EMQ x, it is recommended to get started quickly through the EMQ x guide

Visit the EMQ website to download the installation package suitable for your operating system,Because data persistence is an enterprise function, you need to download EMQ x Enterprise Edition (you can apply for license trial)At the time of writing this article, the latest version of EMQ x enterprise is v3.4.4. The steps to start downloading the zip package are as follows:

##Unzip and download the good installation package
unzip emqx-ee-macosx-v3.4.4.zip
cd emqx

##Copy the license file to the directory etc / specified by EMQ x, and the license must be obtained through trial application or purchase authorization
cp ../emqx.lic ./etc

##Starting EMQ X in console mode
./bin/emqx console

Modify configuration

The configuration files needed in this article are as follows:

  1. License file, EMQ x enterprise license file, overwrite with available license:
etc/emqx.lic
  1. EMQ x Kafka message storage plug-in configuration file is used to configure Kafka connection information and data bridging topics:
etc/plugins/emqx_bridge_kafka.conf

Fill in the plug-in configuration information as follows according to the actual deployment situation. For other configuration items, please read the configuration file carefully and make adjustments or use the default configuration directly

##Connection address
bridge.kafka.servers = 127.0.0.1:9092

##The hooks that need to be processed because we use QoS 1 for message transmission, we can use ack hooks 
##Comment other unrelated events and messages hooks

## bridge.kafka.hook.client.connected.1     = {"topic":"client_connected"}
## bridge.kafka.hook.client.disconnected.1  = {"topic":"client_disconnected"}
## bridge.kafka.hook.session.subscribed.1   = {"filter":"#", "topic":"session_subscribed"}
## bridge.kafka.hook.session.unsubscribed.1 = {"filter":"#", "topic":"session_unsubscribed"}
## bridge.kafka.hook.message.deliver.1      = {"filter":"#", "topic":"message_deliver"}

##Filter is the mqtt topic to be processed and topoc is the Kafka topic written
##Register multiple hooks to process uplink and downlink messages

##Select publish hooks for reporting instructions
bridge.kafka.hook.message.publish.1        = {"filter":"devices/+/state", "topic":"message_state"}

##Issue the instruction to select acked hooks to ensure that the messages arrive before entering the warehouse
bridge.kafka.hook.message.acked.1       = {"filter":"devices/+/command", "topic":"message_command"}

Kafka installation and initialization

Install Kafka through docker and map data9092The port is for connection. Kafka relies on zookeeper. The following provides the complete installation command:

##Install zookeeper
docker run -d --name zookeeper -p 2181 -t wurstmeister/zookeeper

##Install and configure Kafka
docker run -d --name kafka --publish 9092:9092 \
        --link zookeeper --env KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 \
        --env KAFKA_ADVERTISED_HOST_NAME=127.0.0.1 \
        --env KAFKA_ADVERTISED_PORT=9092 \
        wurstmeister/kafka:latest

Pre create the theme to be used in Kafka:

##Enter Kafka docker container
docker exec -it kafka bash

##Uplink data subject message_ state
kafka-topics.sh --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 1 --topic message_state

##Downlink data subject message_ command
kafka-topics.sh --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 1 --topic message_command

At this point, you can restart EMQ X and start the plug-in to apply the above configuration:

./bin/emqx stop

./bin/emqx start

##Or use the console mode to see more information
./bin/emqx console

##Start plug-in
./bin/emqx_ctl plugins load emqx_bridge_kafka

##The following prompt will be given after the startup is successful
Plugin load emqx_bridge_kafka loaded successfully.

Simulation test

Start consumption with Kafka console consumer

The detailed implementation of three business links in this scheme is not covered in this paper. In this paper, you only need to ensure that messages are written into Kafka, and you can use Kafka’s own consumption command to view the data in the subject:

##Enter Kafka docker container
docker exec -it kafka bash

##Uplink data subject
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic message_state --from-beginning

##Open another window to view the downlink data topic
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic message_command --from-beginning

After the command is executed successfully, it will block and wait for the data of the subject to be consumed. We will continue the following operations.

Receiving and sending of simulation test data

adoptEMQ x management consoleInWebSocketThe tool can simulate the uplink / downlink service data of smart door lock. Browser openhttp://127.0.0.1:1883get intoEMQ x management console, openTool -> WebSocketFunction, input connection information to establish mqtt connection simulation door lock device. In the connection informationClient IDAccording to the business specification, this article uses10083618796833171

Subscribe to downlink control topic

According to business requirements, you need to subscribe to the exclusive downlink control topic of door lockdevices/{client_id}/command, subscription required heredevices/10083618796833171/commandTheme andSet QoS = 1

EMQ x enterprise (mqtt broker) + Apache Kafka to build high performance Internet of things message processing background

Analog issue command

Door lock control subjectdevices/{client_id}/commandSend unlocking command, here the data is:

  • Subject:devices/10083618796833171/command
  • QoS:1
  • payload:

    {
      "process_id": "7802441525528958",
      "action": "unlock",
      "nonce_str": "u7u4p0n8",
      "ts": 1574744434,
      "sign": "e9f5af7deaa28563"
    }

Management console after successful distributionPublishThe interface can receive a message:

EMQ x enterprise (mqtt broker) + Apache Kafka to build high performance Internet of things message processing background

At the same time Kafkamessage_commandSubject consumer will receive one or more messages(The trigger times of EMQ x ack hooks are subject to the actual number of received message clients), the message is in JSON format, and the content is formatted as follows:

{
  "client_id": "10083618796833171",
  "username": "",
  "from": "10083618796833171",
  "topic": "devices/10083618796833171/command",
  "payload": "eyAgICJwcm9jZXNzX2lkIjogIjc4MDI0NDE1MjU1Mjg5NTgiLCAgICJhY3Rpb24iOiAidW5sb2NrIiwgICAibm9uY2Vfc3RyIjogInU3dTRwMG44IiwgICAidHMiOiAxNTc0NzQ0NDM0LCAgICJzaWduIjogImU5ZjVhZjdkZWFhMjg1NjMiIH0=",
  "qos": 1,
  "node": "[email protected]",
  "ts": 1574751635845
}

This message contains mqtt receiving / publishing client information and Base64 encoded payload data:

  • client_ ID: receive client_ ID
  • Username: accept client username
  • From: publish client_ ID
  • Topic: message publishing target topic
  • Payload: the message payload encoded by Base64
  • QoS: Message QoS
  • Node: message processing node
  • TS: hooks millisecond trigger timestamp

Simulation report status

Door lock control subjectdevices/{client_id}/stateSend status data, and publish data here as:

  • Subject:devices/10083618796833171/state
  • QoS:1
  • payload:

    {
      "process_id": "7802441525528958",
      "action": "unlock",
      "battery": 83.4,
      "lock_state": 1,
      "version": 1.1,
      "client_id": "10083618796833171"
    }

Kafka after successful reportingmessage_stateConsumer will receive a message(The number of EMQ x publish hooks triggers is related to publishing messages, whether the message subject is subscribed or not, and the number of subscriptions), the message is in JSON format, and the content is formatted as follows:

{
  "client_id": "10083618796833171",
  "username": "",
  "topic": "devices/10083618796833171/state",
  "payload": "eyAgICJwcm9jZXNzX2lkIjogIjc4MDI0NDE1MjU1Mjg5NTgiLCAgICJhY3Rpb24iOiAidW5sb2NrIiwgICAiYmF0dGVyeSI6IDgzLjQsICAgImxvY2tfc3RhdGUiOiAxLCAgICJ2ZXJzaW9uIjogMS4xLCAgICJjbGllbnRfaWQiOiAiMTAwODM2MTg3OTY4MzMxNzEiIH0=",
  "qos": 1,
  "node": "[email protected]",
  "ts": 1574753026269
}

This message only contains mqtt publishing client information and Base64 encoded payload data:

  • client_ ID: publish client_ ID
  • Username: publish client username
  • Topic: message publishing target topic
  • Payload: the message payload encoded by Base64
  • QoS: Message QoS
  • Node: message processing node
  • TS: hooks millisecond trigger timestamp

So far, we have successfully completed all steps of bridging messages from EMQ x to Kafka. After the business system is connected to Kafka, it can be based on the number of messages consumed and the client of the message publisher / subscriber_ The ID and the content of the message payload are used for business judgment to realize the required business functions.

performance testing

If the reader is interested in the performance of the scheme, it can be tested with the mqtt-jmeter plug-in. It should be noted that in the performance test process, the readers need to ensure that the consumers of EMQ cluster, Kafka cluster, Kafka cluster and JMeter test cluster are well optimized and configured, so as to get the correct best performance test results under the relevant configuration.

summary

Through this article, readers can understand the important role of EMQ x + Kafka Internet of things message processing scheme for message communication and business processing. Using this scheme, we can build a loosely coupled, high-performance and error tolerant Internet of things message processing platform to achieve efficient and safe data processing.

The code of this paper implements specific business logic, and readers can expand it according to the business prototype and system architecture provided in this paper. Because the architecture idea of integration in the Internet of things project in the message / flow processing which has been supported by rabbitmq, rocketmq, pulsar and other EMQ x is similar to Kafka, readers can also use this article as a reference to freely select relevant components for scheme integration according to their own technology stack.


For more information, please visit our website emqx.io , or focus on our open source projects github.com/emqx/emqx , please visit official documents for detailed documents.