Sharing activities of Jinniu Technology Practice
Jiniu technology practice sharing series activities is a systematic online technology sharing activity provided by Jiniu, together with top VC and technical experts, for enterprises and technicians.
Each issue of different technical topics, in-depth discussion with industry experts, focusing on solving technical practical difficulties and promoting technological innovation, the course officially starts at 20:00 on Wednesdays every two weeks. All institutions, enterprises, industry experts and technicians are welcome to sign up.
Introduction of guests
Zhang Hao, cloud product manager of Tencent
Responsible for Tencent cloud message queuing, elastic block storage, load balancer and other IAAs layer product planning, iteration, performance experience optimization, etc.
Yan Erhui, senior storage architect of Tencent cloud
It has long-term and profound accumulation in the fields of large-scale storage, PAAS and virtualization. At present, he is mainly engaged in the design and development of Internet middleware.
Zhou Weiyue, senior R & D Engineer of Tencent cloud
Responsible for the design and development of Tencent cloud IAAs layer virtualization resource scheduling operation system.
Usage scenario and value of message queue
Analysis of CMQ infrastructure
CMQ vs open source rabbitmq pressure test
CMQ case best practices
01 | usage scenarios of message queue
Message sending and receiving decoupling: the sender and the receiver do not need to know each other or even the existence of each other;
Shielding the differences between different platforms: different platforms interact through messages, only concerned about sending and reading messages;
Peak shaving and valley filling to improve the system’s ability to deal with emergencies: the sending message terminal will never be blocked, the burst message is cached in the CMQ server, and the consumers read the message according to their actual ability;
One production and multiple consumption: a message can be subscribed by multiple types of consumers, and the producer only needs to produce once;
Cross IDC / WAN transmission: CMQ supports the production and consumption of messages in different IDC and cities, automatic access to the nearest place, and transparent to business;
Analysis of 02 | CMQ infrastructure
Today, with the popularity of distributed technology, we widely use message middleware to exchange and decouple data within the system and between platforms. CMQ is a highly reliable, strong consistent and scalable distributed message queue developed by Tencent cloud. It is widely used in Tencent, including wechat mobile QQ service red packet, Tencent phone charge recharge and advertising order. At present, Tencent cloud is open to the outside world. This paper introduces the core technology principles of CMQ.
According to the usage scenarios, message middleware can be roughly divided into two categories: high reliability and high performance. CMQ is mainly applicable to financial, trading, order and other business scenarios with high requirements for reliability and availability.
As shown in Figure 1, taking Tencent’s recharge system as an example, the recharge system asynchronously decouples the transaction module, the delivery part and the settlement system through CMQ, which greatly reduces the coupling between modules and on the other hand reduces the impact of a large number of emergency requests on the back-end system. At the beginning of the month, the system transmits more than one billion messages a day through CMQ, with a peak value of more than 10 W per second. At the highest point, there are hundreds of millions of messages that can buffer the pressure on the back-end consumer module through the accumulation ability of CMQ.
Figure 1 – structure of a recharge system
The overall structure of CMQ is shown in Figure 2. This paper focuses on the implementation principle of the back-end broker set. In general, a set is composed of three nodes, which ensures the reliability of messages through multiple copies and improves the system availability by multiple nodes. Of course, the reliability and availability can be further improved by increasing the number of nodes in the set according to the actual business requirements. The internal structure of CMQ set is shown in Figure 3.
Figure 2 – overall architecture of CMQ
Figure 3 – internal structure of broker set
In the following, the data high reliability, strong consistency, system availability, scalability, and message full path tracking are introduced respectively.
High reliability assurance
In terms of reliability assurance, it mainly includes the following three aspects: reliable production, reliable storage (accumulation) and reliable consumption
As shown in Figure 3 above, after more than half of the brokers in the set successfully swipe the messages produced by the client, a confirmation message will be returned to inform the success of the production message. If the client does not receive the confirmation within a certain period of time, it needs to try again to ensure that the message is sent successfully.
One of the problems brought about by reliable production is the duplication of messages. In the case of network exceptions, it is likely that CMQ broker has successfully stored the message, only that the confirmation packet is lost on the network. In this way, after the client retries production, there are two duplicate messages on the broker. Considering the high overhead of message de duplication, business logic is needed to ensure the idempotency of messages.
In cmqset, one node is the leader and the other nodes are followers. The leader is responsible for the production and consumption of all messages. When the production message arrives at the leader node, the request is written to the raft log in sequence and the disk is flushed synchronously through the raft consistency module. At the same time, the constructed raft log is sent to other follower nodes through the network in order. The follower node flushes the disk synchronously and returns success. When the leader receives more than half of the node synchronization success information, it submits the request to the MQ processing state machine, and the MQ state machine applies the request to the corresponding queue. The general logic is shown in Figure 4.
Figure 4 – schematic diagram of data storage principle
It can be seen that the message returning the success of the client is stored successfully on the disks of two nodes at least, which greatly reduces the data loss caused by disk failure. In addition, when the data is stored on the disk, the test results will be recorded together. Before the consumer consumes the data, CMQ broker will compare it to ensure that the message is complete and valid.
When a consumer pulls a message, it will specify the hiding time of the current message. During the hiding time, the consumer will explicitly confirm the deletion of the message. If the message is not deleted after the hiding time, the message will be visible again and can continue to consume.
The purpose of explicit confirmation deletion message is to prevent message loss caused by abnormal message delivery and processing.
The processing logic of CMQ broker is similar to that of production message. It is also a writing process. The difference is that the content of the data written at this time is msgid and message status.
Strong consistent implementation
If a set has three nodes (a, B, c), a is the leader and B and C is the follower. As shown in the figure above, for the request data that returns the client’s success, it exists on at least two nodes in CMQ, assuming that it is a B. at this time, if leader a fails, B and C’s two followers will automatically select a new leader. The raft algorithm used by CMQ can ensure that this leader must be one of the most complete log information, and here it must be B. At this point, B continues to provide external services. B and a have the same full data view that has been returned and confirmed to the user. The data is strongly consistent.
In the case of partition in the network where a and B C are located (as shown in Figure 5), leader a cannot process the request because it cannot get the reply from more than half of the nodes in the set. After the election timeout, BC will elect a new leader, and the access layer of CMQ will switch automatically. Raft algorithm ensures that the new leader also has the completed data view.
As mentioned above, the master is responsible for the production and consumption of all messages. When the master fails, other follower nodes in the set will automatically elect a new leader, and the client request will be automatically redirected to the leader node. RTO is related to the configured election timeout, which is currently about 5 seconds. The general process is shown in Figure 6 above. Please refer to raft paper for specific election algorithm.
CMQ single set has priority to guarantee CP in cap theory. When more than half of the nodes in set are working normally, message production and consumption can be carried out. CMQ has powerful monitoring and scheduling ability, which can quickly schedule and migrate the queue to recover the service, and reduce the unavailable time to the minimum.
Horizontal expansion, infinite accumulation
Figure 7 horizontal expansion
The concept of set mentioned above is transparent and insensitive to users. CMQ controller server schedules and relocates queues in real time according to the load of set. If the number of requests for a certain queue exceeds the service threshold of the current set, the controller server can distribute the queue routes to multiple sets to improve the concurrency. For services that need massive accumulation, routing scheduling can be used to increase the upper limit of accumulation, which can reach infinite accumulation in theory.
At present, CMQ can only guarantee the strict order of messages under certain circumstances, such as single production process, single consumption process, or the consumption window of queue is set to 1.
Full path message trace
In CMQ system, the complete path of a message includes three roles: producer, broker and consumer. In the process of processing messages, each role will add relevant information to the trace path. By aggregating these information, the status of any message and the current complete path can be obtained, thus providing strong data support for troubleshooting in production environment. It greatly reduces the difficulty of business positioning.
CMQ is a distributed message queue based on raft algorithm to ensure high reliability and strong consistency of data. It is mainly used in order and transaction business scenarios. The idempotency of the message needs to be guaranteed by the business side, and the message order can be guaranteed under certain circumstances.
For the business requirements that focus more on high performance and high throughput, Tencent cloud uses another message engine to provide services, which is compatible with Kafka on the protocol, which well meets the big data scenario. Please pay attention to the following article for the specific principle.
03 | CMQ vs. open source rabbitmq pressure test
Rabbitmq is a representative open source message middleware, which is widely used in enterprise systems, and it is used in scenarios that require high data consistency, stability and reliability.
CMQ also emphasizes high reliable message delivery. What are the advantages of Tencent cloud’s CMQ compared with rabbitmq?
In addition to the production and consumption confirmation mechanism, CMQ also provides the consumption backtracking function.
The user specifies that CMQ saves the production message for a certain number of days, and then traces the consumption to a certain point in the time period, and then re consumption starts from this point. When the user’s business logic is abnormal, the message replay function based on time is very helpful for business recovery.
Network IO: CMQ can mass produce / consume messages, while rabbitmq does not support mass production. In a large number of small message scenarios, CMQ has fewer requests and lower average latency.
File IO: the CMQ production / consumption message is to write a single file in sequence, and to disk storage periodically, so as to make full use of the file system cache. Rabbitmq persistent messages are first put into the memory queue for state transition, then they are written to the log cache, and finally to the message file and index file (index file is sequential write, message file is random write). It involves three IO operations, and its performance is poor.
CPU: rabbitmq’s log caching and state transition operations are complex and consume a lot of CPU.
Both CMQ and rabbitmq can use multiple machines for hot standby to improve availability. CMQ is based on raft algorithm, which is simple and easy to maintain. Rabbitmq uses the GM algorithm (guaranteed multicast), which is difficult to learn.
In raft protocol, as long as most nodes return success to the leader, the leader can apply the request and return success to the client.
GM reliable multicast makes all nodes in the cluster into a ring. Log replication propagates from the leader to the next node in turn. When the leader receives the request again, an acknowledgement message is sent to propagate in the ring until the leader receives the confirmation message again, which indicates that all nodes in the ring are synchronized.
GM algorithm requires log to return success to the client after all nodes in the cluster are synchronized; raft algorithm only requires most nodes to complete synchronization. Compared with GM algorithm, raft algorithm reduces half of the waiting time on the synchronization path.
Pressure test results
Through internal strict pressure test, under the same network and CPU memory environment, the performance of CMQ QPS is more than 4 times that of rabbitmq under the premise of ensuring reliable transmission.
Red envelope case of wechat Spring Festival Gala in 2004
The red envelope activities of the Spring Festival Gala involve the linkage of four large-scale systems, including wechat, wechat payment, red envelope system and TenPay system. The following is a brief introduction to each system:
Red envelope system: sending, robbing, dismantling and list viewing of personal red packets;
TenPay system: including high-performance storage of payment order and asynchronous entry flow, real-time display of user balance and bill;
Wechat access: ensure the quality of wechat users’ public network access;
Wechat payment: the entrance of online transaction.
Red packet is a hot spot in distributed system. For a typical example, “user a sends a red packet of 10 yuan to user B”, which includes the following steps:
Read the balance from the a account
Subtract account number a (minus 10 yuan)
Write the result back to a account (one time confirmation)
Read the balance from account number B
Open the red packet from a to B and read the value
Add account number B (add 10 yuan)
Write the results to account number B
In order to ensure the data consistency, there are only two results of the above steps: both successful or no successful rollback. Moreover, in the process of this operation, the distributed lock mechanism should be introduced for a and B accounts to avoid the problem of dirty data. In the huge distributed cluster of wechat red packets, things will become extremely complex.
Wechat red packet system introduces Tencent cloud CMQ to avoid the overhead of distributed transactions on the system. Similarly, in the scenario where user a sends a 10 yuan red packet to user B, the following describes the new strategy after the introduction of CMQ.
In the seventh step of the above case, user B opened the red packet, which contained 10 yuan. In the final entry operation, due to the high concurrent pressure of the day, the entry failure often occurs.
The red envelope team transfers all failed requests to CMQ. When user B fails to update the account balance, the mobile client will display the waiting status. Subsequently, the posting system will continue to pull again from CMQ and try the update operation again. CMQ ensures that the 10 yuan entry message will never be lost until it is taken out.
On New Year’s Eve, the actions of sending, dismantling and recording the user’s red packets are transformed into a billion level massive requests. If the traditional transaction mode is used, the concurrency pressure will be enlarged and the system will crash.
CMQ message queue ensures the reliable storage and transmission of red packet messages, and writes three copies in real time to ensure that the data is not lost. When the fund entry fails, the entry system can asynchronously try again for many times, pull data from CMQ to success, and play the role of peak shaving and valley filling. Avoid the disadvantages of traditional methods such as failed rollback and frequent polling database.
Q1: how do startups choose message queues?
A1: there are many open-source message queues with complex implementation mechanisms and high operation and maintenance costs. For start-ups, it is undoubtedly the most convenient to choose message queues provided by cloud service providers. They can be used on demand with very low cost.
Q2: how long is the message produced by CMQ saved if it is not consumed?
A2: the longest lifetime of a message in the queue. After the time specified by this parameter starts from sending to the queue, the message will be deleted regardless of whether it has been taken out. The unit is seconds, and the valid range is 60-1296000 seconds, that is, 1 minute to 15 days.
Q3: what are the advantages over Kafka?
Compared with Kafka, Kafka focuses on throughput. CMQ emphasizes reliable transmission and no loss. The Kafka version of CMQ will also be launched soon, please look forward to it.
Q4: is CMQ consumption mode push or pull? Or a combination of the two?
A4: the current mode of queue is pull. The subsequent topic mode will have push, which can support push to HTTP, SMS, email, and queue.
Q5: which queues support PubSub?
A5: in fact, CMQ supports two product forms: queue and topic. For a queue, there is no PubSub function. Topic has the ability of PubSub, and can also be used in series with topic and queue. Queue can be used as a sub side of a topic.
This sharing is shared by Zhang Hao of Tencent in the technology sharing group of Niu Niu online. Please join the technical sharing in official account ji-niu.