Kafka is often used in our work, but we are not very familiar with some internal mechanisms of Kafka. Therefore, we are looking at Kafka related knowledge recently. We know that Kafka is a very classic message engine, which is famous for its high performance and high availability. So the question is, how does it achieve high performance and high availability? In what form is its message persistent? Since the disk is written, why is it so fast? How does it ensure that messages are not lost? With this series of questions, let’s take off Kafka’s veil.
First, let’s think about such a problem: why do you need a message engine? Why can’t you just go to RPC? Take an order system as an example: when we place an order, we should first reduce the commodity inventory, then the user pays the deduction, and the merchant account adds money… Finally, we may send a push or text message to tell the user that the order is successful and the merchant has come to the order.
If the whole order placing process is blocked synchronously, the time will increase, the waiting time of users will be longer, and the experience is not very good. At the same time, the longer the link the order placing process depends on, the greater the risk. In order to speed up response and reduce risk, we can disassemble some businesses that are not necessarily stuck in the main chain and decouple them from the main business. The key core of ordering is to ensure the consistency of inventory, user payment and merchant payment. The notification of messages can be asynchronous. In this way, the whole order placing process will not be blocked by notifying the merchant or notifying the user of blocking, nor will it prompt the order failure because they fail.
The next step is how to design a message engine. From a macro point of view, a message engine supportssend out、storage、receiveThat’s it.
Then, as shown in the figure above, a simple message queue model appears. The engine stores the sender’s message, so that when the receiver comes to the engine for data, the engine will respond to the data from the storage to the receiver. Since persistent storage is involved, slow disk IO should be considered. In addition, there may be more than one receiver. Take the above order as an example. After the order is completed, the completion event is sent through a message. At this time, the development responsible for the user side push needs to consume this message, and the development responsible for the merchant side push also needs to consume this message. The simplest way to think of is to copy two sets of messages, but does this seem a bit wasteful? High availability is also a point that needs to be considered, so whether our engine gets a replica. After having a replica, if an engine node hangs up, we can elect a new replica to work. It’s not enough just to have copies. There may be multiple senders. At this time, it seems unreasonable if all senders send data to one leader (Master) node, and the pressure on a single node is too great. You might say: isn’t there a copy? Let the receiver read the message directly from the copy. This brings another problem: what if the message of the replica leader is delayed? If you can’t read the message, read the leader again? If so, the design of the engine seems more complex and unreasonable. Then we have to think of a method that can not only disperse the pressure of a single node without a replica, but also disperse the pressure of a single node. The answer is the fragmentation technology. Since the pressure of a single leader node is too great, we can divide it into multiple leader nodes. We only need a good load balancing algorithm to evenly distribute messages to each fragment node through load balancing, So we can design a set of producer consumer model about this length.
However, these are only simple ideas, and the specific implementation is still very complex. With this series of problems and ideas, let’s see how Kafka is implemented.
Thinking and Realization
How to design a message
The message is the source of the service. Everything is designed to send the message from one end to the other. This involves the message structure. The message body can not be too large, which will increase the storage cost and increase the network transmission overhead. Therefore, the message body only needs to contain the necessary information, and it is best not to be redundant. The message should also support compression. Through compression, the message body can become smaller when it is thin, so the storage and network overhead can be further reduced. Messages should be persistent. Consumed messages cannot be stored all the time, or very old messages are unlikely to be consumed again. A set of mechanism is needed to clean up old messages and free disk space. How to find out old messages is the key. Therefore, it is best to take a timestamp of message production time for each message, and calculate the old message through the timestamp, Delete when appropriate. Messages also need to be numbered. On the one hand, the number represents the location of the message. On the other hand, consumers can find the corresponding message through the number. How to store a large number of messages is also a problem. All messages are stored in one file, which has low query efficiency and is not conducive to cleaning up old data. Therefore, segmentation is adopted to cut large log files into multiple relatively small log files to improve maintainability. In this way, when inserting messages, just append them to the end of the segment, However, when searching for messages, if the whole segment is loaded into memory to find one by one, it also seems to require a lot of memory overhead, so a set of indexing mechanism is needed to speed up the access to the corresponding message through indexing.
summaryA message containing Kafka:Create time、Sequence number of the message、Support message compression, the log where messages are stored isSegmented storage, and yesIndexesof
From a macro point of view, the message engine sends and receives messages. There is a problem: producer a should send messages to consumer B and consumer C at the same time. So how can consumer B and consumer C only consume the data they need? The simple way to think of is to add a tag to the message. Consumers get their own message according to the tag instead of directly skipping their own message. However, this seems not elegant, and CPU resources are wasted on message filtering. Therefore, the most effective way is that the message to B will not be given to C, and the message to C will not be given to B. This is topic. Different businesses are distinguished by topics. Each consumer only needs to subscribe to the topics they pay attention to. The producer sends the messages that consumers need through the agreed topics. The simple understanding is that messages are classified according to topics.
summary: topic is alogicTopic can do a good job in business division. Each consumer only needs to pay attention to their own topic.
How to ensure the order of partitions
From the above, we know that the purpose of partitioning is to disperse the pressure of a single node. Combined with topic and message, the general layering of messages is topic – > partition – > message. You may ask, since partitioning is to reduce the pressure of a single node, why not use multiple topics instead of multiple partitions? In the case of multiple machine nodes, we can deploy multiple topics on multiple nodes, which seems to be able to realize distribution. It seems feasible to think about it simply, but it’s still wrong to think about it carefully. Finally, we have to serve the business. In this way, the business of a topic has to be disassembled into multiple topics, but the definition of the business has been broken up.
Well, since there are multiple partitions, the message allocation is a problem. If the data under topic is too concentrated on a partition, it will cause uneven distribution. To solve this problem, a set of good allocation algorithm is very necessary.
Kafka supportPolling methodThat is, in the case of multiple partitions, messages can be evenly distributed to each partition through polling. It should be noted that the data in each partition is orderly, but the overall data cannot guarantee the order. If your business strongly depends on the order of messages, you should carefully consider this scheme. For example, the producer sends three messages a, B and C in turn, They are distributed in three partitions, so the possible consumption order is B, a and C.
So how to ensure the order of messages? From the overall point of view, as long as the number of partitions is greater than 1, the order of messages can never be guaranteed unless you set the number of partitions to 1, but in this case, throughput is a problem. From the actual business scenario, we may need the messages of a user or a commodity in order. It doesn’t matter who comes first and who comes later for the messages of user a and user B, because there is no correlation between them, but we may need to maintain the order of the messages of user A. for example, the messages describe the user’s behavior, and the order of behavior should not be disordered. At this time, we can consider usingkey hashIn this way, the same user ID can always be allocated to a partition through hash. We know that the interior of the partition is orderly. Therefore, the messages of the same user must be orderly. At the same time, different users can be allocated to different partitions. This also makes use of the feature of multiple partitions.
summary: Kafka overall messages cannot be ordered, but messages in a single partition can be ordered.
How to design a reasonable consumer model
Since the message model is designed, consumers are essential. The simplest way to realize consumers is to start a process or thread and directly pull messages from the broker. This is very reasonable, but what if the production speed is greater than the current consumption speed? The first thing I think of is to have another consumer and improve the consumption speed through multiple consumers. There seems to be another problem here. What if both consumers consume the same message? Locking is a solution, but the efficiency will be reduced. You may say that the essence of consumption is reading. Reading can be shared. As long as the business idempotent is guaranteed, it doesn’t matter to consume messages repeatedly. In this case, if 10 consumers scramble for the same news, 9 consumers will waste their resources in vain. Therefore, while multiple consumers need to improve their consumption ability, it is also necessary to ensure that each consumer consumes unprocessed messages, which isConsumer group, there can be multiple consumers under the consumer group. We know that topic is partitioned. Therefore, as long as each consumer in the consumer group subscribes to different partitions. Ideally, each consumer is assigned to the same data volume partition. If a consumer obtains an uneven number of partitions (more or less) and the data is inclined, it will lead to some consumers being very busy or relaxed, which is unreasonable, which requires a set of balanced distribution strategy.
There are three main Kafka consumer partition allocation strategies:
- Range: this strategy is targeted at topic. It will divide the number of partitions of topic by the number of consumers. If there is a remainder, it means that the redundant partitions are not evenly divided. At this time, the consumers in the front row will get one more partition. At first glance, it is reasonable. After all, the original number is unbalanced. However, if consumers subscribe to multiple topics and each topic has several more partitions on average, the top consumers will consume many more partitions.
Because it is divided according to the topic dimension, finally:
- C1 consumption topic0-p0, topic0-p1, topic1-p0, topic1-p1
- C2 consumption topic0-p2, topic1-p2
Finally, it can be found that consumer C1 has two more partitions than consumer C2. It is completely possible to divide one partition of C1 into C2, so that it can be balanced.
- RoundRobin: the principle of this strategy is to sort the partitions of all consumers in the consumer group and all topics subscribed by consumers in dictionary order, and then allocate the partitions to each consumer one by one through the polling algorithm. Suppose there are now two topics, three partitions for each topic, and three consumers. Then the general consumption situation is as follows:
- C0 consumption topic0-p0, topic1-p0
- C1 consumption topic0-p1, topic1-p1
- C2 consumption topic0-p2, topic1-p2
It seems perfect, but if there are three topics and the number of partitions of each topic is inconsistent, for example, topic0 has only one partition, topic1 has two partitions and Topic2 has three partitions, and consumer C0 subscribes to topic0, consumer C1 subscribes to topic0 and topic1, and consumer C2 subscribes to topic0, topic1 and Topic2, the general consumption situation is as follows:
- C0 consumption topic0-p0
- C1 consumption topic1-p0
- C2 consumption topic1-p1, topic2-p0, topic2-p1, topic2-p2
In this way, roundrobin is not the most perfect. Regardless of the difference in throughput capacity of each topic partition, it can be seen that the consumption burden of C2 is obviously large, and topic1-p1 partition can be assigned to consumer C1.
- Sticky: both range and roundrobin have their own shortcomings. In some cases, they can be more balanced, but they don’t.
Sticky was introduced to:The distribution of partitions should be as uniform as possible。 Take the case where the three roundrobin topics above correspond to 1, 2 and 3 partitions respectively, because C1 can consume topic1-p1, but it does not. In this case, topic1-p1 can be assigned to C1 in sticky mode.
Sticky was introduced for two purposes:As far as possible, the allocation of the partition remains the same as that of the last allocation。 The main solution here is to reallocate the partitions after rebalance. Suppose that there are three consumers C0, C1 and C2 who have subscribed to topic0, topic1, Topic2 and Topic3, and each topic has two partitions. At this time, the consumption situation is roughly as follows:
At present, roundrobin is no different from this distribution method, but if consumer C1 exits at this time, only C0 and C2 remain in the consumer group. So we need to redistribute the partition of C1 to C0 and C2. Let’s take a look at how roundrobin rebalance:
It can be found that topic1-p1 of the original C0 is assigned to C2, and topic1-p0 of the original C2 is assigned to C0. This situation may cause the problem of repeated consumption. Before the consumer has time to submit, it is found that the partition has been assigned to a new consumer, and the new consumer will have repeated consumption. However, from a theoretical point of view, after C1 exits, there is no need to move the partitions of C0 and C2. Just divide the partitions of C1 to C0 and C2. This is sticky’s approach:
It should be noted that in the sticky policy, ifThe distribution of partitions should be as uniform as possibleandAs far as possible, the allocation of the partition remains the same as that of the last allocationIn case of conflict, the first one will be implemented first.
summary: Kafka supports the above three partition allocation strategies by default, and also supports user-defined partition allocation. The user-defined method needs to be implemented by yourself. From the effect, roundrobin is better than range, and sticky is better than roundrobin. It is recommended that you use the best strategy supported by the version.