3 – topics and queues


Topics and queues

The original message queue is a queue in a strict sense. In the computer field, “queue” is a data structure with complete and strict definition. In Wikipedia, a queue is defined as a linear list of first in first out (FIFO). In specific applications, it is usually implemented by linked list or array. The queue can only be inserted at the back end (called rear) and deleted at the front end (called front).

This definition contains several key points. The first one is first in first out. An implicit requirement is that these messages need to be guaranteed in the process of entering and leaving the queueStrict order, the order in which it is written to the queue must be read from the queue in the same order. However, there is no “read” operation in the queue. Reading is out of the queue, that is, to “delete” this message from the queue.

Early message queues were designed according to the data structure of “queue”.Let’s take a look at this figure. When a producer sends a message, it is a queue entry operation, and when a consumer receives a message, it is a queue exit operation, that is, a deletion operation. The container on the server that stores messages is naturally called a “queue”.

This is the original message model: queue model.

If multiple producers send messages to the same queue, the messages that can be consumed in this queue are the collection of all messages produced by these producers. The order of messages is the natural order in which these producers send messages. If multiple consumers receive messages from the same queue, there is actually a competitive relationship between these consumers. Each consumer can only receive part of the messages in the queue, that is, any message can only be received by one of them.

If a message data needs to be distributed to multiple consumers, each consumer is required to receive a full amount of messages. For example, for an order data, the risk control system, analysis system, payment system, etc. need to receive messages. At this time, a single queue cannot meet the demand. A feasible solution is to create a separate queue for each consumer and let the producer send multiple copies.

Obviously, this is a stupid practice. Copying the same message data to multiple queues will waste resources. More importantly, producers must know how many consumers there are. Sending a separate message for each consumer actually violates the original design intention of “decoupling” of message queue.

In order to solve this problem, another message model has been evolved:Publish subscribe pattern.

In the publish subscribe model, the sender of a message is called publisher, the receiver of a message is called subscriber, and the container in which the server stores the message is called topic. Publishers send messages to topics, and subscribers need to “subscribe to topics” before receiving messages“ Subscription is not only an action here, but also a logical copy of the topic during consumption. In each subscription, subscribers can receive all messages of the topic.

For a long time in the history of the message field, queue mode and publish subscribe mode coexist. Some message queues support both message models, such as ActiveMQ. Let’s carefully compare the two models. There is no essential difference between producers as publishers, consumers as subscribers and queues as topics. The biggest difference between them is whether a message data can be consumed many times.

In fact, in this publish subscribe model, if there is only one subscriber, it is basically the same as the queue model. In other words, the publish subscribe model is functionally compatible with the queue model. Most of the message models used by modern message queuing products are publish subscribe models, with some exceptions.

Message model of rabbitmq

The exception is rabbitmq, which is one of the few products that still adhere to the queue model. How does it solve the problem of multiple consumers? In rabbitmq, exchange is located between the producer and the queue. The producer does not care which queue the message is sent to, but sends the message to exchange. The policies configured on exchange determine which queues the message is delivered to.

If the same message needs to be consumed by multiple consumers, you need to configure exchange to send the message to multiple queues. Each queue stores a complete message data, which can provide consumption services for each consumer. This can also realize the function of “a message data can be consumed by multiple subscribers” in the new publish subscribe model.

Message model of rocketmq

The message model used by rocketmq is a standard publish subscribe model. In the glossary of rocketmq, the concepts of producer, consumer and topic are exactly the same as those in the publish subscribe model I mentioned above.

However, there is also the concept of queue in rocketmq, and queue is a very important concept in rocketmq. What is the role of queue in rocketmq? This starts with the consumption mechanism of message queue.

Almost all message queuing products use a very simple “request acknowledge” mechanism to ensure that messages will not be lost due to network or server failure during delivery. The specific approach is also very simple. On the production side, the producer first sends the message to the service side, that is, the broker. After receiving the message and writing the message to the subject or queue, the service side will send a confirmation response to the producer.

If the producer does not receive the confirmation from the server or receives a failed response, it will resend the message; On the consumer side, after receiving the message and completing their consumption business logic (for example, saving the data to the database), the consumer will also send a confirmation of successful consumption to the server. The server will not consider a message as successfully consumed until it receives the consumption confirmation. Otherwise, it will resend the message to the consumer until it receives the corresponding confirmation of successful consumption.

This confirmation mechanism ensures the reliability of the message passing process. However, the introduction of this mechanism brings a lot of problems on the consumer side. What’s the problem?In order to ensure the order of messages, before a message is successfully consumed, the next message cannot be consumed, otherwise there will be message holes, which violates the principle of order.

In other words, at most one consumer instance of each theme can consume at any time, so it is impossible to improve the overall consumption performance of the consumer end by horizontally expanding the number of consumers. To solve this problem, rocketmq adds the concept of queue under the topic.

Each topic contains multiple queues, and multiple queues are used to realize multi instance parallel production and consumption.It should be noted that rocketmq only ensures the order of messages on the queue, and the strict order of messages cannot be guaranteed at the subject level.

In rocketmq, the concept of subscriber is embodied through the consumer group. Each consumption group has a complete message in the consumption theme. The consumption progress between different consumption groups is not affected by each other, that is, a message consumed by consumer group1 will also be consumed by consumer group2.

The consumption group contains multiple consumers. The consumers in the same group are in the relationship of competitive consumption, and each consumer is responsible for some messages in the consumption group. If a message is consumed by consumer consumer1, other consumers in the same group will not receive the message again.

In the process of topic consumption, because messages need to be consumed by different groups for many times, the consumed messages will not be deleted immediately. Therefore, rocketmq needs to maintain a consumer offset on each queue for each consumption group. The messages before this location have been consumed, and the messages after this location have not been consumed, Every time a message is successfully consumed, the consumption position will be increased by one. This consumption location is a very important concept. When we use message queue, most of the reasons for message loss are due to improper processing of consumption location.

At any time, only one consumer can occupy a queue for the same consumer group, which ensures the order of messages on the queue.

Kafka’s message model

Let’s look at another common message queue Kafka. Kafka’s message model is exactly the same as rocketmq. All the corresponding concepts in rocketmq and the confirmation mechanism in the production and consumption process I just talked about are fully applicable to Kafka. The only difference is that in Kafka, the name of the concept of queue is different. The corresponding name in Kafka is “partition”, and there is no difference in meaning and function.

This work adoptsCC agreement, reprint must indicate the author and the link to this article