Basic knowledge of message queue

Time:2022-5-11

What is a message queue

Message queuing is a way of asynchronous communication between services. It is an important component in distributed system. It mainly solves the problems of application coupling, asynchronous messages, traffic peak shaving and so on, and realizes high-performance, high availability, scalability and final consistency architecture.

Performance comparison of existing message queues

characteristic AcitveMQ RabbitMQ Kafka RocketMq NSQ
development language Java erlang scala java golang
Single machine throughput 10000 class 10000 class 100000 class 100000 class 10000 class
Timeliness Ms level Us class Ms level Ms level
usability High (master-slave) High (master-slave) Very high (distributed) Very high (distributed) Very high (distributed)
API completeness high high high high low
Multilingual support Support, Java preferred Language independent Support, Java preferred Support, Java preferred support
Provide quick start have have have have nothing
Functional characteristics Mature products have been applied in many companies; There are many documents; Various protocols are well supported Based on Erlang development, it has strong concurrency, good performance and low latency; Rich management interface MQ has complete functions and good expansibility Only major MQ functions are supported, such as some message query, message backtracking and other functions are not provided. After all, they are prepared for big data and are widely used in the field of big data Development using golang is based on distributed and has high performance, but the current documents are not perfect and the deployment is troublesome

Several questions about message queue:

1. How to ensure that the message queue is highly available?

  • In most cases, we use clusters to ensure high availability, which is the same as the database;
  • Take rabbitmq as an example. It usually uses two cluster modes: the default mode and the mirror mode. The mirror mode is the most commonly used, and the mirror mode is shown in the figure below;

Basic knowledge of message queue

  • The rcoketmq cluster has multi master mode, multi master and multi slave asynchronous replication mode and multi master and multi slave synchronous double write mode. The deployment architecture of multi master and multi slave mode is as follows

Basic knowledge of message queue

2. How to ensure that messages are not consumed repeatedly?

There is no fixed answer and it needs to be handled according to the business scenario. Take rabbitmq as an example. Rabbitmq does not guarantee that messages are not repeated. If the business needs to ensure strict non repeated messages, it can be realized through the following methods:

  • Ensure that each message has a unique number and that the message processing is successful and appears simultaneously with the log of the de duplication table;
  • Get this message and insert it into the database. Make a unique primary key for this message. The problem of repeated consumption can be avoided due to the conflict of primary keys.

3. How to ensure the reliable transmission of messages?

The reliability of message queue transmission should be analyzed from three perspectives: producer lost data, message queue lost data and consumer lost data. The following takes rabbitmq as an example:

  • Producer loses data: rabbitmq provides transaction and confirm modes to ensure that producers do not lose messages; The transaction mechanism is to start the transaction (channel. Txselect()) before sending the message, and then send the message. If there is any exception in the sending process, the transaction will be rolled back (channel. Txrollback()). If the sending is successful, the transaction will be committed (channel. Txcommit()). However, the disadvantage is that the throughput decreases. Therefore, in general, the production environment uses the confirm mode more. Once the channel enters the confirm mode, All messages published on this channel will be assigned a unique ID. once the message is delivered to all matching queues, rabbitmq will send an AVK to the producer, which makes the producer know that the message has correctly arrived at the destination queue. If rabbitmq fails to process the message, it will send a NACK message, and then retry the operation.
  • Message queue lost data: to deal with the case of message queue lost data, it is generally to turn on the persistent disk configuration, which can be used in conjunction with the confirm mechanism. We can send an ACK signal to the producer after the message persistent disk. In this way, if rabbitmq dies before the message persistent disk, the producer will not receive the ACK signal and the producer will resend it. Persistence is mainly divided into two steps: 1 If the persistent representation of the queue is set to true, it means that it is a persistent queue; 2. When sending a message, set deliverymode = 2. After this setting, rabbitmq can recover data even if it hangs and restarts.
  • Consumers lose data: consumers generally lose data because of the automatic confirmation message mode. In this mode, consumers automatically confirm the receipt of the message. At this time, rabbitmq will immediately delete the message. The solution is to manually confirm the message.

AMQP (Advanced message queuing protocol)

Basic knowledge of message queue

AMQP core concepts

  • Broker: the application that receives and distributes messages. Rabbitmq server is the message broker;
  • Virtual host: virtual address, which is used for logical isolation and top-level message routing. There can be several exchanges and queues in a virtual host, and there cannot be exchanges or queues with the same name in the same virtual host;
  • Connection: the TCP connection between publisher / consumer and broker. The disconnection operation will only be carried out at the client end. The broker will not disconnect unless there is a network failure or a problem with the broker service;
  • Channel: channel refers to a channel for message reading and writing. If a connection is established every time rabbitmq is accessed, the overhead of establishing TCP connection in case of large message volume will be huge and the efficiency will be low. Channel is a logical connection established within the connection. If the application supports multithreading, usually each thread creates a separate channel for communication, AMQP method includes channel ID to help clients and message brokers identify channels, so channels are completely isolated. As a lightweight connection, channel greatly reduces the overhead of establishing TCP connection by the operating system;
  • Exchange: message arrives at the first stop of the broker, matches the routing key of the query table according to the distribution rules, and distributes the message to the queue. The common types are: direct (point-to-point), topic (publish subscribe)
  • Queue: the message queue and messages are finally sent here to wait for the consumer to take them away;
  • Binding: the virtual connection between exchange and queue. The binding can contain routing key. The binding information is saved in the query table in exchange for the distribution basis of messages.
  • Routing key: a routing rule that the virtual machine can use to determine how to route a specific message;
  • Message: message, the data transmitted between the server and the application. It is composed of properties and body. Properties can modify the message, such as the priority, delay and other advanced features of the message; The body is the content of the message body.