Getting started with rabbitmq

Time:2021-2-16

Message queuing(Message Queue, hereinafter referred to as MQ) is often used in asynchronous system data transfer. If we don’t use MQ, we can only use polling or interface callback [in the application layer], which is not satisfactory in efficiency or coupling. Of course, we can also maintain a long connection between the systems and send and receive data in real time based on the underlying socket mechanism. If we separate these functions into a middleware for all the systems in the project, it is what we call MQ today.


Contrast & Choice

The following are two popular communities with high activity——RabbitMQandKafkaBy the way, let’s make a comparisonredis

A simple small system can use redis. Redis is simple and easy to use. It provides a queue structure and supports publish subscribe mode. But after all, redis is a cache database, and its main responsibility is not message queue. It lacks the characteristics of message reachability (anti loss), reliability (distributed, cluster), asynchrony, transaction processing, etc., and needs additional processing from the application layer.

Rabbitmq: Erlang development, high throughput of single machine, but it only supports cluster mode, does not support distributed, reliability depends on two different nodes in the clustermaster queueandmirror queueAfter the node where the master queue is located is down, the system promotes the mirror queue to the master queue, which is responsible for processing the client queue operation request. Note that mirror queue is only used for mirroring, and it is not designed to bear the reading and writing pressure of the client. The master queue is used for reading and writing, so there is a single point of performance bottleneck. Rabbitmq supports pull and push modes on the consumer side.

Kafka: Scala development, support distributed, so if it is the same queue, the cluster throughput is certainly greater than rabbitmq. Kafka only supports pull mode. The drawback of pull is that if the broker does not have a message to consume, it will cause the consumer to continuously poll until a new message arrives. In order to avoid this, Kafka has a parameter to let consumer block until new messages arrive (or block until the number of messages reaches a certain amount, so that batch access can be obtained). So I think that in the scenario where message transmission is not very frequentIt’s better than pushThat is to say, it reduces the number of polling, does not need to occupy a connection forever, and the real-time performance can be basically guaranteed. Rabbitmq pull mode does not support this mechanism.

In fact, as far as throughput is concerned, unless we expect to have a million level concurrency, there is little difference between the two. In addition, for the single point bottleneck of each queue in rabbitmq, we can divide a queue into multiple queues according to certain logic, and separate messages at the business end, which can also improve throughput. Compared with Kafka, rabbitmq provides relatively complete message routing, message expiration deletion / delay / reservation, and message fault tolerance mechanisms. These functions can not be completed by heap hardware in a short time. If this is required, it is right to select rabbitmq. From this, we also know why Kafka is commonly used in log system. First, compared with business, log write operations are extremely frequent, and dozens of logs may be generated in one request, which requires high throughput. Moreover, the associated logs are generally cross system and cross business, and can not be split in fine granularity, which limits the space for rabbitmq to improve throughput. In addition, the consistency of real-time log records is very important It doesn’t have high requirements for sex, doesn’t need any strategy, and it doesn’t matter if there is a little loss. It can’t reflect the advantages of rabbitmq.

To sum up, the business layer recommends using rabbitmq.


Rabbitmq concept and points for attention

Main concepts of rabbitmq:

  1. Connection: in rabbitmqAMQP 0-9-1 connectionIt corresponds to the underlying TCP link one by one.
  2. Channel: channel, used for message delivery.
  3. Queue: queue, where messages are delivered through the switch.
  4. Exchange: switch, which is used for message routing. There are four modes (fanout, direct, topic and header) to decide which queue to post messages to through routekey.
  5. routeKey: routing key.
  6. DeadLetter: dead letter mechanism.

There are a lot of information about them on the Internet, so we will not repeat them here. We will pay attention to the specific details. The following is an excerpt fromRabbitmq best practicesIt is suggested to understand the main concepts of rabbitmq before looking at them.

In rabbitmq, message confirmation is divided into sender confirmation and consumer confirmation.

  • Sender:
    ConfirmListener: whether the message reaches the callback of exchange needs to implement two methods——handleAckhandleNack. Generally speaking, the sender only needs to ensure that the message can be sent to exchange without paying attention to whether the message is correctly delivered to a certain queue. This is something that rabbitmq and the message receiver need to consider. Based on this, if rabbitmq can not find any queue to be delivered, it will still ack to the sender. At this time, the sender can think that the message has been delivered correctly, and does not care about the problem that the message is not received by the queue. You can set thealternate-exchangeThat is to say, rabbitmq will send messages that cannot be delivered to any queue to the exchange specified by alternate exchange, which is a dead letter exchange(DLXSo Dlx is no different from ordinary switches, but it routes some messages that cannot be processed.
    ReturnListener: in fact, when exchange exists but no receiving queue can be found, if themandatory=trueIn this case, the sender can create a returnlistener to receive the returned message.
    It should be noted that when sending a message, if exchange does not exist, the message will be discarded directly, and there will be no ACK or NACK operation.
  • Consumer side: the message is directly ack by default, that is, the message arrives at the consumer immediately ACK, regardless of whether the consumer business processing is successful or not. In most cases, we need to process the business before we think that this message has been correctly consumed. Therefore, we can start the manual confirmation mode, that is, the consumer decides when to ack by itselfautoAck=falseTurn on the manual confirmation mode.
    requeue: set when the consumer side is NACK or reject to tell RQ whether to re post the message.
    By default, discarded messages in the queue will be discarded directly, but you can set thex-dead-letter-exchangeParameter to send the discarded message to the exchange specified in x-dead-letter-exchange, which becomes Dlx.

Lazy QueueAn important design goal is to support longer queues, that is, to support more message storage. Lazy queues store messages directly into the file system, whether persistent or non persistent. Note that if non persistent messages are stored in the lazy queue, the memory utilization will be stable all the time, but the messages will be lost after restart.

In topic mode, for example, test # is useless and cannot match test1. It needs to be configured as test. #. Maybe this is the specification required by rabbitmq.

In rabbitmq, it seems that there is no extensive discussion on whether to use one exchange or multiple exchanges on the Internet (it can be seen that there is no significant difference between the two schemes in performance). So, we will not increase the complexity, but keep the simple mode of one exchange corresponding to multiple queues, or divide them by business.

When you create a link, you can provide aConnectionName, such asnewConnection(ConnectionName)However, it seems that connectionName is only for people to see (for example, in the management background), and it does not require uniqueness.

Next, let’s talk about connection and channel.


Connection and channel

Why should these two things be mentioned separately. Because rabbitmq does not provide us with an out of the box link reuse component. As we all know, the creation and destruction of connection is a big expense. However, take the official JAVA CLIENT SDK as an example,ConnectionFactory.newConnection()A new connection is created every time. The SDK does not have a built-in connection pool, so this work needs to be handled separately.
There are also two concepts in NiO (for those who don’t know NiO, please refer to the previous one written by the blogger)On reactor mode)And they all have the meaning of link reuse. Maybe rabbitmq refers to NiO.

What’s channel for? Data transmission is done with connection. Why channel? In fact, the purpose of channel is to reuse connection at another level to solve the problem of multithreaded data concurrent transmission. Direct operation of connection for data transmission, when there are multiple threads operating at the same time, it is easy to occur data frame disorder. A connection can create multiple channels. Message sending, receiving, routing and other operations are bound to a channel rather than a connection, and each message is sent by aChannel IDidentification. However, it is thought that this kind of details can be hidden from users, which will increase the complexity. The official document gives some descriptions about the usage and precautions of channel, which need to be considered when coding.

As a rule of thumb, Applications should prefer using a Channel per thread instead of sharing the same Channel across multiple threads.
The official suggests that each thread use a channel, and different threads should not share the channel. Otherwise, it is easy to produce data frame interleaving when concurrent (the same as multiple threads directly share a connection). In this case, exchange will directly close the lower layer connection.
Channels consume resources, so it’s not a good idea to have hundreds of channels open in a process at the same time. A classic anti pattern to be avoided is opening a channel for each published message. Channels are required to be reasonably long-lived and opening a new one is a network round trip which makes this pattern extremely efficient Opening up a new channel requires a network round trip, which is very inefficient.

Consuming in one thread and publishing in another thread on a shared channel can be safe.

Multiple channels on a connection are scheduled by ajava.util.concurrent.ExecutorServiceconscientious. We can use itConnectionFactory#setSharedExecutorSet up a custom scheduler.

On the consumer side, message ack needs to be completed by the thread receiving the delivery. Otherwise, the channel level exception may be generated and the channel will be closed.

Generally speaking, channels also need to be reused, but the number can be one or two orders of magnitude more than connection. We can design a simple connection pool schemePooledConnectionFactoryIt is a connection container that keeps several long live connections available for external use, and each connection has its ownPooledChannelFactory, which maintains some long live channels.Spring-bootAn AMQP component is providedSpring AMQP, has helped us to achieve a similar scheme, and also hidden the channel, butSpringThe original sin of component design – Code fragmentation, annotations flying all over the world – increases the complexity of the component itself, and can not control the details. Of course, it also provides some other optional features. In fact, we only need a simple connection pool, so let’s implement it ourselves.


Simple connection pool implementation & use

Go straight to the code. First, define a link configuration class

@Component
data class ConnectionConfig(
    @Value("${rabbitmq.userName:guest}")
    var userName: String, 
    @Value("${rabbitmq.password:guest}")
    var password: String,
    @Value("${rabbitmq.host:localhost}")
    var host: String,
    @Value("${rabbitmq.port:5672}")
    var port: Int,
    @Value("${rabbitmq.virtualHost:/}")
    var virtualHost: String
)

Configuration items can be configured in the configuration file.

There is a BlockingQueue inside the factory class to store pooledconnection instances. Pooledconnection encapsulates the connection of rabbitmq. As for why we need to encapsulate a layer, I will talk about it later.

@Component
class PooledConnectionFactory(@Autowired private val connectionConfig: ConnectionConfig,
                              @Value("${rabbitmq.maxConnectionCount:5}")
                              private val maxConnectionCount: Int) {
    private val _logger: Logger by lazy {
        LoggerFactory.getLogger(PooledConnectionFactory::class.java)
    }

    private val _connQueue = ArrayBlockingQueue(maxConnectionCount)

    //Several connections have been created
    private val _connCreatedCount = AtomicInteger()

    private val _factory by lazy {
        buildConnectionFactory()
    }

    private fun buildConnectionFactory(): ConnectionFactory {
        val factory = ConnectionFactory()
        with(connectionConfig) {
            factory.username = userName
            factory.password = password
            factory.virtualHost = virtualHost
            factory.host = host
            factory.port = port
        }
        return factory
    }

    @Throws(IOException::class, TimeoutException::class)
    fun newConnection(): PooledConnection {
        var conn = _connQueue.poll()
        if (conn == null) {
            if (_connCreatedCount.getAndIncrement() < maxConnectionCount) {
                try {
                    conn = PooledConnection(_factory.newConnection(), _connQueue)
                } catch (e: Exception) {
                    _connCreatedCount.decrementAndGet()
                    _ logger.error (error creating rabbitmq connection, e)
                    throw e
                }
            } else {
                _connCreatedCount.decrementAndGet()
                conn = _connQueue.take()
            }
        }
        return conn
    }
}

Note that the newConnection method uses atomicinteger to ensure thread safety.

Let’s look at pooledconnection, which implements the closable interface. I wrote the code in kotlin. For the closable interface, kotlin provides an extension functionuse(), use function will automatically close the caller after the code block is executed (no matter whether there is an exception in the middle), similar to C #using()Operation, we’ll see how to use it later.

class PooledConnection(private val connection: Connection, private val container: BlockingQueue) : Closeable {
    private val _logger: Logger by lazy {
        LoggerFactory.getLogger(PooledConnection::class.java)
    }

    override fun close() {
        val offered = container.offer(this)
        if (!offered) {
            Val message = "rabbitmq connection pool is full, unable to release current connection"
            _logger.error(message)
            throw IOException(message)
        }
    }

    fun get() = connection
}

Note that the close() function does not really close, but puts the connection back into the connection pool. If you use RabbitMQ.Connection If it is, it will be closed directly.
The get() function will RabbitMQ.Connection Exposed for use by producers and consumers.

That’s all

Take the production end as an example

/**
     *Send message
     *
     *@ param data the data to be sent
     * @param exchange the name of the exchange sent to
     *The @ param routekey routing key is used for exchange to post messages to the queue
     */
    @Throws(IOException::class)
    fun send(data: Any, exchange: String, routeKey: String) = factory.newConnection().use{
        val conn = it.get()
        val channel = conn.createChannel()
        try {
            val properties = AMQP.BasicProperties.Builder()
                .contentType("application/json")
                . deliverymode (2) // message persistence to prevent loss before processing. The default value is 1.
                .build()
            it.basicPublish(exchange, routeKey, properties, JSON.toJSONString(data).toByteArray())
        }catch (e: Exception) {
            logger.error(e.message)
            throw e
        } finally {
            channel.close()
        }            
    }

So easy! Notice the use of use().

The channel pool can be implemented in a similar way.


reference material

How to choose rabbitmq and Kafka?
The difference between Kafka and rabbitmq
Rabbitmq publish / subscribe practice: implementation of delayed retrial queue