1. Basic theory of rocketmq

1.1 development history

Alibaba message middleware originated from the colorful stone project in 2001. Notify came into being during this period, which is used for the flow of transaction core messages.
In 2010, B2B began to use ActiveMQ as the message core on a large scale. With the rapid development of Alibaba’s business, there is an urgent need for a message middleware that supports sequential messages and has the ability to accumulate massive messages. Metaq1.0 was born in 2011.
In 2012, metaq has developed to version 3.0 and abstracted the general message engine rocketmq. Subsequently, rocketmq was open-source, and Ali’s message middleware officially came into public view.
In 2015, rocketmq has experienced the baptism of double 11 for many years, and has excellent performance in terms of availability, reliability and stability. At the same time, cloud computing is popular. Alibaba message middleware launched aliwaremq1.0 based on rocketmq and began to provide message services to tens of thousands of Alibaba cloud enterprises.
In 2016, metaq carried the flow of trillions of messages during the double 11, crossing a new milestone, and rocketmq entered Apache incubation.

1.2 rocketmq basic terms

1.2.1 message

Represents a message. It is uniquely identified by messageid. When sending, the user can set messagekey to facilitate subsequent query and tracking.
A message must specify topic, which is equivalent to the mail address. Message also has an optional tag setting so that the consumer can filter messages based on tag. You can also add additional key value pairs. For example, you need a business key to find messages on the broker, so as to diagnose problems in the development process.

1.2.2 topic

Topic is used to divide messages by topic. Producer sends the message to the specified topic, and consumer can receive the message by subscribing to the topic. Topic has no strong relationship with the sender and consumer. The sender can deliver messages to multiple topics at the same time, and the consumer can subscribe to messages from multiple topics. In rocketmq, topic is a logical concept. Message stores are not separated by topic.
Topic represents the first level type of messages. For example, messages in an e-commerce system can be divided into transaction messages, logistics messages, etc. A message must have a topic. Topic is the most fine-grained subscription unit. A group can subscribe to messages of multiple topics.

1.2.3 queue

The physical management unit of the message. There can be multiple queues under a topic. The introduction of queue makes the message storage distributed and clustered, and has the ability of horizontal expansion.
Topic and queue are one to many relationships. A topic can contain multiple queues, which are mainly used for load balancing. When sending a message, the user only specifies topic, and the producer will select which queue to send to according to the routing information of topic. When a consumer subscribes to a message, it will decide which queues to subscribe to according to the load balancing policy.
Rocketmq is a disk message queue mode. For the same consumption group, a partition supports only one consumption thread to consume messages. Too few partitions will cause the consumption speed to lag far behind the production speed of messages. Therefore, in the actual production environment, a topic will be set to multi partition mode to support multiple consumers. Refer to the following figure:



In rocketmq, all message queues are persistent data structures with unlimited length. The so-called infinite length means that each storage unit in the queue is of a fixed length, and the storage units accessed are accessed using offset. Offset is a long type of Java. Theoretically, it will not overflow within 100 years, so it is considered to be infinite length. In addition, only the data of recent days are saved in the queue, and the previous data will be deleted according to the expiration time.
It can be considered that messagequeue is an array with infinite length, and offset is the subscript.

1.2.4 offset

When storing messages, rocketmq will generate a message index file for each queue under each topic. Each queue corresponds to an offset to record the number of messages in the current queue

1.2.5 tag

Tags can be considered as further refinement of topic. Generally, messages with different purposes are marked by introducing labels in the same business module. Tag indicates the second level type of messages. For example, transaction messages can be divided into transaction creation messages, transaction completion messages, etc.
Rocketmq provides two-level message classification for easy and flexible control

1.2.6 nameserver basic concepts

Nameserver is the state server in the whole message queue, through which each component of the cluster can understand the global information. At the same time, machines in all roles should regularly report their status to the nameserver. If the timeout is not reported, the nameserver will think that a machine is out of service, and other components will remove the machine from the available list.
Nameserver maintains these configuration information and status information, and other roles are executed cooperatively through nameserver
Nameservers can be deployed independently of each other. Other roles can report status information to multiple nameserver machines at the same time, so as to achieve the purpose of hot backup. The nameserver itself is stateless, that is, the broker, topic and other status information in the nameserver will not be permanently stored. They are regularly reported by various roles and stored in memory.

Nameserver is relatively stable for two reasons:
(1) Nameservers are independent of each other and have no communication relationship with each other. If a single nameserver fails, it will not affect other nameservers. Even if all nameservers fail, it will not affect the use of business systems.
(2) Nameserver does not read and write frequently, so the performance overhead is very small and the stability is very high. significance of nameserver



Service discovery mechanism: when a service is requested, the client knows all service instances through the registry service. The client then uses the load balancing algorithm to select one of the available service instances and send it.



Summary: nameserver is an almost stateless node that can be deployed in a cluster without any information synchronization between nodes.

1.2.7 broker

Broker is the core module of rocketmq, which is responsible for receiving and storing messages and providing push / pull interfaces to send messages to consumers. Consumer can choose to read data from master or slave.
Brokers usually exist in the form of clusters. A broker cluster is composed of multiple master / slave nodes. There is no data interaction between the master nodes in the cluster.
The broker also provides the function of message query. You can query messages through messageid and messagekey.
The broker will synchronize its topic configuration information to the nameserver in real time.
Broker deployment is relatively complex. Brokers are divided into master and slave. A master can correspond to multiple slaves, but a slave can only correspond to one master. The correspondence between master and slave is defined by specifying the same brokername and different brokerids. A brokerid of 0 indicates the master and a non-0 indicates the slave. Master can also deploy multiple. Each broker establishes long-term connections with all nodes in the nameserver cluster and regularly registers topic information with all nameservers.

1.2.8 producer

The message producer is located in the user’s process. Producer gets the routing information of all broker through nameserver, chooses which broker to send the message according to the load balancing strategy, and then calls the broker interface to submit the message.
Producer establishes a long connection with one of the nodes in the nameserver cluster (randomly selected, but different from the last time), periodically obtains topic routing information from the nameserver, establishes a long connection to the master providing topic services, and regularly sends heartbeat to the master.

1.2.9 producerGroup

Producer group is simply multiple producers sending the same type of message, which is called a producer group.

1.2.10 consumer

Message consumer, located within the user process. After obtaining the routing information of all brokers through the nameserver, the consumer sends a pull request to the broker to obtain the message data. Consumers can be started in two modes, broadcast and cluster. In broadcast mode, one message will be sent to all consumers, and in cluster mode, only one message will be sent to one consumer.
The consumer establishes a long connection with one of the nodes in the nameserver cluster (randomly selected, but different from the last time), periodically obtains topic routing information from the nameserver, establishes a long connection to the master and slave providing topic services, and regularly sends heartbeat to the master and slave

1.2.16 consumerGroup

A consumer group, similar to a producer, consists of multiple consumer instances that consume the same type of message.

1.3 message sending

1.3.1 simplified process

The simplest steps from sending to receiving a message are producer, topic, and consumer.



The message is first sent to topic by the producer, and then the consumer goes to topic to get the message. Topic is just a concept here.

1.3.2 detailed process

Detailed message sending and receiving process:



The message is sent to the queue for marking:



1.3.3 three ways to send messages

Functionally, rocketmq supports three ways to send messages: synchronous, asynchronous and oneway.Sequential messages only support synchronous sendingLet’s briefly explain the three ways to send messages in order to understand the differences between them.

Send sync synchronously
The sending message adopts the synchronization mode, which returns the result only after the message is completely sent. This method has the time cost of waiting for the sending result synchronously.
This method has an internal retry mechanism, that is, the internal implementation will retry a certain number of times before actively declaring that the message sending fails, which is 2 times by default (defaultmqproducer getretrytimeswhensendfailed). The same message may be sent to the broker many times. Here, the application developer needs to deal with the idempotency problem on the consumer side.

public class  {
    public static void main(String[] args) Exception {
        DefaultMQProducer producer = new DefaultMQProducer("producer_demo");
        //Specify nameserver address
        //Change to your own
        //Multiple can be separated by ";"
         *Before using the producer object, you must call start initialization. You can initialize it once
         *Note: remember that you cannot call the start method every time you send a message
        for (int i = 0; i <= 100; i++) {
                Build message
                Parameter topic: the topic to which the message belongs
                      Tags: it can be understood as reclassifying messages to facilitate consumers to specify filtering conditions for filtering on the MQ server
                      Keys: set the business key attribute representing the message. Please be globally unique as far as possible so that you can query and reissue the message through the Alibaba cloud server management console when you cannot receive the message normally. Note: normal message sending and receiving will not be affected if it is not set
                      Body: body can be data in any binary form. MQ does not intervene. It needs to negotiate the serialization and deserialization methods between producer and consumer
            Message MSG = new message ("topictest", "taga", "keys", ("test rocketmq" + I). GetBytes ("UTF-8")
            try {
            //Send synchronization message
                SendResult sendResult = producer.send(msg);
                System.out.printf("%s%n", sendResult);
            } catch (Exception e) {
                //Message sending failed. Retry processing is required. You can resend the message or persist the data for compensation processing
                System.out.println(new Date() + " Send mq message failed. Topic is:" + msg.getTopic());

Send async asynchronously
The sending message adopts the asynchronous sending mode. It returns immediately after sending. When the message is completely sent, the callback function sendcallback will be called to inform the sender whether the sending is successful or failed. Asynchronous mode is usually used in response time sensitive business scenarios, that is, it can not bear the time-consuming cost of waiting for return when sending messages synchronously.
Like synchronous sending, the asynchronous mode also implements the retry mechanism internally. The default number is 2 (defaultmqproducer #getretrytimeswhensensayncfailed}). The same message may be sent to the broker many times. The application developer needs to deal with the idempotency problem on the consumer side.

public class AsynProducer {
    public static void main(String[] args) throws Exception {
        DefaultMQProducer producer = new DefaultMQProducer("producer_demo");
        for (int i = 0; i <= 100; i++) {
            Message MSG = new message ("topictest", "taga", "keys", ("test rocketmq" + I). GetBytes ("UTF-8")
            try {
                //Send messages asynchronously, and the sending results are returned to the client through callback.
                producer.send(msg, new SendCallback() { 
                    public void onSuccess(SendResult sendResult) {
                        //Consumption sent successfully
                        System. Out. Println ("success information:" + sendresult. Tostring());
                        System.out.println("send message success. topic=" + sendResult.getRegionId() + ", msgId=" + sendResult.getMsgId());
                    public void onException(Throwable throwable) {
                        //Message sending failed. Retry processing is required. You can resend the message or persist the data for compensation processing
                        System. Out. Println ("fail information:" + throwable. Getmessage());
            } catch (Exception e) {
                //Message sending failed. Retry processing is required. You can resend the message or persist the data for compensation processing
                System.out.println(new Date() + " Send mq message failed. Topic is:" + msg.getTopic());

Send one-way directly
When sending a message in one-way mode, the sender will return immediately after sending the message, and will not wait for an ACK from the broker to tell whether the message has been sent completely. This method has a large throughput, but there is a risk of message loss, so it is suitable for sending unimportant messages, such as log collection.

public class OneWayProducer {

    public static void main(String[] args) throws MQClientException, InterruptedException, UnsupportedEncodingException {
        DefaultMQProducer producer = new DefaultMQProducer("producer_demo");

        for (int i = 0; i <= 10; i++) {

            Message MSG = new message ("topictest", "taga", "keys", ("test rocketmq" + I). GetBytes ("UTF-8")
            try {
                //Because there is no request response processing when sending a message in oneway mode, once the message sending fails, the data will be lost because there is no retry. If the data cannot be lost, it is recommended to select reliable synchronous or reliable asynchronous transmission mode.
            } catch (Exception e) {
                //Message sending failed. Retry processing is required. You can resend the message or persist the data for compensation processing
                System.out.println(new Date() + " Send mq message failed. Topic is:" + msg.getTopic());

1.4 message storage

Topic is a logical concept. In fact, message is recorded in the form of queue on each broker.



The following conclusions can be summarized from the above picture:
(1) Messages sent by consumers will be recorded in the queue queue in the broker
(2) The data of a topic may exist in multiple brokers
(3) There are multiple queues in a broker
In other words, each topic will be divided into several logical queues on the broker. Each logical queue saves part of the message data, but the saved message data is not the real message data, but the message index pointing to the commitlog

1.5 message consumption

1.5.1 broadcast consumption

A message is consumed by multiple consumers. Even if these consumers belong to the same consumer group, the message will be consumed once by each consumer in the consumer group. The concept of consumergroup in broadcast consumption can be considered meaningless at the level of message division. It is applicable to some scenarios where messages are distributed. For example, if my order is placed successfully, I need to notify the financial system, For such distribution scenarios as the customer service system, the consumption mode can be set to broadcast consumption by modifying the messagemodel in the consumer. consumer.setMessageModel(MessageModel.BROADCASTING)

1.5.2 cluster consumption

A consumer instance in a consumergroup evenly allocates messages sent by consumer producers. For example, if a topic has nine messages and one consumer group has three instances (possibly three processes or three machines), each instance consumes only three messages. If the consumer does not specify the consumption mode, it is consumed by the cluster by default, which is applicable to most of the information consuming businesses.

1.6 network architecture




Description of several roles in the figure above:
(1) Nameserver: the naming server (or registry) of the rocketmq cluster. It is stateless (in fact, there may be temporary inconsistencies in the data on each nameserver instance, but it is consistent in most cases through regular updates). It is used to manage the metadata of the cluster (for example, registration information of kV configuration, topic and broker).
(2) Broker (Master): the main node of the rocketmq message proxy server, which plays the role of sending messages from the producer in series, consuming messages from the consumer, and storing messages on the disk;
(3) Broker (slave): the backup node of rocketmq message proxy server, which mainly synchronizes the messages of the primary node for backup through synchronous / asynchronous mode, so as to ensure the high availability of rocketmq cluster;
(4) Producer (message producer): This is the producer of ordinary messages. It mainly sends messages to the main node of rocketmq based on the rocketmq client module.
For the relationship of several communication links in the above figure:
(1) Producer and namerserver: Each producer will establish a TCP connection with an instance in the nameserver cluster and pull topic routing information from the nameserver instance;
(2) Producer and broker: producer will establish a TCP connection with the broker proxy server of the master associated with the topic to be sent, which is used to send messages and timed heartbeat information;
(3) Broker and namerserver: the broker (Master or slave) will establish a TCP connection with each nameserver instance. When the broker starts, it will register its configured topic information to each machine in the nameserver cluster.
That is, each nameserver has the topic routing configuration information of the broker. There is no connection between the master and the slave, and there is a connection between the master and the slave;

2. Go deep into rocketmq

2.1 load balancing of sending messages



Sending messages are sent by polling queues, and each queue receives an average amount of messages. By adding machines, the queue capacity can be expanded horizontally. In addition, you can select which queue to send to in a customized way. Note: in addition, multiple queues can be deployed on one machine or on multiple different machines.

2.1.1 algorithm for selecting queue when sending message

There are two types. One is to send messages directly. The client has an algorithm to select the queue, which is not allowed to be changed by the outside world. There is also a queue selection algorithm that can be customized (three algorithms are built in. If you don’t like it, you can customize the algorithm implementation)

public class org.apache.rocketmq.client.producer.DefaultMQProducer {
    //Only messages are sent, and the selection of queue is realized by the default algorithm
    public SendResult send(Collection<Message> msgs) {}
    //Customize the algorithm of selecting queue to send messages
    public SendResult send(Collection<Message> msgs, MessageQueue messageQueue) {}
} send(msg, mq) usage scenario

Sometimes we don’t want the default queue selection algorithm, but we need to customize it. Generally, the most commonly used scenario is sequential messages. The sending of sequential messages generally specifies that messages with certain characteristics are sent in the same queue, so as to ensure the order, because a single queue is sequential. principle analysis

Three algorithms, selectmessagequeuebyrandom, selectmessagequeuebyhash and selectmessagequeuebymachineroom, are built in. They all implement a common interface: org.apache.rocketmq.client.producer.messagequeueselector. If you want to customize the logic, you can directly implement the interface and rewrite the select method.
It is a typical policy mode. Different algorithms have different implementation classes, and there is a top-level interface.


public class SelectMessageQueueByRandom implements MessageQueueSelector {
    private Random random = new Random(System.currentTimeMillis());
    public MessageQueue select(List<MessageQueue> mqs, Message msg, Object arg) {
        //MQS. Size(): the number of queues. Assuming that the number of queues is 4, the value is random between 0 and 3.
        int value = random.nextInt(mqs.size());
        return mqs.get(value);


public class SelectMessageQueueByHash implements MessageQueueSelector {
    public MessageQueue select(List<MessageQueue> mqs, Message msg, Object arg) {
        int value = arg.hashCode();
        //Prevent negative numbers and take an absolute value, which is also a point we need to pay attention to in our usual development
        if (value < 0) {
            value = Math.abs(value);
        //Number of queues for direct redundancy.
        value = value % mqs.size();
        return mqs.get(value);


public class SelectMessageQueueByMachineRoom implements MessageQueueSelector {
    private Set<String> consumeridcs;
    public MessageQueue select(List<MessageQueue> mqs, Message msg, Object arg) {
        return null;
    public Set<String> getConsumeridcs() {
        return consumeridcs;
    public void setConsumeridcs(Set<String> consumeridcs) {
        this.consumeridcs = consumeridcs;

Custom algorithm

public class MySelectMessageQueue implements MessageQueueSelector {
    public MessageQueue select(List<MessageQueue> mqs, Message msg, Object arg) {
        return mqs.get(0);
} send(msg) usage scenario

This is generally used for scenes without special needs. Because his default queue selection algorithm is very good, various optimization scenarios have been thought of for us. We call itRandom increasing modulus algorithm principle analysis

// {@link org.apache.rocketmq.client.impl.producer.DefaultMQProducerImpl#sendDefaultImpl}
//This is the core principle of sending messages

//Select the queue to send the message to
MessageQueue mq = null;
for (int times = 0; times < 3; times++) {
    //The first time must be null
    String lastBrokerName = null == mq ? null : mq.getBrokerName();
    //Call the following method to select queue
    MessageQueue mqSelected = this.selectOneMessageQueue(topicPublishInfo, lastBrokerName);
    if (mqSelected != null) {
        //Assign a value to MQ. If it fails for the first time, MQ will have a value the next time it is retried (that is, the next time it is for).
        mq = mqSelected;
        //It is crucial to be able to answer the following two questions:
        //1. When was the faultitemtable put in?
        //2. Isavailable() why can you know whether the broker is available only after judging for a period of time?   
        this.updateFaultItem(mq.getBrokerName(), endTimestamp - beginTimestampPrev, false);    

Select the main entry of the queue

public MessageQueue selectOneMessageQueue(final TopicPublishInfo tpInfo, final String lastBrokerName) {
    //The default value is false, which means that broker failure delay is not enabled
    if (this.sendLatencyFaultEnable) {
        try {
            //Random number and + 1
            int index = tpInfo.getSendWhichQueue().getAndIncrement();
            for (int i = 0; i < tpInfo.getMessageQueueList().size(); i++) {
                //First (random number + 1)% queue. Size ()
                int pos = Math.abs(index++) % tpInfo.getMessageQueueList().size();
                if (pos < 0) {
                    pos = 0;
                MessageQueue mq = tpInfo.getMessageQueueList().get(pos);
                //See if the broker to which the queue belongs is available
                if (latencyFaultTolerance.isAvailable(mq.getBrokerName())) {
                    //Retry without failure and return directly to the queue
                    //If the failed retry is the same as the last retry of the selected queue, return
                    //That is, if the broker where your queue is located is available,
                    //It is not the case of retry or failed retry. If the selected queue is the same as the last retry, you are the son of heaven.
                    if (null == lastBrokerName || mq.getBrokerName().equals(lastBrokerName)) {
                        return mq;
   //If all queues are unavailable, select a relatively good broker, regardless of the availability of the message queue
            final String notBestBroker = latencyFaultTolerance.pickOneAtLeast();
            int writeQueueNums = tpInfo.getQueueIdByBroker(notBestBroker);
            if (writeQueueNums > 0) {
                final MessageQueue mq = tpInfo.selectOneMessageQueue();
                if (notBestBroker != null) {
                    mq.setQueueId(tpInfo.getSendWhichQueue().getAndIncrement() % writeQueueNums);
                return mq;
            } else {
        } catch (Exception e) {
            log.error("Error occurred when selecting message queue", e);
  //Randomly select a queue
        return tpInfo.selectOneMessageQueue();
 //When sendlatencyfaultenable = false, select the method of queue, which is false by default.
    return tpInfo.selectOneMessageQueue(lastBrokerName);
} do not enable broker fault delay

Since sendlatencyfaultenable is false by default, let’s look at the logic when sendlatencyfaultenable = false

public MessageQueue selectOneMessageQueue(final String lastBrokerName) {
    //The first time is null, and the second time (that is, when retrying) is not null.
    if (lastBrokerName == null) {
        //Logic for first queue selection
        return selectOneMessageQueue();
    } else {
        //Failed to select the queue to send messages for the first time. Select the logic of the queue for the second retry
        int index = this.sendWhichQueue.getAndIncrement();
        for (int i = 0; i < this.messageQueueList.size(); i++) {
            int pos = Math.abs(index++) % this.messageQueueList.size();
            if (pos < 0)
                pos = 0;
            MessageQueue mq = this.messageQueueList.get(pos);
   //Filter out the queues that failed to send messages last time
            if (!mq.getBrokerName().equals(lastBrokerName)) {
                return mq;
        return selectOneMessageQueue();

Then continue to look at the logic of selecting the queue for the first time

public MessageQueue selectOneMessageQueue() {
    //The current thread has a ThreadLocal variable, which stores a random number 
    // {@link org.apache.rocketmq.client.common.ThreadLocalIndex#getAndIncrement}
    //Then take out the random number, take the modulus according to the queue length, and add the random number + 1
    int index = this.sendWhichQueue.getAndIncrement();
    int pos = Math.abs(index) % this.messageQueueList.size();
    if (pos < 0) {
        pos = 0;
    return this.messageQueueList.get(pos);

In fact, it also means a little random. However, the highlight is to take out the random number, take the modulus according to the queue length, and add the random number + 1 (getandincrement CAS + 1).

When the message fails to be sent for the first time, lastbrokername will store the broker that failed to be selected currently (MQ = mqselected). Try again. At this time, the lastbrokername has a value, which means that the last selected broker failed to be sent. Then, for sendwhichqueue local thread variable + 1, traverse the selected message queue until it is not the last broker, That is, to avoid the logic of the broker that failed to send last time.

For example, if your random number is 1 and the queue length is 4, 1% 4 = 1, you fail and enter the retry. Before the retry, that is, after 1% 4 in the previous step, he changed 1 to 2. Then when you retry this time, you will be 2% 4 = 2 and filter out the failed brokers directly.

Then continue to look at the logic of selecting the queue for the second retry

// +1
int index = this.sendWhichQueue.getAndIncrement();
for (int i = 0; i < this.messageQueueList.size(); i++) {
    //Take mold
    int pos = Math.abs(index++) % this.messageQueueList.size();
    if (pos < 0)
        pos = 0;
    MessageQueue mq = this.messageQueueList.get(pos);
    //Filter out the queues that failed to send messages last time
    if (!mq.getBrokerName().equals(lastBrokerName)) {
        return mq;
//If you don't find a usable queue, continue with the default one
return selectOneMessageQueue(); enable broker failure delay

That is, the logic in the following if

if (this.sendLatencyFaultEnable) {

Let me first (random number + 1)% queue. Size (), and then see if the broker of your queue is available. If it is available, it is not the case of retry or failed retry. If the selected queue is the same as the last retry, you can return it directly. So what do you think about whether the broker is available?

// {@link org.apache.rocketmq.client.latency.LatencyFaultToleranceImpl#isAvailable(String)}
public boolean isAvailable(final String name) {
    final FaultItem faultItem = this.faultItemTable.get(name);
    if (faultItem != null) {
        return faultItem.isAvailable();
    return true;

// {@link org.apache.rocketmq.client.latency.LatencyFaultToleranceImpl.FaultItem#isAvailable()}
public boolean isAvailable() {
    return (System.currentTimeMillis() - startTimestamp) >= 0;

When was the faultitemtable put in? Isavailable() why can you know whether the broker is available only after judging for a period of time? This requires the method called after sending the message above:

// {@link org.apache.rocketmq.client.impl.producer.DefaultMQProducerImpl#updateFaultItem}
//Send start time
beginTimestampPrev = System.currentTimeMillis();
sendResult = this.sendKernelImpl(msg, mq, communicationMode, sendCallback, topicPublishInfo, timeout);
//Send end time
endTimestamp = System.currentTimeMillis();
//Update broker latency
this.updateFaultItem(mq.getBrokerName(), endTimestamp - beginTimestampPrev, false);

The detailed logic is as follows:

// {@link org.apache.rocketmq.client.latency.MQFaultStrategy#updateFaultItem}
public void updateFaultItem(final String brokerName, final long currentLatency, boolean isolation) {
    if (this.sendLatencyFaultEnable) {
        //False is passed in the first isolation, and currentlatency is the time spent sending messages, as shown below
        // this.updateFaultItem(mq.getBrokerName(), endTimestamp - beginTimestampPrev, false);
        long duration = computeNotAvailableDuration(isolation ? 3010 : currentLatency);
        this.latencyFaultTolerance.updateFaultItem(brokerName, currentLatency, duration);

private long[] latencyMax = {50L, 100L, 550L, 101L, 201L, 301L, 1501L};
private long[] notAvailableDuration = {0L, 0L, 3010L, 6010L, 12010L, 18010L, 60100L};

//Add the broker to the faultitemtable by comparing the delay time with the delay level array latencymax unavailable time array notavailableduration in mqfaultstrategy.
private long computeNotAvailableDuration(final long currentLatency) {
    for (int i = latencyMax.length - 1; i >= 0; i--) {
        //Assuming that currentlatency takes 10ms, the data in latencymax obviously does not meet all the following judgments, so return 0 directly;
        if (currentLatency >= latencyMax[i])
            return this.notAvailableDuration[i];
    return 0;

// {@link org.apache.rocketmq.client.latency.LatencyFaultToleranceImpl#updateFaultItem()}
//In fact, it is mainly to assign starttimestamp the current time + computenotavailableduration (isolation? 3010: currentlatency); For isavailable()
//That is, isavailable() returns true only when notavailableduration = = 0.
public void updateFaultItem(final String name, final long currentLatency, final long notAvailableDuration) {
    FaultItem old = this.faultItemTable.get(name);
    if (null == old) {
        final FaultItem faultItem = new FaultItem(name);
        //Assign starttimestamp the current time + computenotavailableduration (isolation? 3010: currentlatency); For isavailable()
        faultItem.setStartTimestamp(System.currentTimeMillis() + notAvailableDuration);

        old = this.faultItemTable.putIfAbsent(name, faultItem);
        if (old != null) {
            //Assign starttimestamp the current time + computenotavailableduration (isolation? 3010: currentlatency); For isavailable()
            old.setStartTimestamp(System.currentTimeMillis() + notAvailableDuration);
    } else {
        //Assign starttimestamp the current time + computenotavailableduration (isolation? 3010: currentlatency); For isavailable()
        old.setStartTimestamp(System.currentTimeMillis() + notAvailableDuration);

Rocketmq predicts an available time (current time + notavailableduration) for each broker. When the current time is greater than this time, it means that the broker is available. There are six levels of notavailableduration corresponding to the interval of latencymax one by one. Predict when the broker is available according to the incoming currentlatency. So look at this again

public boolean isAvailable() {
    return (System.currentTimeMillis() - startTimestamp) >= 0;

According to the execution time, the interval in which notavailableduration falls is 0 and available within the time range of 0 ~ 100. When it is greater than this value, the available time will begin to increase. It is considered not the optimal solution and is discarded directly. summary

When fault tolerance is not enabled, the polling queue is used for sending. If it fails, the failed brokers are filtered when retrying
If the fault tolerance policy is enabled, the availability of a broker will be predicted through the prediction mechanism of rocketmq
If the last failed broker is available, the broker’s queue will still be selected
If the above situation fails, select one at random for transmission
When sending a message, the call time and whether an error is reported will be recorded, and the available time of the broker will be predicted according to the time

2.2 message storage

2.2.1 storage model



The hierarchical structure of rocketmq file storage model is shown in the figure above. According to the category and function, the conceptual model can be roughly divided into five layers. The following will be analyzed and described from each level:
(1) Rocketmq business processor layer: the business logic entry for the broker side to read and write messages. This layer mainly includes business logic related processing operations (distinguish specific business operation types according to the resolution of requestcode in remotingcommand, and then execute different business processing processes), such as pre inspection and verification steps Construct messageextbrokerinner object, decode deserialization, construct response return object, etc;
(2) Rocketmq data storage component layer; This layer is mainly the storage core class of rocketmq defaultmessagestore, which is the access portal for rocketmq message data files. The “putmessage()” and “getmessage()” methods of this class are used to read and write the log data files stored in commitlog messages (the specific read and write access operations still depend on the methods provided by the commitlog object model in the next layer); In addition, during the initialization of this component, many storage related background service threads will be started, including allocatemappedfileservice (mappedfile pre allocation service thread), reputmessageservice (playback storage message service thread), haservice (broker master-slave synchronization high availability service thread), storestatsservice (message storage statistics service thread) and indexservice (index file service thread), etc;
(3) Rocketmq storage logical object layer: this layer mainly includes three model classes indexfile, consumerqueue and commitlog directly related to rocketmq data file storage. Indexfile provides access services for index data files, consumerqueue provides access services for logical message queues, and commitlog provides access services for log data files stored in messages. These three model classes also constitute the overall structure of rocketmq storage layer (the in-depth analysis of these three model classes will be included in the following pages);
(4) Encapsulated file memory mapping layer: rocketmq mainly uses mappedbytebuffer and filechannel in JDK NiO to read and write data files. Among them, mappedbytebuffer, a memory mapped disk file, is used to read and write large files, and this class is encapsulated into mappedfile class in rocketmq. The problem of restrictions here has been mentioned above; For each type of large file (indexfile / consumerqueue / commitlog), it is separated into multiple files of fixed size during storage (the size of a single indexfile is about 400m, the size of a single consumerqueue file is about 5.72m, and the size of a single commitlog file is 1g). The file name of each separated file is the byte size of all previous files + 1, which is the starting offset of the file, Thus, the concatenation of the whole large file is realized. Here, the mappedfile class provides read-write operation services for each kind of single file (among them, the mappedfile class provides sequential write / random read, memory data disk brushing, memory cleaning and other file related services);
(5) Disk storage layer: mainly refers to the disks used to deploy the rocketmq server. Here, the effects of different disk types (such as SSD or ordinary HDD) and disk performance parameters (such as IOPs, throughput and access delay) on sequential write / random read operations need to be considered

2.2.2 storage process



(1) Types and disadvantages of rocketmq message storage structure
The above figure shows the overall message storage architecture of rocketmq. Rocketmq adopts a hybrid storage structure, that is, all queues under a single broker instance share a log data file (i.e. commitlog) for storage. Kafka adopts an independent storage structure, with one file per queue. The editor believes that the disadvantage of rocketmq using hybrid storage structure is that there will be more random read operations, so the reading efficiency is low. At the same time, consumption messages need to rely on consumequeue, and building the logical consumption queue requires some overhead.

(2) In depth analysis of rocketmq message storage architecture
As can be seen from the overall architecture diagram above, the hybrid storage structure of rocketmq adopts a storage structure with separate data and index parts for producer and consumer respectively. The producer sends messages to the broker, and then the broker swipes and persists the messages in a synchronous or asynchronous manner and saves them to the commitlog. As long as the message is persisted to the disk file commitlog, the message sent by producer will not be lost. Because of this, consumers will certainly have the opportunity to consume this message. It doesn’t matter that the consumption time can lag a little. To take a step back, even if the consumer end does not Fara get the message to be consumed for the first time, the broker server can wait for a certain time delay through the long polling mechanism and then initiate the request to pull the message again.
Here, the specific method of rocketmq is to use the background service thread on the broker side – reputmessageservice to continuously distribute requests and asynchronously build consumequeue (logical consumption queue) and indexfile (index file) data. Then, the consumer can find the message to be consumed according to the consumerqueue. Among them, consumequeue (logical consumption queue), as the index of consumption messages, stores the initial physical offset offset, message size and hashcode value of message tag of queue messages under the specified topic in commitlog. The indexfile only provides a method to query messages through key or time interval for message query (PS: this method of finding messages through indexfile does not affect the main process of sending and consuming messages).

(3) Pagecache and MMAP memory mapping
Here, it is necessary to briefly introduce the concept of page cache. The operating system implements all file I / O requests of the system through the page cache mechanism. For the operating system, disk files are composed of a series of data blocks in order. The size of data blocks is determined by the operating system itself. A standard page size in x86 Linux is 4KB.
When processing file I / O requests, the operating system kernel first looks in the page cache (each data block in the page cache is set with file and offset address information). If it misses, it starts disk I / O, loads the data block in the disk file into a free block in the page cache, and then copies it into the user buffer.
The page cache itself will also pre read the data files. For the first read request operation of each file, the system will read the requested page and a few subsequent pages at the same time. Therefore, in order to improve the hit rate of page cache (try to keep the accessed pages in physical memory), from the perspective of hardware, the larger the physical memory, the better. From the operating system level, when accessing the page cache, even if only 1K messages are accessed, the system will pre read more data in advance. The next time the message is read, it is likely to hit the memory.
In rocketmq, the consumequeue logical consumption queue stores less data and is read sequentially. Under the pre reading function of the page cache mechanism, the reading performance of the consumequeue will be relatively high, close to memory, and will not affect the performance even when there is message accumulation. For the log data file stored in the commitlog message, there will be more random access when reading the message content, which will seriously affect the performance. If you select an appropriate system IO scheduling algorithm, such as setting the scheduling algorithm to “NOOP” (if SSD is used for block storage at this time), the performance of random reading will also be improved.
In addition, rocketmq mainly reads and writes files through mappedbytebuffer. Among them, the filechannel model in NiO is used to directly map the physical files on the disk to the memory address in the user state (this MMAP method reduces the performance overhead of traditional IO copying the disk file data back and forth between the buffer in the operating system kernel address space and the buffer in the user application address space), The operation on the file is transformed into direct operation on the memory address, which greatly improves the efficiency of file reading and writing (it should be noted here that there are several limitations in using mappedbytebuffer for memory mapping, one of which is that only 1.5 ~ 2G files can be mapped to the virtual memory in user status at a time, which is why rocketmq sets a single commitlog log data file to 1g by default).

2.3 six load balancing algorithms for rocketmq consumer message queuing

When rocketmq is started, the load balancing thread will be started. The process is as follows:

//Click - > mqclientinstance. Start() above, and rebalanceservice inherits servicethread,
//Servicethread implements the runnable interface
//Continuing to the next level, mqclientinstance. Dorebalance() finds the following
//.. Click in one layer at a time, and finally find the rebalanceimpl.rebalancebytopic method
AllocateMessageQueueStrategy strategy = this.allocateMessageQueueStrategy;

Allocatemessagequeuestrategy is the interface to implement the load balancing algorithm of consumer message queue.
There are six implementation methods for this interface in rocketmq-4.3.0:
Allocatemessagequeueaverage: average algorithm
Allocatemessagequeueaveragelybycircle: Ring averaging algorithm
Allocatemessagequeuebyconfig: load balancing algorithm according to configuration
Allocatemessagequeuebymachineroom: according to the machine room load balancing algorithm
Allocatemessagequeueconsistenthash: consistent hash load balancing algorithm
Allocatemachineroomnearby: near machine room strategy
When the client does not specify, rocketmq uses the allocatemessagequeueaverage average algorithm by default.

2.3.1 allocatemessagequeueaverage average load balancing algorithm

As the name suggests, the average algorithm is to take the average value. The method has four parameters, consumergroup (consumer group name)
Currentcid (current consumer ID), mqall (all message queues under the current topic), cidall (all consumer IDS under the current consumer group). The idea of the algorithm is to calculate the average value and then allocate the continuous queue to each consumer. Assuming that the queue size is 8 (No. 0-7) and the number of consumers is 3 (No. 0-2), the allocation result is: consumer 0: queue 0, 1, 2; Consumer 1: queue 3, 4, 5; Consumer 2: queue 6, 7.

2.3.2 allocatemessagequeueaveragebycircle ring average allocation algorithm

Ring allocation can be regarded as: all consumers form a ring, and then cycle the ring allocation queue. The allocatemessagequeueaveragely method allocates continuous queues evenly, and the ring allocates interval queues. The core code is a for loop, which is also well understood. Suppose there are 8 MQS and 3 consumers. After allocation, the result is {0, 3, 6}, {1, 4, 7}, {2, 5}.



2.3.3 allocatemessagequeuebymachineroom machine room allocation algorithm

Machine room allocation: find out the valid machine room information (i.e. message queue) according to the broker name in MQ, and then divide it equally. The logic of this algorithm is to calculate the average value and remainder first. The difference between this algorithm and the allocatemessagequeue average algorithm is that it first allocates mod (average number) message queues to each consumer, and then allocates the remainder one by one from scratch. Suppose there are 8 MQ and 3 consumers, then the average value mod = 2 and the remainder 2. The distribution method is that each consumer first allocates two MQ, {0, 1}, {2, 3}, {4, 5}, and then the remainder 2 is distributed from scratch, and finally {0, 1, 6}, {2, 3, 7}, {4, 5}.

2.3.4 allocatemessagequeueconsistenthash consistent hash load balancing algorithm

The purpose of consistent hash load balancing is to ensure that the same requests fall on the same server as much as possible. Why as much as possible? Because the server will go online and offline, most requests should not be affected when a few servers change.

Problems of ordinary hash algorithm
The ordinary hash algorithm can be simply understood as hashing the key value and then taking the module of the server, that is, hash (key)% n. At this time, if one of our servers goes down or a new server needs to be added, our n value will change, which will cause all our requests to change. For a simple example, we have a redis cluster with 4 servers deployed. If we use random storage for key1, we may need to traverse 4 servers when looking for key1, which is inefficient. In other words, after the key1 hash operation, take the module and locate it on a server, so that when we find key1, we can quickly locate it on a server. However, there is another problem. As mentioned earlier, if one server is added to our redis cluster or one server is down, the value calculated by the hash algorithm will change and a cache avalanche will occur in a short time.

Consistent hash algorithm
Hash ring: the hash algorithm just now takes the module of the server, and the consistency hash algorithm uses pair 232 modulo, that is, consistent hash organizes the whole hash space into a ring, 0-232-1。
Physical node: hash the server (IP + port) and map it to a node on the ring. When the request arrives, the hash is mapped to the ring according to the requested key, and the nearest server is selected clockwise for the request.
Virtual node: when there are few servers in the ring, there will be uneven distribution, that is, a large number of requests fall on the same server. In order to avoid this situation, virtual nodes are introduced. For example, three virtual nodes are cloned for physical nodes by adding suffixes. If two physical nodes clone three virtual nodes, there will be a total of 8 nodes in the ring. Only the cloned virtual node will eventually be located on the actual physical node, but it can effectively allocate requests.

Compared with ordinary hash, the advantage of consistent hash is that its requests mapped to the ring are sent to the server closest to it on the ring. If one server goes down or a new server is added, the only requests affected are those between this server and the previous server node, and others will not be affected.

2.3.5 allocatemessagequeuebyconfig configure load balancing

There’s nothing to say about this. Custom configuration.

2.3.6 allocatemachineroomnearby approach strategy



2.3 sequential messages

2.3.1 what are sequential messages

Sequential message (FIFO message) is a message type provided by MQ to publish and consume in strict order. Sequential message consists of two parts: sequential publishing and sequential consumption.
There are two types of sequential messages:
Partition order: all messages in a partition are published and consumed in first in first out order
Global Order: all messages in a topic are published and consumed in the order of first in first out

This is the definition of sequential messages on Alibaba cloud. Sequential messages are divided into sequential publishing and sequential consumption.

So is sending messages in multiple threads sequential publishing?
If there is no causality in multithreading, there is no order. Then users sending messages in multiple threads means that users do not care about the order of messages sent in different threads. That is, for messages sent by multiple threads, messages between different threads are not published sequentially, but messages from the same thread are published sequentially. This needs to be guaranteed by users themselves
For sequential consumption, it is necessary to ensure that messages from the same sending thread are processed in the same order (why not say that they should be consumed in one thread?)
Global order is actually a special case of partition order, even if topic has only one partition (global order is not discussed below, because global order will face performance problems, and global order is not required in most scenarios)

2.3.2 how to ensure sequence

In the MQ model, the sequence needs to be guaranteed by three stages:
1. Keep the order when the message is sent
2. When the message is stored, it shall be consistent with the sending order
3. Keep the order of message consumption consistent with that of storage
Keeping order when sending means that for messages with order requirements, users should send them synchronously in the same thread. If the storage order is consistent with the sending order, the messages a and B sent in the same thread must be stored in space before B. The consistency between consumption and storage requires that messages a and B must be processed in the order of a before B after reaching the consumer.



For the original data of messages of two orders: A1, B1, B2, A2, A3, B3 (order of occurrence in absolute time):
When sending, the messages of order a need to keep the order of A1, A2 and A3, and the messages of order B are the same, but the messages of order a and B have no order relationship, which means that the messages of order a and B can be sent in different threads
During storage, the order of messages of orders a and B needs to be guaranteed respectively, but the order of messages between orders a and B can not be guaranteed
A1, B1, B2, A2, A3, B3 are acceptable
A1, A2, B1, B2, A3, B3 are acceptable
A1, A3, B1, B2, A2, B3 are unacceptable
The simple way to ensure the order during consumption is to “do nothing” and do not adjust the order of messages received, that is, as long as the messages of a partition are processed by only one thread; Of course, if a and B are in a partition, they can also be split into different threads after receiving messages, but the benefits should be weighed

2.3.3 implementation of rocketmq sequential message



The above figure is an introduction to the principle of rocketmq sequential messages, which routes messages of different orders to different partitions. The document only gives the processing of producer sequence. When consumer consumes, the message sequence can be guaranteed by only one thread in a partition. The specific implementation is as follows.

Producer side
The only thing the producer needs to do to ensure the message order is to route the message to a specific partition. In rocketmq, the partition selection is realized through messagequeueselector.


  • List < messagequeue > MQS: all partitions under the topic to be sent

  • Message MSG: message object

  • Additional parameters: users can pass their own parameters

    For example, the following implementation can ensure that messages of the same order are routed to the same partition:

long orderId = ((Order) object).getOrderId;
return mqs.get(orderId % mqs.size());

Consumer end
There are two types of rocketmq consumers: mqpullconsumer and mqpushconsumer.
Mqpullconsumer uses the user to control the thread and actively obtain messages from the server. Each time it obtains a message in a messagequeue. The list msgfoundlist in pullresult is naturally consistent with the storage order. Users need to ensure the consumption order after receiving this batch of messages.
For mqpushconsumer, the user registers a messagelistener to consume messages. In the client, it is necessary to ensure the order of messages when calling the messagelistener. The implementation in rocketmq is as follows:



(1) Pullmessageservice gets messages from the broker in a single thread
(2) Pullmessageservice adds messages to processqueue (processmessage is a message cache), and then submits a consumption task to consumemessageorderlyservice
(3) Consumemessageorderlyservice is executed by multiple threads. Each thread needs to get the lock of messagequeue when consuming messages
(4) After getting the lock, get the message from processqueue

The core idea of ensuring consumption order is:
(1) After getting the message, it is added to the processqueue and executed by a single thread, so the messages in the processqueue are sequential
(2) When submitting a consumption task, it is submitted to “consume an MQ once”. This consumption request is to obtain message consumption from processqueue, so it is also sequential (no matter which thread obtains the lock, it is consumed in the order of messages in processqueue)

Relationship between order and exception

The order message requires both producer and consumer to ensure the order. Producer needs to ensure that the message is routed to the correct partition, and the message needs to ensure that there is only one thread message for the data of each partition, so there will be some defects:
(1) The send order message cannot take advantage of the failover feature of the cluster because the messagequeue cannot be replaced for retry
(2) Because of the hot issues caused by the routing policy sent, the data volume of some messagequeues may be particularly large
(3) The number of parallel reads consumed depends on the number of partitions
(4) Cannot skip when consumption fails

If the messagequeue cannot be replaced to retry, it needs to have its own copy. The available copy can be guaranteed through algorithms such as raft and Paxos, or the messagequeue can be stored through other highly available storage devices.
There seems to be no good solution to the hot issue. We can only split the messagequeue and optimize the routing method to distribute messages to different messagequeues as evenly as possible.
Theoretically, consuming parallel reading will not have much problem, because the number of messagequeues can be adjusted.
It is inevitable that the failed consumption cannot be skipped, because skipping may lead to subsequent data processing errors. However, some policies can be provided. The user can decide whether to skip according to the error type, and provide functions such as retry queue. After skipping, the user can re consume this message in “other” places.

2.4 message de duplication

The root cause of message duplication is that the network is unreachable. This problem cannot be avoided as long as data is exchanged through the network. So the way to solve this problem is to bypass it. Then the question becomes: if the consumer receives two identical messages, what should be done?
1. The business logic of the consumer processing the message remains idempotent
2. Ensure that each message has a unique number and that the message processing is successful and appears simultaneously with the log of the de duplication table
Article 1 is easy to understand. As long as idempotency is maintained, no matter how many repeated messages come, the final processing result is the same. The second principle is to use a log table to record the ID of the successfully processed message. If the new message ID is already in the log table, the message will not be processed.
The first solution should obviously be implemented on the consumer side, which is not the function of the message system. Article 2 can be implemented by message system or business end. Under normal circumstances, the probability of duplicate messages is actually very small. If it is implemented by the message system, it will certainly have an impact on the throughput and high availability of the message system. Therefore, it is best to deal with the problem of message duplication by the business side. This is also the reason why rocketmq does not solve the problem of message duplication.

Rocketmq does not guarantee that messages are not repeated. If your business needs to ensure that messages are strictly not repeated, you need to redo them on the business side.
So where is the msgid record? Cache, of course. The specific methods are as follows:
• when the consumer receives a message, it calls the incr method provided by redis, with msgid as the key (unique), and value is incremented from 1 by default.
• when the incr return value is 1, set its expiration time to two minutes later, and the message needs to be consumed.
• when the incr return value is greater than 1, the message is ignored.

public long incr(String key,Date expireTime){
    long count = redisNumber.incre(key);
    return count;

for(MsgExt msg: msgs){
    long currentTime = System.currentTimeMillis();
    currentTime += Constants.MSG_EXPIRES_TIME_MILLS;
    Date expireTime = new Date(currentTime);
    long msgIDCount = redisCacheHelper.incr(msg.getKeys(),expireTime);

2.5 message stacking

The main function of message middleware is asynchronous decoupling. Another important function is to block the front-end data peak and ensure the stability of the back-end system, which requires message middleware to have a certain message accumulation ability. There are two situations for message heap integration:
Messages are accumulated in the memory buffer. Once the memory buffer is exceeded, messages can be discarded according to certain discard strategies, as described in the CORBA notification specification. It is suitable for businesses that can tolerate message discarding. In this case, the message stacking ability mainly depends on the memory buffer size, and the performance will not degrade greatly after message stacking, because the amount of data in memory has a limited impact on the external access ability.
Messages are accumulated in persistent storage systems, such as dB, kV storage, and file records. When a message cannot be hit in the memory cache, it is inevitable to access the disk, resulting in a large number of read io. The throughput of read IO directly determines the access ability after message stacking.

There are four main points to evaluate message stacking capacity:
• how many messages and bytes can be stacked? That is, the stacking capacity of messages.
• whether the message throughput will be affected by message stacking after message stacking.
• whether the normal consumer will be affected after the message is stacked.
• after message accumulation, how much throughput is used to access messages accumulated on disk.

2.5.1 handling of production faults with message backlog

First, find out what causes the message accumulation. Whether it is caused by too many producers and too few consumers or other situations. In short, locate the problem first. Then check whether the message consumption speed is normal. If it is normal, you can temporarily solve the problem of message accumulation by launching more consumers. improve consumption parallelism

Most message consumption behaviors are IO intensive, that is, they may operate the database or call RPC. The consumption speed of this kind of consumption behavior lies in the throughput of the back-end database or external system. By increasing the consumption parallelism, the total consumption throughput can be improved, but when the parallelism increases to a certain extent, it will decrease. Therefore, the application must set a reasonable degree of parallelism. There are several ways to modify the consumption parallelism:

Under the same consumer group, increase the number of consumer instances to improve parallelism (note that consumer instances exceeding the number of subscription queues are invalid). You can add machines or start multiple processes on existing machines. Improve the consumption parallel thread of a single consumer by modifying the parameters consumethreadmin and consumethreadmax. batch consumption

If some business processes support batch consumption, the consumption throughput can be greatly improved. For example, for order deduction applications, it takes 1 s to process one order at a time and only 2 s to process 10 orders at a time. In this way, the consumption throughput can be greatly improved. Set the consumemessagebatchmaxsize parameter of the consumer, which is 1 by default, That is, only one message is consumed at a time. For example, if it is set to N, the number of messages consumed each time is less than or equal to n. skip unimportant messages

When message accumulation occurs, if the consumption speed can not catch up with the sending speed, you can choose to discard unimportant messages

public ConsumeConcurrentlyStatus consumemessage
    (List<messageExt> msgs,ConsumeConcurrentlyContext context){ 
    long offset = msgs.get(0).getqueueOffset(); 
    String maxOffset = msgs.get(0).getProperty(message.PROPERTY_MAX_OFFSET); 
    long diff= Long.parseLong(max0ffset)- offset; 
    If (diff > 10100) {// special handling of message stacking
        return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
    //Normal consumption process
    return ConsumeConcurrentlyStatus.COMSUME_SUCCESS;

As shown in the above code, when the number of messages in a queue accumulates to more than 10100, try to discard some or all messages, and the return sample can quickly catch up with the speed of sending messages.

What if the consumer and queue are not equal, and multiple online servers cannot consume the accumulated messages in a short time?
• prepare a temporary topic, new topic
• the number of queues is several times that of the original queue
• distribution of queues across multiple brokers
• one online consumer acts as a message Porter, moving the messages in the oldtopic to the newtopic without business logic processing
• n online consumers consume the data in newtopic at the same time
• fix bugs
• restore the original consumer and continue to consume the previous topic and oldtopic

2.6 delay message

2.6.1 what is a delay message

Delayed messages are messages that cannot be consumed by consumers immediately after they are sent to the broker. They can only be consumed at a specific time point or after waiting for a specific time. Rocketmq supports delay messages, but does not support arbitrary time accuracy. It supports specific levels, such as timing 5S, 10s, 1M, etc.

2.6.2 use method of delay message

(1) Broker.conf configuration file

#Broker.conf configuration file 
brokerClusterName = DefaultCluster 
brokerName = broker-a 
brokerId = 0 
deleteWhen = 04 
fileReservedTime = 48 
brokerRole = ASYNC_master 
flushDiskType = ASYNC_FLUSH 
#You can set the message delay level 
messageDelayLevel = 1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h

Delay configuration description:
The configuration item configures the delay time of each level from level 1. You can modify the delay time of this specified level; Time unit support: s, m, h, D, which respectively represent seconds, minutes, hours and days;
The default value is declared above and can be adjusted manually; The default value is sufficient. It is not recommended to modify this value.
(2) Set message delay level

//Create message object 
message message = new message("topic-A", "tagB", ("hello" +i).getBytes(RemotingHelper.DEFAULT_CHARSET)); 
//Set message delay level 

2.6.3 usage scenario of delay message

We often shop and know that it takes some time from placing an order to paying. This process involves order service and payment service. When payment is completed, the payment service modifies the database order status to represent order completion. There are often such processing situations in shopping: orders that have not been paid for more than 30 minutes are called overtime orders, and overtime orders must be closed. The traditional approach is to start a scheduled task service, scan the order table every other period of time, query whether there are overtime orders, and then modify the order status to close the order. The disadvantage of this method is that the amount of scan table data is large, which causes great pressure on the database. We can use delayed messages to improve the execution efficiency of this requirement. First, send the order to rocketmq after the customer places the order. The message contains the order number and sets the delay time of 30 minutes. Then add an order timeout service. The order timeout service subscribes to the delay message and processes the order table.



2.7 transaction messages

Rocketmq supports not only ordinary messages and sequential messages, but also transaction messages. First, let’s discuss what transaction messages are and the need to support transaction messages.

2.7.1 related concepts

Based on its message definition, rocketmq extends two related concepts to transaction messages:
1. Half (prepare) message – half message (preprocessing message)
Semi message is a special message type. Messages in this state cannot be consumed by consumers temporarily. When a transaction message is successfully delivered to the broker, but the broker does not receive the secondary confirmation sent by the producer, the transaction message is in the “temporarily unavailable for consumption” state, and the transaction message in this state is called semi message.
2. Message status check – message status check
The secondary confirmation message sent by the producer to the broker may not be delivered successfully due to network jitter, producer restart and other reasons. If the broker detects that a transaction message is in the semi message state for a long time, it will actively initiate a backcheck operation to the producer side to query the transaction state (commit or rollback) of the transaction message on the producer side. It can be seen that message status check is mainly used to solve the timeout problem in distributed transactions.

2.7.2 execution process



The above is the transaction message execution flow chart provided on the official website. The following is an analysis of the specific process:
1. Step 1: the producer sends a half message to the broker;
2. Step 2: broker ACK, half message is sent successfully;
3. Step3: the producer executes local transactions;
4. Step4: after the local transaction is completed, the producer sends a secondary confirmation message to the broker according to the transaction status to confirm the commit or rollback status of the half message. After the broker receives the secondary confirmation message, for the commit status, it will directly send it to the consumer side to execute the consumption logic, while for the rollback, it will be directly marked as failed, cleared after a period of time, and will not be sent to the consumer. Under normal circumstances, this distributed transaction has been completed, and the rest to be handled is the timeout problem, that is, the broker still does not receive the secondary confirmation message from the producer after a period of time;
5. Step5: for the timeout status, the broker actively initiates a message query to the producer;
6. Step6: the producer processes the query back message and returns the execution result of the corresponding local transaction;
7. Step 7: the broker performs a commit or rollback operation on the result of the query back message, the same as step 4

2.7.3 actual cases

Let’s take a transfer scenario as an example to illustrate this problem: Bob transfers 100 yuan to Smith.
In the figure, the execution of local transactions (Bob account deduction) and sending asynchronous messages should be successful or failed at the same time, that is, if the deduction is successful, the message must be sent successfully. If the deduction fails, the message cannot be sent again. The question is, shall we deduct the money first or send a message first?

Let’s start by sending a message. The schematic diagram is as follows:



The problem is: if the message is sent successfully but the deduction fails, the consumer will consume the message and add money to Smith’s account.

If you can’t send a message first, let’s deduct the money first. The general diagram is as follows:



The existing problems are similar to the above: if the deduction is successful and the message fails to be sent, Bob will deduct the money, but the Smith account does not add money.

There may be many ways to solve this problem. For example, directly put the message into Bob’s deduction transaction. If the sending fails, throw an exception and roll back the transaction. This treatment also conforms to the principle that “just” does not need to be solved.

Rocketmq supports transaction messages. Let’s see how rocketmq is implemented.



Rocketmq will get the address of the message when sending the prepared message in the first stage, execute local transactions in the second stage, and access the message and modify the status through the address obtained in the first stage in the third stage. Careful, you may find the problem again. What if the confirmation message fails to be sent? Rocketmq will regularly scan the transaction messages in the message cluster. When the prepared message is found, it will confirm to the message sender whether Bob’s money has been reduced or not. If so, whether to rollback or continue to send the confirmation message. Rocketmq will decide whether to rollback or continue to send the confirmation message according to the policy set by the sender. This ensures that the message sending and the local transaction succeed or fail at the same time.

3 high reliability

3.1 rocketmq availability

Multi master deployment to prevent single point of failure



3.2 rocketmq reliability

3.2.1 message sending

The send method of producer supports internal retry. The retry logic is as follows:
1. Retry at most 3 times.
2. If the sending fails, the round turns to the next broker.
3. The total time-consuming of this method does not exceed the value set by sendmsgtimeout, which is 10s by default.
Therefore, if the message sent to the broker generates a timeout exception, it will not be retried.
The above strategies still cannot guarantee that the message will be sent successfully. In order to ensure that the message will be sent successfully, it is recommended to do this: if calling the send synchronization method fails to send, try to store the message in dB, and the background thread will retry regularly to ensure that the message will reach the broker.

3.2.2 broker services

All messages sent to the broker have synchronous disk brushing and asynchronous disk brushing mechanisms. In general, the reliability is very high
When the disk is synchronized, the message is written to the physical file to return success, so it is very reliable
During asynchronous disk scrubbing, message loss occurs only when the machine goes down. Broker hang up may occur, but machine downtime and crash rarely occur unless there is a sudden power failure

3.2.3 message consumption

Consumption and storage structure of rocketmq



Under normal conditions, P sends a message to the broker, the message content is written to the commitlog, the location information (index) of the message content in the commitlog is written to the consumerqueue, and C reads the content consumption message of the consumerqueue.
CONSUME_ Success indicates that the consumption is successful, which is the status returned in the normal business code.
RECONSUME_ Later indicates that the current consumption failed and needs to be retried later.
In rocketmq, only the business consumer side returns the consult_ Success will consider that the message consumption is successful. If reconsume is returned_ Later and rocketmq will think that the consumption has failed and need to be redelivered.
In order to ensure that the message is successfully consumed at least once, rocketmq will send the message that the consumption failed back to the broker and deliver it to the consumer again at a next time point (10 seconds by default, modifiable). If repeated messages fail all the time, the message will be delivered to the server after the failure accumulates to a certain number of times (16 times by default)Dead letter queueAt this time, it is necessary to monitor the dead letter queue for manual intervention.

Recommended Today

SQL exercise 20 – Modeling & Reporting

This blog is used to review and sort out the common topic modeling architecture, analysis oriented architecture and integration topic reports in data warehouse. I have uploaded these reports to GitHub. If you are interested, you can have a lookAddress: recorded a relatively complete development process in my hexo blog deployed on GitHub. You can […]