Rabbitmq/ high parallel face-to-face questions


From: PHP Chinese website

1. what is rabbitmq

A message queuing technology using AMQP advanced message queuing protocol. Its biggest feature is that the consumer does not need to ensure the existence of the provider, realizing a high degree of decoupling between services

2. why use rabbitmq

  1. Under the distributed system, it has a series of advanced functions such as asynchronous, peak shaving, load balancing, etc;
  2. With a persistence mechanism, process messages and information in the queue can also be saved.
  3. Realize decoupling between consumers and producers.
  4. For high concurrency scenarios, using message queue can change synchronous access into serial access to a certain amount of flow limit, which is conducive to database operation.
  5. The message queue can be used to achieve the effect of asynchronous order placing. In the queue, the logical order is placed in the background

3. scenarios using rabbitmq

  1. Asynchronous communication between services
  2. Sequential consumption
  3. Scheduled task
  4. Request peak shaving

4. how to ensure that messages are sent to rabbitmq correctly? How to ensure that the message receiver consumes the message?

Sender confirmation mode

If the channel is set to confirm mode, all messages published on the channel will be assigned a unique ID.
Once the message is delivered to the destination queue or written to the disk (a message that can be persisted), the channel will send an acknowledgement to the producer (including the unique ID of the message).
If rabbitmq has an internal error that causes the message to be lost, a NACK (not acknowledged) message will be sent.
The sender confirmation mode is asynchronous, and the producer application can continue to send messages while waiting for confirmation. When the acknowledgement message arrives at the producer application, the callback method of the producer application is triggered to process the acknowledgement message.

Receiver confirmation mechanism

Receiver message acknowledgement mechanism

After receiving each message, the consumer must confirm it (message reception and message confirmation are two different operations). Rabbitmq can safely delete a message from the queue only after the consumer confirms the message.
The timeout mechanism is not used here. Rabbitmq only confirms whether to resend the message through the connection interruption of the consumer. That is, as long as the connection is not interrupted, rabbitmq gives the consumer enough time to process messages. Ensure the final consistency of data;

Several special cases are listed below

If a consumer receives a message and disconnects or unsubscribes before confirmation, rabbitmq will assume that the message has not been distributed, and then redistribute it to the consumer of the next subscription. (there may be a hidden danger of repeated consumption of messages, which needs to be de duplicated)
If a consumer receives a message but does not confirm the message, and the connection is not disconnected, rabbitmq thinks that the consumer is busy and will not distribute more messages to the consumer.

5. how to avoid repeated message delivery or consumption?

During message production, MQ generates an inner MSG ID for the messages sent by each producer as the basis for de duplication (message delivery fails and retransmission) to avoid duplicate messages from entering the queue;

During message consumption, a bizid (globally unique for the same business, such as payment ID, order ID, post ID, etc.) must be used as the basis for de duplication in the message body to avoid repeated consumption of the same message.

6. what transmission is the message based on?

The creation and destruction of TCP connections are expensive, and the number of concurrent connections is limited by system resources, which will cause performance bottlenecks. Rabbitmq uses channels to transmit data. A channel is a virtual connection established in a real TCP connection, and the number of channels on each TCP connection is unlimited.

7. how are messages distributed?

If at least one consumer subscribes to the queue, the message will be sent to the consumer in a round robin manner. Each message will be distributed to only one subscribed consumer (provided that the consumer can process the message normally and confirm it). Multi consumption can be realized through routing

8. how to route messages?

When message provider – > Routing – > one or more queued messages are published to the exchanger, the message will have a routing key, which is set when the message is created. The queue can be bound to the switch through the queue routing key.

After the message arrives at the switch, rabbitmq will match the message’s routing key with the queue’s routing key (there are different routing rules for different switches);

The commonly used exchangers are mainly divided into the following three types

·Fanout: if the exchange receives a message, it will broadcast it to all bound queues

·Direct: if the routing keys match exactly, the message will be delivered to the corresponding queue

·Topic: enables messages from different sources to reach the same queue. When using the topic switch, you can use wildcards

9. how to ensure that messages are not lost?

Message persistence. Of course, the premise is that the queue must persist rabbitmq. The way to ensure that persistent messages can be recovered from server restart is to write them to a persistent log file on the disk. When a persistent message is published to the persistent switch, rabbit will send a response after the message is submitted to the log file.

Once a consumer consumes a persistent message from the persistence queue, rabbitmq marks the message in the persistence log as waiting for garbage collection. If rabbitmq restarts a persistent message before it is consumed, it will automatically rebuild the switch and queue (and binding), and republish the messages in the persistent log file to the appropriate queue.

10. what are the benefits of using rabbitmq?

  1. High decoupling between services

  2. High asynchronous communication performance

  3. Flow peak shaving

11. Rabbitmq cluster

Mirror cluster mode

The queue you create, whether the metadata or the messages in the queue, will exist on multiple instances. Then each time you write a message to the queue, the message will be automatically sent to the queues of multiple instances for message synchronization.

The advantage is that if any one of your machines goes down, all the other machines can be used. The disadvantages are: first, the performance overhead is too large. Message synchronization among all machines leads to heavy network bandwidth pressure and consumption! Second, there is no scalability in this way. If a queue is heavily loaded, you add machines, and the new machines also contain all the data of the queue. There is no way to linearly expand your queue

Disadvantages of 12.mq

Reduced system availability

The more external dependencies the system introduces, the easier it will be to hang up. Originally, it would be better if you were system a calling the interfaces of the three BCD systems. The four ABCD systems are fine. There is no problem. You would prefer to add an MQ. What if MQ hangs up? MQ hangs up, the whole system crashes, and you are finished.

Increased system complexity

How can you ensure that messages are not consumed repeatedly by adding MQ? How to deal with message loss? How to ensure the sequence of message delivery? Big head, big head, a lot of problems, endless pain

13. consistency issues

After the system a finishes processing, it returns successfully. People think that your request is successful; But the problem is, what if the three BCD systems and the two BD systems succeed in writing the database, but the C system fails to write the database? Your data is inconsistent.

So message queuing is actually a very complex architecture. You have many advantages when you introduce it, but you also have to make various additional technical solutions and architectures to avoid the disadvantages it brings. At best, after that, you will find that the system complexity has increased by an order of magnitude, perhaps 10 times as complex. But at the critical moment, it still has to be used

14. distributed transactions

Submit in segments. There will be an arbiter and messages will be sent to all nodes. Only after all the nodes have ack can they succeed. Otherwise, you have to wait for a resend.

15. how to design for the sudden large traffic of live broadcast.

  1. Nginx plus machine

  2. CDN cache static page

  3. Redis queue. Let users come in slowly.

  4. Add cache. Cache user data, such as user information.

  5. Database using master-slave

  6. Elastic expansion

  7. Current limiting fuse

This work adoptsCC agreement, reprint must indicate the author and the link to this article