Reliable message final consistency (asynchronously guaranteed)

Time:2021-7-17

preface

Consistent design is an important issue in distributed system. If a system uses multiple subsystems to store and read data at the same time, it must design a consistent definition to meet the functional requirements. If the operation results of different data subsystems are inconsistent, it may not only confuse users, but also cause more serious data problems or system errors. There are many levels of consistency, which are suitable for different business scenarios. For industries such as finance that require high data consistency, traditional transactions can provide high consistency guarantee. It is also acceptable to sacrifice a certain degree of strong consistency for better user experience in distributed systems with high performance and availability requirements.

I wrote a note about TCC transaction before, and also understood the causes of distributed transaction and some solutions. This time, I will summarize the relevant contents of the design ideas of the final consistency scheme with you;

scene

The concept of message sending consistency refers to the consistency between the business actions that generate messages and message sending.

In other words, if the business operation is successful, the message generated by the business operation must be successfully delivered, otherwise the message will be lost

The final consistency can be used in the following function scenarios:

  • Accounting asynchronous bookkeeping business of corresponding payment system
  • Ordinary points account to increase points of service

In other words, data systems with final consistency usually do not require rollback when data operations fail. The user or system log will know that the operation failed, but the data inconsistency will not be automatically repaired until another successful operation.

PS: there must be some small partners who have doubts. When I execute the program, the code reports an error, which leads to the failure of the final consistent solution. What should I do??? Do you have many children???

If the code reports an error, it indicates that there is a problem with your business code, not the final consistency solution.

Implementation process

The final consistency can be achieved with the help of message middleware, message queue and other tools, and different technical solutions need to be customized according to their own business;

What we want to introduce is a reliable message service system based on rabbitmq to complete transaction execution. The specific flow is as follows:
Reliable message final consistency (asynchronously guaranteed)

  1. The active application first sends the message to the message middleware, and the message status is marked as “to be confirmed”;
  2. After receiving the message, message middleware will persist the message to the message storage, but will not deliver the message to the passive application;
  3. The message middleware returns the message persistence result (success / failure). According to the returned result, the active application determines how to process the business operation

    • Failure: abandons the business operation processing and ends (returns the failure result to the upper layer if necessary);
    • Success: execute business operation processing;
  4. After the business operation is completed, the business operation result (success / failure) is sent to the message middleware;
  5. After receiving the business operation result, the message middleware processes it according to the business result;

    • Failure: delete the message in the message store, end;
    • Success: update the message status in the message store to “to be sent (can be sent)”, and then execute message delivery;
  6. The passive application monitors and receives messages in the “to be sent” state, and performs business processing;
  7. After business processing, send ack to message middleware to confirm that the message has been received (message) middleware will delete the message from the queue)

In addition to the above processes, the message system should also provide ackmsg message confirmation service and message status query service.

Exception handling process

Reliable message final consistency (asynchronously guaranteed)

The angle of initiative

Reliable message final consistency (asynchronously guaranteed)

Reliable message final consistency (asynchronously guaranteed)

From the perspective of Middleware

Reliable message final consistency (asynchronously guaranteed)
Reliable message final consistency (asynchronously guaranteed)

Summary and treatment of abnormal situation

Reliable message final consistency (asynchronously guaranteed)

Implementation of the scheme

Reliable message final consistency (asynchronously guaranteed)

Composition of message system

  1. Message service subsystem:

Is the most important subsystem, it receives and stores pre sent messages, and provides further confirmation function. Generally, the following interface services need to be implemented.

  • Store pre sent messages (active application system)
  • Confirm and send message (active application system)
  • Query status confirmation timeout message (message status confirmation subsystem)
  • Confirm that the message has been successfully consumed (passive application system)
  • Query the message of consumption confirmation timeout (message recovery subsystem)
  1. Message management subsystem:

It provides a visual management interface to query and manage the data in reliable message service system. For example, you can view dead messages and manually resend them through the interface

  1. Message status confirmation subsystem:

Provide the handling of abnormal situation. When the message service subsystem receives and saves the pre sent message, but due to the abnormal situation, it does not receive the confirmation message, the message can not be kept in the database all the time. In this case, the message status confirmation subsystem needs to periodically retrieve the data to be confirmed, and call the business query interface in the active application system to check and confirm. Decide whether to send a message or delete data according to the check result.

  1. Message recovery subsystem:

If the message data has received business confirmation, this kind of business confirmed message must be sent to MQ and successfully consumed by the consumer. It must not be lost. The message recovery subsystem periodically fetches the time-out messages whose status is “sending” but not confirmed by consumption, and resends them.

  1. Real time message service subsystem (MQ)

The consumer listener program receives the MQ message. After successful processing, it calls the interface of the message service subsystem and confirms that the message has been successfully consumed and can be deleted.

Overall process

  1. When the user places an order, the active application pre sends the message to the message service subsystem.
  2. The message service subsystem stores pre sent messages.
  3. Returns the result of storing the pre sent message.
  4. If the result returned in step 3 is successful, execute the business operation; otherwise, do not execute it.
  5. After the business operation is successful, the message service subsystem is called to confirm the sending message.
  6. Send the pre sent message stored in the message service library, and update the status of the message to sent (but not consumed).
  7. Message middleware sends messages to consumer applications.
  8. The consumer application calls the passive application service.
  9. The passive application returns the result to the consumer application.
  10. The consumer application will ack the message to the message middleware and confirm the successful consumption of the message to the message service subsystem,
  11. Let the message service subsystem delete the message or set the status to consumed successfully.
  12. The message status subsystem regularly checks the message data to see if there is a timeout message in the sent status, that is, the message that has not been successfully consumed. The active application system should provide a query interface to query whether the business data corresponding to a message is successfully processed
  13. If the business data is in the status of successful processing, the confirmation is called again and the message is sent, that is, step 6.
  14. If the business data processing fails, the message service subsystem is called to delete the message data.

Thanks for watching


This article is the note arrangement in the learning process. If there is something wrong, please contact me and correct it in time to avoid misleading children’s shoes. Thank you for watching children’s shoes patiently. I hope this article will help you. Later, I plan to write a demo in hyperf. If you are interested, please pay attention to my GitHub.