The big move of “scalability” of distributed system — detailed explanation of “elastic architecture”


If you see my article for the second time, welcomeUnderneath scan code subscribe to my official account number (cross border architect)Yo!
The length of this paper is3633Words, recommended reading10Minute.
Adhere to the original, every article is the work of the heart ~

If our development work is really like building blocks, it’s good to have a clear outline, separate each block and replace the damaged one.

However, in fact, we may face only one block in our work, and it is still a large block. We need to change it together and repair it together.

Z brother mentioned in the previous “distributed system focus (13) -” high cohesion and low coupling “detailed explanation” that we can consciously do some segmentation, but the difficulty of replacement and repair depends on the granularity of segmentation.

Is there a better way? This is obvious.

Event driven architecture

Let’s look at this problem in a different way.

Whether it’s a normal system upgrade, bug fix, or capacity expansion, it’s actually an operation. Through this “operation” to solve the current problems.

So layered architecture is just like a person’s hands, feet, mouth, nose and so on, but the whole is closely coupled.

How is it coupled? We are connected by the flow of “blood”. This is like connecting different nodes through RPC framework in a distributed system.

But software and people are different, there are two different ways to connect, in addition to the “synchronous” way, there are also “asynchronous” way. Because sometimes you don’t need to know the execution results of other systems, just make sure you pass the data you need to it.

There happens to be an architecture that is typical of this pattern – event driven architecture.

The common MQ and local message table are used in the data transfer link, which is the reflection of the idea of event driven architecture.

Event driven architecture is subdivided into two typical implementation methods, which are similar to the two implementation methods of Saga mode mentioned in brother “affairs” of “distributed system concerns (3) -” consensus “, one is centralized and the other is decentralized.

Let’s take brother Z as an example to make it easier for you to understand. (the example is only to explain how it works. In the real implementation, we need to consider how to ensure data consistency and other issues. In this part, we can refer to the previous series of articles, the end of the article is the belt conveyor.)

In the traditional e-commerce scenario, after users click the “submit” button from the shopping cart, they need to do at least these things: generate an order, generate a payment record, and match the order with the delivery company.

What’s the difference between centralization and decentralization in this scenario?


This model has a “God.”.

The big move of

But God doesn’t handle and doesn’t know any business logic, it just orchestrates events.

Besides centralization, what are its characteristics? Brother Z defines it as “3 + 2 Structure”.

There are three types of subjects in this model: event producer, “God” (mediator), and event handler. Then there are two layers of queues in the middle, which can be decoupled.

Like this:Event producer > queue > God (mediator) – > queue > event handler

In the above example, the event producer cartservice issues an “order creation” event, which is passed to the mediator through the queue. Then the mediator transforms the event accordingly according to the pre compiled rules, and also makes the two distribution through the queue, and sends it to the event handler.

Maybe you will ask, these are easy to understand. However, I have often seen what choreography is, and how to do it?

actuallyThe choreographer mainly does two things: “event conversion” and “event sending”(corresponding to the call of the service orchestration class framework).

The essence of event conversion is to assign values to the parameters of the event object to be sent。 Where does the assigned data come from?

In addition to the parameters brought in by the source of the event, there is also a continuous accumulation of “context”, such as a shared storage space in the figure below.

The big move of

You may ask, how can I combine multiple event handlers into one context?

A global unique identifier can be used. Each time an event is dropped to God, the global unique identifier will be brought in.

Extras: Generally speaking, there will be a unique internal “sub serial number” under a global unique identifier, in order to cooperate with the next “event sending”.

One is to clearly know which upstream system the exception generated by this call comes from when troubleshooting.

The second is to facilitate the observation of whether the entire call logic meets the expectations of the orchestration.

How to do it? Pass a sequence number in the format of x.x.x.x. For example, serial is 1, 2, 3. Branch and parallel are 2.1, 2.2. The combination of branch and serial is 1, 2, 2.1, 2.2, 3.

The essence of “event sending” is to be responsible for the logical control of event flow, and then send it to the “event handler” for handling。 Does it decide whether to proceed in sequence or in branches? Serial or parallel?

The big move of

It will not be expanded until now, otherwise it will be off topic. We will talk about this part in detail next time.

Again,“Event conversion” and “event sending” are the two basic functions that you need to meet when you realize the “God” (mediator) functionOh

The biggest advantage of centralization is to make the process more “visible”, and it is also easier to do some monitoring things. The larger the system scale, the more obvious the effect of this advantage.

But a basic “God” (mediator) implementation needs to consider data consistency, so it will greatly increase the complexity of its implementation.

Therefore, if you are facing a scenario where the business is not particularly large and relatively stable, it may be a good choice to use decentralization.

De centralization

Since there is no God in this mode, each event handler needs to know what his next event handler is? What parameters are required? And the queue.

But the overall structure will become much simpler, from “3 + 2 Structure” to “2 + 1 Structure”.

The big move of

The complexity behind structural simplification has gone to the business code written by the event handler developers. Because he needs to be responsible for “event conversion” and “event sending”.

Well, after being transformed into an event driven architecture, the system will indeed run more smoothly through the decoupling of “queues” and asynchronous event flow.

But sometimes you may want to have more fine-grained control, because in general, a service will handle many business links, not only one external interface and one business logic.

In this case, most of the time you may need to modify only one of the interfaces. Can you modify only a part of the code and “hot update”?

Microkernel architecture (plug-in architecture) is suitable to solve this problem.

Microkernel architecture

As the name suggests, the key to microkernel architecture is the kernel. So we need to find out what the kernel is first? The other parts are then treated as “detachable” parts.

For example, for us, the brain is the core, and everything else can be changed. After changing, you are still you, but the brain is not you.

Microkernel architecture is composed of two parts: core system and plug-in module.

The big move of

The core system also contains microkernel, plug-in modules, and some built-in default functions in the form of plug-ins.

Among them, microkernel is mainly responsible for plug-in life cycle management and control plug-in modules.

The plug-in module is responsible for the loading, replacement and unloading of plug-ins.

If the external plug-ins can be accessed and run smoothly, it is necessary to have an implementation that meets the standard interface specifications.

The standard interface of a plug-in has at least two methods that need to be implemented by specific plug-ins:

public interface IPlugin{

    /// <summary>
    ///Initialize configuration
    /// </summary>
    void InitializeConfig(Dictionary<string,string> configs);
    /// <summary>
    // / operation
    /// </summary>
    void Run();

Finally, plug-ins are independent of each other, but the core system knows where to find them and how to run them.

Best practices

Knowing these two “elastic” architecture patterns, how can you judge when you need to move them out?

Brother Z takes you to analyze the advantages and disadvantages of each architecture, and you can find the applicable scenarios.

Event driven architecture

Its advantages are:

  1. Through “queue” to decouple, the system can be on-line in the face of rapidly changing demand without affecting the upstream system.
  2. Because “event” is an independent “standardized” communication carrier, it can be used to connect various cross platform and multi language programs. If there is additional persistence, it is also convenient for subsequent troubleshooting. At the same time, the “event” can be replayed repeatedly, and the throughput of the processor can be tested more realistically.
  3. More “dynamic” and better fault tolerance. It can be very easy to integrate, re integrate, re configure new and existing event handlers at a low cost, or remove event handlers easily. Easily expand and shrink.
  4. In the “God” mode, there can be a “visible” control over the business, and it is easier to find out the unreasonable or neglected problems in the process. At the same time, it can standardize some technical details, such as the implementation of “data consistency”.

Its disadvantages are:

  1. In the face of unstable network problems and various exceptions, it takes a lot of energy and cost to deal with them to ensure consistency.
  2. It can’t be like synchronous call. After the operation is successful, you can see the latest data. You need to tolerate the delay or do some extra processing on the user experience.

Then, it is applicable to the following scenarios:

  • Scenes with low real-time requirements.
  • There are a large number of cross platform, multi language heterogeneous environments in the system.
  • A scenario that aims to maximize program reuse.
  • Flexible business scenarios.
  • Scenes that need to be expanded and shrunk frequently.

Microkernel architecture

Its advantages are:

  1. It provides convenience for progressive design and incremental development. You can implement a solid core system first, and then gradually add functions and features.
  2. Like event driven architecture, it can also avoid single component failure, resulting in system crash and good fault tolerance. The kernel only needs to restart this component without affecting other functions.

Its disadvantages are:

  1. Due to the small size of the main microkernel, it is impossible to optimize the whole system. Each plug-in is managed by its own and may even be maintained by a different team.
  2. Generally speaking, in order to avoid the complexity explosion in a single application, the plug-in nested plug-in mode is rarely enabled, so the code reuse in plug-ins will be worse.

Then, it is applicable to the following scenarios:

  • Can be embedded or part of other architectural patterns. For example, in the event driven architecture, God’s event transformation can be implemented using the microkernel architecture.
  • Although the business logic is different, it runs in the same scenario. For example, regular tasks and job scheduling applications.
  • A scenario with clear incremental development expectations.


OK, let’s summarize.

This time, brother Z introduced to you two implementation modes and ideas of “event driven architecture”, as well as the implementation ideas of “microkernel architecture”.

The advantages and disadvantages of the two architecture patterns and the best practice of scenario analysis are presented.

Hope to enlighten you.

Related articles:

  • Focus of distributed system (1) — initial understanding of data consistency
  • Distributed system focus (2) – data consistency through “consensus”
  • Distributed system focus (3) – brother “business” of “consensus”

By Zachary


If you like this article, you can click “at the end of the articleFabulous」。

This will give me some feedback. )

Thank you for your help.

A kind of About the author: Zachary (personal micro signal: Zachary ZF). Adhere to the intention of polishing each piece of high-quality original. Welcome to scan the QR code ~ below.

Publish original content on a regular basis: architecture design, distributed system, product operation and some thoughts.

If you are a junior programmer, want to improve but don’t know how to start. Or as a programmer for many years, I fell into some bottlenecks and wanted to broaden my vision. Welcome to my official account.Cross border architect“, reply”technologySend you a mind map that I have collected and sorted out for a long time.
If you’re operating, there’s nothing you can do in the face of changing markets. Or want to understand the mainstream operation strategy to enrich their own “warehouse”. Welcome to my official account.Cross border architect“, reply”OperateSend you a mind map that I have collected and sorted out for a long time.

The big move of

Recommended Today

Start with Hotspot source code from Thread.start

native start0 is traced to the hotspot source code private void native start0(); The principle of native is to call JNI, and the convention of Hotspot source code is, usually one corresponds to one Xxx.c, Here are three examples: Java class Path relative to the OpenJDK source java.lang.Thread jdk/src/share/native/java/lang/Thread.c java.lang.String jdk/src/share/native/java/lang/String.c java.lang.System jdk/src/share/native/java/lang/System.c So […]