The actor model is so excellent under distributed high concurrency

Time:2021-1-26

Write at the beginning

In general, there are two strategies for communicating in concurrent threads: shared data and messaging. One of the biggest problems of concurrent programming with shared data is data condition competition. It’s a headache to deal with all kinds of locks.

Most of the traditional popular language concurrency is based on shared memory between multithreads. Synchronous method is used to prevent write contention. Actors use message model. Each actor can process up to one message at the same time and send messages to other actors, which ensures the principle of writing separately. Thus, multi thread write contention is avoided. Compared with data sharing, the biggest advantage of message passing mechanism is that there is no data competition. There are two common types of message delivery: message delivery based on channel (represented by golang) and message delivery based on actor (represented by Erlang).

About actor

The actor model was first defined by Carl Hewitt in 1973 and extended by Erlang OTP. Its message passing is more in line with the original object-oriented intention. Actor belongs to the concurrent component model, which defines the advanced stage of concurrent programming paradigm by means of components to avoid users’ direct contact with basic concepts such as multi-threaded concurrency or thread pool.

Actor model = data + behavior + message.

Actor model is a general concurrent programming model, which is not owned by a certain language or framework. It can be used in almost any programming language. Erlang is the most typical one. It provides the support of actor model at the language level. The killer application rabbitmq is developed based on Erlang.

More object oriented

Actors are similar to objects in object-oriented programming (OO). Each actor instance encapsulates its own state and is physically isolated from other actors. Take a game player as an example. Each player is an instance of player in the actor system. Each player has its own attributes, such as ID, nickname, and attack power. The code level is not much different from our OO code. There are many OO instances in the system memory level

 class PlayerActor
    {
        public int Id { get; set; }
        public string Name { get; set; }
    }

No lock

When using Java / C #, we need to pay special attention to a series of thread problems such as lock and memory atomicity. The internal state of actor model is maintained by itself, that is, its internal data can only be modified by itself (state modification through message passing). Therefore, using actors model for concurrent programming can avoid these problems. Actor is internally executed in a single thread mode, similar to redis, so actor can completely implement similar applications of distributed lock.

asynchronous

Each actor has a dedicated mailbox to receive messages, which is the basis of asynchronous implementation of actor. When an actor instance sends a message to another actor, it does not directly call the actor method, but delivers the message to the corresponding mailbox, just like a postman. Instead of sending the message directly to the recipient, it puts it into each mailbox, so that the postman can quickly carry out the next work. So in the actor system, it is very fast for the actor to send a message.

The actor model is so excellent under distributed high concurrency

The main advantage of this design is that it decouples the actors. Tens of thousands of actors run concurrently. Each actor runs at its own pace, and sends and receives messages without being blocked.

quarantine

Each actor instance maintains its own state and is physically isolated from other actor instances. It is not based on shared data like multithreading + lock mode. Actors communicate with other actors through message mode, which is different from OO message delivery mode. Message delivery between actors is a real physical message delivery.

Naturally distributed

The location of each actor instance is transparent, and whether the actor address is local or remote, it is the same for the code. The instance of each actor is very small, up to a few hundred bytes, so it’s very easy to create hundreds of thousands of actor instances on a single machine. If you have written golang code, you will find that actor is very similar to gorutine in heavyweight. Because of the location transparency, the actor system can expand horizontally to cope with concurrency at will. For the caller, the location of the called actor is local. Of course, it also benefits from the powerful routing system of the actor system.
The actor model is so excellent under distributed high concurrency

life cycle

Each actor instance has its own life cycle, just like the GC mechanism in C # Java, for the actors that need to be eliminated, the system will destroy and release memory and other resources to ensure the continuity of the system. In fact, in the actor system, the destruction of the actor can be manually intervened, or the system can be automatically destroyed.

fault-tolerant

When it comes to actor’s fault tolerance, I have to say it’s quite surprising. The traditional programming method is to catch the exception in the future to ensure the stability of the system, which is called defensive programming. But defensive programming also has its own shortcomings, similar to reality, the defensive side can never 100% defend all possible future code defects. For example, many places in Java code are full of judging whether the variable is nil. These are the most typical cases of defensive coding. However, the program of actor model does not carry out defensive programming, but follows the philosophy of “let it crash”, and let the managers of actor deal with these crash problems. For example, after an actor crashes, the manager can choose to create a new instance or log. The crash or exception information of each actor can be fed back to the manager, which ensures the flexibility of the actor system in managing each actor instance.

inferiority

There is no perfect language, and so is the framework / model. As a kind of concurrency model in distributed environment, actor has its disadvantages.

  1. Since the same type of actor objects are scattered among multiple hosts, it is a weakness to take the set of multiple actors. For example, in the e-commerce system, as a kind of actor, goods query a list of goods in most cases through the following process: first, select a series of commodity IDS according to the query criteria, and get the list of commodity actors according to the commodity ID (it is likely to produce a commodity search service, whether using ES or other search engines). If the volume is very large, there is a risk of network storm (although the probability is very small). In the case that the real-time requirement is not too high, in fact, the list of commodity actors can also be independent, and MQ can be used to receive the signal of commodity information modification to deal with the problem of data consistency.
  2. In many cases, in distributed systems based on actor model, cache is likely to be in-process cache, that is to say, each actor actually keeps its own state information in the process, which is usually called stateful service in the industry. But each actor has its own life cycle. Will it cause problems? Hehe, maybe. Think about it. Take commodities as an example, If the environment is a non actor concurrent model, the commodity cache can use the LRU strategy to eliminate the inactive commodity cache to ensure that the memory will not be used too much. If it is an in-process cache based on the actor model, each actor is actually the cache itself, so it is not easy to use the LRU strategy to ensure the memory usage, because the active state of the actor is unknown to you Yes.
  3. The problem of distributed things is actually a problem faced by all distributed models, not because of actor. Take commodity actor as an example. When adding a commodity, commodity actor and statistical commodity actor (in many cases, they are designed as two kinds of actor services) need to ensure the integrity of things and the consistency of data. In many cases, real-time consistency can be sacrificed to ensure final consistency.
  4. The mailbox of each actor may be piled up or full. When this happens, will the new message be discarded or waiting? So when designing an actor system, the design of the mailbox needs to pay attention to.

Sublimation

  1. From the above, since the actor is transparent to the location, any actor is as if it is local to other actors. Based on this feature, we can do a lot of things. In the traditional distributed system, if server a wants to communicate with server B, either RPC call (HTTP call is not commonly used) or MQ system is used. But in the actor system, the communication between servers has become very elegant. Although it is also RPC call in essence, it is like calling local functions for coders. In fact, streaming is more popular now.
  2. Because the execution model of actor system is single thread and asynchronous, any similar function with resource competition is very suitable for actor model, such as seckill activity.
  3. Based on the above introduction, actor model naturally supports load balancing at the design level, and supports horizontal expansion very well. Of course, actor’s distributed system also needs a service registry.
  4. Although actor is a single thread execution model, it doesn’t mean that every actor needs to occupy one thread. In fact, the task executed on actor, like goroutine of golang, can be a lightweight thing, and all actors on a host can share a thread pool, which ensures the maximum quantification of business code with the least thread resources.

More wonderful articles

The actor model is so excellent under distributed high concurrency

Recommended Today

How does git store data

How does git store data Git is one of the tools that we use most every day. It is a new version control tool created by Linus Torvalds, the earliest author of Linux kernel. It is as simple and easy to use as Linux. This article will briefly talk about how git works requirement I […]