Sofastack (scalable open financial architecture stack) is a financial level cloud native architecture independently developed by ant financial services. It contains all the components required to build the financial level cloud native architecture, and is the best practice tempered in financial scenarios.
SOFARegistry is a highly available service registry with an open source of ant service and high capacity registration and subscription capabilities. It has evolved to the fifth generation in the past ten years, driven by the business development of Alipay / ants.
This paper is the seventh part of the analysis of the sofa registry framework,The author of this article is very clear。 The analysis of sofa registry framework series is produced by sofa team and source code enthusiasts< SOFA:RegistryLab/ >At the end of the paper, there are a series of articles in the past.
GitHub address: https://github.com/sofastack/sofa-registry
As described in the previous article, compared with other registries, sofa registry has the main features of supporting massive data, supporting massive clients, second level service online and offline notification and high availability. This paper will describe the consistency scheme of sofa registry from the following aspects:
- MetaServer data consistency
In order to support high availability, for MetaServer, the metadata of sofaregistry is stored. In order to ensure the consistency of MetaServer cluster, raft protocol is used for election and replication.
- Session server data consistency
In order to support the connection of massive clients, sofaregistry adds a session server layer between the client and the dataserver. The connection between the client and the session server avoids the uncontrollable problem of too many connections between the client and the dataserver. When the client connects with the dataserver through the sessionserver, the publisher data will be cached in the sessionserver at the same time. At this time, it is necessary to solve the problem of data consistency between the dataserver and the sessionserver.
- Dataserver data consistency
In order to avoid the problem of massive data storage, the massive data storage capacity of sory server is adopted to avoid the problem of massive data storage. In this model, each data partition has multiple copies. When the dataserver that stores the registered data is expanded or reduced, the MetaServer will notify the dataserver and sessionserver of this change. The data fragmentation will be migrated and synchronized within the cluster. At this time, the data consistency problem within the dataserver appears.
MetaServer data consistency
MetaServer plays the role of cluster metadata management in sofa registry. It is used to maintain the list of cluster members. It can be regarded as the registry of sofa registry. When session server and dataserver need to know the cluster list and need to expand or shrink the size, MetaServer will provide corresponding data.
Figure 1 internal structure of MetaServer
The figure is from the introduction and implementation analysis of MetaServer function of ant financial services registry | sofa registry analysis |
Because there is not a lot of data in the sofa registry cluster node list, there is no need to use data fragmentation to store in MetaServer. As shown in Figure 1, the list of cluster nodes is stored in the repository. Bolt requests such as node registration, renewal and list query are provided through raft strong consistency protocol to ensure strong consistency of data obtained by the cluster.
For the raft protocol algorithm, please refer to the explanation in the raft consensus algorithm. In sofa system, there is sofajraft implementation for raft protocol. The principle of raft protocol algorithm is briefly introduced below.
Raft protocol consists of three parts: leader election, log replication and safety.
- Leader election
The leader is selected by a certain algorithm, which is used to accept client requests and append instructions to the log.
Fig. 2 state transition diagram of raft state machine
From understanding the raft consensus algorithm: an academic article summary
- Log replication
After receiving the client request, the leader appends the operation to the log, synchronizes the message with other followers, and finally commits the log, and returns the result to the client.
Figure 3 replication state machine
Graph from raft consistent algorithm notes
Security ensures the data consistency.
Data consistency guarantee based on raft protocol
Figure 4 raft stored procedure in sofa registry
The figure is from the introduction and implementation analysis of MetaServer function of ant financial services registry | sofa registry analysis |
As shown in Figure 4, the raft protocol data storage in sofa registry goes through some of the above processes. When the client initiates raft protocol calls, such as data registration, renewal, query and other operations, it will be implemented through dynamic proxy
ProxyHandlerClass to proxy through
RaftClientSend data to
RaftServerAnd through the internal state machine
StatemachineFinally, the data operation is realized to ensure the data consistency within MetaServer.
Session server data consistency
Session server is responsible for session management and connection in sofa registry. At the same time, the subscriber needs to subscribe to the service data of the dataserver through the sessionserver, and the publisher needs to publish the service data to the dataserver through the sessionserver.
In this scenario, as an intermediate agent layer, session server caches the data obtained from the data server. The data from the dataserver needs to be pushed to the subscriber through the sessionserver. There are two scenarios to trigger the push of the session server: one is that the data from publisher to dataserver changes; the other is that the subscriber is added.
In the actual scenario, there are more new subscribers. In this scenario, you can directly push the data cached by the session server to the subscriber, which can greatly reduce the pressure on the dataserver when the session server obtains data from the dataserver. Therefore, this further confirms the need to cache data in session server.
Figure 5 Comparison of data push between two scenarios
Data comparison mechanism between session server and dataserver
When the service publisher is offline or disconnected, the corresponding data will be registered to the dataserver through the sessionserver. At this point, the data of dataserver and sessionserver will be inconsistent for a short time. In order to ensure the consistency of the data, the data server and the session server can use the
PULLThe two methods realize data synchronization.
- Data push mode
When the service data changes, the dataserver will actively notify the session server. At this time, the session server will compare the version numbers of the two data
versionAfter comparison, if the data needs to be updated, it will take the initiative to obtain the corresponding data from the data server.
- Data pull mode
The session server will actively query the dataserver every certain time (the default is 30s)
versionIf the version number is changed, the corresponding synchronization operation will be performed.
- Session server synchronizes data from dataserver: under normal circumstances, the data of dataserver is generally updated than that of sessionserver. At this time, when sessionserver finds that the data version number has changed, it will actively pull the data of dataserver for synchronization. Note that the cached data at this time is only related to the service information subscribed by the client managed by the current session server. The full amount of data will not be cached, and the capacity is not allowed.
- Dataserver synchronizes data from sessionserver: under special circumstances, the dataserver data is missing and the replica data is also problematic. When the version number of the sessionserver is compared with the dataserver data, the data recovery operation will be triggered to recover the total data stored in the session server memory to the dataserver The mechanism of data synchronization and reverse compensation is realized.
- How to cache data
Sofa registry uses
LoadingCache<Key, Value>To cache the data synchronized from the dataserver in the session server. The entry in each cache has an expiration time. When pulling data, you can set the expiration time (the default is 30s), so that the cache can query the datainfoids of all sub of the current session on a regular basis from the dataserver. Compare the latest push version of the session record (see
com.alipay.sofa.registry.server.session.store.SessionInterests#interestVersions) is smaller than the dataserver, indicating that it needs to be pushed. Then, the session server takes the initiative to obtain the data of the datainfoid from the dataserver (it will be cached in the cache at this time) and push it to the client.
At the same time, when there is data update in the dataserver, it will actively send a request to the session server to invalidate the corresponding entry, thus prompting the session server to update the invalid entry.
Data consistency synchronization between session server and subscriber
When the data of the session server changes, it will synchronize with the subscriber to change the
dataInfoIdThe data is pushed to the subscriber to ensure that the data cached locally by the client is consistent with that in the session server.
Dataserver data consistency
In sofa registry, data server takes on the core function of data storage. The data is stored according to the datainfoid, which supports multi copy backup to ensure the high availability of data. This layer can be expanded with the growth of service data volume.
If the dataserver is down, MetaServer can sense it and inform all dataservers and sessionservers that data fragmentation can fail over to other replicas. At the same time, the partitioned data will be migrated within the dataserver cluster.
Data server request receiving process
Before we talk about consistency, let’s talk about what we’ve done about data synchronization after the start of dataserver. When the dataserver is started, a data synchronization bolt service, opendatasyncserver, will be started for corresponding dataserver data synchronization processing.
When starting the datasyncserver, the following handlers are registered to handle bolt requests:
Figure 5 handler registered by dayasynserver
The handler is mainly used for data acquisition. When a request comes, it will obtain the corresponding data stored in the current dataserver node through the datacenter and datainfoid in the request.
- publishDataProcessor unPublishDataHandler
When a data publisher, publisher, goes online or offline, it will trigger the publishdataprocessor or unpublishdatahandler respectively. The handler will add a data change event to the datachangeeventcenter to inform the event change center of data changes asynchronously. After receiving the event, the event change center will add the event to the queue. At this time, the data change event center will process the online and offline data asynchronously according to different event types.
At the same time, the datachangehandler will release the event change information through changenotifier to inform other nodes to synchronize data.
This is a data pull request. When the handler is triggered, it will notify the current dataserver node to compare the version number. If the version number of the data in the request is higher than the version number in the current node cache, the data synchronization operation will be performed to ensure that the data is up-to-date.
This is a dataserver online notification request handler. When other nodes go online, the handler will be triggered, so that the current node stores the new node information in the cache. It is used to manage node status, whether it is initial or working.
When the handler is triggered, the handler will be compared by version number. If the version number of the data stored in the current dataserver contains the version number of the current request, all data greater than the version number of the current request data will be returned to facilitate data synchronization between nodes.
Connection management handler: when other dataserver nodes are connected to the current dataserver node, the connect method will be triggered to register the connection information in the local cache. When other dataserver nodes are disconnected from the current node, the disconnect method will be triggered to delete the cache information and ensure that the current dataserver node stores all the connected data Dataserver node.
Sofa registry adopts the final consistency process similar to Eureka in the data storage level, but the storage content is different from that of Eureka in each node. The content of each node is partitioned according to the consistent hash data to achieve the unlimited expansion of data capacity.
Sofa registry is an AP distributed system, which indicates that a availability is selected on the premise of existing condition P. When the data is synchronized, the obtained data is inconsistent with the actual data. However, because the stored information is the registered node of the service, although there will be a short-term inconsistency, for the client, the probability is still able to find available nodes from this part of data, which will not cause fatal damage to the business system because of the temporary inconsistency of the data.
Data migration process in cluster
The data server of sofa registry selects “consistent hash shard” to store data. On the basis of “consistent hash fragmentation”, in order to avoid the problem of “fragmented data is not fixed”, sofa registry chooses to record operation logs in dataserver memory with the granularity of datainfoid, and the granularity of datainfoid is also used for data synchronization between dataservers.
Figure 6 asynchronous data synchronization between dataservers
Data and replicas are distributed on different nodes to perform consistent hash fragmentation. At that time, after writing to the primary replica, the primary replica will update the data asynchronously to other replicas, thus realizing the data migration between different replicas in the cluster.
In the design of distributed system, availability, partition fault tolerance and consistency are the options we must weigh. Cap theory tells us that only two of these three requirements can be met at the same time. In the design of distributed systems, how to make trade-offs is a difficult problem for every system designer.
Sofa registry system is divided into three clusters, namely metadata cluster MetaServer, session cluster sessionserver and data cluster dataserver. There are many places in a complex system that need to consider the issue of consistency. Sofa registry also adopts different solutions for the consistency requirements of different modules. For MetaServer module, raft protocol with strong consistency is adopted to ensure the consistency of cluster information. For the data module, sofa registry selects AP to ensure availability and final consistency.
The design of sofa registry gives us enlightenment. When designing a multi module distributed system, different consistency schemes can be selected according to the requirements of different modules. At the same time, the trade-off between cap and the objectives of different modules of the system should be considered to make a reasonable trade-off.
Sofa registry Lab Series
- How to realize the second level service online and offline notification | sofaregistry resolution in the service registry
- Analysis of session storage policy | sofaregistry in service registry
- Detailed explanation of data fragmentation and synchronization scheme in service registry
- Service registry MetaServer function introduction and implementation analysis | sofa registry analysis
- Service registry sofaregistry analysis and optimization of service discovery
- Introduction of registry sofaregistry architecture based on massive data