Open source distributed transaction solution based on combination of hard and soft



Zhang Liang, head of data research and development of Jingdong Digital Technology Co., Ltd., founder of Apache shardingsphere & PPMC

He loves open source and currently leads open source projects shardingsphere (formerly sharding JDBC) and elastic job. He is good at Java based distributed architecture and cloud platform based on kubernetes and mesos. He advocates elegant code and has a lot of research on how to write expressive code.

At present, the main focus is on building shardingsphere into a first-class financial data solution in the industry. Shardingsphere has entered the Apache incubator. It is the first open source project of Jingdong group to enter the Apache foundation and the first distributed database middleware of Apache foundation.

Jiang Ning, technical expert of Huawei open source capability center, project leader of Apache servicecomb. Former chief software engineer of red hat software, with more than 10 years of experience in enterprise level open source middleware development, rich experience in java development and use, and a fan of functional programming. Since 2006, I have been engaged in the development of Apache open source middleware project, and successively participated in the development of Apache CXF, Apache camel, and Apache servicemix. In depth research on micro service architecture, web services, enterprise integration pattern, SOA, OSGi.

Blog address:

Feng Zheng, red hat software engineer. In 2009, he joined red hat software company, mainly engaged in the work of transaction manager. As a core developer, he participated in Narayan and BlackTie projects, and contributed to the transaction processing integration with multiple application servers (wildfly, karaf, Tomcat) and frameworks (common DBCP, spring boot). He has participated in the Apache servicecomb project since 2017 and is currently a member of PMC. For distributed transaction processing and transaction processing in microservice environment, we have done in-depth research.

Reading guide

Compared with the gradual maturity of the data fragmentation scheme, the distributed transaction solution that integrates performance, transparency, automation, strong consistency, and can be applied to various application scenarios is rare. The performance bottleneck of distributed transaction based on two (three) phase commit and the business transformation of flexible transaction make distributed transaction still a headache for architects.

Apache shardingsphere (incubating) will lose no time in early 2019 to provide an integrated distributed transaction solution combining rigidity and flexibility. If your application system is suffering from this problem, you might as well pour a cup of coffee and spend ten minutes reading this article. Maybe you will get something?


Database transactions need to meet the four characteristics of acid (atomicity, consistency, isolation and persistence).

  • Atomicity means that a transaction is executed as a whole, either all or none.
  • Consistency means that the business should ensure that the data changes from one consistent state to another.
  • Isolation means that when multiple transactions are executed simultaneously, the execution of one transaction should not affect the execution of other transactions.
  • Durability means that the committed transaction modification data will be persisted.

In a single data node, transactions are limited to the access control of a single database resource, which is called local transaction. Almost all mature relational databases provide native support for local transactions. However, in the distributed application environment based on microservice, more and more application scenarios require that the access to multiple services and their corresponding multiple database resources can be included in the same transaction, so the distributed transaction arises at the historic moment.

Relational database provides perfect acid native support for local transaction. But in the distributed scenario, it becomes the shackles of system performance. How to make the database meet the characteristics of acid or find the corresponding alternative in the distributed scenario is the key work of distributed transaction.

Local affairs

Without opening any distributed transaction manager, each data node manages its own transaction. They don’t have the ability of coordination and communication, and they don’t know whether other data nodes are successful or not. There is no performance loss for local transactions, but it is not enough for strong consistency and final consistency.

Two stage submission

The earliest distributed transaction model of XA protocol is x / open distributed transaction processing (DTP) model proposed by X / open international alliance, which is called XA protocol for short.

The distributed transaction based on XA protocol has little invasion to business. Its biggest advantage is that it is transparent to users, and users can use distributed transactions based on XA protocol just like local transactions. Xa protocol can strictly guarantee the transaction acid characteristics.

It is a double-edged sword to strictly guarantee the transaction acid feature. In the process of transaction execution, all the required resources need to be locked, which is more suitable for short transactions with fixed execution time. For a long transaction, the monopolization of data during the whole transaction will lead to a significant decline in the concurrency performance of the business system that depends on hot data. Therefore, in the high concurrency performance oriented scenario, the distributed transaction based on XA protocol is not the best choice.

Flexible transaction

If the transactions that implement the acid transaction elements are called rigid transactions, then the transactions based on the base transaction elements are called flexible transactions. Base is the abbreviation of basic availability, flexible state and final consistency.

  • Basically available ensures that distributed transaction participants are not always online at the same time.
  • Soft state allows a certain delay of system state update, which may not be detected by customers.
  • Eventualy consistent is usually used to ensure the final consistency of the system by means of message passing.

In acid transaction, the requirement of consistency and isolation is very high. In the process of transaction execution, all resources must be occupied. The idea of flexible transaction is to move mutex operation from resource level to business level through business logic. By relaxing the requirements of strong consistency and isolation, as long as the data is consistent at the end of the whole transaction. During the execution of a transaction, the data from any read operation may be changed. This kind of weak consistency design can be used in exchange for the improvement of system throughput.

Saga is a typical flexible transaction manager. The concept of sagas comes from a database paper more than 30 years ago[… ]A saga transaction is a long-term transaction consisting of several short-term transactions. In the distributed transaction scenario, we regard a saga distributed transaction as a transaction composed of multiple local transactions, and each local transaction has a corresponding compensation transaction. In the process of Saga transaction execution, if there is an exception in a certain step, Saga transaction will be terminated, and the corresponding compensation transaction will be called to complete the recovery operation, so as to ensure that saga related local transactions are either executed successfully or recovered to the state before transaction execution through compensation.

TCC (try cancel / confirm Implementation) is another flexible transaction coordination implementation. TCC provides a perfect recovery method with the help of two-phase commit protocol. In TCC mode, cancel compensation is obviously the result of canceling the first stage by executing business logic in the second stage. Try is to perform relevant business operations in the first stage to complete the occupation of relevant business resources, such as pre allocating ticketing resources, or checking and refreshing the credit limit of user account. In the cancellation phase, release the related business resources, such as releasing the pre allocated ticket resources or restoring the previously occupied user credit line. So why do we add confirmation? This needs to start with the use life cycle of business resources. In the try process, we only occupy the business resources, and the related execution operations are only in the pending state. Only after the confirmation operation is completed, can the business resources be confirmed.

The strong consistency transaction based on acid and the final consistency transaction based on base are not silver bullets. Only in the most suitable scenario can they play their greatest advantages. The differences between them can be compared in detail through the following table to help developers select technologies.

Open source distributed transaction solution based on combination of hard and soft


Due to different application scenarios, developers need to be able to reasonably balance various distributed transactions between performance and function.

The API and function of two-phase commit and flexible transaction are not exactly the same, and they can not switch freely and transparently. In the development decision-making stage, we have to choose between the two-stage submitted transaction and flexible transaction, which greatly increases the design and development costs.

Xa based two-stage commit transaction is relatively simple to use, but it can not deal with the high concurrency of the Internet or the long transaction scenario of complex system; flexible transaction requires developers to transform the application, the access cost is very high, and the developers need to realize the resource occupation and reverse compensation.

Distributed transaction of shardingsphere

Integrating the existing mature transaction solutions, providing a unified distributed transaction interface for local transaction, two-stage commit and flexible transaction, and making up for the shortcomings of the current solutions, providing a one-stop distributed transaction solution is the main design goal of Apache shardingsphere (incubating) distributed transaction module. The name of the module is sharding transaction. The design concept and function presentation of sharding transaction module can be summarized with three key words: combination of hardness and softness, automation and transparency.

1. Combination of hardness and softness

At the same time, it provides two-phase commit transaction based on Xa and flexible transaction solution based on saga, and can be used together.

2. Automation

Both XA transaction and saga transaction are completed in an automated way, and the user is not aware of them. Xa transaction does not need to use xadatasource interface and JTA transaction manager; saga transaction does not need user to implement compensation interface.

3. Transparency

In the two access terminals of Apache shardingsphere, sharding JDBC and sharding proxy, encapsulation of local transaction oriented interface is provided respectively. Users can use multiple horizontally partitioned data sources managed by shardingsphere as one database, and the ability of fully distributed transaction can be realized through local transaction API. Users can switch transaction types freely in the application transparently.

Sharding transaction module is composed of sharding-transaction-core, sharding-transaction-2pc and sharding transaction base.

  • sharding-transaction-core:

User oriented API and developer oriented SPI are provided.

  • sharding-transaction-2pc:

Two phase commit transaction parent module. At present, only sharding transaction XA module supports XA protocol. In the future, more transaction types based on two-phase commit will be introduced, such as percolator. See:


  • sharding-transaction-base:

Flexible transaction parent module. At present, there is only sharding transaction saga module, which adopts saga executor provided by Apache servicecom saga actuator to provide flexible transaction support, and provides reverse SQL and snapshot capabilities on the basis of it, so as to realize automatic reverse compensation function.

The highlights of XA and saga transaction modules of shardingsphere are described below.

Xa transaction: escorted by three XA transaction managers

There are many mature XA transaction managers. Apache shardingsphere (incubating) does not choose to rebuild the wheel. Instead, it hopes to create an ecology, integrate the appropriate wheels together organically, and provide mature and stable distributed transaction processing capabilities. Its main functions are as follows:

1. Reuse mature engine and switch the underlying implementation automatically

Sharding transaction XA module further defines SPI for XA transaction manager developers. Developers only need to implement the interface defined by SPI to automatically join the Apache shardingsphere (incubating) ecosystem as their XA transaction manager.

Apache shardingsphere (incubating) has officially implemented SPI based on atomikos and bitronix, and invited Narayana, the XA transaction engine of radhat JBoss[… ]The development team implements the spi of JBoss. Users can choose their preferred XA transaction manager among atomikos, bitronix and Narayana.

Limited by the license of Apache foundation project, Apache shardingsphere (incubating) will adopt atomikos of Apache protocol as its default implementation. For bitronix based on LGPL protocol and Narayana based on LGPL protocol, users can refer to the corresponding jar package to the classpath of the project.

If these three XA transaction managers do not meet the needs of users, developers can extend SPI to implement customized XA transaction managers.

2. Data source transparent automatic access

Apache shardingsphere (incubating) can automatically connect xadatasource to XA transaction manager as a database driven data source. For the application using datasource as database driver, users do not need to change its coding and configuration. Apache shardingsphere (incubating) transforms it into XA datasource and xaconnection supporting XA protocol through automatic adaptation, and registers it as XA resource in the underlying XA transaction manager.

The architecture of XA module is as follows:

Open source distributed transaction solution based on combination of hard and soft

Saga transaction – automatic compensation across flexible transaction constraints

In the flexible transaction, every update operation to the database will submit the data to the database to achieve the best resource release effect in the high concurrency system. When the data has problems and needs to be rolled back, the final consistency and isolation behavior of the data are maintained by flexible transaction manager. Apache shardingsphere (incubating) adopts Apache servicecom saga activator[… ]As saga transaction manager, its main functions are as follows:

  1. Automatic reverse compensation

Saga defines that each sub transaction in a transaction has a corresponding reverse compensation operation. Saga transaction manager generates a directed acyclic graph according to the program execution result, and calls reverse compensation operation in reverse order according to the graph when rollback operation is needed. Saga transaction manager is only used to control when to try again. It is not responsible for the content of compensation. The specific operation of compensation needs to be provided by the developer.

The other flexible transaction manager, TCC, is similar to saga in concept, and needs to be compensated by the user developer. In addition to compensation, TCC also provides the ability to occupy resources, but the user developer also needs to provide the resource occupation operation. Although it is better than saga in function, the cost of TCC is also higher than saga.

User developers provide resource occupancy and compensation operations, which makes it difficult for flexible transaction solutions to be implemented in large-scale business systems. Moreover, due to the intervention of business system, the application scope of flexible transaction framework is always located in service rather than database. There are few mature flexible transaction managers that can be directly used by database.

Apache shardingsphere (incubating) uses reverse SQL technology to automatically generate data snapshot and reverse SQL for updating the database, which are executed by Apache servicecom saga activator. Users no longer need to pay attention to how to implement compensation method, and successfully locate the application scope of flexible transaction manager back to the source of transaction database level.

For the Apache shardingsphere (incubating) SQL parsing engine, which can handle complex query statements, the difficulty of parsing insert / update / delete statements is much less; shardingsphere partitions data by intercepting SQL executed by users, and all SQL can be directly controlled by it. Therefore, the combination of reverse SQL and compensation capability with Apache servicecomb saga activator achieves the ability of automating flexible transactions, which is a model of the combination of data fragmentation and flexible transactions.

The architecture of Saga module is as follows:

Open source distributed transaction solution based on combination of hard and soft

Access end — distributed transaction oriented to native transaction interface

The goal of Apache shardingsphere (incubating) is to use multi partitioned databases just like one database. In the transaction module, this goal is still applicable. No matter how the database managed by shardingsphere is partitioned, there is only one logical database for developers. Therefore, the transaction interface of shardingsphere is still the native local transaction interface, that is, the transaction interface of JDBC java.sql.Connection The methods of setautocommit, commit and rollback, and the statements of begin, commit and rollback for database transaction manager. While the user calls the native local transaction interface, shardingsphere ensures the distributed transaction of the back-end partitioned database through sharding transaction module.

Because the native transaction interface does not support transaction types, shardingsphere provides three ways for users to switch transaction types.

1. Switch the current transaction type through SCTL (sharding CTL, the database management command provided by shardingsphere). It can be input in the way of SQL execution, which is suitable for sharding JDBC and sharding proxy. For example: SCTL:SET TRANSACTION_ TYPE=BASE

2. Switch the current transaction type through ThreadLocal, which is suitable for sharding JDBC. For example: t ransactionTypeHolder.set ( TransactionType.XA )

3. Switch the current transaction type through meta annotation and spring, which is suitable for sharding JDBC and sharding proxy. For example: @ shardingtransactiontype( TransactionType.BASE )

Route planning

The development branch of distributed transaction module in GitHub[… ]It is basically available and will be released with the release of 4.0.0.m1, which will be the first release of shardingsphere after it enters the Apache foundation incubator. Distributed transaction is an important part of data fragmentation and microservice architecture, and also the focus of Apache shardingsphere (incubating). It will continue to be improved after its release. The route planning is as follows.

Transaction isolation engine

After the SQL reverse engine is stable, the focus of flexible transaction will be on creating transaction isolation. Since the isolation of transactions is not the scope planned by saga, Apache shardingsphere (incubating) will improve it outside saga and make it a part of the whole flexible transaction together with SQL reverse engine.

Apache shardingsphere (incubating) will use optimistic lock, pessimistic lock, no isolation and other strategies to achieve one-to-one support of isolation levels such as read committed, read uncommitted, repeatable read and serialization. The concurrency of the system is further improved by multi version snapshot.

External XA transaction interface

After supporting its own internal transaction problems, sharding JDBC and sharding proxy, the two access terminals of Apache shardingsphere (incubating), will provide the ability to integrate with other data sources and be managed by JTA and other distributed transaction managers.

After implementing the external XA transaction interface, the data source of sharding JDBC will implement the xadatasource interface, providing the possibility to join an XA transaction together with other data sources; the database protocol of sharding proxy will also implement the two-phase commit protocol based on Xa, making it a resource manager loaded by Xa.

In addition, shardingsphere also implements the recovery part of XA protocol, that is, it can provide in double transactions to realize transaction recovery when the transaction processor crashes.


The distributed transaction capabilities provided by Apache shardingsphere (incubating) can be summarized in the following table. Readers can compare with the table at the beginning of this article to see the changes brought about by the distributed transaction module of shardingsphere.

Open source distributed transaction solution based on combination of hard and soft

In the rapid development of Apache shardingsphere (incubating), the rudiment of distributed transaction has become. We will build it into a usable product as soon as possible, and continue to provide high-quality solutions for the community. For a short article, you will be interested in this field. Might as well first try, whether to meet your expectations? Or simply join our community to create a more perfect distributed transaction solution.