Deep analysis of distributed transaction performance


With the large-scale application of microservices, there are more and more distributed transactions across microservices. What is the performance of distributed transactions? How much performance will be reduced? Can it meet business needs? These indicators are very concerned about whether distributed transactions can be successfully introduced into production applications.

This paper attempts to deeply analyze the additional overhead brought by distributed transactions, which factors in the application will affect the final performance, where the bottleneck is, and how to improve the performance. This paper focuses on a multi language distributed transaction manager transaction is used as the sample of performance test to deeply analyze the results of performance test.

testing environment

model CPU / memory storage system Mysql
Alibaba cloud ecs.c7.xlarge 4-core 8g 500G ESSD IOPS 26800 Ubuntu 20.04 Docker mysql:5.7

Test process

#Under DTM directory
Docker compose - f helper / compose.mysql.yml up - D # start MySQL

#Run sysbench to test mysql
sysbench oltp_write_only.lua --time=60 --mysql-host= --mysql-port=3306 --mysql-user=root --mysql-password= --mysql-db=sbtest --table-size=1000000 --tables=10 --threads=10 --events=999999999 --report-interval=10 prepare
sysbench oltp_write_only.lua --time=60 --mysql-host= --mysql-port=3306 --mysql-user=root --mysql-password= --mysql-db=sbtest --table-size=1000000 --tables=10 --threads=10 --events=999999999 --report-interval=10 run

Go run app / main. Go bench > / dev / nul # starts the bench service of DTM. There are many logs and redirects to the nul device
Bench / # start the command line and run DTM related tests

PS: if you need to do hands-on testing, it is recommended that you buy a host in Hong Kong or abroad, so that the related GitHub and docker can access much faster and build a good environment quickly. The host I bought in China is very slow to access GitHub and docker. Sometimes it can’t be connected and can’t be tested smoothly.

Test index

We will compare the following indicators:

  • Global TPS: how many global transactions have been completed from the user’s perspective.
  • Db-tps: the number of transactions completed at the DB level in each test
  • Ops: how many SQL statements have been completed in each test

Comparison of results

Mysql No dtm-2sql DTM-2SQL DTM-2SQL-Barrier No dtm-10sql DTM-10SQL DTM-10SQL-Barrier
Global-TPS 1232 575 531 551 357 341
DB-TPS 2006 2464 2300 2124 1102 1428 1364
OPS 12039 4928 5750 6372 10620 9282 9548

MySQL performance

We first tested the performance of MySQL with. In the performance test of DTM, there are many write operations, so we mainly test the performance of MySQL write this time.

We use OLTP in sysbench_ write_ Only benchmark. In this benchmark, each transaction contains 6 write SQL (including insert / update / delete).

Under this benchmark, the number of transactions completed per second is about 2006 and the number of SQL completed is about 12039. These two results will be referenced in subsequent DTM related tests.

DTM test

There are many transaction modes involved in distributed transactions. We select a representative simple saga mode as a representative to analyze the performance of distributed transaction DTM.

The saga transaction we selected contains two sub transactions, one is transout transfer out balance and the other is transin transfer in balance. Transfer in and transfer out contain two SQLs, namely, update balance and record flow.

No dtm-2sql

We first test the case without DTM, that is, we directly call transout and transin. The test result is that 1232 global transactions are completed per second. Each global transaction contains two sub transactions, transfer out and transfer in, so the db-tps is 2464, and then each sub transaction contains two SQL, so the total SQL operation is 4928.

This result is higher than MySQL and db-tps, while db-sql is only half. The main reason is that each transaction needs to synchronize data to disk, which requires additional performance consumption. At this time, the bottleneck is mainly the transaction capacity of the system database


We then test the case of adopting DTM. After adopting DTM, the sequence diagram of a saga transaction is as follows:

Deep analysis of distributed transaction performance

Global transactions include four transactions: transin, transout, saving global transactions + transaction branches, and modifying global transactions as completed. Modifying each sub transaction branch to completed also requires one transaction, but DTM adopts asynchronous write for consolidation, reducing transactions.

The number of SQL included in each global transaction is: 1 save global transaction, 1 save branch, 1 read all branches, 2 modify branches to complete, 1 modify global transaction to complete, a total of 6 additional SQL, plus 4 SQL of the original sub transaction, 10.

In the test results, the number of global transactions completed per second is 575, so the db-tps is 2300 and the OPS is 5750. Compared with the previous scheme without DTM, the db-tps decreases slightly and the OPS increases to a certain extent. The bottleneck is still in the system database


After the sub transaction barrier is added, each sub transaction branch will have one more insert statement, and the number of SQL corresponding to each global transaction is 12

In the test results, the number of global transactions completed per second is 531, so the db-tps is 2124 and OPS is 6372. Compared with the previous DTM scheme, db-tps decreases slightly and OPS increases slightly, which is in line with expectations

No dtm-10sql

We adjust the pressure test data, adjust the number of SQL in each sub transaction from 2 to 10, and execute the SQL cycle in the sub transaction 5 times.

In the pressure test results without DTM, the number of global transactions completed per second is 551, db-tps is 1102 and OPS is 10620. In this result, OPS is close to MySQL, and the bottleneck is mainly ops in the database.


In this pressure test result, the number of global transactions completed per second is 357, the number of db-tps is 1428, and the number of OPS is 9282. Among them, OPS is more than 10% lower than that without DTM. The main reason is that DTM tables have more fields and indexes, and the execution overhead of each SQL will be larger, so the total OPS will be lower.


In the test results, the number of global transactions completed per second is 341, so the db-tps is 1364 and the OPS is 9548. Compared with the previous DTM scheme, the db-tps decreases slightly and the OPS increases slightly, which is in line with the expectation


Because distributed transactions need to save the state of global transactions and branch transactions, additional writes will be generated. About 4 + n (number of sub transactions) SQL operations and 2 database transactions will be generated for each global transaction. When the business is simple and there are few SQL, the use of distributed transactions will reduce the transaction throughput by 50%; If the business is complex and there are many SQL, the performance will be reduced by about 35%. The decrease is mainly due to the saving of global / branch transaction status, resulting in additional SQL operations.

From the comparison between DTM’s pressure measurement results and MySQL’s pressure measurement data, DTM has little additional overhead and has maximized the ability of the database.

An ecs.c7.xlarge + 500g alicloud server can provide 300 ~ 600 global TPS after installing mysql, and the monthly cost is 900 yuan (price in October 2021). This cost is very low compared with the business capacity provided.

If you need stronger performance, you can purchase higher configuration or deploy multiple groups of DTMs in the application layer. The cost of the two schemes is not large enough to meet the needs of most companies.

Welcome to visit, give a star to support our work!