Source Code Explains Seata-Client Principle and Procedure of Distributed Transaction

Time:2019-10-9

Summary:Based on the structure of spring cloud + spring JPA + spring cloud Alibaba fescar + MySQL + seata, this paper builds a demo of distributed system. Through debug log and source code of seata, the working process and principle of the system are analyzed from the point of view of client end (RM, TM).

Preface

Distributed transaction is a problem that must be solved in distributed system. At present, the final consistency scheme is mostly used. Since Ali opened Fescar (renamed Seata in early April) at the beginning of this year, the project has attracted great attention and is now close to 8000 Stars. Seata aims at high performance and zero intrusion to solve the distributed transaction problem in the field of micro-services. It is currently in the process of rapid iteration. The short-term small goal is to produce the available version of Mysql.

Based on the structure of spring cloud + spring JPA + spring cloud Alibaba fescar + MySQL + seata, this paper builds a demo of distributed system. Through debug log and source code of seata, the working process and principle of the system are analyzed from the point of view of client end (RM, TM). (Example project: https://github.com/fescar-group/fescar-samples/tree/master/spring cloud-jpa-seata)

To better understand the full text, let’s familiarize ourselves with the relevant concepts:

  • XID: The unique identifier of a global transaction, consisting of ip: port: sequence;
  • Transaction Coordinator (TC): Transaction Coordinator, which maintains the running state of global transactions, coordinates and drives the submission or rollback of global transactions.
  • Transaction Manager (TM): Controls the boundaries of global transactions, opens a global transaction, and ultimately initiates a global commit or rollback resolution;
  • Resource Manager (RM): Controls branch transactions, is responsible for branch registration, status reporting, and receives instructions from transaction coordinator to drive the submission and rollback of branch (local) transactions.

Tip: The code in this paper is based on the version of fescar-0.4.1. As the project has just changed its name to seata, some of the package names, class names, jar package names have not been changed uniformly, so fescar is still used in the following description.

Distributed Framework Support

Fescar uses XID to represent a distributed transaction. XID needs to be transferred to the system involved in a distributed transaction request to send the processing of branch transactions to feacar-server and receive the commit and rollback instructions of feacar-server. Fescar has officially supported the full version of the Dubbo protocol, and has provided implementations for the spring cloud (spring-boot) distributed project community.

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-alibaba-fescar</artifactId>
    <version>2.1.0.BUILD-SNAPSHOT</version>
</dependency>

This component realizes XID transfer function based on RestTemplate and Feign communication.

Business logic

Business logic is the classic process of order placing, balance deduction and inventory reduction. According to the module, it is divided into three independent services and connected to the corresponding database separately:

  • Order: order-server
  • Account: account-server
  • Inventory: storage-server

There are also business systems that initiate distributed transactions:

  • Business: business-server

The project structure is as follows

Normal business:

  1. Business initiates a purchase request
  2. Storage deducts inventory
  3. Order Creation Order
  4. Account deduction balance

Abnormal business

  1. Business initiates a purchase request
  2. Storage deducts inventory
  3. Order Creation Order
  4. accountAbnormal deduction balance

Under normal flow, the data in 2, 3 and 4 steps update the global commit normally, while the data in abnormal flow is rolled back globally due to the abnormal error in step 4.

configuration file

Fescar’s configuration entry file is registry.conf. Looking at the code Configuration Factory, we know that the configuration file can’t be specified yet, so the name of the configuration file can only be registry.conf.

private static final String REGISTRY_CONF = "registry.conf";
public static final Configuration FILE_INSTANCE = new FileConfiguration(REGISTRY_CONF);

In _____________registrySpecific configurations can be specified in the file. By default, the file type is used. There are three parts of the configuration in file.conf:

  1. The configuration of the transport transport transport part corresponds to the NettyServerConfig class, which is used to define Netty-related parameters. Netty is used to communicate between TM, RM and fescar-server.
  2. service

    service {
     #vgroup->rgroup
     vgroup_mapping.my_test_tx_group = "default"
     # Configure the address of Client to connect TC
     default.grouplist = "127.0.0.1:8091"
     #degrade current not support
     enableDegrade = false
     #disable
     Whether Seata's Distributed Transaction is Enabled
     disableGlobalTransaction = false
    }
  3. client

    client {
      # Upper Buffer Limit of RM after Receiving Commit Notification of TC
      async.commit.buffer.limit = 10000
      lock {
        retry.internal = 10
        retry.times = 30
      }
    }

Data Source Proxy

In addition to the previous configuration file, fescar has a slight amount of code in AT mode where it specifies the proxy for the data source and is currently only based onDruidDataSourceAgent. (Note: Arbitrary data source types have been supported in the latest release of version 0.4.2)

@Bean
@ConfigurationProperties(prefix = "spring.datasource")
public DruidDataSource druidDataSource() {
    DruidDataSource druidDataSource = new DruidDataSource();
    return druidDataSource;
}

@Primary
@Bean("dataSource")
public DataSourceProxy dataSource(DruidDataSource druidDataSource) {
    return new DataSourceProxy(druidDataSource);
}

UseDataSourceProxyThe purpose is to introduceConnectionProxyThe non-intrusive aspect of fescar is manifested inConnectionProxyIn implementation, the entry point of adding global transaction to branch transaction is local transaction.commitStage, so that the design can ensure that business data and ____________undo_logIn a local transaction.

undo_logIt is a table that needs to be created on the business library, and fescar relies on it to record the status and two phases of each branch transaction.rollbackPlayback data. Don’t worry about the single-point problem caused by the large amount of data in this table. It’s a global transaction.commitTransactions correspond to each other in the scenario of ___________undo_logAsynchronous deletion.

CREATE TABLE `undo_log` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `branch_id` bigint(20) NOT NULL,
  `xid` varchar(100) NOT NULL,
  `rollback_info` longblob NOT NULL,
  `log_status` int(11) NOT NULL,
  `log_created` datetime NOT NULL,
  `log_modified` datetime NOT NULL,
  `ext` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `ux_undo_log` (`xid`,`branch_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

Start Server

Go to https://github.com/seata/seata/releases to download fescar-server corresponding to Client version to avoid protocol inconsistencies caused by different versions into bin directory after decompression and execute:

./fescar-server.sh 8091 ../data

Start successful output:

2019-04-09 20:27:24.637 INFO [main]c.a.fescar.core.rpc.netty.AbstractRpcRemotingServer.start:152 -Server started ... 

Start Client

Fescar’s loading entry class is located in GlobalTransaction AutoConfiguration, which can automatically load spring boot-based projects, but can also be instantiated in other ways.GlobalTransactionScanner

@Configuration
@EnableConfigurationProperties({FescarProperties.class})
public class GlobalTransactionAutoConfiguration {
    private final ApplicationContext applicationContext;
    private final FescarProperties fescarProperties;

    public GlobalTransactionAutoConfiguration(ApplicationContext applicationContext, FescarProperties fescarProperties) {
        this.applicationContext = applicationContext;
        this.fescarProperties = fescarProperties;
    }

    /**
    * Example Global Transaction Scanner
    * scanner is the initiator class initialized by client
    */
    @Bean
    public GlobalTransactionScanner globalTransactionScanner() {
        String applicationName = this.applicationContext.getEnvironment().getProperty("spring.application.name");
        String txServiceGroup = this.fescarProperties.getTxServiceGroup();
        if (StringUtils.isEmpty(txServiceGroup)) {
            txServiceGroup = applicationName + "-fescar-service-group";
            this.fescarProperties.setTxServiceGroup(txServiceGroup);
        }

        return new GlobalTransactionScanner(applicationName, txServiceGroup);
    }
}

You can see that a configuration item, FescarProperties, is supported to configure the transaction grouping name:

spring.cloud.alibaba.fescar.tx-service-group=my_test_tx_group

If no service group is specified, the name is generated by default using spring. application. name + – fescar – Service – group, so no spring. application. name startup will cause an error.

@ConfigurationProperties("spring.cloud.alibaba.fescar")
public class FescarProperties {
    private String txServiceGroup;

    public FescarProperties() {
    }

    public String getTxServiceGroup() {
        return this.txServiceGroup;
    }

    public void setTxServiceGroup(String txServiceGroup) {
        this.txServiceGroup = txServiceGroup;
    }
}

After obtaining the applicationId and txService Group, create the GlobalTransaction Scanner object, mainly looking at the initClient method in the class.

private void initClient() {
    if (StringUtils.isNullOrEmpty(applicationId) || StringUtils.isNullOrEmpty(txServiceGroup)) {
        throw new IllegalArgumentException(
            "applicationId: " + applicationId + ", txServiceGroup: " + txServiceGroup);
    }
    //init TM
    TMClient.init(applicationId, txServiceGroup);

    //init RM
    RMClient.init(applicationId, txServiceGroup);

}

Initialization can be seen in the method.TMClientAndRMClientFor a service, it can be either a TM role or a RM role, and when it is a TM or RM role depends on a global transaction.@GlobalTransactionalWhere is the annotation? The result created by Client is a Netty connection to TC, so you can see two Netty Channels in the startup log, which indicate that the transactionRoles areTMROLEAndRMROLE

2019-04-09 13:42:57.417  INFO 93715 --- [imeoutChecker_1] c.a.f.c.rpc.netty.NettyPoolableFactory   : NettyPool create channel to {"address":"127.0.0.1:8091","message":{"applicationId":"business-service","byteBuffer":{"char":"\u0000","direct":false,"double":0.0,"float":0.0,"int":0,"long":0,"readOnly":false,"short":0},"transactionServiceGroup":"my_test_tx_group","typeCode":101,"version":"0.4.1"},"transactionRole":"TMROLE"}
2019-04-09 13:42:57.505  INFO 93715 --- [imeoutChecker_1] c.a.f.c.rpc.netty.NettyPoolableFactory   : NettyPool create channel to {"address":"127.0.0.1:8091","message":{"applicationId":"business-service","byteBuffer":{"char":"\u0000","direct":false,"double":0.0,"float":0.0,"int":0,"long":0,"readOnly":false,"short":0},"transactionServiceGroup":"my_test_tx_group","typeCode":103,"version":"0.4.1"},"transactionRole":"RMROLE"}
2019-04-09 13:42:57.629 DEBUG 93715 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Send:RegisterTMRequest{applicationId='business-service', transactionServiceGroup='my_test_tx_group'}
2019-04-09 13:42:57.629 DEBUG 93715 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Send:RegisterRMRequest{resourceIds='null', applicationId='business-service', transactionServiceGroup='my_test_tx_group'}
2019-04-09 13:42:57.699 DEBUG 93715 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Receive:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null,messageId:1
2019-04-09 13:42:57.699 DEBUG 93715 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Receive:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null,messageId:2
2019-04-09 13:42:57.701 DEBUG 93715 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : [email protected] msgId:1, future :[email protected], body:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null
2019-04-09 13:42:57.701 DEBUG 93715 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : [email protected] msgId:2, future :[email protected], body:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null
2019-04-09 13:42:57.710  INFO 93715 --- [imeoutChecker_1] c.a.fescar.core.rpc.netty.RmRpcClient    : register RM success. server version:0.4.1,channel:[id: 0xe6468995, L:/127.0.0.1:57397 - R:/127.0.0.1:8091]
2019-04-09 13:42:57.710  INFO 93715 --- [imeoutChecker_1] c.a.f.c.rpc.netty.NettyPoolableFactory   : register success, cost 114 ms, version:0.4.1,role:TMROLE,channel:[id: 0xd22fe0c5, L:/127.0.0.1:57398 - R:/127.0.0.1:8091]
2019-04-09 13:42:57.711  INFO 93715 --- [imeoutChecker_1] c.a.f.c.rpc.netty.NettyPoolableFactory   : register success, cost 125 ms, version:0.4.1,role:RMROLE,channel:[id: 0xe6468995, L:/127.0.0.1:57397 - R:/127.0.0.1:8091]

You can see it in the log

  1. Create Netty Connections
  2. Send registration request
  3. The response results are obtained.
  4. RmRpcClientTmRpcClientSuccessful instantiation

TM Processing Flow

In this case, the role of TM is business-service, and the purchases method of Business Service is annotated.@GlobalTransactionalNotes:

@Service
public class BusinessService {

    @Autowired
    private StorageFeignClient storageFeignClient;
    @Autowired
    private OrderFeignClient orderFeignClient;

    @GlobalTransactional
    public void purchase(String userId, String commodityCode, int orderCount){
        storageFeignClient.deduct(commodityCode, orderCount);

        orderFeignClient.create(userId, commodityCode, orderCount);
    }
}

After method invocation, a global transaction will be created. First of all, focus on@GlobalTransactionalThe role of annotations is intercepted in the Global Transactional Interceptor.

/**
 * AOP interception method call
 */
@Override
public Object invoke(final MethodInvocation methodInvocation) throws Throwable {
    Class<?> targetClass = (methodInvocation.getThis() != null ? AopUtils.getTargetClass(methodInvocation.getThis()) : null);
    Method specificMethod = ClassUtils.getMostSpecificMethod(methodInvocation.getMethod(), targetClass);
    final Method method = BridgeMethodResolver.findBridgedMethod(specificMethod);

    // Get the GlobalTransactional annotation for the method
    final GlobalTransactional globalTransactionalAnnotation = getAnnotation(method, GlobalTransactional.class);
    final GlobalLock globalLockAnnotation = getAnnotation(method, GlobalLock.class);

    // If the method has a GlobalTransactional annotation, the corresponding method processing is intercepted
    if (globalTransactionalAnnotation != null) {
        return handleGlobalTransaction(methodInvocation, globalTransactionalAnnotation);
    } else if (globalLockAnnotation != null) {
        return handleGlobalLock(methodInvocation);
    } else {
        return methodInvocation.proceed();
    }
}

handleGlobalTransactionThe execute of Transactional Template is called in the method. As can be seen from the class name, it is a standard template method. It defines TM’s standard steps for global transaction processing, and the annotations are clear.

public Object execute(TransactionalExecutor business) throws TransactionalExecutor.ExecutionException {
    // 1\. get or create a transaction
    GlobalTransaction tx = GlobalTransactionContext.getCurrentOrCreate();

    try {
        // 2\. begin transaction
        try {
            triggerBeforeBegin();
            tx.begin(business.timeout(), business.name());
            triggerAfterBegin();
        } catch (TransactionException txe) {
            throw new TransactionalExecutor.ExecutionException(tx, txe,
                TransactionalExecutor.Code.BeginFailure);
        }
        Object rs = null;
        try {
            // Do Your Business
            rs = business.execute();
        } catch (Throwable ex) {
            // 3\. any business exception, rollback.
            try {
                triggerBeforeRollback();
                tx.rollback();
                triggerAfterRollback();
                // 3.1 Successfully rolled back
                throw new TransactionalExecutor.ExecutionException(tx, TransactionalExecutor.Code.RollbackDone, ex);
            } catch (TransactionException txe) {
                // 3.2 Failed to rollback
                throw new TransactionalExecutor.ExecutionException(tx, txe,
                    TransactionalExecutor.Code.RollbackFailure, ex);
            }
        }
        // 4\. everything is fine, commit.
        try {
            triggerBeforeCommit();
            tx.commit();
            triggerAfterCommit();
        } catch (TransactionException txe) {
            // 4.1 Failed to commit
            throw new TransactionalExecutor.ExecutionException(tx, txe,
                TransactionalExecutor.Code.CommitFailure);
        }
        return rs;
    } finally {
        //5\. clear
        triggerAfterCompletion();
        cleanUp();
    }
}

Global transactions are opened through the begin method of DefaultGlobalTransaction.

public void begin(int timeout, String name) throws TransactionException {
    if (role != GlobalTransactionRole.Launcher) {
        check();
        if (LOGGER.isDebugEnabled()) {
            LOGGER.debug("Ignore Begin(): just involved in global transaction [" + xid + "]");
        }
        return;
    }
    if (xid != null) {
        throw new IllegalStateException();
    }
    if (RootContext.getXID() != null) {
        throw new IllegalStateException();
    }
    // Specific method of opening transaction to get XID returned by TC
    xid = transactionManager.begin(null, null, name, timeout);
    status = GlobalStatus.Begin;
    RootContext.bind(xid);
    if (LOGGER.isDebugEnabled()) {
        LOGGER.debug("Begin a NEW global transaction [" + xid + "]");
    }
}

At the beginning of the methodif (role != GlobalTransactionRole.Launcher)The key role of role judgment is whether it is the Launcher or Participant of the global transaction. If the downstream system method for distributed transactions is also added@GlobalTransactionalAnnotation, then its role is Participant, which ignores the begin direct return, and judges whether Launcher or Participant is based on the existence of XID in the current context. Launcher is the one without XID, and Participant is the one with XID. Thus, the creation of global transactions can only be performed by Launcher, and only one Launcher exists in a distributed transaction.

Default Transaction Manager is responsible for TM and TC communication, sending start, commit, rollback instructions.

@Override
public String begin(String applicationId, String transactionServiceGroup, String name, int timeout)
    throws TransactionException {
    GlobalBeginRequest request = new GlobalBeginRequest();
    request.setTransactionName(name);
    request.setTimeout(timeout);
    GlobalBeginResponse response = (GlobalBeginResponse)syncCall(request);
    return response.getXid();
}

So far, the XID returned by fescar-server indicates that a global transaction has been successfully created, and the above process is reflected in the log.

2019-04-09 13:46:57.417 DEBUG 31326 --- [nio-8084-exec-1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : offer message: timeout=60000,transactionName=purchase(java.lang.String,java.lang.String,int)
2019-04-09 13:46:57.417 DEBUG 31326 --- [geSend_TMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : write message:FescarMergeMessage timeout=60000,transactionName=purchase(java.lang.String,java.lang.String,int), channel:[id: 0xa148545e, L:/127.0.0.1:56120 - R:/127.0.0.1:8091],active?true,writable?true,isopen?true
2019-04-09 13:46:57.418 DEBUG 31326 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Send:FescarMergeMessage timeout=60000,transactionName=purchase(java.lang.String,java.lang.String,int)
2019-04-09 13:46:57.421 DEBUG 31326 --- [lector_TMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Receive:MergeResultMessage c[email protected]2dc480dc,messageId:1196
2019-04-09 13:46:57.421 DEBUG 31326 --- [nio-8084-exec-1] c.a.fescar.core.context.RootContext      : bind 192.168.224.93:8091:2008502699
2019-04-09 13:46:57.421 DEBUG 31326 --- [nio-8084-exec-1] c.a.f.tm.api.DefaultGlobalTransaction    : Begin a NEW global transaction [192.168.224.93:8091:2008502699]

After the global transaction is created, business. execute (), the business code, is executed.storageFeignClient.deduct(commodityCode, orderCount)Enter the RM process, where the business logic is to call the storage-service deduction inventory interface.

RM Processing Flow

@GetMapping(path = "/deduct")
public Boolean deduct(String commodityCode, Integer count){
    storageService.deduct(commodityCode,count);
    return true;
}

@Transactional
public void deduct(String commodityCode, int count){
    Storage storage = storageDAO.findByCommodityCode(commodityCode);
    storage.setCount(storage.getCount()-count);

    storageDAO.save(storage);
}

There are no fescar-related codes and annotations in the storage interface and service method, which reflects the non-intrusion of fescar. So how does it get involved in this global transaction? The answer is in Connection Proxy, which is why you have to use it.DataSourceProxyThe reason is that only through DataSourceProxy can we register branch transactions with TC and send the processing results of RM through this entry point when the local transaction of business code is submitted.

Because the transaction submission of the business code itself isConnectionProxyAgent implementations, so when committing local transactions, the Commit method of ConnectionProxy is actually executed.

public void commit() throws SQLException {
    // If the current global transaction is a global transaction, the commit of the global transaction is executed
    // To determine whether a global transaction exists is to see whether XID exists in the current context.
    if (context.inGlobalTransaction()) {
        processGlobalTransactionCommit();
    } else if (context.isGlobalLockRequire()) {
        processLocalCommitWithGlobalLocks();
    } else {
        targetConnection.commit();
    }
}

private void processGlobalTransactionCommit() throws SQLException {
    try {
        // First, register RM with TC and get the branchId assigned by TC.
        register();
    } catch (TransactionException e) {
        recognizeLockKeyConflictException(e);
    }

    try {
        if (context.hasUndoLog()) {
            // Write undolog
            UndoLogManager.flushUndoLogs(this);
        }

        // Submit local transaction, write undo_log and business data in the same local transaction
        targetConnection.commit();
    } catch (Throwable ex) {
        // Send notification of RM transaction failure to TC
        report(false);
        if (ex instanceof SQLException) {
            throw new SQLException(ex);
        }
    }
    // Send notification of RM transaction success to TC
    report(true);
    context.reset();
}

private void register() throws TransactionException {
    // Register RM and build request to send registration instructions to TC via netty
    Long branchId = DefaultResourceManager.get().branchRegister(BranchType.AT, getDataSourceProxy().getResourceId(),
            null, context.getXid(), null, context.buildLockKeys());
    // The branchId to be returned exists in the context
    context.setBranchId(branchId);
}

Verify the above process through the log.

2019-04-09 21:57:48.341 DEBUG 38933 --- [nio-8081-exec-1] o.s.c.a.f.web.FescarHandlerInterceptor   : xid in RootContext null xid in RpcContext 192.168.0.2:8091:2008546211
2019-04-09 21:57:48.341 DEBUG 38933 --- [nio-8081-exec-1] c.a.fescar.core.context.RootContext      : bind 192.168.0.2:8091:2008546211
2019-04-09 21:57:48.341 DEBUG 38933 --- [nio-8081-exec-1] o.s.c.a.f.web.FescarHandlerInterceptor   : bind 192.168.0.2:8091:2008546211 to RootContext
2019-04-09 21:57:48.386  INFO 38933 --- [nio-8081-exec-1] o.h.h.i.QueryTranslatorFactoryInitiator  : HHH000397: Using ASTQueryTranslatorFactory
Hibernate: select storage0_.id as id1_0_, storage0_.commodity_code as commodit2_0_, storage0_.count as count3_0_ from storage_tbl storage0_ where storage0_.commodity_code=?
Hibernate: update storage_tbl set count=? where id=?
2019-04-09 21:57:48.673  INFO 38933 --- [nio-8081-exec-1] c.a.fescar.core.rpc.netty.RmRpcClient    : will connect to 192.168.0.2:8091
2019-04-09 21:57:48.673  INFO 38933 --- [nio-8081-exec-1] c.a.fescar.core.rpc.netty.RmRpcClient    : RM will register :jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false
2019-04-09 21:57:48.673  INFO 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.NettyPoolableFactory   : NettyPool create channel to {"address":"192.168.0.2:8091","message":{"applicationId":"storage-service","byteBuffer":{"char":"\u0000","direct":false,"double":0.0,"float":0.0,"int":0,"long":0,"readOnly":false,"short":0},"resourceIds":"jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false","transactionServiceGroup":"hello-service-fescar-service-group","typeCode":103,"version":"0.4.0"},"transactionRole":"RMROLE"}
2019-04-09 21:57:48.677 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Send:RegisterRMRequest{resourceIds='jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false', applicationId='storage-service', transactionServiceGroup='hello-service-fescar-service-group'}
2019-04-09 21:57:48.680 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Receive:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null,messageId:9
2019-04-09 21:57:48.680 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : [email protected] msgId:9, future :[email protected], body:version=0.4.1,extraData=null,identified=true,resultCode=null,msg=null
2019-04-09 21:57:48.680  INFO 38933 --- [nio-8081-exec-1] c.a.fescar.core.rpc.netty.RmRpcClient    : register RM success. server version:0.4.1,channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091]
2019-04-09 21:57:48.680  INFO 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.NettyPoolableFactory   : register success, cost 3 ms, version:0.4.1,role:RMROLE,channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091]
2019-04-09 21:57:48.680 DEBUG 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : offer message: transactionId=2008546211,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,lockKey=storage_tbl:1
2019-04-09 21:57:48.681 DEBUG 38933 --- [geSend_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : write message:FescarMergeMessage transactionId=2008546211,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,lockKey=storage_tbl:1, channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091],active?true,writable?true,isopen?true
2019-04-09 21:57:48.681 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Send:FescarMergeMessage transactionId=2008546211,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,lockKey=storage_tbl:1
2019-04-09 21:57:48.687 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Receive:MergeResultMessage BranchRegisterResponse: transactionId=2008546211,branchId=2008546212,result code =Success,getMsg =null,messageId:11
2019-04-09 21:57:48.702 DEBUG 38933 --- [nio-8081-exec-1] c.a.f.rm.datasource.undo.UndoLogManager  : Flushing UNDO LOG: {"branchId":2008546212,"sqlUndoLogs":[{"afterImage":{"rows":[{"fields":[{"keyType":"PrimaryKey","name":"id","type":4,"value":1},{"keyType":"NULL","name":"count","type":4,"value":993}]}],"tableName":"storage_tbl"},"beforeImage":{"rows":[{"fields":[{"keyType":"PrimaryKey","name":"id","type":4,"value":1},{"keyType":"NULL","name":"count","type":4,"value":994}]}],"tableName":"storage_tbl"},"sqlType":"UPDATE","tableName":"storage_tbl"}],"xid":"192.168.0.2:8091:2008546211"}
2019-04-09 21:57:48.755 DEBUG 38933 --- [nio-8081-exec-1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : offer message: transactionId=2008546211,branchId=2008546212,resourceId=null,status=PhaseOne_Done,applicationData=null
2019-04-09 21:57:48.755 DEBUG 38933 --- [geSend_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : write message:FescarMergeMessage transactionId=2008546211,branchId=2008546212,resourceId=null,status=PhaseOne_Done,applicationData=null, channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091],active?true,writable?true,isopen?true
2019-04-09 21:57:48.756 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Send:FescarMergeMessage transactionId=2008546211,branchId=2008546212,resourceId=null,status=PhaseOne_Done,applicationData=null
2019-04-09 21:57:48.758 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Receive:MergeResultMessage co[email protected]582a08cf,messageId:13
2019-04-09 21:57:48.799 DEBUG 38933 --- [nio-8081-exec-1] c.a.fescar.core.context.RootContext      : unbind 192.168.0.2:8091:2008546211
2019-04-09 21:57:48.799 DEBUG 38933 --- [nio-8081-exec-1] o.s.c.a.f.web.FescarHandlerInterceptor   : unbind 192.168.0.2:8091:2008546211 from RootContext
  1. Get XID from business-service
  2. Bind XID to the current context
  3. Execute business logic SQL
  4. Create this RM Netty connection to TC
  5. Send branch transaction information to TC
  6. Get the branchId returned by TC
  7. Recording Undo Log data
  8. Send the results of the PhaseOne phase of this transaction to TC
  9. Unbinding XID from the current context

The first and ninth steps are completed in FescarHandler Interceptor. This class is not fescar, but spring-cloud-alibaba-fescar, which implements XID bind and unbind in current request context based on feign, rest communication. Here RM completes the PhaseOne phase, and then looks at the processing logic of the PhaseTwo phase.

Transaction submission

After the execution of each branch transaction, TC summarizes the reporting results of each RM and sends the commands of commit or rollback to each RM.

2019-04-09 21:57:49.813 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Receive:xid=192.168.0.2:8091:2008546211,branchId=2008546212,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,applicationData=null,messageId:1
2019-04-09 21:57:49.813 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.AbstractRpcRemoting    : [email protected] msgId:1, body:xid=192.168.0.2:8091:2008546211,branchId=2008546212,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,applicationData=null
2019-04-09 21:57:49.814  INFO 38933 --- [atch_RMROLE_1_8] c.a.f.core.rpc.netty.RmMessageListener   : onMessage:xid=192.168.0.2:8091:2008546211,branchId=2008546212,branchType=AT,resourceId=jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false,applicationData=null
2019-04-09 21:57:49.816  INFO 38933 --- [atch_RMROLE_1_8] com.alibaba.fescar.rm.AbstractRMHandler  : Branch committing: 192.168.0.2:8091:2008546211 2008546212 jdbc:mysql://127.0.0.1:3306/db_storage?useSSL=false null
2019-04-09 21:57:49.816  INFO 38933 --- [atch_RMROLE_1_8] com.alibaba.fescar.rm.AbstractRMHandler  : Branch commit result: PhaseTwo_Committed
2019-04-09 21:57:49.817  INFO 38933 --- [atch_RMROLE_1_8] c.a.fescar.core.rpc.netty.RmRpcClient    : RmRpcClient sendResponse branchStatus=PhaseTwo_Committed,result code =Success,getMsg =null
2019-04-09 21:57:49.817 DEBUG 38933 --- [atch_RMROLE_1_8] c.a.f.c.rpc.netty.AbstractRpcRemoting    : send response:branchStatus=PhaseTwo_Committed,result code =Success,getMsg =null,channel:[id: 0xd40718e3, L:/192.168.0.2:62607 - R:/192.168.0.2:8091]
2019-04-09 21:57:49.817 DEBUG 38933 --- [lector_RMROLE_1] c.a.f.c.rpc.netty.MessageCodecHandler    : Send:branchStatus=PhaseTwo_Committed,result code =Success,getMsg =null

As you can see from the log

  1. RM received commit notifications of XID = 192.168.0.2:8091:2008546211 and branchId = 2008546212;
  2. Perform the commit action;
  3. The commit result is sent to TC, and branchStatus is PhaseTwo_Committed.

Looking specifically at the implementation process of the two-stage commit, the doBranchCommit method of the AbstractRMHandler class:

/**
 * Key parameters such as xid, branchId, etc. that are notified
 * Then call RM's branchCommit
 */
protected void doBranchCommit(BranchCommitRequest request, BranchCommitResponse response) throws TransactionException {
    String xid = request.getXid();
    long branchId = request.getBranchId();
    String resourceId = request.getResourceId();
    String applicationData = request.getApplicationData();
    LOGGER.info("Branch committing: " + xid + " " + branchId + " " + resourceId + " " + applicationData);
    BranchStatus status = getResourceManager().branchCommit(request.getBranchType(), xid, branchId, resourceId, applicationData);
    response.setBranchStatus(status);
    LOGGER.info("Branch commit result: " + status);
}

Eventually, the request from the branchCommit will be invoked to the branchCommit method of AsyncWorker. The way AsyncWorker handles is a key part of the fescar architecture, because most transactions are normally committed, so the PhaseOne phase is over, so that locks can be released as quickly as possible. After the PhaseTwo phase receives the command of commit, asynchronous processing can be done. Exclude PhaseTwo’s time consumption from a distributed transaction.

private static final List<Phase2Context> ASYNC_COMMIT_BUFFER = Collections.synchronizedList( new ArrayList<Phase2Context>());

/**
 * Adding XIDs that will need to be submitted to the list
 */
@Override
public BranchStatus branchCommit(BranchType branchType, String xid, long branchId, String resourceId, String applicationData) throws TransactionException {
    if (ASYNC_COMMIT_BUFFER.size() < ASYNC_COMMIT_BUFFER_LIMIT) {
        ASYNC_COMMIT_BUFFER.add(new Phase2Context(branchType, xid, branchId, resourceId, applicationData));
    } else {
        LOGGER.warn("Async commit buffer is FULL. Rejected branch [" + branchId + "/" + xid + "] will be handled by housekeeping later.");
    }
    return BranchStatus.PhaseTwo_Committed;
}

/**
 * Consumption of XID in List by Timing Tasks
 */
public synchronized void init() {
    LOGGER.info("Async Commit Buffer Limit: " + ASYNC_COMMIT_BUFFER_LIMIT);
    timerExecutor = new ScheduledThreadPoolExecutor(1,
        new NamedThreadFactory("AsyncWorker", 1, true));
    timerExecutor.scheduleAtFixedRate(new Runnable() {
        @Override
        public void run() {
            try {
                doBranchCommits();
            } catch (Throwable e) {
                LOGGER.info("Failed at async committing ... " + e.getMessage());
            }
        }
    }, 10, 1000 * 1, TimeUnit.MILLISECONDS);
}

private void doBranchCommits() {
    if (ASYNC_COMMIT_BUFFER.size() == 0) {
        return;
    }
    Map<String, List<Phase2Context>> mappedContexts = new HashMap<>();
    Iterator<Phase2Context> iterator = ASYNC_COMMIT_BUFFER.iterator();

    // Remove all to-do data from ASYNC_COMMIT_BUFFER in a timing loop
    // Grouping commit data with resourceId as key, resourceId is a connection URL for a database
    // As you can see from the previous log, the purpose is to cover the creation of multiple data sources for the application.
    while (iterator.hasNext()) {
        Phase2Context commitContext = iterator.next();
        List<Phase2Context> contextsGroupedByResourceId = mappedContexts.get(commitContext.resourceId);
        if (contextsGroupedByResourceId == null) {
            contextsGroupedByResourceId = new ArrayList<>();
            mappedContexts.put(commitContext.resourceId, contextsGroupedByResourceId);
        }
        contextsGroupedByResourceId.add(commitContext);

        iterator.remove();

    }

    for (Map.Entry<String, List<Phase2Context>> entry : mappedContexts.entrySet()) {
        Connection conn = null;
        try {
            try {
                // Obtain data sources and connections based on resourceId
                DataSourceProxy dataSourceProxy = DataSourceManager.get().get(entry.getKey());
                conn = dataSourceProxy.getPlainConnection();
            } catch (SQLException sqle) {
                LOGGER.warn("Failed to get connection for async committing on " + entry.getKey(), sqle);
                continue;
            }
            List<Phase2Context> contextsGroupedByResourceId = entry.getValue();
            for (Phase2Context commitContext : contextsGroupedByResourceId) {
                try {
                    // Perform undolog processing by deleting records corresponding to XID and branchId
                    UndoLogManager.deleteUndoLog(commitContext.xid, commitContext.branchId, conn);
                } catch (Exception ex) {
                    LOGGER.warn(
                        "Failed to delete undo log [" + commitContext.branchId + "/" + commitContext.xid + "]", ex);
                }
            }

        } finally {
            if (conn != null) {
                try {
                    conn.close();
                } catch (SQLException closeEx) {
                    LOGGER.warn("Failed to close JDBC resource while deleting undo_log ", closeEx);
                }
            }
        }
    }
}

So for the processing of commit action, RM only needs to delete undo_log corresponding to XID and branchId.

Transaction rollback

There are two scenarios for triggering rollback scenarios

  1. Branch transaction exception, in ConnectionProxyreport(false)Situation
  2. TM captures the exception thrown on the downstream system, that is, initiating a global transaction labeled@GlobalTransactionalThe annotated method captures exceptions. In the execute template method of the Transactional Template class, the call to business. execute () is caught, and rollback is called after catch. TM notifies TC that XID needs to roll back the transaction.
public void rollback() throws TransactionException {
   // Only Launcher can launch the rollback
   if (role == GlobalTransactionRole.Participant) {
       // Participant has no responsibility of committing
       if (LOGGER.isDebugEnabled()) {
           LOGGER.debug("Ignore Rollback(): just involved in global transaction [" + xid + "]");
       }
       return;
   }
   if (xid == null) {
       throw new IllegalStateException();
   }

   status = transactionManager.rollback(xid);
   if (RootContext.getXID() != null) {
       if (xid.equals(RootContext.getXID())) {
           RootContext.unbind();
       }
   }
}

TC aggregates and sends rollback instructions to participants. RM receives the rollback notification in the doBranchRollback method of the AbstractRMHandler class.

protected void doBranchRollback(BranchRollbackRequest request, BranchRollbackResponse response) throws TransactionException {
    String xid = request.getXid();
    long branchId = request.getBranchId();
    String resourceId = request.getResourceId();
    String applicationData = request.getApplicationData();
    LOGGER.info("Branch rolling back: " + xid + " " + branchId + " " + resourceId);
    BranchStatus status = getResourceManager().branchRollback(request.getBranchType(), xid, branchId, resourceId, applicationData);
    response.setBranchStatus(status);
    LOGGER.info("Branch rollback result: " + status);
}

Then the rollback request is passed to theDataSourceManagerThe branchRollback method of the class.

public BranchStatus branchRollback(BranchType branchType, String xid, long branchId, String resourceId, String applicationData) throws TransactionException {
    // Get the corresponding data source according to resourceId
    DataSourceProxy dataSourceProxy = get(resourceId);
    if (dataSourceProxy == null) {
        throw new ShouldNeverHappenException();
    }
    try {
        UndoLogManager.undo(dataSourceProxy, xid, branchId);
    } catch (TransactionException te) {
        if (te.getCode() == TransactionExceptionCode.BranchRollbackFailed_Unretriable) {
            return BranchStatus.PhaseTwo_RollbackFailed_Unretryable;
        } else {
            return BranchStatus.PhaseTwo_RollbackFailed_Retryable;
        }
    }
    return BranchStatus.PhaseTwo_Rollbacked;
}

Ultimately, the undo method of the UndoLogManager class will be executed, because the pure JDBC operation code is relatively long and will not be pasted out. You can view the source code by connecting to GitHub and talk about the specific process of undo:

  1. Find undo_log submitted by PhaseOne phase according to XID and branchId;
  2. If found, the playback SQL is generated and executed according to the data recorded in undo_log, that is, the data modified in PhaseOne phase is restored.
  3. After the second step is processed, the undo_log data is deleted.
  4. If the corresponding undo_log is not found in step 1, insert a state ofGlobalFinishedUndo_log. The reason for this failure may be that the local transaction in the PhaseOne phase was abnormal, resulting in no normal write. Because XID and branchId are the only indexes, the insertion of step 4 can prevent the successful writing of PhaseOne phase after recovery, then PhaseOne phase will be abnormal, so that business data will not be submitted successfully, and the data will achieve the effect of final rollback.

summary

Combining with the distributed business scenario locally, the main processing flow of fescar client side is analyzed, and the main source codes of TM and RM roles are analyzed, hoping to help you understand the working principle of fescar.

With the rapid iteration of fescar and the continuous improvement of Roadmap planning in the later stage, it is believed that fescar can become a benchmark solution for open source distributed transactions over time.



Author: Brother Middleware

Read the original text

This article is the original content of Yunqi Community, which can not be reproduced without permission.

Recommended Today

The method of obtaining the resolution of display by pyqt5

The code is as follows import sys from PyQt5.QtWidgets import QApplication, QWidget class Example(QWidget): def __init__(self): super().__init__() self.initUI() #Interface drawing to initui method def initUI(self): self.desktop = QApplication.desktop() #Get display resolution size self.screenRect = self.desktop.screenGeometry() self.height = self.screenRect.height() self.width = self.screenRect.width() print(self.height) print(self.width) #Show window self.show() if __name__ == ‘__main__’: #Create applications and objects app […]