Transactional replication explains how to skip a transaction

Time:2021-9-16

In transactional replication, data synchronization delay is often encountered. Sometimes these delays are due to an update performed in the publication, such as update TA set col =? Where ?, This update contains a huge amount of data. On the subscription side, this update will be broken down into multiple commands (one command per data line by default) and applied to the subscription. As a last resort, we need to skip this large transaction and let the replication continue to run.

Now let’s introduce some principles and specific methods of transactional replication

When the article of the publication database is updated, the corresponding log will be generated. The log reader will read these log information and write them to the msrepl of the distribution database_ Transactions and msrepl_ Commands.

Msrepl_ Each record in transactions has a unique identifier Xact_ seqno,xact_ Seqno corresponds to the LSN in the log. So you can use Xact_ Seqno infers their generation order in the publication database. The generation time of large numbers is later and that of small numbers is earlier.

The distribution agent consists of two sub processes, reader and writer. The reader is responsible for reading data from the distribution database, and the writer is responsible for writing the data read by the reader to the subscription database

The reader is through sp_ MSget_ repl_ Commands to read the data in the distribution database (read the msrepl_transactions table and msrepl_commands table)

Here is sp_ MSget_ repl_ Parameter definition of commands


CREATE PROCEDURE sys.sp_MSget_repl_commands 

( 

@agent_id int, 

@last_xact_seqno varbinary(16), 

@get_count tinyint = 0, -- 0 = no count, 1 = cmd and tran (legacy), 2 = cmd only 

@compatibility_level int = 7000000, 

@subdb_version int = 0, 

@read_query_size int = -1 

) 

This stored procedure has six parameters. In transactional replication, only the first four parameters will be used (and the values of the third parameter and the fourth parameter are fixed. They are 0 and 10000000 respectively). Here is an example:

execsp_MSget_repl_commands 46,0x0010630F000002A900EA00000000,0,10000000

@agent_ ID means distribution agent. Each subscription will have a separate distribution agent to process data. Bring in @ agent_ ID, you can find the publication corresponding to the subscription and all articles.

@last_ xact_ Seqno represents the LSN last delivered to the subscription.

The general logic is that the reader reads the msreplication of the subscription database_ Transaction of subscriptions table_ Timestamp column, obtain the last updated LSN number, and then read the data with LSN greater than this number in the distribution database. The writer writes the read data to the subscription and updates msreplication_ Transaction of subscriptions table_ Timestamp column. Then the reader will continue to use the new LSN to read the subsequent data, and then pass it to the writer, and so on.

If we update the transaction manually_ Timestamp column, set this value to the LSN of the currently executing large transaction, and the distribution agent will skip the large transaction instead of reading it.

Here’s an example to illustrate

The environment is as follows

Publisher: SQL108W2K8R21

Distributor: SQL108W2K8R22

Subscriber: SQL108W2K8R23

The highlighted publication in the figure contains three Aritcles, Ta, TB and TC

It TA contains 182 million data, and then we did the following operations

An update statement was made at 11:00,

update ta set c=-11

Subsequently, perform some insertion operations on tables TA, TB and TC

insert tb values(0,0)

insert tc values(0,0)

After that, we started the Replication Monitor and found that there was a big delay. The distribution agent was always delivering the data generated by a) operation

Execute the following statement in the subscription database to get the transaction number of the current latest record


declare @publisher sysname 

declare @publicationDB sysname 

declare @publication sysname 

set @publisher='SQL108W2K8R22' 

set @publicationDB='pubdb' 

set @publication='pubdbtest2'

select transaction_timestamp From MSreplication_subscriptions 

where 

[email protected] and 

[email protected] and 

[email protected] 

In my environment, the transaction number is 0x000001490004e9a0004000000000000

Return to the distribution database and execute the following statement to get the transaction number immediately following the large transaction. Please replace the parameter with the data in your actual environment. (note that if you encounter performance problems when executing the following statements, replace the parameters directly with values)


declare @publisher sysname 

declare @publicationDB sysname 

declare @publication sysname 

declare @transaction_timestamp [varbinary](16) 

set @publisher='SQL108W2K8R21' 

set @publicationDB='publicationdb2' 

set @publication='pubtest' 

set @transaction_timestamp= 0x0000014900004E9A0004000000000000

select top 1 xact_seqno from MSrepl_commands with (nolock) where xact_seqno>@transaction_timestamp and 

article_id in ( 

  select article_id From MSarticles a inner join MSpublications p on a.publication_id=p.publication_id and a.publisher_id=p.publisher_id and a.publisher_db=p.publisher_db 

  inner join sys.servers s on s.server_id=p.publisher_id 

  where [email protected] and [email protected] and [email protected] 

) 

and publisher_database_id =( 

    select id From MSpublisher_databases pd inner join MSpublications p on pd.publisher_id=p.publisher_id 

    inner join sys.servers s on pd.publisher_id=s.server_id and pd.publisher_db=p.publisher_db 

    where [email protected] and [email protected] and [email protected] 

) 

Order by xact_seqno

In my environment, the transaction number is 0x0000018c00000100171

Execute the following statement in the subscription database to skip large transactions. Please replace the parameter with the data in your actual environment


declare @publisher sysname

declare @publicationDB sysname 

declare @publication sysname 

declare @transaction_timestamp [varbinary](16) 

set @publisher='SQL108W2K8R22' 

set @publicationDB='pubdb' 

set @publication='pubdbtest2' 

set @transaction_timestamp= 0x0000018C000001000171

update MSreplication_subscriptions set [email protected]_timestamp 

where [email protected] and [email protected] and [email protected] 

After execution, open the distribution agent job.

Next, you will find that the transaction has been successfully skipped, TA will not be updated at the subscriber, subsequent updates will be gradually delivered to the subscription, and the delay disappears.