Global sequence of distributed timestamps in distributed | dble

Time:2021-2-23

Author: Wu Jinling
Aikesheng is a member of the dble project team, mainly responsible for the daily testing work related to dble, and is good at troubleshooting problems in dble. I love testing work and want to carry it out in the end for the rest of my life.
Source: original contribution
*The original content is not allowed to be used without authorization. Please contact Xiaobian for reprint and indicate the source.


There are four global sequences in dble, which are MySQL offset step, timestamp, distributed timestamp and distributed offset step. From the perspective of testing, this paper will briefly describe the construction and use of the global sequence environment of distributed timestamp.

1、 A brief introduction of global sequence based on distributed timestamp

This method provides a distributed ID generator based on zookeeper (hereinafter referred to as zk), which can generate a globally unique 63 bit binary ID (the first bit is always 0, ensuring that the global sequence is positive).

The 63 bit pattern of positive numbers is as follows:

Global sequence of distributed timestamps in distributed | dble

Among them:

  • A – E is from high to low;
  • A is the low 9-bit value of thread ID;
  • B is the 5-bit instance ID value;

This value is the configuration file sequence_ distributed_ conf.properties The instanceid value in or the value obtained from zookeeper server;

  • C is the 4-bit data center ID value;

This is the configuration file sequence_ distributed_ conf.properties The value of clusterid in;

  • D is a 6-bit self increasing value;
  • E is the low 39 bit value of the current time stamp of the system (it can be used for 17 years).

2、 Build a global sequence environment using distributed timestamps

1. Configure ZK environment

To build a dble & ZK environment, please refer to another article in the community, “using zookeeper to manually deploy a dble cluster environment.”. In this paper, a ZK manages three dbles (dble-1, dble-2, dble-3) to form a cluster. The versions of the three dbles are 2.20.04.0.

2. Configuration requirements of dble

1) Sequence of dbles in cluster_ distributed_ conf.properties to configure

Sequence in dble-1_ distributed_ conf.properties The configuration is as follows:

INSTANCEID=zk
CLUSTERID=01
START_TIME=2010-11-04 09:42:54

Sequence in dble-2_ distributed_ conf.properties The configuration is as follows:

INSTANCEID=zk
CLUSTERID=02
START_TIME=2010-11-04 09:42:54

Sequence in dble-3_ distributed_ conf.properties The configuration is as follows:

INSTANCEID=zk
CLUSTERID=03
START_TIME=2010-11-04 09:42:54

sequence_ distributed_ conf.properties Medium:

Instanceid: Specifies the instance ID value, which can be ‘ZK’ or n (n is an integer in the interval [0,31]). If ZK is configured, the maintenance of the sequence (mainly the maintenance of the instanceid value) is maintained by the temporary self increasing node of zookeeper. Every time a global sequence is generated, a temporary self incrementing node is applied to ZK, and the instanceid is obtained by calculating the number of self incrementing nodes% 32. If the instanceid value is not ‘ZK’, the maintenance of the sequence only depends on the single instance (mainly the maintenance of the instanceid value). In this case, the sequence is similar to the timestamp method.

Clusterid: Specifies the group ID value, which can be m (M is an integer in the interval [0,15]).

START_ Time: specify the start time. The time format is fixed. It must be in the format of 09:42:54 on November 4, 2010.

2) In dble-1 schema.xml And server.xml The configuration is modified as follows:

schema.xml The key configuration is as follows:

<schema name="schema1" dataNode="dn1">
<table name="tb_autoIncre" dataNode="dn1,dn2,dn3,dn4" rule="hash-four" cacheKey="id" incrementColumn="id"/>
</schema>

server.xml The key configuration is as follows:

<system>
    <pro
perty name="sequenceHandlerType">3</property>
</system>

3) Log in to the management port of dble-1 and execute the management command reload @ config_ All, and then restart three dbles in turn to make the configuration take effect in the cluster.

3、 Operation and result verification

1. Create tables and insert data

Create a table with a bigint type autoincrement column, log in to the business port of dble-1, and execute the following steps:

mysql -p111111 -utest -P8066 -Dschema1 -e "create table tb_autoIncre(id bigint,time char(120));"

Insert data, execute:

datestr=`date +%Y%m%d`
mysql -p111111 -utest -P8066 -Dschema1 -e "insert tb_autoIncre values('${datestr}');"

To query the inserted data, execute:

mysql -p111111 -utest -P8066 -Dschema1 -e "select * from tb_autoIncre ;"

Global sequence of distributed timestamps in distributed | dble

2. Verify whether the inserted ID auto increment column value is correct:

The ID in the figure above is a positive number. Convert the positive number into binary, and then deduce whether the components are consistent with the design according to the 63 bit binary number composition rules in the introduction above. The specific steps are as follows:

1) Convert the ID obtained in step 2 into binary record (a). If the result is less than 64 bits, add 0 to make up 64 bits

select conv(id, 10, 2) from  tb_autoIncre;

Global sequence of distributed timestamps in distributed | dble

Therefore, the binary after 64 bits complement is: 000000000000000000100000001001011000111101011000

2) Record the closed interval of the first [16 ~ 19] bits of binary (a) as 0001, which is converted into decimal system as 1, and the value is equal to the clusterid value in the configuration; take the closed interval of the first [11 ~ 15] bits as 00000, which is converted into decimal system as 0, and the instance value of temporary self increasing node in ZK as,

Global sequence of distributed timestamps in distributed | dble

0% 32 is also 0, so it meets the expectation; the last 39 bits of the intercepted binary are a new binary 1000110001000111000111101011000, which is recorded as (b) ‬.

3) Convert binary (b) to decimal

select conv(100011000100001001011100011111101011000, 2, 10);

Global sequence of distributed timestamps in distributed | dble

So (b) into decimal: 301204389720, recorded as (c).

4) Convert (c) to date (T1)

set @unixtime=301204389720/1000;
select from_unixtime(@unixtime);

Global sequence of distributed timestamps in distributed | dble

5) Record the date (T1) – 1970 / 01 / 01 and the time difference as (T2)

select datediff('1979-01-08 08:53:58.752000','1970-01-01');

Global sequence of distributed timestamps in distributed | dble

6) Use start time (2010 / 11 / 04 09:42:54) + (T2) to get the final time (T3)

select date_add('2010-11-04 09:42:54', interval 3486 day);

Global sequence of distributed timestamps in distributed | dble

(T3) is approximately equal to the value of the time column selected.

Conclusion: it can be seen from the above analysis that the value inserted into the autoincrement column is correct.