Micro course lesson 12 introduction to global sequence



In the last issue, we gave a brief introduction and demonstration of hint. Our basic functions will be introduced here. Here are some advanced functions.

Micro course lesson 12 introduction to global sequence

Global sequence

Let’s first introduce global sequences. Our dble currently supports four global sequences. It is said that there are four kinds. If they are divided into two kinds according to their kernel, they are only four in different carriers.

Micro course lesson 12 introduction to global sequence

Let’s take a look at the two concepts.

In the upper right corner, this is the snowflake algorithm, which was first proposed by twitter and implemented by long digital segmentation. Dble has slightly adjusted its details, but it does not affect its basic concept. It makes a global sequence through timestamp. First of all, it is a 41 bit time stamp. Can you first calculate the 41st power of 2? We have a conclusion here. This number should be large enough to last 69 years at the millisecond level. For our average system, it should be impossible to survive for 79 years. In fact, 69 years is enough. Just 70 years after the birth of the computer, I have another 12 digit serial number in every millisecond. The 12 digit serial number is the 12th power of 2, which is 4096, that is, one millimeter supports 4096 concurrent operations. If we convert it to QPS, it is multiplied by 1000. After multiplying 1000, it should be a throughput of 4 million. We think that most of them should not meet such high business requirements, so they are basically enough.

There is also a working machine ID in the middle. This is designed for clusters. When my dble is not deployed in a single node, but in a cluster or in a load balanced manner. I need to ensure that the final split sequence I insert into the database is unique. In fact, I do this by identifying IDs for different machines or instances. In this way, as long as the time of a single machine does not go back, and my timestamp, my machine ID can ensure that my entire sequence is unique. But if your concurrency peak really exceeds 4096 per millisecond, will there be a repetition? In fact, he will not repeat, he will wait. It will directly take the concurrency that the peak value can’t satisfy. It will wait until the millisecond has passed and go to the next millisecond to regenerate a new sequence. In this case, the delay will be affected.

OK, this is the principle of snowflake. Here is the principle of offset step. What is it? He has a signer, which can issue signals in steps. For example, I send the number in steps of 1000. When my dble needs to use a global sequence, go to this signer to apply for a step size unit. Apply for 1000 numbers, all of them belong to me. For example, if a dble is a cluster and another dble also applies for this number, through the concurrency control of the signer, the result of the application must not be in the same interval, so as to ensure that my sequence is unique. In fact, compared with the previous method, this method has obvious advantages and disadvantages. The advantage is that it can refine the granularity to a table. For example, there is no relationship between my table and the global sequence between tables. We can reuse numbers. Snowflake cannot be reused because it is time-dependent. The same set of values is used for all auto increment order lists. The signer mode can be reused. Determine if an additional carrier is needed and who will do the signer to control concurrency, issuance and persistence. So we have MySQL and ZK modes, which are actually carriers of the signer.

OK, let’s introduce it here.

In order to facilitate reading, some spoken words are optimized without affecting learning. The manuscript and the video will be consistent as far as possible.