Can not miss the distributed ID generator (leaf), easy to use batch!

Time:2021-10-17

This article is included in the personal blog:www.chengxy-nds.top, technical data sharing and progress

Students who don’t understand distributed ID should go firstInterviewers are a little confused when they say nine distributed ID generation methods at one goReview the basic knowledge, and I won’t repeat it here

Leaf

LeafIt is a distributed ID generation service launched by meituan. Its name is taken from the words of German philosopher and mathematician Leibniz: “there are no two identical leaves in the world.” it has such a moral to take a name. Meituan programmers break it!

LeafAdvantages:High reliabilityLow delayGlobally uniqueAnd so on.

At present, the mainstream distributed ID generation methods are generally based onDatabase segment modeandSnow flake algorithmHowever, leaf just combines these two methods at the same time, and can switch flexibly according to different business scenarios.

Next, combined with the actual combat, I will introduce it in detailLeafofLeaf segment modeandLeaf snowflake mode

1、 Leaf segment mode

Leaf-segmentSegment mode is for direct useDatabase self increment IDact asDistributed IDAn optimization to reduce the frequency of database operations. It is equivalent to obtaining self incrementing IDs from the database in batches. One number segment range is taken out from the database each time. For example, (11000] represents 1000 IDs. The business service will locally generate 1 ~ 1000 self incrementing IDs from the number segment and load them into memory.

The general process is shown in the figure below:
Can not miss the distributed ID generator (leaf), easy to use batch!
After the number segment is exhausted, go to the database to obtain a new number segment, which can greatly reduce the pressure of the database. yesmax_idField onceupdateOperation,update max_id= max_id + stepIf the update is successful, the new number segment is successful. The range of the new number segment is(max_id ,max_id +step]。

Due to the dependence on the database, we first design the following table structure:

CREATE TABLE `leaf_alloc` (
  `biz_ Tag ` varchar (128) not null default '' comment 'business key',
  `max_ ID ` bigint (20) not null default '1' comment 'the maximum ID currently allocated',
  `Step ` int (11) not null comment 'initial step size, which is also the minimum step size dynamically adjusted',
  `Description ` varchar (256) default null comment 'description of business key',
  `update_ time` timestamp NOT NULL DEFAULT CURRENT_ TIMESTAMP ON UPDATE CURRENT_ Timestamp comment 'update time of database maintenance',
  PRIMARY KEY (`biz_tag`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Insert a test piece of business data in advance

INSERT INTO `leaf_ Alloc ` (` biz_tag `, ` max_id `, ` step `, ` description `, ` update_time `) values ('leaf segment test ',' 0 ',' 10 ',' test ',' 2020-02-28 10:41:03 ');
  • biz_tag: biz is used for different business requirements_ The tag field is used for isolation. If you need to expand the capacity in the future, you only need to perform biz_ Tag sub database and sub table
  • max_id: the maximum value of the current business number segment, which is used to calculate the next business number segment

    • step: step size, that is, the number of IDS obtained each time

      • description: there is nothing to say about the business description

Download the leaf project locally:https://github.com/Meituan-Dianping/Leaf

Modify the in the projectleaf.propertiesFiles, adding database configuration

leaf.name=com.sankuai.leaf.opensource.test
leaf.segment.enable=true
leaf.jdbc.url=jdbc:mysql://127.0.0.1:3306/xin-master?useUnicode=true&characterEncoding=utf8
leaf.jdbc.username=junkang
leaf.jdbc.password=junkang

leaf.snowflake.enable=false

be carefulleaf.snowflake.enableAndleaf.segment.enableIt cannot be opened at the same time, otherwise the project will not start.

The configuration is quite simple and can be started directlyLeafServerApplicationThen it’s OK. Next, test it,leafIs based onHTTP requestThe issuing service,LeafControllerThere are only two methods, one segment interface and one snowflake interface,keyIt is the business pre inserted in the databasebiz_tag

@RestController
public class LeafController {
    private Logger logger = LoggerFactory.getLogger(LeafController.class);

    @Autowired
    private SegmentService segmentService;
    @Autowired
    private SnowflakeService snowflakeService;

    /**
     *Segment mode
     * @param key
     * @return
     */
    @RequestMapping(value = "/api/segment/get/{key}")
    public String getSegmentId(@PathVariable("key") String key) {
        return get(key, segmentService.getId(key));
    }

    /**
     *Snowflake algorithm mode
     * @param key
     * @return
     */
    @RequestMapping(value = "/api/snowflake/get/{key}")
    public String getSnowflakeId(@PathVariable("key") String key) {
        return get(key, snowflakeService.getId(key));
    }

    private String get(@PathVariable("key") String key, Result id) {
        Result result;
        if (key == null || key.isEmpty()) {
            throw new NoKeyException();
        }
        result = id;
        if (result.getStatus().equals(Status.EXCEPTION)) {
            throw new LeafServerException(result.toString());
        }
        return String.valueOf(result.getId());
    }
}

visit:http://127.0.0.1:8080/api/segment/get/leaf-segment-test, the result returned normally. I felt that there was nothing wrong, but when I checked the data in the database table, I found a problem.
Can not miss the distributed ID generator (leaf), easy to use batch!
Can not miss the distributed ID generator (leaf), easy to use batch!
Usually, when using the number segment mode, the time to get the number segment is when the previous number segment is consumed, but just now an ID has been taken, but it has been updated in the databasemax_id, which meansleafOne more number segment has been obtained. What the hell is this?
Can not miss the distributed ID generator (leaf), easy to use batch!

LeafWhy is it designed like this?

LeafIt is hoped that there will be no blocking in the process of getting the number segment in the DB!

When the number segment is exhausted, go to the DB to remove a number segment. If the network jitters or the DB has a slow query, and the business system cannot get the number segment, the response time of the whole system will slow down. This is intolerable for businesses with huge traffic.

thereforeLeafWhen the current segment is consumed to a certain point, the next segment is asynchronously loaded into memory. Instead of waiting until the number segment is exhausted to update the number segment. This greatly reduces the risk of the system.

thatA pointWhen exactly?

An experiment is done here. The length of the number segment is set tostep=10max_id=1
Can not miss the distributed ID generator (leaf), easy to use batch!
When I took the first ID, I saw that the number segment increased by 1 / 10
Can not miss the distributed ID generator (leaf), easy to use batch!
Can not miss the distributed ID generator (leaf), easy to use batch!
When I take the third ID, I see that the number segment has increased again, 3 / 10
Can not miss the distributed ID generator (leaf), easy to use batch!
Can not miss the distributed ID generator (leaf), easy to use batch!
LeafuseDouble bufferIts service has two segment cachessegment。 When the current number segment has consumed 10%, and the next number segment has not been obtained, another update thread will be started to update the next number segment.

In shortLeafIt ensures that two more number segments will always be cached. Even if the database hangs at any time, it will ensure that the sending service can work normally for a period of time.

Can not miss the distributed ID generator (leaf), easy to use batch!
Generally recommended segment(segment)The length is set to 600 times (10 minutes) of the QPS issued during the peak service period, so that even if the DB goes down, the leaf can continue to issue numbers for 10-20 minutes without being affected.

advantage:

  • The leaf service can be easily linearly expanded, and its performance can fully support most business scenarios.
  • High disaster tolerance: the leaf service has an internal number segment cache. Even if the DB goes down, the leaf can still provide external services normally in a short time.

Disadvantages:

  • The ID number is not random enough. It can disclose the information of the number of numbers, which is not very safe.
  • DB downtime may make the whole system unavailable (it is possible to use the database).

2、 Leaf snowflake

Leaf-snowflakeBasically, the design of snowflake is followed, and the ID composition structure is as follows:Positive digit(1 bit)+time stamp(41 bits)+Machine ID(5 bits)+Machine room ID(5 bits)+Self increment(12 bits), a long type composed of 64 bits in total.

Leaf-snowflakeDifferent from the original snowflake algorithm, it is mainly in the generation of workid,Leaf-snowflakerely onZookeepergenerateworkId, that is, the one aboveMachine ID(5 bits)+Machine room ID(5 bits).LeafWorkid in is based on zookeeperSequence IDWhen using leaf snowflake, each application will generate a sequence ID in zookeeper at startup, which is equivalent to a sequence node corresponding to a machine, that is, a workid.

Can not miss the distributed ID generator (leaf), easy to use batch!
Leaf-snowflakeThe process of starting the service is roughly as follows:

  • Start the leaf snowflake service, connect zookeeper and_ Under the forever parent node, check whether you have registered (whether there are children in this order).

    • If you have registered, directly retrieve your workerid (int type ID number generated by ZK order node) and start the service.

      • If it has not been registered, create a persistent sequence node under the parent node. After successful creation, retrieve the sequence number as its own workerid number and start the service.

butLeaf-snowflakeIt is a weak dependency on zookeeper. In addition to going to ZK to get data every time, it will also cache one on the local file systemworkerIDFile. Once the zookeeper has a problem and the machine needs to be restarted in case of failure, the service can still be started normally.

start-upLeaf-snowflakeThe mode is also relatively simple. Start the local zookeeper and modify the in the projectleaf.propertiesFile, closingLeaf.segment pattern, enableleaf.snowflakeMode.

leaf.segment.enable=false
#leaf.jdbc.url=jdbc:mysql://127.0.0.1:3306/xin-master?useUnicode=true&characterEncoding=utf8
#leaf.jdbc.username=junkang
#leaf.jdbc.password=junkang

leaf.snowflake.enable=true
leaf.snowflake.zk.address=127.0.0.1
leaf.snowflake.port=2181
/**
     *Snowflake algorithm mode
     * @param key
     * @return
     */
    @RequestMapping(value = "/api/snowflake/get/{key}")
    public String getSnowflakeId(@PathVariable("key") String key) {
        return get(key, snowflakeService.getId(key));
    }

Test it and visit:http://127.0.0.1:8080/api/snowflake/get/leaf-segment-test

Can not miss the distributed ID generator (leaf), easy to use batch!
advantage:

  • The ID number is a 64 bit number of 8byte with increasing trend, which meets the requirements of the primary key stored in the above database.

Disadvantages:

  • Depending on zookeeper, there is a risk of service unavailability (I really don’t know what the disadvantages are)

3、 Leaf monitoring

Request address:http://127.0.0.1:8080/cache

For the monitoring of the service itself, leaf provides the memory data mapping interface of the web layer, which can see the distribution status of all segments in real time. For example, the usage of double buffers in each segment and the location to which the current ID is sent can be viewed on the web interface.

Can not miss the distributed ID generator (leaf), easy to use batch!

summary

This article does not make too much analysis on the leaf source code, because the amount of leaf code is simple and easy to read.

It’s not easy to be original. Burn your hair and output content. If you lose something, point a praise and encourage it!

Students who read VX in technical articles and want to get more Java resources can pay attention to my official account.Something inside the programmer, code:[666]