Since elastic job Lite is decentralized, why should we elect a master node


Opening remarks

The last article introduced the introduction and architecture of elastic job Lite. Using and some processes, it is mentioned that elastic job Lite is a decentralized and lightweight task scheduling framework. Why should elastic job Lite select the master node when starting? Am I wrong? Haha, it’s impossible. Later, elastic job Lite is abbreviated as ejl.

Leader election

Ejl is positioned as lightweight and decentralized. Its task scheduling is driven by its own machines. Each machine coordinates through ZK. Ejl creates a jobscheduler for each task, and elects a master node for each job in the initialization of jobscheduler. Remember that there is not a master node in the whole Bureau, but a master node for each task. As shown in the following figure, two tasks job1 and job2 are running on each node. Then, when starting, each node will create two jobscheduler objects and elect a leader in the cluster for each task.

How was this leader elected? When will the election begin? 1、 Elect a leader for each task when the whole cluster starts; 2、 When the leaders of some tasks go offline, they will be re elected.

Election of leaders at cluster startup

In jobscheduler

In callback Execute () as follows, judge again that there is no master node, and write the current machine example ID

if (!hasLeader()) {

Leader re-election, mainly throughLeaderElectionJobListenerThis listener is used to re elect leaders. When a job is still running, but the leader node goes offline, the leader must be re elected

class LeaderElectionJobListener extends AbstractJobListener{

The essence of the primary node election is that everyone competes for a ZK distributed lock. Whoever gets the lock first is the master node.

When to use a leader? What does it do?

In a distributed system, there are multiple machines and pieces in the execution of a task, so how to allocate it? Which machines perform which segments? If everyone participates, it will be chaotic. At this time, a leader is needed to make a decision. There are two places in ejl that require leader nodes to participate:

  1. After the machine is started, when the task is executed for the first time, the leader is required to partition
  2. When new nodes are added in the cluster, the number of shards changes, or some nodes go offline, re sharding will be triggered

The main code is as follows. You can read the source code fromAbstractElasticJobExecuteIn classexecuteThe method began to look.

Abstractelasticjobexecute class

When to delete a leader node

There are three times to delete the leader node. First, when the machine process of the leader node is broken, the JVM deletes itself through the hook method; 2、 Delete the leader node when the job is disabled. Third, the master node process is closed remotely

Leader machine process shutdown

In the jobshutdownhookplugin class

When the job is disabled

In the leaderabidicationjoblistener class

When the job terminates scheduling

In the instanceshutdownstatusjoblistener class

Data structure of ejl’s leader in ZK

Code inLeaderNodeIn class

Root path of leader in ZK

String ROOT = "leader";

This is the parent path for the leader to vote. / Leader / election

String ELECTION_ROOT = ROOT + "/election";

Save the address of the master node / Leader / selection / instance. This is a temporary node. After the machine where the leader is located goes offline, this path will disappear and play a role in re-election

String INSTANCE = ELECTION_ROOT + "/instance";

Distributed lock of leader election

String LATCH = ELECTION_ROOT + "/latch";

Since elastic job Lite is decentralized, why should we elect a master node

This work adoptsCC agreement, reprint must indicate the author and the link to this article

That boy Ah Wei