Big country pride, domestic distributed scheduling system, you may know, but not in-depth

Time:2021-1-20

Write in front:

Xiaobian advocates systematic learning. No matter what kind of technology is popular or not, it can form a whole set of knowledge system

If you don’t believe me, you can see my recent article

The first time I saw such a complete knowledge map of redis, the boss no longer had to worry about my technology

Systematic learning can bring you a lot of help, whether it’s work or interview or for your future development. Today’s finishing this article is also because some time ago, the company wants to transform the existing single node scheduling into distributed task scheduling, so we have studied a lot of related documents and other materials, and we have today’s article

At that time, for the sake of that framework, I wanted to save some things. I studied the mainstream open source distributed task scheduling framework on the market, but in the end, I had a feeling: trouble! Especially before in a class to write a lot of scheduling tasks, transformation up more trouble. I’m lazy again. I always feel a little uncomfortable when I have to change a lot of tools that others have written.

But we can’t refuse to say the importance of distributed scheduling system

The importance of distributed scheduling system

Distributed scheduling plays a very important role in Internet enterprises, especially in the field of e-commerce. Due to the characteristics of large amount of data and high concurrency, it has higher requirements for data processing. It is necessary to ensure not only the efficiency, but also the accuracy and security. The relatively time-consuming business logic is often separated from it for asynchronous processing.

Personal R & D

So I want to write a framework myself. After all, I think that distributed task scheduling is the simplest among all distributed systems, because the task scheduling itself in general companiesIt is impossible to schedule a large number of tasks at the same time, which leads to great concurrencyIn order to deal with more tasks at the same time, the main purpose of the transformation is to distribute tasks to multiple nodes.

One day later, I picked up the express at the front desk of the company and saw such a phenomenon: several of our colleagues (including me) looked at the front desk from the beginning to the end to see whether the express was their own. If it was their own, they would take it away. If not, they would ignore it. Then I was inspired. This scenario is analogous to the distributed scheduling system. We can think that the express company or the courier has sorted each express according to our name and telephone number. We just need to take our own. But from another point of view, it can also be understood that each of us looks at all the express from the beginning to the end, and then according to some agreed rules, if it is our own express, we will take it away, if it is not our own, we will ignore it and continue to look at the next one. If you think of express delivery as a task, a group of people can get a group of express delivery smoothly, then can a group of nodes get their own tasks well?

Just after this inspiration, I started to take action and attach the key code

package com.rdpaas.task.scheduler;

import com.rdpaas.task.common.*;

import com.rdpaas.task.config.EasyJobConfig;

import com.rdpaas.task.repository.NodeRepository;

import com.rdpaas.task.repository.TaskRepository;

import com.rdpaas.task.strategy.Strategy;

import org.slf4j.Logger; import org.slf4j.LoggerFactory;

import org.springframework.beans.factory.annotation.Autowired;

import org.springframework.stereotype.Component;

import javax.annotation.PostConstruct;

import java.util.Date; import java.util.List;

import java.util.concurrent.*;

/*** task scheduler*

@author rongdi *

@date 2019-03-13 21:15 */

@Component

public class TaskExecutor {

private static final Logger logger = LoggerFactory.getLogger(TaskExecutor.class);

@Autowired

private TaskRepository taskRepository;

@Autowired

private NodeRepository nodeRepository;

@Autowired

private EasyJobConfig config;

/*** create task expiration delay queue*/

private DelayQueue<DelayItem<Task>> taskQueue = new DelayQueue<>();

/*** you can clearly know that only 2 threads will run at most, so you can directly use the tools provided by the system*/

private ExecutorService bossPool = Executors.newFixedThreadPool(2);

/*** declare worker pool*/

private ThreadPoolExecutor workerPool;

@PostConstruct

public void init() {

/*** Custom thread pool, the initial number of threads, corepoolsize, and the waiting queue size of thread pool. When the initial threads have tasks and the waiting queue is full, * the maximum number of threads will be automatically expanded by the number of threads. When the newly expanded threads are idle for 60s, they will be automatically recycled. The self defined thread pool is due to the thread tools of executors* Each has its own disadvantages and is not suitable for production and use*/

workerPool = new ThreadPoolExecutor(config.getCorePoolSize(), config.getMaxPoolSize(), 60, TimeUnit.SECONDS, new ArrayBlockingQueue<>(config.getQueueSize()));

/*** executing pending task loading thread*/

bossPool.execute(new Loader());

/*** task scheduling thread*/

bossPool.execute(new Boss());

} class Loader implements Runnable

OK, so far, it’s almost over. However, just like the Java design pattern, it’s rarely used so much, but can’t you? The answer is yes, definitely not. Similarly, we are a company. We are allowed to develop our business by ourselves, but what if the company can’t let you play freely? It’s still ready-made, isn’t it? Next, let’s take a look at the commonly used distributed task scheduling system that I sort out

Introduction of domestic platform

1、opencron

https://gitee.com/benjobs/ope…

Opencron is a fully functional and universal open source timing task scheduling system. It has advanced and reliable automatic task management and scheduling functions, provides operable web graphical management, meets various complex timing task scheduling in various scenarios, and integrates Linux real-time monitoring, websh and other functional features.

Do you have the requirement to execute the task plan regularly, and you need to define the tasks one by one in the crontab of Linux?

  • Tasks need to be defined one by one in the crontab of each Linux server;
  • It is inconvenient to monitor the execution of tasks;
  • Log in to each machine to check the running results of the scheduled tasks. It’s a disaster to have more than one machine;
  • It is very troublesome for multiple machines to work together on a task. How to ensure that the tasks on multiple machines are executed in sequence?
  • When the task fails to run, to execute it again, we have to redefine the execution time and let it run again. After the rerun is completed, we have to change it back to the normal time;
  • It’s troublesome to kill a running task. Check the process before killing

Big country pride, domestic distributed scheduling system, you may know, but not in-depth

2、LTS

https://gitee.com/hugui/light…

LTS, light task scheduler, is a distributed task scheduling framework, which supports real-time tasks, timed tasks and cron tasks. It has good scalability and extensibility, provides support for spring (including XML and annotations), and provides business loggers. It supports node monitoring, task execution monitoring and JVM monitoring, and supports dynamic submission, change and stop of tasks.

Complete sample code:

https://github.com/ltsopensou…

Big country pride, domestic distributed scheduling system, you may know, but not in-depth

3、XXL-JOB

https://gitee.com/xuxueli0323…

http://www.xuxueli.com/xxl-job

Big country pride, domestic distributed scheduling system, you may know, but not in-depth

Xxl-job is a lightweight distributed task scheduling framework, which supports crud operation of tasks through web pages, dynamic modification of task status, pause / resume tasks, termination of running tasks, online configuration of scheduling tasks and online viewing of scheduling results.

Big country pride, domestic distributed scheduling system, you may know, but not in-depth

4、Elastic-Job

https://gitee.com/elasticjob/…

Elastic job is a distributed scheduling solution, which consists of two independent subprojects elastic job lite and elastic job cloud. It is positioned as a lightweight decentralized solution, which provides distributed task coordination service in the form of jar package. It supports distributed scheduling coordination, elastic capacity expansion and reduction, failure transfer, missed execution job re triggering, parallel scheduling, self diagnosis and repair, etc.

Big country pride, domestic distributed scheduling system, you may know, but not in-depth

5、Uncode-Schedule

https://gitee.com/uncode/unco…

Uncode schedule is a distributed task scheduling component based on zookeeper + quartz / spring task, which ensures that each task does not execute repeatedly on different nodes in the cluster. It supports dynamic adding and deleting tasks, adding IP blacklist and filtering nodes that do not need to perform tasks.

Function overview:

  • Distributed task scheduling system based on zookeeper + spring task / quartz / uncode task.
  • Ensure that each task is not repeated on different nodes in the cluster.
  • When a single task node fails, it is automatically transferred to other task nodes to continue execution.
  • When the task node starts, it must ensure that the zookeeper is available. When the zookeeper cluster is not available during the running period of the task node, the task node will keep running before it is available, and the zookeeper cluster will resume normal running period.
  • It supports dynamic addition, modification and deletion of tasks, and supports task pause and restart.
  • Add IP blacklist to filter nodes that do not need to perform tasks.
  • Background management and task execution monitoring.
  • Support spring boot, support single task to run multiple instances (using extension suffix).

Module mechanism:

Big country pride, domestic distributed scheduling system, you may know, but not in-depth

6、Antares

https://github.com/ihaolin/an…

Antares is a distributed task scheduling management platform based on quartz mechanism, which rewrites the execution logic internally. A task is only scheduled by a node in the server cluster. Users can improve the efficiency of task execution by pre slicing the task; they can also perform basic operations on the task, such as trigger, pause, monitoring and so on, through the console Antares tower.

Overall architecture of Antares:

Big country pride, domestic distributed scheduling system, you may know, but not in-depth

Task state machine in Antares:

Big country pride, domestic distributed scheduling system, you may know, but not in-depth

I think the arrangement is quite complete. Welcome to praise and pay attention to it. I will update the corresponding articles in the later period. Thank you

Forward to share with more people, and then you can get the information you need by private letter “information”. As long as you come, as long as I have, Xiaobian will not be stingy
Focus on official account: Java architects alliance, updating technology daily

Recommended Today

Practice of query operation of database table (Experiment 3)

Following the previous two experiments, this experiment is to master the use of select statements for various query operations: single table query, multi table connection and query, nested query, set query, to consolidate the database query operation.Now follow Xiaobian to practice together!Based on the data table (student, course, SC, teacher, TC) created and inserted in […]