What is taskctl
Batch scheduling automation technology is an indispensable and important technology of data integration background in the era of big data. Data is gold, and data is an important asset of the whole society and enterprise groups. It is an important proposition of the whole society to manage and use data well. If you want to make good use of data, you should first manage the data well. And batch scheduling automation technology is an important guarantee for data management. In many large and small data warehouses, data marts and all kinds of data pools, it is batch scheduling automation technology that makes a large number of data in and out, storage, cleaning, filtering, rough processing, fine processing and other work orderly and efficiently. Without ETL work such as data management and data integration of batch dispatching automation, just like a large company without leadership, all work will become disordered, inefficient and out of control.
Yes, batch scheduling automation technology is just as important to data integration and all kinds of ETL as leadership is to the company. At the same time, batch scheduling automation technology is like an excellent professional manager, without industry restrictions, it is a pure technical system unrelated to business. Therefore, the independence, systematization, specialization, instrumentalization and productization of the technology will bring great help to the whole ETL technology field and data integration field, and make the whole data integration technology world better.
Taskctl is such a professional product of batch scheduling automation technology. The product is novel in concept, complete in system, comprehensive in function, simple in use and smooth in operation. The advanced design makes the product unique in the industry. It not only has a complete dispatching core, flexible expansion, but also has a complete application system. Compared with the similar products in the industry, this product not only makes a breakthrough in process design and flow chart display, but also makes a qualitative leap in graphic intuition, easy operation and flexibility.
The standard products of taskctl automation technology adopt the typical C / S mode. The application layer is the client and the control layer is the server. At the same time, the server completes the scheduling control of the target layer.
From the perspective of function, the application layer is mainly divided into admin, designer and monitor. From the perspective of application channel, it is divided into desktop client channel and background character interface client channel. At the same time, in order to further facilitate users, the system server also provides a wealth of control operation line commands.
The control layer is a multi-level pyramid architecture, and the top layer is the service control node, which completes various scheduling service control and provides various operation application services for clients. The agent layer completes the control interaction with the target server (ETL, etc.). In addition, the agent layer can schedule and control the servers deployed in the cluster by cascading the master-slave agents to achieve load balancing.
The target layer is the target controlled by the whole product, such as our ETL server, workstation, etc.
Logical architecture of core components
The core of the product is built on the basis of independent innovation core technologies: no database storage access, all event component communication trigger (message queue), dynamic data full memory access. In the whole logical architecture, each component corresponds to a system process, and the whole core function is completed by processes with different functions in an orderly manner.
Technical consultation: add wechat – “kitler” remarks “consultation”
Ten features and functions
100000 level job scale Scheduling Support
Since 2.0, taskctl has been positioned as the basic enterprise level scheduling software, which can achieve 100000 level task scheduling control and meet the scheduling scale requirements of major enterprises.
It supports the integration of various technology platforms and realizes the scheduling of various job types
Taskctl is an open scheduling platform. In order to adapt to the support and extension of task programs such as Datastage, information, kettle, all-in-one machine, big data, stored procedure, Java and various scripts, and ensure the application unification of different task types, taskctl adopts plug-in driving mechanism to control jobs, so as to realize the scheduling control of different technology platforms and different job types .
Multi level high availability (HA), distributed, load balanced enterprise level features
In order to ensure the high availability and high scalability of the system, the core design of the whole product adopts hierarchical architecture. Through the coordination of “server (dispatching control center)” and “agent (agent)”, various complex dispatching control can be completed. At the same time, through the distributed cluster deployment of server, server and agent, the enterprise system features of high availability and load balancing can be realized.
Rich application channels and complete application system
Taskctl not only passes the system application functions by application typeAdmin、Designer、MonitorAt the same time, it is divided into C / S mode desktop client, C / S mode character interface client and B / S mode monitoring application according to different application channels. They constitute a complete application system, users can choose the appropriate client channel for application operation according to their own operating habits and specific application environment.
Flexible user rights management
In order to control the operation of each workflow resource, taskctl adopts the user management mechanism of the operating system. Taskctl objectifies and documents the design process. Each process can grant read, write and operation permissions to the owner, the same group and other users respectively. This mechanism allows users to flexibly grant read and write permissions to different processes of different projects.
Multi level organization structure of process operation
Process job information is the core information of scheduling. In order to effectively manage and control this information, taskctl organizes job information through multi-level system such as theme application, process (sub process) and module, which makes the job information structure of the whole platform clearer, easier to manage and easier to control.
Powerful core scheduling function
① Flexible process driven
The start of a workflow can be triggered by file arrival, custom cycle timing (n minutes, n hours, n days, etc.) and custom events.
② Complete scheduling control strategy
Relationship strategy:It can realize the parallelism, mutual exclusion and arbitrary dependency control between jobs and job flows. In particular, through serial, single point dependency, event dependency and user-defined conditions, the system can realize the dependency control of any job within the job flow, different job flows, different ETL job servers, different business dates and different batches.
Scheduling strategy:It can realize the scheduling of any natural calendar and logical date; it can realize the mixed scheduling of natural calendar and multiple logical dates in one process.
Fault tolerant strategy:It can automatically rerun after a job error, and specify the number of rerun times. It can also automatically determine whether the task passes or fails when the number of error retries is full.
Powerful custom policy:Whether a task runs, ignores and waits can be determined by user-defined conditions. On the one hand, the built-in function provided by the system can be used for condition judgment; on the other hand, the condition judgment can be realized by custom script program.
Flexible parameter transfer:Users can define global variables and process private variables to realize macro replacement, job parameter transfer and variable information transfer between processes. In addition, taskctl can also realize that the return value of one task can be passed as the entry parameter of another task.
All round real-time monitoring of job operation
In order to understand the operation status of the job in real time, taskctl uses real-time refresh, graphics, multi angle and multi caliber statistics, SMS and other methods to monitor the whole platform job in an all-round way, so that users can timely grasp which jobs are running, error reasons, failures, warnings and other information
Flexible manual intervention maintenance
Manual intervention is an essential function of automatic dispatching system. The user can pause and reset the process, set the breakpoint and rerun the job, force the success and ignore the pass by manual operation; the user can start the process in free mode to realize the manual operation of any job and any branch of the job.
The graphical user interface of taskctl can complete all the above functions. It integrates many functions into an intuitive graphical interface, so that users do not need to be familiar with a variety of commands, job definition language, etc., can quickly master and use, further convenient for users