Before explaining task manager, here are some conceptual terms that task manager will use.
In the graph database Nebula graph, there are some tasks that run in the background for a long time, which we call jobs. Some instructions used by DBAs in the storage layer, such as: after the data is imported, if you want to do a global com action, they are all in the job category.
As a distributed system, jobs in Nebula graph are completed by different stored tasks. We call the job subtasks running on a stored task task task. The job manager on metad is responsible for job control, while the task manager on stored is responsible for task control.
In this paper, we focus on how to manage and schedule long-time tasks to further improve database performance.
Problems to be solved by Task Manager
As mentioned above, tasks controlled by task manager on stored are subtasks of meta controlled jobs. What problems does task manager specifically solve? In Nebula graph, task manager mainly solves the following two problems:
- Change the previous transmission mode through HTTP to RPC (Thrift)
When building a cluster, general users know that thrift protocol is used for communication between storage, and they will open the firewall for the port required by thrift. However, they may not realize that Nebula graph also needs to use HTTP port. We have encountered many times when community users forget to open the HTTP port.
- Stored has the ability to schedule tasks
This content will be described in the following chapters of this article.
The position of task manager in Nebula graph
Meta in task manager system
In the task manager system, the task of metad (job manager) is to select the corresponding stored host according to a job request sent from the graph D, and then assemble the task request and send it to the corresponding stored host. It is not difficult to find that in the system, meta accepts job requests, assembles task requests, sends task requests and accepts task return results. These logical routines are stable. However, how to assemble a task request and which stored tasks to send the task request to will vary according to different jobs. For job manager
Template strategy +
Simple factoryIn response to future expansion.
Let future jobs inherit from metajobexecutor and implement prepare() and execute() methods.
Scheduling control of Task Manager
As mentioned earlier, task manager’s scheduling control aims to achieve two points:
- When the system resources are sufficient, execute tasks with high concurrency as possible
- When the system resources are tight, the resources occupied by all running tasks should not exceed a set threshold.
High concurrent task execution
Task manager refers to the threads held by itself in system resources as workers. Task manager has a real-life simulation prototype – the business hall of a bank. Imagine the following steps when we go to a bank to do business:
- Scene 1: get a number at the numbering machine at the door
- Scene 2: find a place in the hall and wait for a call while playing with your mobile phone
- Scene 3: when the arrival number is called, go to the designated window
At the same time, you will encounter problems like this and that:
- Scene 4: VIP can jump in line
- Scenario 5: you may line up and give up the business for some reason
- Scene 6: you may be in line and the bank is closed
So, to put it in order, this is the basic requirement of task manager
- Tasks are executed in FIFO order: different tasks have different priorities, and high priority tasks can jump in the queue
- The user can cancel a task in the queue
- Stored shutdown at any time
- A task, in order to make it as concurrent as possible, will be split into multiple subtasks. Subtask is the real task executed by each worker
- Task manager is the only instance in the whole world, so multithreading security should be considered
So, we have the following realization:
- Implementation 1: use jobid and taskid in thrift structure to determine a task, which is called task handle.
- Implementation 2: the task manager will have a blocking queue, which is responsible for queuing the task’s handle (queuing machine), while the blocking queue itself is thread safe.
- Implementation of 3: blocking queue supports different priorities, high priority first out (VIP queue jumping function).
- Implementation 4: task manager maintains a globally unique map, with key as task handle and value as specific task (bank hall). The concurrent hash map of folly and thread safe map are used in Nebula graph.
- Implementation 5: if a user cancels a task, directly find the corresponding task in the map according to the handle and mark cancel, and do not process the handle in the queue.
- Implementation 6: if there is a running task, the shutdown of stored will wait until the subtask of the task is finished.
Limit the resource threshold occupied by task
It is very simple to ensure that the threshold is not exceeded, because workers are threads. As long as all workers come from a thread pool, the maximum number of workers can be guaranteed. The trouble is to evenly allocate subtasks to workers. Let’s discuss the following scheme:
Method 1: use round robin to add tasks
The simplest way is to add tasks in a round robin way. That is to say, after the task is decomposed into sub tasks, it is added to the current workers in turn.
But there may be problems. For example, I have three workers and two tasks (task 1 in blue and task 2 in yellow)
Round robin graph 1
If sub tasks in task 2 execute much faster than those in task 1, a good parallel strategy should be as follows:
Round robin diagram 2
Simple and crude round robin makes the completion time of task 2 depend on task 1 (see Figure 1 of round robin).
Method 2: a group of workers deal with a task
In view of the situation that may occur in method 1, a special worker is set to handle only the specified tasks, so as to avoid the interdependence of multiple tasks. But it’s still not good enough
It is difficult to ensure that the execution time of each sub task is basically the same. Assuming that the execution of sub task 1 is significantly slower than that of other sub tasks, a good execution strategy should be as follows:
This plan can’t avoid the problem of one core being difficult and 10 cores being surrounded.
Method 3: the solution adopted by Nebula graph
In the nebula graph, the task manager will give the handle of the task to n workers. N is determined by the total number of workers, the total number of sub tasks, and the concurrency parameters specified by DBA when submitting jobs.
Each task maintains a blocking queue (sub task queue in the figure below) to store sub tasks. During execution, the worker finds the task according to his own handle, and then obtains the sub task from the block queue of the task.
Question 1: why not put the task in the blocking queue directly, but split it into two parts, save the task in the map and let the task handle queue?
The main reason is that the C + + multithreading infrastructure does not support this logic well. Task needs to support cancel. Assuming that a task is placed in the blocking queue, the blocking queue needs to support the ability to locate one of the tasks. However, the blocking queue in folly does not have such an interface.
Question 2: what kind of job has VIP treatment?
At present, the task manager supports the com action / rebuild index, which is not sensitive to the execution time, and supports the function similar to count（）The query operation function is still under development. Considering that users want to complete count in a relatively short time（）If it happens that stored is doing multiple com pactions, you still want count (*) to run first instead of starting after all com pactions.
If there are any mistakes or omissions in this article, welcome to GitHub: https://github.com/vesoft-inc/nebula The issue district can mention issue to us or go to the official forum: https://discuss.nebula-graph.com.cn/ Of
Suggestion feedbackTo join the nebula graph communication group, please contact the official assistant of nebula graph wechat: nebula graphbot
The author has something to say: Hi, I’m Li, I’m Li lionel.liu , is a research and Development Engineer of graph data, nebula graph, who has a strong interest in database query engine. I hope this experience sharing can help you, and I hope to help you correct any mistakes, thank you~