This article is reproduced from:https://ververica.cn/develope…
Author: Zhou Kaibo (baoniu)
Overview of Flink architecture
Overview of Flink Architecture – job
Users write Flink tasks through datastream API, dataset API, SQL and table API, which will generate a jobgraph. Jobgraph is composed of source, map (), keyby () / window () / apply () and sink operators. When the jobgraph is submitted to the Flink cluster, it can run in four modes: local, standalone, yarn and kubernetes.
Overview of Flink Architecture – jobmanager
The main functions of job manager are as follows:
- Convert the jobgraph to the execution graph, and finally run the execution graph
- The scheduler component is responsible for task scheduling
- The checkpoint coordinator component is responsible for coordinating the checkpoint of the whole task, including the start and completion of checkpoint
- Communicate with taskmanager through actor system
- Other functions, such as recovery metadata, can read data from metadata when it is used for fault recovery.
Overview of Flink Architecture – taskmanager
Task manager is responsible for the execution of specific tasks. It starts after job manager applies for resources. The main components of taskmanager are:
- Memory & I / O Manager is the management of memory I / O
- Network manager is used to manage the network
- Actor system, used to be responsible for network communication
Task manager is divided into many taskslots. Each task runs in a taskslot. Taskslot is the smallest unit of scheduling resources.
Before introducing yarn, let’s briefly introduce the Flink standalone pattern, which will help us better understand the architecture of yarn and kubernetes.
- In standalone mode, master and task manager can run on the same machine or on different machines.
- In the master process, the role of standalone resource manager is to manage resources. When the user submits the jobgraph to the master through the Flink cluster client, the jobgraph goes through the dispatcher first.
- When the dispatcher receives the request from the client, it generates a job manager. Then, the jobmanager process applies for resources from the standalone resource manager, and finally starts the task manager.
- After the task manager is started, there will be a registration process. After registration, the job manager will distribute specific tasks to the task manager for execution.
The above is the running process of a standalone task.
Flink runtime related components
Next, we summarize the basic architecture of Flink and some of its runtime components, as follows:
- Client：Users submit tasks through SQL or API, and a jobgraph will be generated after submission.
- JobManager：After the job manager receives the user’s request, it will schedule the task and request the resource to start the task manager.
- TaskManager：It is responsible for the execution of a specific task. The task manager registers with the job manager. When the task manager receives the task assigned by the job manager, it starts to execute the specific task.
Flink on yarn principle and Practice
Yarn architecture principles – Overview
Yarn mode is widely used in China, and most companies have used yarn mode in production environment. First of all, let’s introduce the architecture principle of yarn, because only when we know enough about the architecture principle of yarn, we can better know how Flink runs on yarn.
The architecture principle of yarn is shown in the figure above. The most important role is ResourceManager, which is mainly responsible for the management of the whole resource. The client side is responsible for submitting tasks to ResourceManager.
After the user submits the task on the client side, it will be sent to the resource manager first. The resource manager will start the container, and then further start the application master, that is, start the master node. After the master node is started, it will re apply for resources from the resource manager. After the resource manager allocates resources to the application master, the application master will schedule specific tasks to execute.
Yarn architecture principles – Components
The components in the yarn cluster include:
- Resource Manager (RM): Resource Manager (RM) is responsible for processing client requests, starting / monitoring application master, monitoring node manager, resource allocation and scheduling, including scheduler and applications manager.
- Application master (AM): application master (AM) runs on slave and is responsible for data segmentation, resource application and allocation, task monitoring and fault tolerance.
- Nodemanager (nm): nodemanager (nm) runs on slave for single node resource management, am / RM communication and status reporting.
- Container: container is responsible for abstracting resources, including memory, CPU, disk, network and other resources.
Yarn Architecture Principle interaction
Take the MapReduce task running on yarn as an example to explain the interaction principle of yarn architecture
- First, after the user writes MapReduce code, the task is submitted through the client
- After receiving the request from the client, the resource manager will assign a container to start the application master and inform the node manager to start the application master under this container.
- After the application master starts, it initiates a registration request to the resource manager. Then the application master applies for resources from the resource manager. According to the obtained resources, it communicates with the related nodemanager and requires it to start the program.
- One or more nodemanagers start the map / reduce task.
- Nodemanager continuously reports map / reduce task status and progress to applicationmaster.
- When all map / reduce tasks are completed, the application master reports the task completion to the resource manager and logs off.
Flink on Yarn–Per Job
The per job mode in Flink on yarn means that every time a task is submitted, the resource will be released after the task is completed. After understanding the principle of yarn, the process of per job is easy to understand, as follows:
- First, the client submits the yarn app, such as jobgraph or jars.
- Next, yarn’s ResourceManager will apply for the first container. The container starts the process through the application master, which runs the Flink program, namely, Flink horn ResourceManager and jobmanager.
- Finally, Flink yarn ResourceManager applies for resources from yarn ResourceManager. When the resource is assigned, start taskmanager. After the task manager starts, it registers with the Flink yard resource manager. After the registration is successful, the job manager will assign specific tasks to the task manager to start execution.
Flink on Yarn–Session
In per job mode, the whole resource will be released after the task is executed, including job manager and task manager. The session mode is different. Its dispatcher and ResourceManager can be reused. In session mode, when the dispatcher receives the request, it will start the job manager (a) to start the task manager, and then it will start the job manager (b) and the corresponding task manager. When a and B tasks are completed, resources will not be released. Session mode is also known as multithreading mode. Its characteristic is that resources will always exist and will not be released. Multiple job managers share a dispatcher, and they also share the Flink horn resource manager.
The application scenarios of session mode and per job mode are different. Per job mode is more suitable for tasks that are not sensitive to startup time and run for a long time. Seesion mode is suitable for short-time tasks, generally batch tasks. If you use per job mode to run short-time tasks, you need to apply for resources frequently. After running, you need to release resources, and you need to apply for resources again next time. Obviously, this kind of situation is not suitable for per job mode, but more suitable for session mode.
Characteristics of yarn model
The advantages of yarn model are as follows:
- Unified management and scheduling of resources. The resources (memory, CPU, disk, network, etc.) of all nodes in yarn cluster are abstracted as containers. When computing framework needs resources for computing tasks, it needs to apply to resource manager for container. Yarn schedules resources and allocates containers according to specific policies. Yarn mode can improve the utilization of cluster resources through a variety of task scheduling strategies. For example, FIFO scheduler, capacity scheduler, fair scheduler, and can set task priority.
- Resource isolation: yarn uses lightweight resource isolation mechanism cgroups to isolate resources to avoid mutual interference. Once the amount of resources used by container exceeds the pre-defined upper limit, it will be killed.
- Automatic fail over processing. For example, yarn nodemanager monitoring, yarn applicationmanager exception recovery.
Although yarn model has many advantages, it also has many disadvantages,For example, the cost of operation and maintenance deployment is high, and the flexibility is not enough.