HA: when RM starts, it will write the lock file to the / mrstore path of ZK. If the write succeeds, it will be active RM. Otherwise, it will be standby RM. After startup, RM will write the job information to / mrstore. Zkfc thread in RM process will monitor the lock file in / mrstore. If it does not exist, RM is active, and if it exists, it is standby. After switching, the job information can be read from / mrstore.
Function: when the client submits the job, it will access RM. If it accesses standby RM, it will access active RM again. RM allocates and schedules resources according to the job context and the status information collected from nm, and starts a container to run am. Nm is responsible for starting the container, monitoring the resource usage of the node, and reporting to RM through the heartbeat mechanism. The task of the job will run in the container.
RM and nm are equivalent to the managers of office buildings, RM managers and nm area management. They provide containers (offices);
Am is equivalent to the project manager, who applies for the office from the office building manager, executes the task on the container provided by them, and completes the job submitted by the customer.