Pod affinity of container scheduling system k8s

Time:2021-12-30

Earlier, we learned about the use and working logic of networkpolicy resources on k8s. For review, please refer to:https://www.cnblogs.com/qiuhom-1874/p/14227660.html; Today, let’s talk about pod scheduling strategy;

There is a very important component Kube scheduler on k8s. Its main function is to monitor whether the nodeName field in the pod resource on apiserver is empty. If the field is empty, it means that the corresponding pod has not been scheduled. At this time, Kube scheduler will select the node that runs pod best from many nodes of k8s according to the definition and related attributes of pod resources, Fill the corresponding host name into the nodeName field of the corresponding pod, and then save the pod definition resource back to apiserver; At this time, the apiserver will notify the kubelet component on the corresponding node to read the corresponding pod resource definition according to the host name in the nodeName field on the pod resource. The kubelet reads the list of corresponding pod resource definitions from the apiserver, and calls the local docker to run the corresponding pod according to the attributes defined in the resource list; Then the pod status is fed back to the apiserver, and the apiserver saves the status information of the corresponding pod back to etcd; In the whole process, the main function of Kube scheduler is to schedule the pod and feed back the scheduling information to apiserver. Then the problem comes. How does Kube scheduler judge which node is most suitable for running the corresponding pod among many nodes?

The working logic of the k8s regulator is to schedule the corresponding pod according to the scheduling algorithm; Different scheduling algorithms have different scheduling results and different evaluation criteria. When the scheduler finds that there are unscheduled pods on the apiserver, it will insert all node information on k8s into the corresponding preselection strategy function one by one to screen and eliminate the nodes that do not meet the operation of pod, We call this process the “predict” phase of the scheduler; the remaining nodes that meet the requirements of running the pod will enter the next phase (Priority), the so-called optimal selection is based on the scores of each optimal function in these nodes that meet the requirements of running pod. Finally, each node is selected by adding the scores of each optimization function to select the highest score. The node corresponding to the highest score is the final scheduling result of the scheduler. If the highest score has more nodes, the scheduler will be randomized from the highest number of nodes to the same number. Select a node as the last node to run pod; We call this process pod selection process (select); in short, the scheduler’s scheduling process will go through three stages. The first stage is the preselection stage, which is mainly to screen the nodes that do not meet the running pod and eliminate these nodes; the second stage is optimization, which is to score the nodes through various optimization functions to screen the nodes with the highest scores; the third stage is node selection, which is from multiple high scores Randomly select one of the nodes as the final running pod node; The general process is shown in the figure below

Tip: the preselection process is a one vote veto mechanism. As long as one of the preselection functions fails, the corresponding node will be directly eliminated; The remaining pre selected nodes will enter the optimization stage, in which each node will score each node through the corresponding optimization function and calculate the total score of each node; Finally, the scheduler selects the highest node according to the final total score of each node as the final scheduling result. If there are multiple nodes in the highest score, the scheduler randomly selects one from the corresponding node set as the last scheduling result and feedback the final scheduling result to apiserver.

Factors affecting scheduling

NodeName: nodeName is the most direct way to affect pod scheduling. We know that the scheduler judges whether a pod is scheduled according to whether the nodeName field is empty. If the user clearly defines the nodeName field in the corresponding pod resource list, it means that the scheduler is not used for scheduling. At this time, the scheduler will not schedule such pod resources because the corresponding nodeName is not empty, The scheduler considers that the pod has been scheduled; This method is used to manually bind the pod to a node;

Nodeselector: nodeselector is more relaxed than nodeName. It is also an important factor affecting scheduler scheduling. When defining pod resources, if nodeselector is specified, it means that only nodes that meet the labels defined by the corresponding node label selector can run the corresponding pod; If no node satisfies the node selector, the corresponding pod can only be in pending status;

Node affinity: node affinity is used to define the affinity of pod to nodes. The so-called affinity of pod to nodes means that pod is more willing or unwilling to run on those nodes; Compared with the previous nodeName and nodeselector, this method is more precise in scheduling logic;

Pod affinity: pod affinity is used to define the affinity between pods. The so-called affinity between pods means that pods prefer to be with that or those pods; On the contrary, some pods are more reluctant to be with that or those pods. This is called pod anti affinity, that is, the anti affinity between pods; The so-called “together” refers to the same location as the corresponding pod. This location can be divided by host name or region. In this way, it is particularly important for us to define whether the pod and pod are together or not. It is also the standard for judging where the corresponding pod can operate;

Taint and tolerances: taint is the stain on the node, and tolerances is the tolerance of the corresponding pod to the stain on the node, that is, if the pod can tolerate the stain on the node, the corresponding pod can run on the corresponding node, otherwise the pod cannot run on the corresponding node; This method is scheduled by combining the node stain and the tolerance of pod to node stain;

Example: using nodeName scheduling policy

[[email protected] ~]# cat pod-demo.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
spec:
  nodeName: node01.k8s.org
  containers:
  - name: nginx
    image: nginx:1.14-alpine
    imagePullPolicy: IfNotPresent
    ports:
    - name: http
      containerPort: 80
[[email protected] ~]# 

Tip: nodeName can directly specify the node on which the corresponding pod runs without the default scheduler scheduling; The above resources indicate running nginx pod in node01 k8s. Org on this node;

Application List

[[email protected] ~]# kubectl apply -f pod-demo.yaml
pod/nginx-pod created
[[email protected] ~]# kubectl get pods -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod   1/1     Running   0          10s   10.244.1.28   node01.k8s.org              
[[email protected] ~]#

Tip: you can see that the corresponding pod must run on the node we manually specify;

Example: using nodeselector scheduling policy

[[email protected] ~]# cat pod-demo-nodeselector.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod-nodeselector
spec:
  nodeSelector:
    disktype: ssd
  containers:
  - name: nginx
    image: nginx:1.14-alpine
    imagePullPolicy: IfNotPresent
    ports:
    - name: http
      containerPort: 80
[[email protected] ~]# 

Tip: nodeselector is used to define the label matching of the corresponding node. If the corresponding node has this corresponding label, the corresponding pod can be scheduled to run at the corresponding node. Otherwise, it cannot be scheduled to run at the corresponding node; If all nodes are not satisfied, the pod will be in pending status. The pod will not be scheduled to run at the corresponding node until the corresponding node has the corresponding label;

Application List

[[email protected] ~]# kubectl apply -f pod-demo-nodeselector.yaml
pod/nginx-pod-nodeselector created
[[email protected] ~]# kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE     IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod                1/1     Running   0          9m38s   10.244.1.28   node01.k8s.org              
nginx-pod-nodeselector   0/1     Pending   0          16s                                   
[[email protected] ~]#

Tip: you can see that the status of the corresponding pod is always in pending status. The reason is that none of the corresponding k8s nodes meets the corresponding node selector label;

Verification: label node02 to see if the corresponding pod will be dispatched to node02?

[[email protected] ~]# kubectl get nodes --show-labels
NAME               STATUS   ROLES                  AGE   VERSION   LABELS
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master01.k8s.org,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=
node01.k8s.org     Ready                     29d   v1.20.0   app=nginx-1.14-alpine,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node01.k8s.org,kubernetes.io/os=linux
node02.k8s.org     Ready                     29d   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node02.k8s.org,kubernetes.io/os=linux
node03.k8s.org     Ready                     29d   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node03.k8s.org,kubernetes.io/os=linux
node04.k8s.org     Ready                     19d   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node04.k8s.org,kubernetes.io/os=linux
[[email protected] ~]# kubectl label node node02.k8s.org disktype=ssd
node/node02.k8s.org labeled
[[email protected] ~]# kubectl get nodes --show-labels               
NAME               STATUS   ROLES                  AGE   VERSION   LABELS
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=master01.k8s.org,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=
node01.k8s.org     Ready                     29d   v1.20.0   app=nginx-1.14-alpine,beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node01.k8s.org,kubernetes.io/os=linux
node02.k8s.org     Ready                     29d   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node02.k8s.org,kubernetes.io/os=linux
node03.k8s.org     Ready                     29d   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node03.k8s.org,kubernetes.io/os=linux
node04.k8s.org     Ready                     19d   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node04.k8s.org,kubernetes.io/os=linux
[[email protected] ~]# kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE     IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod                1/1     Running   0          12m     10.244.1.28   node01.k8s.org              
nginx-pod-nodeselector   1/1     Running   0          3m26s   10.244.2.18   node02.k8s.org              
[[email protected] ~]#

Tip: you can see that after the node02 node is labeled disktype = SSD, the corresponding pod is scheduled to run on node02;

Example: using nodeaffinity scheduling policy in affinity

[[email protected] ~]# cat pod-demo-affinity-nodeaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod-nodeaffinity
spec:
  containers:
  - name: nginx
    image: nginx:1.14-alpine
    imagePullPolicy: IfNotPresent
    ports:
    - name: http
      containerPort: 80
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: foo
            operator: Exists
            values: []
        - matchExpressions:
          - key: disktype
            operator: Exists
            values: []
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 10
          preference:
            matchExpressions:
            - key: foo
              operator: Exists
              values: []
        - weight: 2
          preference:
            matchExpressions:
            - key: disktype
              operator: Exists
              values: []
[[email protected] ~]# 

Tip: for nodeaffinity, there are two kinds of restrictions. One is the hard restriction, which is defined by the required duringschedulingignored duringexecution field. This field is an object, in which only nodeselectorterms field can be defined, and this field is a list object, You can use the match expressions field to define an expression that matches the label of the corresponding node (the operators that can be used in the corresponding expression include in, notin, exists, doesnotexists, LT and gt; lt and GT are used for string comparison, exists and doesnotexists are used to judge whether the corresponding tag key exists, and in and notin are used to judge whether the value of the corresponding tag is in a collection). You can also use the matchfields field to define the corresponding matching node field; the so-called hard limit System means that the corresponding pod can be scheduled to run on the corresponding node only if the corresponding defined node label selection expression or node field selector is satisfied. Otherwise, the corresponding pod cannot be scheduled to run on the node. If the corresponding node label expression or node field selector is not satisfied, the corresponding pod will be suspended all the time; The second is the soft limit, which is defined by the preferredduringscheduleingignoredduringexecution field. This field is a list object in which the weight can be used to define the weight corresponding to the soft limit, which will be added to the total score of the corresponding node by the scheduler when calculating the node score at last; The preference field is used to define the corresponding soft limit matching conditions; That is, when scheduling, the scheduler will add the corresponding weight to the total score of the corresponding node; For the soft limit, the corresponding soft limit will take effect only when the hard limit matches multiple nodes; That is, the soft limit is the second restriction on the basis of hard constraints. It means that the hard constraints match multiple node, preferentially use the node matching in soft constraints. If the given weight and matching conditions in the soft limit can not let multiple node win the highest score, that is, using the default scheduling adjustment mechanism, randomly select a node from the top node as the final scheduling result. If a given weight and corresponding matching condition in the soft limit can win the highest score corresponding to node, then the node corresponds to the final scheduling result. In short, soft restrictions are used together with hard restrictions. Soft restrictions are used to assist hard restrictions in selecting nodes; If only the soft restriction is used, the pod is preferentially scheduled to the nodes with higher weight and matching conditions; If the weight is the same, the scheduler selects the highest score from the final score according to the default rule as the final scheduling result. The above example shows that the hard limit for running a pod must be a node label with a key of foo or a node label with a key of disktype on the corresponding node; If the corresponding hard limit is not matched to any node, the corresponding pod will not be scheduled, that is, it is in the pending state. If the corresponding hard limits are matched, the node with the matching key foo in the soft limit will add 10 to the total score, and the node with the key disktype will add 2 to the total score; That is, in the soft limit, pod tends to be on the node of the node label whose key is foo; Note that nodeaffinity does not have node anti affinity. To achieve anti affinity, you can use notin or doesnotexists operators to match the corresponding conditions;

Application resource list

[[email protected] ~]# kubectl get nodes -L foo,disktype     
NAME               STATUS   ROLES                  AGE   VERSION   FOO   DISKTYPE
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0         
node01.k8s.org     Ready                     29d   v1.20.0         
node02.k8s.org     Ready                     29d   v1.20.0         ssd
node03.k8s.org     Ready                     29d   v1.20.0         
node04.k8s.org     Ready                     19d   v1.20.0         
[[email protected] ~]# kubectl apply -f pod-demo-affinity-nodeaffinity.yaml
pod/nginx-pod-nodeaffinity created
[[email protected] ~]# kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE    IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod                1/1     Running   0          122m   10.244.1.28   node01.k8s.org              
nginx-pod-nodeaffinity   1/1     Running   0          7s     10.244.2.22   node02.k8s.org              
nginx-pod-nodeselector   1/1     Running   0          113m   10.244.2.18   node02.k8s.org              
[[email protected] ~]#

Tip: you can see that after the application list, the corresponding pod is scheduled to run on node02. The reason for scheduling to node02 is that there is a node label with a key of disktype on the corresponding node, which meets the hard limit of the corresponding running pod;

Verification: delete the node tag with the key of disktype on the pod and the corresponding node02, and apply the resource list again to see how the corresponding pod is scheduled?

[[email protected] ~]# kubectl delete -f pod-demo-affinity-nodeaffinity.yaml
pod "nginx-pod-nodeaffinity" deleted
[[email protected] ~]# kubectl label node node02.k8s.org disktype-
node/node02.k8s.org labeled
[[email protected] ~]# kubectl get pods 
NAME                     READY   STATUS    RESTARTS   AGE
nginx-pod                1/1     Running   0          127m
nginx-pod-nodeselector   1/1     Running   0          118m
[[email protected] ~]# kubectl get node -L foo,disktype
NAME               STATUS   ROLES                  AGE   VERSION   FOO   DISKTYPE
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0         
node01.k8s.org     Ready                     29d   v1.20.0         
node02.k8s.org     Ready                     29d   v1.20.0         
node03.k8s.org     Ready                     29d   v1.20.0         
node04.k8s.org     Ready                     19d   v1.20.0         
[[email protected] ~]# kubectl apply -f pod-demo-affinity-nodeaffinity.yaml
pod/nginx-pod-nodeaffinity created
[[email protected] ~]# kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE    IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod                1/1     Running   0          128m   10.244.1.28   node01.k8s.org              
nginx-pod-nodeaffinity   0/1     Pending   0          9s                                   
nginx-pod-nodeselector   1/1     Running   0          118m   10.244.2.18   node02.k8s.org              
[[email protected] ~]#

Tip: you can see that after deleting the tags on the original pod and node2 and applying the resource list again, the pod will always be in pending status; The reason is that the corresponding k8s node does not meet the hard limit of the corresponding pod running time; Therefore, the corresponding pod cannot be scheduled;

Verification: delete the pod, label node01 and node03 with the key Foo and the key disktype respectively, and then apply the list again to see how the corresponding pod will be scheduled?

[[email protected] ~]# kubectl delete -f pod-demo-affinity-nodeaffinity.yaml
pod "nginx-pod-nodeaffinity" deleted
[[email protected] ~]# kubectl label node node01.k8s.org foo=bar
node/node01.k8s.org labeled
[[email protected] ~]# kubectl label node node03.k8s.org disktype=ssd
node/node03.k8s.org labeled
[[email protected] ~]# kubectl get nodes -L foo,disktype
NAME               STATUS   ROLES                  AGE   VERSION   FOO   DISKTYPE
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0         
node01.k8s.org     Ready                     29d   v1.20.0   bar   
node02.k8s.org     Ready                     29d   v1.20.0         
node03.k8s.org     Ready                     29d   v1.20.0         ssd
node04.k8s.org     Ready                     19d   v1.20.0         
[[email protected] ~]# kubectl apply -f pod-demo-affinity-nodeaffinity.yaml
pod/nginx-pod-nodeaffinity created
[[email protected] ~]# kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE    IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod                1/1     Running   0          132m   10.244.1.28   node01.k8s.org              
nginx-pod-nodeaffinity   1/1     Running   0          5s     10.244.1.29   node01.k8s.org              
nginx-pod-nodeselector   1/1     Running   0          123m   10.244.2.18   node02.k8s.org              
[[email protected] ~]#

Tip: it can be seen that when the conditions in the hard limit are matched by multiple nodes, the priority scheduling corresponds to the node with large matching weight of the soft limit conditions, that is, if the hard limit cannot normally select the scheduling node, the matching conditions with significant corresponding weight in the soft limit are limited to be scheduled;

Verification: delete the node label on node01 to see if the corresponding pod will be removed or scheduled to other nodes?

[[email protected] ~]# kubectl get nodes -L foo,disktype
NAME               STATUS   ROLES                  AGE   VERSION   FOO   DISKTYPE
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0         
node01.k8s.org     Ready                     29d   v1.20.0   bar   
node02.k8s.org     Ready                     29d   v1.20.0         
node03.k8s.org     Ready                     29d   v1.20.0         ssd
node04.k8s.org     Ready                     19d   v1.20.0         
[[email protected] ~]# kubectl label node node01.k8s.org foo-
node/node01.k8s.org labeled
[[email protected] ~]# kubectl get nodes -L foo,disktype     
NAME               STATUS   ROLES                  AGE   VERSION   FOO   DISKTYPE
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0         
node01.k8s.org     Ready                     29d   v1.20.0         
node02.k8s.org     Ready                     29d   v1.20.0         
node03.k8s.org     Ready                     29d   v1.20.0         ssd
node04.k8s.org     Ready                     19d   v1.20.0         
[[email protected] ~]# kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE    IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod                1/1     Running   0          145m   10.244.1.28   node01.k8s.org              
nginx-pod-nodeaffinity   1/1     Running   0          12m    10.244.1.29   node01.k8s.org              
nginx-pod-nodeselector   1/1     Running   0          135m   10.244.2.18   node02.k8s.org              
[[email protected] ~]#

Tip: it can be seen that after the pod runs normally, even if the corresponding node does not meet the hard limit of the corresponding pod operation, the corresponding pod will not be removed or scheduled to other nodes, indicating that the node affinity takes effect during scheduling. Once the scheduling is completed, even if the later node does not meet the node affinity of the pod operation, the corresponding pod will not be removed or scheduled again; In short, nodeaffinity is a fait accompli of pod scheduling, and secondary scheduling cannot be done;

Effective method of node affinity rule

1. When nodeaffinity and nodeselector are used together, the relationship between them is “and”, that is, the two conditions must be met at the same time before the corresponding node can meet the scheduling operation or not run the corresponding pod;

Example: define a pod scheduling policy using nodeaffinity and nodeselector

[[email protected] ~]# cat pod-demo-affinity-nodesector.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod-nodeaffinity-nodeselector
spec:
  containers:
  - name: nginx
    image: nginx:1.14-alpine
    imagePullPolicy: IfNotPresent
    ports:
    - name: http
      containerPort: 80
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: foo
            operator: Exists
            values: []
  nodeSelector:
    disktype: ssd
[[email protected] ~]# 

Tip: the above list indicates that the corresponding pod tends to run on nodes with a node tag key of Foo and a node tag of disktype = SSD on the corresponding node

Application List

[[email protected] ~]# kubectl get nodes -L foo,disktype    
NAME               STATUS   ROLES                  AGE   VERSION   FOO   DISKTYPE
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0         
node01.k8s.org     Ready                     29d   v1.20.0         
node02.k8s.org     Ready                     29d   v1.20.0         
node03.k8s.org     Ready                     29d   v1.20.0         ssd
node04.k8s.org     Ready                     19d   v1.20.0         
[[email protected] ~]# kubectl apply -f pod-demo-affinity-nodesector.yaml
pod/nginx-pod-nodeaffinity-nodeselector created
[[email protected] ~]# kubectl get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE    IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod                             1/1     Running   0          168m   10.244.1.28   node01.k8s.org              
nginx-pod-nodeaffinity                1/1     Running   0          35m    10.244.1.29   node01.k8s.org              
nginx-pod-nodeaffinity-nodeselector   0/1     Pending   0          7s                                   
nginx-pod-nodeselector                1/1     Running   0          159m   10.244.2.18   node02.k8s.org              
[[email protected] ~]#

Tip: you can see that after the corresponding pod is created, it is always in the pending state. The reason is that there are no nodes that meet the requirements and have a node tag key of Foo and disktype = SSD. Therefore, the corresponding pod cannot be scheduled normally and has to be suspended;

2. When multiple nodeasuffinities specify multiple nodeselectorterms at the same time, the “or” relationship is taken between them; That is, multiple matchexpressions lists are used to specify the corresponding matching conditions respectively;

[[email protected] ~]# cat pod-demo-affinity2.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod-nodeaffinity2
spec:
  containers:
  - name: nginx
    image: nginx:1.14-alpine
    imagePullPolicy: IfNotPresent
    ports:
    - name: http
      containerPort: 80
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: foo
            operator: Exists
            values: []
        - matchExpressions:
          - key: disktype
            operator: Exists
            values: []
[[email protected] ~]# 

Tip: the above example shows that the running pod node tends to have a node with a node label of foo or disktype on the corresponding node;

Application List

[[email protected] ~]# kubectl get nodes -L foo,disktype
NAME               STATUS   ROLES                  AGE   VERSION   FOO   DISKTYPE
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0         
node01.k8s.org     Ready                     29d   v1.20.0         
node02.k8s.org     Ready                     29d   v1.20.0         
node03.k8s.org     Ready                     29d   v1.20.0         ssd
node04.k8s.org     Ready                     19d   v1.20.0         
[[email protected] ~]# kubectl apply -f pod-demo-affinity2.yaml
pod/nginx-pod-nodeaffinity2 created
[[email protected] ~]# kubectl get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE    IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod                             1/1     Running   0          179m   10.244.1.28   node01.k8s.org              
nginx-pod-nodeaffinity                1/1     Running   0          46m    10.244.1.29   node01.k8s.org              
nginx-pod-nodeaffinity-nodeselector   0/1     Pending   0          10m                                  
nginx-pod-nodeaffinity2               1/1     Running   0          6s     10.244.3.21   node03.k8s.org              
nginx-pod-nodeselector                1/1     Running   0          169m   10.244.2.18   node02.k8s.org              
[[email protected] ~]#

Tip: you can see that the corresponding pod is scheduled to run on node03. The reason why it can run on node03 is that the corresponding node03 meets the conditions that the node label key is foo or the key is disktype;

3. For the same matchexpressions, multiple conditions take the “and” relationship; That is, use multiple key lists to specify the corresponding matching criteria respectively;

Example: specify multiple conditions under one matchexpressions

[[email protected] ~]# cat pod-demo-affinity3.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod-nodeaffinity3
spec:
  containers:
  - name: nginx
    image: nginx:1.14-alpine
    imagePullPolicy: IfNotPresent
    ports:
    - name: http
      containerPort: 80
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: foo
            operator: Exists
            values: []
          - key: disktype
            operator: Exists
            values: []
[[email protected] ~]# 

Tip: the above list indicates that pod tends to run on nodes with foo node tag key and disktype node tag key;

Application List

[[email protected] ~]# kubectl get nodes -L foo,disktype                 
NAME               STATUS   ROLES                  AGE   VERSION   FOO   DISKTYPE
master01.k8s.org   Ready    control-plane,master   29d   v1.20.0         
node01.k8s.org     Ready                     29d   v1.20.0         
node02.k8s.org     Ready                     29d   v1.20.0         
node03.k8s.org     Ready                     29d   v1.20.0         ssd
node04.k8s.org     Ready                     19d   v1.20.0         
[[email protected] ~]# kubectl apply -f pod-demo-affinity3.yaml
pod/nginx-pod-nodeaffinity3 created
[[email protected] ~]# kubectl get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE     IP            NODE             NOMINATED NODE   READINESS GATES
nginx-pod                             1/1     Running   0          3h8m    10.244.1.28   node01.k8s.org              
nginx-pod-nodeaffinity                1/1     Running   0          56m     10.244.1.29   node01.k8s.org              
nginx-pod-nodeaffinity-nodeselector   0/1     Pending   0          20m                                   
nginx-pod-nodeaffinity2               1/1     Running   0          9m38s   10.244.3.21   node03.k8s.org              
nginx-pod-nodeaffinity3               0/1     Pending   0          7s                                    
nginx-pod-nodeselector                1/1     Running   0          179m    10.244.2.18   node02.k8s.org              
[[email protected] ~]#

Tip: you can see that after the corresponding pod is created, it is always in the Pengding state; The reason is that there is no node with Foo and disktyp key that meets the node label;

The working logic and usage of pod affinity are similar to that of node affinity. Pod affinity also has hard and soft restrictions. Its logic is the same as that of node affinity, that is, it defines hard affinity. Soft affinity rules are used to assist hard affinity rules to select the corresponding pod running nodes; If the hard affinity does not meet the conditions, the corresponding pod can only be suspended; If only the soft affinity rule is used, the corresponding pod will be preferentially run on the nodes with greater weight in the matching soft affinity rule. If no nodes meet the soft affinity rule, the default scheduling rule will be used to select a node with the highest score to run the pod;

Example: use the hard limit scheduling policy in podaffinity in affinity

[[email protected] ~]# cat require-podaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
  name: with-pod-affinity-1
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - {key: app, operator: In, values: ["nginx"]}
        topologyKey: kubernetes.io/hostname
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
[[email protected] ~]# 

Tip: the above list is the hard restricted usage in podaffinity. To define podaffinity, you need to use the podaffinity field in the spec.affinity field; The required duringschedulingignored duringexecution field is the field used to define the hard limit of the corresponding podaffinity. This field is a list object, in which the labelselector is used to define the tag selector of the pod together with the corresponding pod; The topologykey field is used to define the location to be divided by. The location can be a node label key on the corresponding node; The above list indicates that the hard restriction of running myapp pod is that there must be a pod running on the corresponding node, and there is a label of APP = nginx on this pod; That is, the pod labeled app = nginx runs on that node, and the corresponding myapp runs on that node; If no corresponding pod exists, the pod will also be in pending status;

Application List

[[email protected] ~]# kubectl get pods -L app -o wide
NAME        READY   STATUS    RESTARTS   AGE     IP            NODE             NOMINATED NODE   READINESS GATES   APP
nginx-pod   1/1     Running   0          8m25s   10.244.4.25   node04.k8s.org                          nginx
[[email protected] ~]# kubectl apply -f require-podaffinity.yaml
pod/with-pod-affinity-1 created
[[email protected] ~]# kubectl get pods -L app -o wide          
NAME                  READY   STATUS    RESTARTS   AGE     IP            NODE             NOMINATED NODE   READINESS GATES   APP
nginx-pod             1/1     Running   0          8m43s   10.244.4.25   node04.k8s.org                          nginx
with-pod-affinity-1   1/1     Running   0          6s      10.244.4.26   node04.k8s.org                          
[[email protected] ~]#

Tip: you can see that the corresponding pod is running on node04. The reason is that there is a pod with app = nginx tag on the corresponding node, which meets the hard limit in the corresponding podaffinity;

Verification: delete the above two pods, and then apply the list again to see if the corresponding pods can operate normally?

[[email protected] ~]# kubectl delete all --all
pod "nginx-pod" deleted
pod "with-pod-affinity-1" deleted
service "kubernetes" deleted
[[email protected] ~]# kubectl apply -f require-podaffinity.yaml
pod/with-pod-affinity-1 created
[[email protected] ~]# kubectl get pods -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP       NODE     NOMINATED NODE   READINESS GATES
with-pod-affinity-1   0/1     Pending   0          8s                     
[[email protected] ~]#

Tip: you can see that the corresponding pod is in pending status. The reason is that there is no app = nginx pod tag running on any node, which does not meet the hard limit in podaffinity;

Example: use the soft limit scheduling policy in podaffinity in affinity

[[email protected] ~]# cat prefernece-podaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
  name: with-pod-affinity-2
spec:
  affinity:
    podAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 80
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - {key: app, operator: In, values: ["db"]}
          topologyKey: rack
      - weight: 20
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - {key: app, operator: In, values: ["db"]}
          topologyKey: zone
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
[[email protected] ~]# 

Tip: the soft limit in podaffinity needs to be defined with the field preferredduringschedulingignoredduringexecution; Weight is used to define the weight of the corresponding soft limit condition, that is, the node that meets the corresponding soft limit, and this weight will be added to the final score; The above list indicates that the location is divided by the node tag key = rack. If there is a pod with the corresponding pod tag app = DB running on the corresponding node, the total score of the corresponding node will be increased by 80; If the location is divided by the node label key = zone, if the corresponding node runs a pod with the pod label app = dB, the total score of the corresponding node will be increased by 20; If there are no satisfied nodes, the default scheduling rules are used for scheduling;

Application List

[[email protected] ~]# kubectl get node -L rack,zone                
NAME               STATUS   ROLES                  AGE   VERSION   RACK   ZONE
master01.k8s.org   Ready    control-plane,master   30d   v1.20.0          
node01.k8s.org     Ready                     30d   v1.20.0          
node02.k8s.org     Ready                     30d   v1.20.0          
node03.k8s.org     Ready                     30d   v1.20.0          
node04.k8s.org     Ready                     20d   v1.20.0          
[[email protected] ~]# kubectl get pods -o wide -L app              
NAME                  READY   STATUS    RESTARTS   AGE   IP       NODE     NOMINATED NODE   READINESS GATES   APP
with-pod-affinity-1   0/1     Pending   0          22m                                
[[email protected] ~]# kubectl apply -f prefernece-podaffinity.yaml 
pod/with-pod-affinity-2 created
[[email protected] ~]# kubectl get pods -o wide -L app             
NAME                  READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES   APP
with-pod-affinity-1   0/1     Pending   0          22m                                             
with-pod-affinity-2   1/1     Running   0          6s    10.244.4.28   node04.k8s.org                          
[[email protected] ~]#

Prompt: you can see that the corresponding pod is running normally and is scheduled to node04; From the above example, the operation of the corresponding pod is not scheduled with soft constraints, but with the default scheduling rule; The reason is that the corresponding node does not meet the conditions in the corresponding soft limit;

Verification: delete the pod, mark the rack node on node01, mark the zone node on node03, and run the pod again to see how the corresponding pod will be scheduled?

[[email protected] ~]# kubectl delete -f prefernece-podaffinity.yaml
pod "with-pod-affinity-2" deleted
[[email protected] ~]# kubectl label node node01.k8s.org rack=group1
node/node01.k8s.org labeled
[[email protected] ~]# kubectl label node node03.k8s.org zone=group2
node/node03.k8s.org labeled
[[email protected] ~]# kubectl get node -L rack,zone
NAME               STATUS   ROLES                  AGE   VERSION   RACK     ZONE
master01.k8s.org   Ready    control-plane,master   30d   v1.20.0            
node01.k8s.org     Ready                     30d   v1.20.0   group1   
node02.k8s.org     Ready                     30d   v1.20.0            
node03.k8s.org     Ready                     30d   v1.20.0            group2
node04.k8s.org     Ready                     20d   v1.20.0            
[[email protected] ~]# kubectl apply -f prefernece-podaffinity.yaml
pod/with-pod-affinity-2 created
[[email protected] ~]# kubectl get pods -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES
with-pod-affinity-1   0/1     Pending   0          27m                                 
with-pod-affinity-2   1/1     Running   0          9s    10.244.4.29   node04.k8s.org              
[[email protected] ~]#

Tip: you can see whether the corresponding pod is still scheduled to run on node04, indicating that the location label on the node does not affect its scheduling results;

Verification: delete the pod, create a pod labeled app = dB on node01 and node03 respectively, and then apply the list again to see how the corresponding pod will be scheduled?

[[email protected] ~]# kubectl apply -f prefernece-podaffinity.yaml
pod/with-pod-affinity-2 created
[[email protected] ~]# kubectl get pods -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES
with-pod-affinity-1   0/1     Pending   0          27m                                 
with-pod-affinity-2   1/1     Running   0          9s    10.244.4.29   node04.k8s.org              
[[email protected] ~]# 
[[email protected] ~]# kubectl delete -f prefernece-podaffinity.yaml
pod "with-pod-affinity-2" deleted
[[email protected] ~]# cat pod-demo.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: redis-pod1
  labels:
    app: db
spec:
  nodeSelector:
    rack: group1
  containers:
  - name: redis
    image: redis:4-alpine
    imagePullPolicy: IfNotPresent
    ports:
    - name: redis
      containerPort: 6379
---
apiVersion: v1
kind: Pod
metadata:
  name: redis-pod2
  labels:
    app: db
spec:
  nodeSelector:
    zone: group2
  containers:
  - name: redis
    image: redis:4-alpine
    imagePullPolicy: IfNotPresent
    ports:
    - name: redis
      containerPort: 6379
[[email protected] ~]# kubectl apply -f pod-demo.yaml
pod/redis-pod1 created
pod/redis-pod2 created
[[email protected] ~]# kubectl get pods -L app -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES   APP
redis-pod1            1/1     Running   0          34s   10.244.1.35   node01.k8s.org                          db
redis-pod2            1/1     Running   0          34s   10.244.3.24   node03.k8s.org                          db
with-pod-affinity-1   0/1     Pending   0          34m                                             
[[email protected] ~]# kubectl apply -f prefernece-podaffinity.yaml
pod/with-pod-affinity-2 created
[[email protected] ~]# kubectl get pods -L app -o wide             
NAME                  READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES   APP
redis-pod1            1/1     Running   0          52s   10.244.1.35   node01.k8s.org                          db
redis-pod2            1/1     Running   0          52s   10.244.3.24   node03.k8s.org                          db
with-pod-affinity-1   0/1     Pending   0          35m                                             
with-pod-affinity-2   1/1     Running   0          9s    10.244.1.36   node01.k8s.org                          
[[email protected] ~]#

Tip: you can see that the corresponding pod runs on node01. The reason is that there is a pod running with a pod tag of APP = dB on the corresponding node01, which meets the corresponding soft restriction conditions, and there is a node tag with a key of rack on the corresponding node; That is, the condition that the corresponding weight is 80 is satisfied, so the corresponding pod tends to run on node01;

Example: use the hard limit and soft limit scheduling policies in podaffinity in affinity

[[email protected] ~]# cat require-preference-podaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
  name: with-pod-affinity-3
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - {key: app, operator: In, values: ["db"]}
        topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 80
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - {key: app, operator: In, values: ["db"]}
          topologyKey: rack
      - weight: 20
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - {key: app, operator: In, values: ["db"]}
          topologyKey: zone
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
[[email protected] ~]# 

Tip: the above list indicates that the corresponding pod must run on the corresponding node. There is a pod labeled app = DB running on the corresponding node. If no node is satisfied, the corresponding pod can only be suspended; If there are multiple nodes, the requirements in the soft limit shall be met accordingly; If the hard limit is satisfied and the corresponding node has a node label with a key of rack, the total score of the corresponding node is increased by 80. If the corresponding node has a node label with a key of zone, the total score of the corresponding node is increased by 20;

Application List

[[email protected] ~]# kubectl get pods -o wide -L app
NAME                  READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES   APP
redis-pod1            1/1     Running   0          13m   10.244.1.35   node01.k8s.org                          db
redis-pod2            1/1     Running   0          13m   10.244.3.24   node03.k8s.org                          db
with-pod-affinity-1   0/1     Pending   0          48m                                             
with-pod-affinity-2   1/1     Running   0          13m   10.244.1.36   node01.k8s.org                          
[[email protected] ~]# kubectl apply -f require-preference-podaffinity.yaml
pod/with-pod-affinity-3 created
[[email protected] ~]# kubectl get pods -o wide -L app                     
NAME                  READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES   APP
redis-pod1            1/1     Running   0          14m   10.244.1.35   node01.k8s.org                          db
redis-pod2            1/1     Running   0          14m   10.244.3.24   node03.k8s.org                          db
with-pod-affinity-1   0/1     Pending   0          48m                                             
with-pod-affinity-2   1/1     Running   0          13m   10.244.1.36   node01.k8s.org                          
with-pod-affinity-3   1/1     Running   0          6s    10.244.1.37   node01.k8s.org                          
[[email protected] ~]#

Tip: you can see that the corresponding pod is scheduled to run on node01. The reason is that the corresponding node meets both the hard restriction and the soft restriction with the largest corresponding weight;

Verification: delete the above pod and re apply the list to see if the corresponding pod will still operate normally?

[[email protected] ~]# kubectl delete all --all
pod "redis-pod1" deleted
pod "redis-pod2" deleted
pod "with-pod-affinity-1" deleted
pod "with-pod-affinity-2" deleted
pod "with-pod-affinity-3" deleted
service "kubernetes" deleted
[[email protected] ~]# kubectl apply -f require-preference-podaffinity.yaml
pod/with-pod-affinity-3 created
[[email protected] ~]# kubectl get pods -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP       NODE     NOMINATED NODE   READINESS GATES
with-pod-affinity-3   0/1     Pending   0          5s                     
[[email protected] ~]#

Tip: you can see that the corresponding pod is in pending status after it is created. The reason is that no node meets the hard limit of the corresponding pod scheduling; Therefore, the corresponding pod cannot be scheduled and can only be suspended;

Example: use the podantiaffinity scheduling policy in affinity

[[email protected] ~]# cat require-preference-podantiaffinity.yaml
apiVersion: v1
kind: Pod
metadata:
  name: with-pod-affinity-4
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - {key: app, operator: In, values: ["db"]}
        topologyKey: kubernetes.io/hostname
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 80
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - {key: app, operator: In, values: ["db"]}
          topologyKey: rack
      - weight: 20
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - {key: app, operator: In, values: ["db"]}
          topologyKey: zone
  containers:
  - name: myapp
    image: ikubernetes/myapp:v1
[[email protected] ~]# 

Tip: the usage of podantiaffinity is the same as that of podafinity, but its corresponding logic is opposite. Podantiaffinity is to define that the node that meets the conditions does not run the corresponding pod, and podafinity is to run the pod that meets the conditions; The above list indicates that the corresponding pod must not run on a node with a pod with a tag of APP = dB, and if there are node tags with a key of rack and a key of zone on the corresponding node, such nodes will not run either; That is, it can only run on nodes that meet the above three conditions; If all nodes meet the above three conditions, the corresponding pod can only be hung; If only the soft limit is used, the pod will barely run on the node with lower score of the corresponding node;

Application List

[[email protected] ~]# kubectl get pods -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP       NODE     NOMINATED NODE   READINESS GATES
with-pod-affinity-3   0/1     Pending   0          22m                    
[[email protected] ~]# kubectl apply -f require-preference-podantiaffinity.yaml
pod/with-pod-affinity-4 created
[[email protected] ~]# kubectl get pods -o wide
NAME                  READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES
with-pod-affinity-3   0/1     Pending   0          22m                                 
with-pod-affinity-4   1/1     Running   0          6s    10.244.4.30   node04.k8s.org              
[[email protected] ~]# kubectl get node -L rack,zone
NAME               STATUS   ROLES                  AGE   VERSION   RACK     ZONE
master01.k8s.org   Ready    control-plane,master   30d   v1.20.0            
node01.k8s.org     Ready                     30d   v1.20.0   group1   
node02.k8s.org     Ready                     30d   v1.20.0            
node03.k8s.org     Ready                     30d   v1.20.0            group2
node04.k8s.org     Ready                     20d   v1.20.0            
[[email protected] ~]#

Tip: you can see that the corresponding pod is scheduled to run on node04; The reason is that there are no above three conditions on node04; Of course, node02 is also the node that runs the corresponding pod;

Verification: delete the above pod, run a pod with app = DB tag on each of the four nodes, and apply the list again to see how to schedule the matched pod?

[[email protected] ~]# kubectl delete all --all
pod "with-pod-affinity-3" deleted
pod "with-pod-affinity-4" deleted
service "kubernetes" deleted
[[email protected] ~]# cat pod-demo.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: redis-ds
  labels:
    app: db
spec:
  selector:
    matchLabels:
      app: db
  template:
    metadata:
      labels:
        app: db
    spec:
      containers:
      - name: redis
        image: redis:4-alpine
        ports:
        - name: redis
          containerPort: 6379
[[email protected] ~]# kubectl apply -f pod-demo.yaml
daemonset.apps/redis-ds created
[[email protected] ~]# kubectl get pods -L app -o wide
NAME             READY   STATUS    RESTARTS   AGE   IP            NODE             NOMINATED NODE   READINESS GATES   APP
redis-ds-4bnmv   1/1     Running   0          44s   10.244.2.26   node02.k8s.org                          db
redis-ds-c2h77   1/1     Running   0          44s   10.244.1.38   node01.k8s.org                          db
redis-ds-mbxcd   1/1     Running   0          44s   10.244.4.32   node04.k8s.org                          db
redis-ds-r2kxv   1/1     Running   0          44s   10.244.3.25   node03.k8s.org                          db
[[email protected] ~]# kubectl apply -f require-preference-podantiaffinity.yaml
pod/with-pod-affinity-5 created
[[email protected] ~]# kubectl get pods -o wide -L app
NAME                  READY   STATUS    RESTARTS   AGE     IP            NODE             NOMINATED NODE   READINESS GATES   APP
redis-ds-4bnmv        1/1     Running   0          2m29s   10.244.2.26   node02.k8s.org                          db
redis-ds-c2h77        1/1     Running   0          2m29s   10.244.1.38   node01.k8s.org                          db
redis-ds-mbxcd        1/1     Running   0          2m29s   10.244.4.32   node04.k8s.org                          db
redis-ds-r2kxv        1/1     Running   0          2m29s   10.244.3.25   node03.k8s.org                          db
with-pod-affinity-5   0/1     Pending   0          9s                                                
[[email protected] ~]#

Tip: you can see that the corresponding pod has no nodes to run and is in pending status. The reason is that the corresponding nodes meet the hard limit of excluding the operation of the corresponding pod;

Through the above verification process, it can be concluded that no matter the affinity between pod and node or the affinity between pod and pod, as long as the hard affinity is defined in the scheduling policy, the corresponding pod will run on the node that meets the hard affinity conditions. If no node meets the hard affinity conditions, the corresponding pod will be suspended; If only soft affinity is defined, the corresponding pod will preferentially run on the nodes matching the soft restriction conditions with large weight. If no node meets the soft restriction, the corresponding scheduling will follow the default scheduling strategy and find the node with the highest score to run; The same logic applies to anti affinity; The difference is that anti affinity meets the corresponding hard limit or soft limit, and the corresponding pod will not run on the corresponding node; It should also be noted that if the affinity scheduling strategy of pod and pod is used, if there are many nodes, the rules should not be set too fine, and the granularity should be appropriate. Excessive fine will lead to the consumption of more resources by the screening nodes during the scheduling of pod, resulting in the decline of the performance of the whole cluster; It is recommended to use node affinity in large-scale clusters;