explaining the interactions of topology spread constraints and node affinity/selector (#29632)
* explaining the interactions of topology spread constraints and node affinity/selector Signed-off-by: RinkiyaKeDad <arshsharma461@gmail.com> * udpates from code review Signed-off-by: RinkiyaKeDad <arshsharma461@gmail.com> * more updated from code reviews Signed-off-by: RinkiyaKeDad <arshsharma461@gmail.com>
This commit is contained in:
parent
c2f0ae3f05
commit
69be6060ca
|
@ -230,20 +230,9 @@ If you apply "two-constraints.yaml" to this cluster, you will notice "mypod" sta
|
|||
|
||||
To overcome this situation, you can either increase the `maxSkew` or modify one of the constraints to use `whenUnsatisfiable: ScheduleAnyway`.
|
||||
|
||||
### Conventions
|
||||
### Interaction With Node Affinity and Node Selectors
|
||||
|
||||
There are some implicit conventions worth noting here:
|
||||
|
||||
- Only the Pods holding the same namespace as the incoming Pod can be matching candidates.
|
||||
|
||||
- Nodes without `topologySpreadConstraints[*].topologyKey` present will be bypassed. It implies that:
|
||||
|
||||
1. the Pods located on those nodes do not impact `maxSkew` calculation - in the above example, suppose "node1" does not have label "zone", then the 2 Pods will be disregarded, hence the incoming Pod will be scheduled into "zoneA".
|
||||
2. the incoming Pod has no chances to be scheduled onto this kind of nodes - in the above example, suppose a "node5" carrying label `{zone-typo: zoneC}` joins the cluster, it will be bypassed due to the absence of label key "zone".
|
||||
|
||||
- Be aware of what will happen if the incomingPod's `topologySpreadConstraints[*].labelSelector` doesn't match its own labels. In the above example, if we remove the incoming Pod's labels, it can still be placed onto "zoneB" since the constraints are still satisfied. However, after the placement, the degree of imbalance of the cluster remains unchanged - it's still zoneA having 2 Pods which hold label {foo:bar}, and zoneB having 1 Pod which holds label {foo:bar}. So if this is not what you expect, we recommend the workload's `topologySpreadConstraints[*].labelSelector` to match its own labels.
|
||||
|
||||
- If the incoming Pod has `spec.nodeSelector` or `spec.affinity.nodeAffinity` defined, nodes not matching them will be bypassed.
|
||||
The scheduler will skip the non-matching nodes from the skew calculations if the incoming Pod has `spec.nodeSelector` or `spec.affinity.nodeAffinity` defined.
|
||||
|
||||
Suppose you have a 5-node cluster ranging from zoneA to zoneC:
|
||||
|
||||
|
@ -283,6 +272,21 @@ There are some implicit conventions worth noting here:
|
|||
|
||||
{{< codenew file="pods/topology-spread-constraints/one-constraint-with-nodeaffinity.yaml" >}}
|
||||
|
||||
The scheduler doesn't have prior knowledge of all the zones or other topology domains that a cluster has. They are determined from the existing nodes in the cluster. This could lead to a problem in autoscaled clusters, when a node pool (or node group) is scaled to zero nodes and the user is expecting them to scale up, because, in this case, those topology domains won't be considered until there is at least one node in them.
|
||||
|
||||
### Other Noticeable Semantics
|
||||
|
||||
There are some implicit conventions worth noting here:
|
||||
|
||||
- Only the Pods holding the same namespace as the incoming Pod can be matching candidates.
|
||||
|
||||
- The scheduler will bypass the nodes without `topologySpreadConstraints[*].topologyKey` present. This implies that:
|
||||
|
||||
1. the Pods located on those nodes do not impact `maxSkew` calculation - in the above example, suppose "node1" does not have label "zone", then the 2 Pods will be disregarded, hence the incoming Pod will be scheduled into "zoneA".
|
||||
2. the incoming Pod has no chances to be scheduled onto this kind of nodes - in the above example, suppose a "node5" carrying label `{zone-typo: zoneC}` joins the cluster, it will be bypassed due to the absence of label key "zone".
|
||||
|
||||
- Be aware of what will happen if the incomingPod's `topologySpreadConstraints[*].labelSelector` doesn't match its own labels. In the above example, if we remove the incoming Pod's labels, it can still be placed onto "zoneB" since the constraints are still satisfied. However, after the placement, the degree of imbalance of the cluster remains unchanged - it's still zoneA having 2 Pods which hold label {foo:bar}, and zoneB having 1 Pod which holds label {foo:bar}. So if this is not what you expect, we recommend the workload's `topologySpreadConstraints[*].labelSelector` to match its own labels.
|
||||
|
||||
### Cluster-level default constraints
|
||||
|
||||
It is possible to set default topology spread constraints for a cluster. Default
|
||||
|
|
Loading…
Reference in New Issue