Merge pull request #26820 from ahg-g/nss
Document Pod affinity namespaceSelector
This commit is contained in:
commit
0b32f897b2
|
@ -189,6 +189,7 @@ Resources specified on the quota outside of the allowed set results in a validat
|
||||||
| `BestEffort` | Match pods that have best effort quality of service. |
|
| `BestEffort` | Match pods that have best effort quality of service. |
|
||||||
| `NotBestEffort` | Match pods that do not have best effort quality of service. |
|
| `NotBestEffort` | Match pods that do not have best effort quality of service. |
|
||||||
| `PriorityClass` | Match pods that references the specified [priority class](/docs/concepts/configuration/pod-priority-preemption). |
|
| `PriorityClass` | Match pods that references the specified [priority class](/docs/concepts/configuration/pod-priority-preemption). |
|
||||||
|
| `CrossNamespacePodAffinity` | Match pods that have cross-namespace pod [(anti)affinity terms](/docs/concepts/scheduling-eviction/assign-pod-node). |
|
||||||
|
|
||||||
The `BestEffort` scope restricts a quota to tracking the following resource:
|
The `BestEffort` scope restricts a quota to tracking the following resource:
|
||||||
|
|
||||||
|
@ -429,6 +430,63 @@ memory 0 20Gi
|
||||||
pods 0 10
|
pods 0 10
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Cross-namespace Pod Affinity Quota
|
||||||
|
|
||||||
|
{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
|
||||||
|
|
||||||
|
Operators can use `CrossNamespacePodAffinity` quota scope to limit which namespaces are allowed to
|
||||||
|
have pods with affinity terms that cross namespaces. Specifically, it controls which pods are allowed
|
||||||
|
to set `namespaces` or `namespaceSelector` fields in pod affinity terms.
|
||||||
|
|
||||||
|
Preventing users from using cross-namespace affinity terms might be desired since a pod
|
||||||
|
with anti-affinity constraints can block pods from all other namespaces
|
||||||
|
from getting scheduled in a failure domain.
|
||||||
|
|
||||||
|
Using this scope operators can prevent certain namespaces (`foo-ns` in the example below)
|
||||||
|
from having pods that use cross-namespace pod affinity by creating a resource quota object in
|
||||||
|
that namespace with `CrossNamespaceAffinity` scope and hard limit of 0:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ResourceQuota
|
||||||
|
metadata:
|
||||||
|
name: disable-cross-namespace-affinity
|
||||||
|
namespace: foo-ns
|
||||||
|
spec:
|
||||||
|
hard:
|
||||||
|
pods: "0"
|
||||||
|
scopeSelector:
|
||||||
|
matchExpressions:
|
||||||
|
- scopeName: CrossNamespaceAffinity
|
||||||
|
```
|
||||||
|
|
||||||
|
If operators want to disallow using `namespaces` and `namespaceSelector` by default, and
|
||||||
|
only allow it for specific namespaces, they could configure `CrossNamespaceAffinity`
|
||||||
|
as a limited resource by setting the kube-apiserver flag --admission-control-config-file
|
||||||
|
to the path of the following configuration file:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
apiVersion: apiserver.config.k8s.io/v1
|
||||||
|
kind: AdmissionConfiguration
|
||||||
|
plugins:
|
||||||
|
- name: "ResourceQuota"
|
||||||
|
configuration:
|
||||||
|
apiVersion: apiserver.config.k8s.io/v1
|
||||||
|
kind: ResourceQuotaConfiguration
|
||||||
|
limitedResources:
|
||||||
|
- resource: pods
|
||||||
|
matchScopes:
|
||||||
|
- scopeName: CrossNamespaceAffinity
|
||||||
|
```
|
||||||
|
|
||||||
|
With the above configuration, pods can use `namespaces` and `namespaceSelector` in pod affinity only
|
||||||
|
if the namespace where they are created have a resource quota object with
|
||||||
|
`CrossNamespaceAffinity` scope and a hard limit greater than or equal to the number of pods using those fields.
|
||||||
|
|
||||||
|
This feature is alpha and disabled by default. You can enable it by setting the
|
||||||
|
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
|
||||||
|
`PodAffinityNamespaceSelector` in both kube-apiserver and kube-scheduler.
|
||||||
|
|
||||||
## Requests compared to Limits {#requests-vs-limits}
|
## Requests compared to Limits {#requests-vs-limits}
|
||||||
|
|
||||||
When allocating compute resources, each container may specify a request and a limit value for either CPU or memory.
|
When allocating compute resources, each container may specify a request and a limit value for either CPU or memory.
|
||||||
|
|
|
@ -271,6 +271,18 @@ If omitted or empty, it defaults to the namespace of the pod where the affinity/
|
||||||
All `matchExpressions` associated with `requiredDuringSchedulingIgnoredDuringExecution` affinity and anti-affinity
|
All `matchExpressions` associated with `requiredDuringSchedulingIgnoredDuringExecution` affinity and anti-affinity
|
||||||
must be satisfied for the pod to be scheduled onto a node.
|
must be satisfied for the pod to be scheduled onto a node.
|
||||||
|
|
||||||
|
#### Namespace selector
|
||||||
|
{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
|
||||||
|
|
||||||
|
Users can also select matching namespaces using `namespaceSelector`, which is a label query over the set of namespaces.
|
||||||
|
The affinity term is applied to the union of the namespaces selected by `namespaceSelector` and the ones listed in the `namespaces` field.
|
||||||
|
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
|
||||||
|
null `namespaceSelector` means "this pod's namespace".
|
||||||
|
|
||||||
|
This feature is alpha and disabled by default. You can enable it by setting the
|
||||||
|
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
|
||||||
|
`PodAffinityNamespaceSelector` in both kube-apiserver and kube-scheduler.
|
||||||
|
|
||||||
#### More Practical Use-cases
|
#### More Practical Use-cases
|
||||||
|
|
||||||
Interpod Affinity and AntiAffinity can be even more useful when they are used with higher
|
Interpod Affinity and AntiAffinity can be even more useful when they are used with higher
|
||||||
|
|
|
@ -140,6 +140,7 @@ different Kubernetes components.
|
||||||
| `NonPreemptingPriority` | `true` | Beta | 1.19 | |
|
| `NonPreemptingPriority` | `true` | Beta | 1.19 | |
|
||||||
| `PodDisruptionBudget` | `false` | Alpha | 1.3 | 1.4 |
|
| `PodDisruptionBudget` | `false` | Alpha | 1.3 | 1.4 |
|
||||||
| `PodDisruptionBudget` | `true` | Beta | 1.5 | |
|
| `PodDisruptionBudget` | `true` | Beta | 1.5 | |
|
||||||
|
| `PodAffinityNamespaceSelector` | `false` | Alpha | 1.21 | |
|
||||||
| `PodOverhead` | `false` | Alpha | 1.16 | 1.17 |
|
| `PodOverhead` | `false` | Alpha | 1.16 | 1.17 |
|
||||||
| `PodOverhead` | `true` | Beta | 1.18 | |
|
| `PodOverhead` | `true` | Beta | 1.18 | |
|
||||||
| `ProcMountType` | `false` | Alpha | 1.12 | |
|
| `ProcMountType` | `false` | Alpha | 1.12 | |
|
||||||
|
@ -671,6 +672,8 @@ Each feature gate is designed for enabling/disabling a specific feature:
|
||||||
- `PersistentLocalVolumes`: Enable the usage of `local` volume type in Pods.
|
- `PersistentLocalVolumes`: Enable the usage of `local` volume type in Pods.
|
||||||
Pod affinity has to be specified if requesting a `local` volume.
|
Pod affinity has to be specified if requesting a `local` volume.
|
||||||
- `PodDisruptionBudget`: Enable the [PodDisruptionBudget](/docs/tasks/run-application/configure-pdb/) feature.
|
- `PodDisruptionBudget`: Enable the [PodDisruptionBudget](/docs/tasks/run-application/configure-pdb/) feature.
|
||||||
|
- `PodAffinityNamespaceSelector`: Enable the [Pod Affinity Namespace Selector](/docs/concepts/scheduling-eviction/assign-pod-node/#namespace-selector)
|
||||||
|
and [CrossNamespacePodAffinity](/docs/concepts/policy/resource-quotas/#cross-namespace-pod-affinity-quota) quota scope features.
|
||||||
- `PodOverhead`: Enable the [PodOverhead](/docs/concepts/scheduling-eviction/pod-overhead/)
|
- `PodOverhead`: Enable the [PodOverhead](/docs/concepts/scheduling-eviction/pod-overhead/)
|
||||||
feature to account for pod overheads.
|
feature to account for pod overheads.
|
||||||
- `PodPriority`: Enable the descheduling and preemption of Pods based on their
|
- `PodPriority`: Enable the descheduling and preemption of Pods based on their
|
||||||
|
|
Loading…
Reference in New Issue