Merge pull request #26820 from ahg-g/nss

Document Pod affinity namespaceSelector
This commit is contained in:
Kubernetes Prow Robot 2021-03-08 11:09:00 -08:00 committed by GitHub
commit 0b32f897b2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 73 additions and 0 deletions

View File

@ -189,6 +189,7 @@ Resources specified on the quota outside of the allowed set results in a validat
| `BestEffort` | Match pods that have best effort quality of service. | | `BestEffort` | Match pods that have best effort quality of service. |
| `NotBestEffort` | Match pods that do not have best effort quality of service. | | `NotBestEffort` | Match pods that do not have best effort quality of service. |
| `PriorityClass` | Match pods that references the specified [priority class](/docs/concepts/configuration/pod-priority-preemption). | | `PriorityClass` | Match pods that references the specified [priority class](/docs/concepts/configuration/pod-priority-preemption). |
| `CrossNamespacePodAffinity` | Match pods that have cross-namespace pod [(anti)affinity terms](/docs/concepts/scheduling-eviction/assign-pod-node). |
The `BestEffort` scope restricts a quota to tracking the following resource: The `BestEffort` scope restricts a quota to tracking the following resource:
@ -429,6 +430,63 @@ memory 0 20Gi
pods 0 10 pods 0 10
``` ```
### Cross-namespace Pod Affinity Quota
{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
Operators can use `CrossNamespacePodAffinity` quota scope to limit which namespaces are allowed to
have pods with affinity terms that cross namespaces. Specifically, it controls which pods are allowed
to set `namespaces` or `namespaceSelector` fields in pod affinity terms.
Preventing users from using cross-namespace affinity terms might be desired since a pod
with anti-affinity constraints can block pods from all other namespaces
from getting scheduled in a failure domain.
Using this scope operators can prevent certain namespaces (`foo-ns` in the example below)
from having pods that use cross-namespace pod affinity by creating a resource quota object in
that namespace with `CrossNamespaceAffinity` scope and hard limit of 0:
```yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: disable-cross-namespace-affinity
namespace: foo-ns
spec:
hard:
pods: "0"
scopeSelector:
matchExpressions:
- scopeName: CrossNamespaceAffinity
```
If operators want to disallow using `namespaces` and `namespaceSelector` by default, and
only allow it for specific namespaces, they could configure `CrossNamespaceAffinity`
as a limited resource by setting the kube-apiserver flag --admission-control-config-file
to the path of the following configuration file:
```yaml
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: "ResourceQuota"
configuration:
apiVersion: apiserver.config.k8s.io/v1
kind: ResourceQuotaConfiguration
limitedResources:
- resource: pods
matchScopes:
- scopeName: CrossNamespaceAffinity
```
With the above configuration, pods can use `namespaces` and `namespaceSelector` in pod affinity only
if the namespace where they are created have a resource quota object with
`CrossNamespaceAffinity` scope and a hard limit greater than or equal to the number of pods using those fields.
This feature is alpha and disabled by default. You can enable it by setting the
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
`PodAffinityNamespaceSelector` in both kube-apiserver and kube-scheduler.
## Requests compared to Limits {#requests-vs-limits} ## Requests compared to Limits {#requests-vs-limits}
When allocating compute resources, each container may specify a request and a limit value for either CPU or memory. When allocating compute resources, each container may specify a request and a limit value for either CPU or memory.

View File

@ -271,6 +271,18 @@ If omitted or empty, it defaults to the namespace of the pod where the affinity/
All `matchExpressions` associated with `requiredDuringSchedulingIgnoredDuringExecution` affinity and anti-affinity All `matchExpressions` associated with `requiredDuringSchedulingIgnoredDuringExecution` affinity and anti-affinity
must be satisfied for the pod to be scheduled onto a node. must be satisfied for the pod to be scheduled onto a node.
#### Namespace selector
{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
Users can also select matching namespaces using `namespaceSelector`, which is a label query over the set of namespaces.
The affinity term is applied to the union of the namespaces selected by `namespaceSelector` and the ones listed in the `namespaces` field.
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
null `namespaceSelector` means "this pod's namespace".
This feature is alpha and disabled by default. You can enable it by setting the
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
`PodAffinityNamespaceSelector` in both kube-apiserver and kube-scheduler.
#### More Practical Use-cases #### More Practical Use-cases
Interpod Affinity and AntiAffinity can be even more useful when they are used with higher Interpod Affinity and AntiAffinity can be even more useful when they are used with higher

View File

@ -140,6 +140,7 @@ different Kubernetes components.
| `NonPreemptingPriority` | `true` | Beta | 1.19 | | | `NonPreemptingPriority` | `true` | Beta | 1.19 | |
| `PodDisruptionBudget` | `false` | Alpha | 1.3 | 1.4 | | `PodDisruptionBudget` | `false` | Alpha | 1.3 | 1.4 |
| `PodDisruptionBudget` | `true` | Beta | 1.5 | | | `PodDisruptionBudget` | `true` | Beta | 1.5 | |
| `PodAffinityNamespaceSelector` | `false` | Alpha | 1.21 | |
| `PodOverhead` | `false` | Alpha | 1.16 | 1.17 | | `PodOverhead` | `false` | Alpha | 1.16 | 1.17 |
| `PodOverhead` | `true` | Beta | 1.18 | | | `PodOverhead` | `true` | Beta | 1.18 | |
| `ProcMountType` | `false` | Alpha | 1.12 | | | `ProcMountType` | `false` | Alpha | 1.12 | |
@ -671,6 +672,8 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `PersistentLocalVolumes`: Enable the usage of `local` volume type in Pods. - `PersistentLocalVolumes`: Enable the usage of `local` volume type in Pods.
Pod affinity has to be specified if requesting a `local` volume. Pod affinity has to be specified if requesting a `local` volume.
- `PodDisruptionBudget`: Enable the [PodDisruptionBudget](/docs/tasks/run-application/configure-pdb/) feature. - `PodDisruptionBudget`: Enable the [PodDisruptionBudget](/docs/tasks/run-application/configure-pdb/) feature.
- `PodAffinityNamespaceSelector`: Enable the [Pod Affinity Namespace Selector](/docs/concepts/scheduling-eviction/assign-pod-node/#namespace-selector)
and [CrossNamespacePodAffinity](/docs/concepts/policy/resource-quotas/#cross-namespace-pod-affinity-quota) quota scope features.
- `PodOverhead`: Enable the [PodOverhead](/docs/concepts/scheduling-eviction/pod-overhead/) - `PodOverhead`: Enable the [PodOverhead](/docs/concepts/scheduling-eviction/pod-overhead/)
feature to account for pod overheads. feature to account for pod overheads.
- `PodPriority`: Enable the descheduling and preemption of Pods based on their - `PodPriority`: Enable the descheduling and preemption of Pods based on their