Merge pull request #49108 from windsonsea/nodeger

Clean up policy/node-resource-managers.md
This commit is contained in:
Kubernetes Prow Robot 2025-01-17 01:52:06 -08:00 committed by GitHub
commit 641ea80f3c
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 60 additions and 47 deletions

View File

@ -9,16 +9,18 @@ weight: 50
<!-- overview --> <!-- overview -->
In order to support latency-critical and high-throughput workloads, Kubernetes offers a suite of Resource Managers. The managers aim to co-ordinate and optimise node's resources alignment for pods configured with a specific requirement for CPUs, devices, and memory (hugepages) resources. In order to support latency-critical and high-throughput workloads, Kubernetes offers a suite of
Resource Managers. The managers aim to co-ordinate and optimise the alignment of node's resources for pods
configured with a specific requirement for CPUs, devices, and memory (hugepages) resources.
<!-- body --> <!-- body -->
## Hardware topology alignment policies ## Hardware topology alignment policies
_Topology Manager_ is a kubelet component that aims to coordinate the set of components that are _Topology Manager_ is a kubelet component that aims to coordinate the set of components that are
responsible for these optimizations. The the overall resource management process is governed using responsible for these optimizations. The overall resource management process is governed using
the policy you specify. the policy you specify. To learn more, read
To learn more, read [Control Topology Management Policies on a Node](/docs/tasks/administer-cluster/topology-manager/). [Control Topology Management Policies on a Node](/docs/tasks/administer-cluster/topology-manager/).
## Policies for assigning CPUs to Pods ## Policies for assigning CPUs to Pods
@ -29,27 +31,30 @@ hardware (for example, sharing CPUs across multiple Pods) or allocate hardware b
resource (for example, assigning one of more CPUs for a Pod's exclusive use). resource (for example, assigning one of more CPUs for a Pod's exclusive use).
By default, the kubelet uses [CFS quota](https://en.wikipedia.org/wiki/Completely_Fair_Scheduler) By default, the kubelet uses [CFS quota](https://en.wikipedia.org/wiki/Completely_Fair_Scheduler)
to enforce pod CPU limits.  When the node runs many CPU-bound pods, the workload can move to different CPU cores depending on to enforce pod CPU limits.  When the node runs many CPU-bound pods, the workload can move to
whether the pod is throttled and which CPU cores are available at scheduling time. Many workloads are not sensitive to this migration and thus different CPU cores depending on whether the pod is throttled and which CPU cores are available
at scheduling time. Many workloads are not sensitive to this migration and thus
work fine without any intervention. work fine without any intervention.
However, in workloads where CPU cache affinity and scheduling latency significantly affect workload performance, the kubelet allows alternative CPU However, in workloads where CPU cache affinity and scheduling latency significantly affect
workload performance, the kubelet allows alternative CPU
management policies to determine some placement preferences on the node. management policies to determine some placement preferences on the node.
This is implemented using the _CPU Manager_ and its policy. This is implemented using the _CPU Manager_ and its policy.
There are two available policies: There are two available policies:
- `none`: the `none` policy explicitly enables the existing default CPU - `none`: the `none` policy explicitly enables the existing default CPU
affinity scheme, providing no affinity beyond what the OS scheduler does affinity scheme, providing no affinity beyond what the OS scheduler does
automatically.  Limits on CPU usage for automatically.  Limits on CPU usage for
[Guaranteed pods](/docs/concepts/workloads/pods/pod-qos/) and [Guaranteed pods](/docs/concepts/workloads/pods/pod-qos/) and
[Burstable pods](/docs/concepts/workloads/pods/pod-qos/) [Burstable pods](/docs/concepts/workloads/pods/pod-qos/)
are enforced using CFS quota. are enforced using CFS quota.
- `static`: the `static` policy allows containers in `Guaranteed` pods with integer CPU - `static`: the `static` policy allows containers in `Guaranteed` pods with integer CPU
`requests` access to exclusive CPUs on the node. This exclusivity is enforced `requests` access to exclusive CPUs on the node. This exclusivity is enforced
using the [cpuset cgroup controller](https://www.kernel.org/doc/Documentation/cgroup-v2.txt). using the [cpuset cgroup controller](https://www.kernel.org/doc/Documentation/cgroup-v2.txt).
{{< note >}} {{< note >}}
System services such as the container runtime and the kubelet itself can continue to run on these exclusive CPUs.  The exclusivity only extends to other pods. System services such as the container runtime and the kubelet itself can continue to run on
these exclusive CPUs.  The exclusivity only extends to other pods.
{{< /note >}} {{< /note >}}
CPU Manager doesn't support offlining and onlining of CPUs at runtime. CPU Manager doesn't support offlining and onlining of CPUs at runtime.
@ -64,12 +69,12 @@ CPUs reserved by these options are taken, in integer quantity, from the initial
core ID.  This shared pool is the set of CPUs on which any containers in core ID.  This shared pool is the set of CPUs on which any containers in
`BestEffort` and `Burstable` pods run. Containers in `Guaranteed` pods with fractional `BestEffort` and `Burstable` pods run. Containers in `Guaranteed` pods with fractional
CPU `requests` also run on CPUs in the shared pool. Only containers that are CPU `requests` also run on CPUs in the shared pool. Only containers that are
both part of a `Guaranteed` pod and have integer CPU `requests` are assigned part of a `Guaranteed` pod and have integer CPU `requests` are assigned
exclusive CPUs. exclusive CPUs.
{{< note >}} {{< note >}}
The kubelet requires a CPU reservation greater than zero when the static policy is enabled. The kubelet requires a CPU reservation greater than zero when the static policy is enabled.
This is because zero CPU reservation would allow the shared pool to become empty. This is because a zero CPU reservation would allow the shared pool to become empty.
{{< /note >}} {{< /note >}}
As `Guaranteed` pods whose containers fit the requirements for being statically As `Guaranteed` pods whose containers fit the requirements for being statically
@ -144,7 +149,6 @@ The pod above runs in the `Guaranteed` QoS class because `requests` are equal to
And the container's resource limit for the CPU resource is an integer greater than And the container's resource limit for the CPU resource is an integer greater than
or equal to one. The `nginx` container is granted 2 exclusive CPUs. or equal to one. The `nginx` container is granted 2 exclusive CPUs.
```yaml ```yaml
spec: spec:
containers: containers:
@ -163,7 +167,6 @@ The pod above runs in the `Guaranteed` QoS class because `requests` are equal to
But the container's resource limit for the CPU resource is a fraction. It runs in But the container's resource limit for the CPU resource is a fraction. It runs in
the shared pool. the shared pool.
```yaml ```yaml
spec: spec:
containers: containers:
@ -182,27 +185,38 @@ equal to one. The `nginx` container is granted 2 exclusive CPUs.
#### Static policy options {#cpu-policy-static--options} #### Static policy options {#cpu-policy-static--options}
The behavior of the static policy can be fine-tuned using the CPU Manager policy options. Here are the available policy options for the static CPU management policy,
The following policy options exist for the static CPU management policy: listed in alphabetical order:
{{/* options in alphabetical order */}}
`align-by-socket` (alpha, hidden by default) `align-by-socket` (alpha, hidden by default)
: Align CPUs by physical package / socket boundary, rather than logical NUMA boundaries (available since Kubernetes v1.25) : Align CPUs by physical package / socket boundary, rather than logical NUMA boundaries
(available since Kubernetes v1.25)
`distribute-cpus-across-cores` (alpha, hidden by default) `distribute-cpus-across-cores` (alpha, hidden by default)
: Allocate virtual cores, sometimes called hardware threads, across different physical cores (available since Kubernetes v1.31) : Allocate virtual cores, sometimes called hardware threads, across different physical cores
(available since Kubernetes v1.31)
`distribute-cpus-across-numa` (alpha, hidden by default) `distribute-cpus-across-numa` (alpha, hidden by default)
: Spread CPUs across different NUMA domains, aiming for an even balance between the selected domains (available since Kubernetes v1.23) : Spread CPUs across different NUMA domains, aiming for an even balance between the selected domains
(available since Kubernetes v1.23)
`full-pcpus-only` (beta, visible by default) `full-pcpus-only` (beta, visible by default)
: Always allocate full physical cores (available since Kubernetes v1.22) : Always allocate full physical cores (available since Kubernetes v1.22)
`strict-cpu-reservation` (alpha, hidden by default) `strict-cpu-reservation` (alpha, hidden by default)
: Prevent all the pods regardless of their Quality of Service class to run on reserved CPUs (available since Kubernetes v1.32) : Prevent all the pods regardless of their Quality of Service class to run on reserved CPUs
(available since Kubernetes v1.32)
`prefer-align-cpus-by-uncorecache` (alpha, hidden by default) `prefer-align-cpus-by-uncorecache` (alpha, hidden by default)
: Align CPUs by uncore (Last-Level) cache boundary on a best-effort way (available since Kubernetes v1.32) : Align CPUs by uncore (Last-Level) cache boundary on a best-effort way
(available since Kubernetes v1.32)
You can toggle groups of options on and off based upon their maturity level You can toggle groups of options on and off based upon their maturity level
using the following feature gates: using the following feature gates:
* `CPUManagerPolicyBetaOptions` (default enabled). Disable to hide beta-level options. * `CPUManagerPolicyBetaOptions` (default enabled). Disable to hide beta-level options.
* `CPUManagerPolicyAlphaOptions` (default disabled). Enable to show alpha-level options. * `CPUManagerPolicyAlphaOptions` (default disabled). Enable to show alpha-level options.
You will still have to enable each option using the `cpuManagerPolicyOptions` field in the You will still have to enable each option using the `cpuManagerPolicyOptions` field in the
kubelet configuration file. kubelet configuration file.
@ -253,10 +267,10 @@ than number of NUMA nodes.
If the `distribute-cpus-across-cores` policy option is specified, the static policy If the `distribute-cpus-across-cores` policy option is specified, the static policy
will attempt to allocate virtual cores (hardware threads) across different physical cores. will attempt to allocate virtual cores (hardware threads) across different physical cores.
By default, the `CPUManager` tends to pack cpus onto as few physical cores as possible, By default, the `CPUManager` tends to pack CPUs onto as few physical cores as possible,
which can lead to contention among cpus on the same physical core and result which can lead to contention among CPUs on the same physical core and result
in performance bottlenecks. By enabling the `distribute-cpus-across-cores` policy, in performance bottlenecks. By enabling the `distribute-cpus-across-cores` policy,
the static policy ensures that cpus are distributed across as many physical cores the static policy ensures that CPUs are distributed across as many physical cores
as possible, reducing the contention on the same physical core and thereby as possible, reducing the contention on the same physical core and thereby
improving overall performance. However, it is important to note that this strategy improving overall performance. However, it is important to note that this strategy
might be less effective when the system is heavily loaded. Under such conditions, might be less effective when the system is heavily loaded. Under such conditions,
@ -270,9 +284,9 @@ The `reservedSystemCPUs` parameter in [KubeletConfiguration](/docs/reference/con
or the deprecated kubelet command line option `--reserved-cpus`, defines an explicit CPU set for OS system daemons or the deprecated kubelet command line option `--reserved-cpus`, defines an explicit CPU set for OS system daemons
and kubernetes system daemons. More details of this parameter can be found on the and kubernetes system daemons. More details of this parameter can be found on the
[Explicitly Reserved CPU List](/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list) page. [Explicitly Reserved CPU List](/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list) page.
By default this isolation is implemented only for guaranteed pods with integer CPU requests not for burstable and best-effort pods By default, this isolation is implemented only for guaranteed pods with integer CPU requests not for burstable and best-effort pods
(and guaranteed pods with fractional CPU requests). Admission is only comparing the cpu requests against the allocatable cpus. (and guaranteed pods with fractional CPU requests). Admission is only comparing the CPU requests against the allocatable CPUs.
Since the cpu limit is higher than the request, the default behaviour allows burstable and best-effort pods to use up the capacity Since the CPU limit is higher than the request, the default behaviour allows burstable and best-effort pods to use up the capacity
of `reservedSystemCPUs` and cause host OS services to starve in real life deployments. of `reservedSystemCPUs` and cause host OS services to starve in real life deployments.
If the `strict-cpu-reservation` policy option is enabled, the static policy will not allow If the `strict-cpu-reservation` policy option is enabled, the static policy will not allow
any workload to use the CPU cores specified in `reservedSystemCPUs`. any workload to use the CPU cores specified in `reservedSystemCPUs`.
@ -294,7 +308,6 @@ of inter-cache latency and noisy neighbors at the cache level. If the `CPUManage
cannot align optimally while the node has sufficient resources, the container will cannot align optimally while the node has sufficient resources, the container will
still be admitted using the default packed behavior. still be admitted using the default packed behavior.
## Memory Management Policies ## Memory Management Policies
{{< feature-state feature_gate_name="MemoryManager" >}} {{< feature-state feature_gate_name="MemoryManager" >}}