Add DisableAcceleratorUsageMetrics feature flag
Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>
This commit is contained in:
parent
3841d72380
commit
bbe4de82e1
|
|
@ -98,6 +98,14 @@ Take metric `A` as an example, here assumed that `A` is deprecated in 1.n. Accor
|
|||
|
||||
If you're upgrading from release `1.12` to `1.13`, but still depend on a metric `A` deprecated in `1.12`, you should set hidden metrics via command line: `--show-hidden-metrics=1.12` and remember to remove this metric dependency before upgrading to `1.14`
|
||||
|
||||
## Disable accelerator metrics
|
||||
|
||||
The kubelet collects accelerator metrics through cAdvisor. To collect these metrics, for accelerators like NVIDIA GPUs, kubelet held an open handle on the driver. This meant that in order to perform infrastructure changes (for example, updating the driver), a cluster administrator needed to stop the kubelet agent.
|
||||
|
||||
The responsibility for collecting accelerator metrics now belongs to the vendor rather than the kubelet. Vendors must provide a container that collects metrics and exposes them to the metrics service (for example, Prometheus).
|
||||
|
||||
The [`DisableAcceleratorUsageMetrics` feature gate](/docs/references/command-line-tools-reference/feature-gate.md#feature-gates-for-alpha-or-beta-features:~:text= DisableAcceleratorUsageMetrics,-false) disables metrics collected by the kubelet, with a [timeline for enabling this feature by default](https://github.com/kubernetes/enhancements/tree/411e51027db842355bd489691af897afc1a41a5e/keps/sig-node/1867-disable-accelerator-usage-metrics#graduation-criteria).
|
||||
|
||||
## Component metrics
|
||||
|
||||
### kube-controller-manager metrics
|
||||
|
|
|
|||
|
|
@ -88,6 +88,7 @@ different Kubernetes components.
|
|||
| `DefaultPodTopologySpread` | `false` | Alpha | 1.19 | |
|
||||
| `DevicePlugins` | `false` | Alpha | 1.8 | 1.9 |
|
||||
| `DevicePlugins` | `true` | Beta | 1.10 | |
|
||||
| `DisableAcceleratorUsageMetrics` | `false` | Alpha | 1.19 | 1.20 |
|
||||
| `DryRun` | `false` | Alpha | 1.12 | 1.12 |
|
||||
| `DryRun` | `true` | Beta | 1.13 | |
|
||||
| `DynamicKubeletConfig` | `false` | Alpha | 1.4 | 1.10 |
|
||||
|
|
@ -420,6 +421,7 @@ Each feature gate is designed for enabling/disabling a specific feature:
|
|||
- `CustomResourceWebhookConversion`: Enable webhook-based conversion
|
||||
on resources created from [CustomResourceDefinition](/docs/concepts/api-extension/custom-resources/).
|
||||
troubleshoot a running Pod.
|
||||
- `DisableAcceleratorUsageMetrics`: [Disable accelerator metrics collected by the kubelet](/docs/concepts/cluster-administration/monitoring.md).
|
||||
- `DevicePlugins`: Enable the [device-plugins](/docs/concepts/cluster-administration/device-plugins/)
|
||||
based resource provisioning on nodes.
|
||||
- `DefaultPodTopologySpread`: Enables the use of `PodTopologySpread` scheduling plugin to do
|
||||
|
|
|
|||
Loading…
Reference in New Issue