Merge pull request #49146 from windsonsea/noshut

Clean up cluster-administration/node-shutdown.md
2025-01-06 13:32:17 +01:00 · 2025-01-06 13:32:17 +01:00 · 8c7ba0a478
parent 659d8413e1 664a30e73a
commit 8c7ba0a478
1 changed files with 59 additions and 50 deletions
--- a/content/en/docs/concepts/cluster-administration/node-shutdown.md
+++ b/content/en/docs/concepts/cluster-administration/node-shutdown.md
@ -5,25 +5,27 @@ weight: 10
 ---

 <!-- overview -->
+
 In a Kubernetes cluster, a {{< glossary_tooltip text="node" term_id="node" >}}
-can be shutdown in a planned graceful way or unexpectedly because of reasons such
+can be shut down in a planned graceful way or unexpectedly because of reasons such
 as a power outage or something else external. A node shutdown could lead to workload
 failure if the node is not drained before the shutdown. A node shutdown can be
 either **graceful** or **non-graceful**.

 <!-- body -->
+
 ## Graceful node shutdown {#graceful-node-shutdown}

 {{< feature-state feature_gate_name="GracefulNodeShutdown" >}}

 The kubelet attempts to detect node system shutdown and terminates pods running on the node.

-Kubelet ensures that pods follow the normal
+kubelet ensures that pods follow the normal
 [pod termination process](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination)
 during the node shutdown. During node shutdown, the kubelet does not accept new
 Pods (even if those Pods are already bound to the node).

-The Graceful node shutdown feature depends on systemd since it takes advantage of
+The graceful node shutdown feature depends on systemd since it takes advantage of
 [systemd inhibitor locks](https://www.freedesktop.org/wiki/Software/systemd/inhibit/) to
 delay the node shutdown with a given duration.

@ -32,12 +34,12 @@ Graceful node shutdown is controlled with the `GracefulNodeShutdown`
 enabled by default in 1.21.

 Note that by default, both configuration options described below,
-`shutdownGracePeriod` and `shutdownGracePeriodCriticalPods` are set to zero,
+`shutdownGracePeriod` and `shutdownGracePeriodCriticalPods`, are set to zero,
 thus not activating the graceful node shutdown functionality.
-To activate the feature, the two kubelet config settings should be configured appropriately and
+To activate the feature, both options should be configured appropriately and
 set to non-zero values.

-Once systemd detects or notifies node shutdown, the kubelet sets a `NotReady` condition on
+Once systemd detects or is notified of a node shutdown, the kubelet sets a `NotReady` condition on
 the Node, with the `reason` set to `"node is shutting down"`. The kube-scheduler honors this condition
 and does not schedule any Pods onto the affected node; other third-party schedulers are
 expected to follow the same logic. This means that new Pods won't be scheduled onto that node
@ -48,26 +50,29 @@ node shutdown has been detected, so that even Pods with a
 {{< glossary_tooltip text="toleration" term_id="toleration" >}} for
 `node.kubernetes.io/not-ready:NoSchedule` do not start there.

-At the same time when kubelet is setting that condition on its Node via the API,
+When kubelet is setting that condition on its Node via the API,
 the kubelet also begins terminating any Pods that are running locally.

 During a graceful shutdown, kubelet terminates pods in two phases:

 1. Terminate regular pods running on the node.
-2. Terminate [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
+1. Terminate [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
   running on the node.

-Graceful node shutdown feature is configured with two
+The graceful node shutdown feature is configured with two
 [`KubeletConfiguration`](/docs/tasks/administer-cluster/kubelet-config-file/) options:

-* `shutdownGracePeriod`:
-  * Specifies the total duration that the node should delay the shutdown by. This is the total
-    grace period for pod termination for both regular and
-    [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical).
-* `shutdownGracePeriodCriticalPods`:
-  * Specifies the duration used to terminate
-    [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
-    during a node shutdown. This value should be less than `shutdownGracePeriod`.
+- `shutdownGracePeriod`:
+
+  Specifies the total duration that the node should delay the shutdown by. This is the total
+  grace period for pod termination for both regular and
+  [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical).
+
+- `shutdownGracePeriodCriticalPods`:
+
+  Specifies the duration used to terminate
+  [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
+  during a node shutdown. This value should be less than `shutdownGracePeriod`.

 {{< note >}}

@ -122,22 +127,22 @@ Assuming the following custom pod
 [priority classes](/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass)
 in a cluster,

-|Pod priority class name|Pod priority class value|
-|-------------------------|------------------------|
-|`custom-class-a`         | 100000                 |
-|`custom-class-b`         | 10000                  |
-|`custom-class-c`         | 1000                   |
-|`regular/unset`          | 0                      |
+| Pod priority class name | Pod priority class value |
+| ----------------------- | ------------------------ |
+| `custom-class-a`        | 100000                   |
+| `custom-class-b`        | 10000                    |
+| `custom-class-c`        | 1000                     |
+| `regular/unset`         | 0                        |

 Within the [kubelet configuration](/docs/reference/config-api/kubelet-config.v1beta1/)
 the settings for `shutdownGracePeriodByPodPriority` could look like:

-|Pod priority class value|Shutdown period|
-|------------------------|---------------|
-| 100000                 |10 seconds     |
-| 10000                  |180 seconds    |
-| 1000                   |120 seconds    |
-| 0                      |60 seconds     |
+| Pod priority class value | Shutdown period |
+| ------------------------ | --------------- |
+| 100000                   | 10 seconds      |
+| 10000                    | 180 seconds     |
+| 1000                     | 120 seconds     |
+| 0                        | 60 seconds      |

 The corresponding kubelet config YAML configuration would be:

@ -154,18 +159,18 @@ shutdownGracePeriodByPodPriority:
 ```

 The above table implies that any pod with `priority` value >= 100000 will get
-just 10 seconds to stop, any pod with value >= 10000 and < 100000 will get 180
-seconds to stop, any pod with value >= 1000 and < 10000 will get 120 seconds to stop.
-Finally, all other pods will get 60 seconds to stop.
+just 10 seconds to shut down, any pod with value >= 10000 and < 100000 will get 180
+seconds to shut down, any pod with value >= 1000 and < 10000 will get 120 seconds to shut down.
+Finally, all other pods will get 60 seconds to shut down.

 One doesn't have to specify values corresponding to all of the classes. For
 example, you could instead use these settings:

-|Pod priority class value|Shutdown period|
-|------------------------|---------------|
-| 100000                 |300 seconds    |
-| 1000                   |120 seconds    |
-| 0                      |60 seconds     |
+| Pod priority class value | Shutdown period |
+| ------------------------ | --------------- |
+| 100000                   | 300 seconds     |
+| 1000                     | 120 seconds     |
+| 0                        | 60 seconds      |

 In the above case, the pods with `custom-class-b` will go into the same bucket
 as `custom-class-c` for shutdown.
@ -225,14 +230,16 @@ on a different node.
 During a non-graceful shutdown, Pods are terminated in the two phases:

 1. Force delete the Pods that do not have matching `out-of-service` tolerations.
-2. Immediately perform detach volume operation for such pods.
+1. Immediately perform detach volume operation for such pods.

 {{< note >}}
+
 - Before adding the taint `node.kubernetes.io/out-of-service`, it should be verified
  that the node is already in shutdown or power off state (not in the middle of restarting).
 - The user is required to manually remove the out-of-service taint after the pods are
  moved to a new node and the user has checked that the shutdown node has been
  recovered since the user was the one who originally added the taint.
+
 {{< /note >}}

 ### Forced storage detach on timeout {#storage-force-detach-on-timeout}
@ -256,39 +263,41 @@ its associated
 [VolumeAttachment](/docs/reference/kubernetes-api/config-and-storage-resources/volume-attachment-v1/)
 deleted.

-After this setting has been applied, unhealthy pods still attached to a volumes must be recovered
+After this setting has been applied, unhealthy pods still attached to volumes must be recovered
 via the [Non-Graceful Node Shutdown](#non-graceful-node-shutdown) procedure mentioned above.

 {{< note >}}
+
 - Caution must be taken while using the [Non-Graceful Node Shutdown](#non-graceful-node-shutdown) procedure.
 - Deviation from the steps documented above can result in data corruption.
-{{< /note >}}

+{{< /note >}}

 ## Windows Graceful node shutdown {#windows-graceful-node-shutdown}

 {{< feature-state feature_gate_name="WindowsGracefulNodeShutdown" >}}

-The Windows graceful node shutdown feature depends on kubelet running as a Windows service, 
-it will then have a registered [service control handler](https://learn.microsoft.com/en-us/windows/win32/services/service-control-handler-function) 
-to delay the  presshutdown event with a given duration.
+The Windows graceful node shutdown feature depends on kubelet running as a Windows service,
+it will then have a registered [service control handler](https://learn.microsoft.com/en-us/windows/win32/services/service-control-handler-function)
+to delay the preshutdown event with a given duration.

-Windows graceful node shutdown is controlled with the `WindowsGracefulNodeShutdown` 
-[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) 
+Windows graceful node shutdown is controlled with the `WindowsGracefulNodeShutdown`
+[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
 which is introduced in 1.32 as an alpha feature.

 Windows graceful node shutdown can not be cancelled.

-If Kubelet is not running as a Windows service, it will not be able to set and monitor 
+If kubelet is not running as a Windows service, it will not be able to set and monitor
 the [Preshutdown](https://learn.microsoft.com/en-us/windows/win32/api/winsvc/ns-winsvc-service_preshutdown_info) event,
 the node will have to go through the [Non-Graceful Node Shutdown](#non-graceful-node-shutdown) procedure mentioned above.

-In the case where the Windows graceful node shutdown feature is enabled, but the kubelet is not 
-running as a Windows service, the kubelet will continue running instead of failing. However, 
+In the case where the Windows graceful node shutdown feature is enabled, but the kubelet is not
+running as a Windows service, the kubelet will continue running instead of failing. However,
 it will log an error indicating that it needs to be run as a Windows service.

 ## {{% heading "whatsnext" %}}

 Learn more about the following:
-* Blog: [Non-Graceful Node Shutdown](/blog/2023/08/16/kubernetes-1-28-non-graceful-node-shutdown-ga/).
-* Cluster Architecture: [Nodes](/docs/concepts/architecture/nodes/).
+
+- Blog: [Non-Graceful Node Shutdown](/blog/2023/08/16/kubernetes-1-28-non-graceful-node-shutdown-ga/).
+- Cluster Architecture: [Nodes](/docs/concepts/architecture/nodes/).