Refactoring node status to new page (#42378)

* Refactoring node status to new page - Created a new page in architecture in node for the status - Removed the current node status from concepts and moved it there * Update content/en/docs/reference/node/_index.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update content/en/docs/reference/node/node-status.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update content/en/docs/concepts/architecture/nodes.md Co-authored-by: Tim Bannister <tim@scalefactory.com> * Update node-status.md * Update node-status.md * Update content/en/docs/reference/node/node-status.md Co-authored-by: Tim Bannister <tim@scalefactory.com> --------- Co-authored-by: Tim Bannister <tim@scalefactory.com>
2023-08-18 07:12:28 -04:00 · 2023-08-18 07:12:28 -04:00 · 096a657f56
parent fc9493acf3
commit 096a657f56
3 changed files with 147 additions and 107 deletions
--- a/content/en/docs/concepts/architecture/nodes.md
+++ b/content/en/docs/concepts/architecture/nodes.md
@ -163,10 +163,10 @@ that should run on the Node even if it is being drained of workload applications

 A Node's status contains the following information:

-* [Addresses](#addresses)
-* [Conditions](#condition)
-* [Capacity and Allocatable](#capacity)
-* [Info](#info)
+* [Addresses](/docs/concepts/node/node-status/#addresses)
+* [Conditions](/docs/concepts/node/node-status/#condition)
+* [Capacity and Allocatable](/docs/concepts/node/node-status/#capacity)
+* [Info](/docs/concepts/node/node-status/#info)

 You can use `kubectl` to view a Node's status and other details:

@ -174,121 +174,21 @@ You can use `kubectl` to view a Node's status and other details:
 kubectl describe node <insert-node-name-here>
 ```

-Each section of the output is described below.
+See [Node Status](/docs/concepts/node/node-status) for more details

-### Addresses
-
-The usage of these fields varies depending on your cloud provider or bare metal configuration.
-
-* HostName: The hostname as reported by the node's kernel. Can be overridden via the kubelet
-  `--hostname-override` parameter.
-* ExternalIP: Typically the IP address of the node that is externally routable (available from
-  outside the cluster).
-* InternalIP: Typically the IP address of the node that is routable only within the cluster.
-
-
-### Conditions {#condition}
-
-The `conditions` field describes the status of all `Running` nodes. Examples of conditions include:
-
-{{< table caption = "Node conditions, and a description of when each condition applies." >}}
-| Node Condition       | Description |
-|----------------------|-------------|
-| `Ready`              | `True` if the node is healthy and ready to accept pods, `False` if the node is not healthy and is not accepting pods, and `Unknown` if the node controller has not heard from the node in the last `node-monitor-grace-period` (default is 40 seconds) |
-| `DiskPressure`       | `True` if pressure exists on the disk size—that is, if the disk capacity is low; otherwise `False` |
-| `MemoryPressure`     | `True` if pressure exists on the node memory—that is, if the node memory is low; otherwise `False` |
-| `PIDPressure`        | `True` if pressure exists on the processes—that is, if there are too many processes on the node; otherwise `False` |
-| `NetworkUnavailable` | `True` if the network for the node is not correctly configured, otherwise `False` |
-{{< /table >}}
-
-{{< note >}}
-If you use command-line tools to print details of a cordoned Node, the Condition includes
-`SchedulingDisabled`. `SchedulingDisabled` is not a Condition in the Kubernetes API; instead,
-cordoned nodes are marked Unschedulable in their spec.
-{{< /note >}}
-
-In the Kubernetes API, a node's condition is represented as part of the `.status`
-of the Node resource. For example, the following JSON structure describes a healthy node:
-
-```json
-"conditions": [
-  {
-    "type": "Ready",
-    "status": "True",
-    "reason": "KubeletReady",
-    "message": "kubelet is posting ready status",
-    "lastHeartbeatTime": "2019-06-05T18:38:35Z",
-    "lastTransitionTime": "2019-06-05T11:41:27Z"
-  }
-]
-```
-
-When problems occur on nodes, the Kubernetes control plane automatically creates
-[taints](/docs/concepts/scheduling-eviction/taint-and-toleration/) that match the conditions
-affecting the node. An example of this is when the `status` of the Ready condition
-remains `Unknown` or `False` for longer than the kube-controller-manager's `NodeMonitorGracePeriod`,
-which defaults to 40 seconds. This will cause either an `node.kubernetes.io/unreachable` taint, for an `Unknown` status,
-or a `node.kubernetes.io/not-ready` taint, for a `False` status, to be added to the Node.
-
-These taints affect pending pods as the scheduler takes the Node's taints into consideration when
-assigning a pod to a Node. Existing pods scheduled to the node may be evicted due to the application
-of `NoExecute` taints. Pods may also have {{< glossary_tooltip text="tolerations" term_id="toleration" >}} that let
-them schedule to and continue running on a Node even though it has a specific taint.
-
-See [Taint Based Evictions](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions) and
-[Taint Nodes by Condition](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-nodes-by-condition)
-for more details.
-
-### Capacity and Allocatable {#capacity}
-
-Describes the resources available on the node: CPU, memory, and the maximum
-number of pods that can be scheduled onto the node.
-
-The fields in the capacity block indicate the total amount of resources that a
-Node has. The allocatable block indicates the amount of resources on a
-Node that is available to be consumed by normal Pods.
-
-You may read more about capacity and allocatable resources while learning how
-to [reserve compute resources](/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable)
-on a Node.
-
-### Info
-
-Describes general information about the node, such as kernel version, Kubernetes
-version (kubelet and kube-proxy version), container runtime details, and which
-operating system the node uses.
-The kubelet gathers this information from the node and publishes it into
-the Kubernetes API.
-
-## Heartbeats
+## Node heartbeats

 Heartbeats, sent by Kubernetes nodes, help your cluster determine the
 availability of each node, and to take action when failures are detected.

 For nodes there are two forms of heartbeats:

-* updates to the `.status` of a Node
+* Updates to the [`.status`](/docs/concepts/node/node-status/) of a Node
 * [Lease](/docs/concepts/architecture/leases/) objects
  within the `kube-node-lease`
  {{< glossary_tooltip term_id="namespace" text="namespace">}}.
  Each Node has an associated Lease object.

-Compared to updates to `.status` of a Node, a Lease is a lightweight resource.
-Using Leases for heartbeats reduces the performance impact of these updates
-for large clusters.
-
-The kubelet is responsible for creating and updating the `.status` of Nodes,
-and for updating their related Leases.
-
- The kubelet updates the node's `.status` either when there is change in status
-  or if there has been no update for a configured interval. The default interval
-  for `.status` updates to Nodes is 5 minutes, which is much longer than the 40
-  second default timeout for unreachable nodes.
- The kubelet creates and then updates its Lease object every 10 seconds
-  (the default update interval). Lease updates occur independently from
-  updates to the Node's `.status`. If the Lease update fails, the kubelet retries,
-  using exponential backoff that starts at 200 milliseconds and capped at 7 seconds.
-
 ## Node controller

 The node {{< glossary_tooltip text="controller" term_id="controller" >}} is a
--- a/content/en/docs/reference/node/_index.md
+++ b/content/en/docs/reference/node/_index.md
@ -9,6 +9,8 @@ This section contains the following reference topics about nodes:
 * the kubelet's [checkpoint API](/docs/reference/node/kubelet-checkpoint-api/)
 * a list of [Articles on dockershim Removal and on Using CRI-compatible Runtimes](/docs/reference/node/topics-on-dockershim-and-cri-compatible-runtimes/)

+* [Node `.status` information](/docs/reference/node/node-status/)
+
 You can also read node reference details from elsewhere in the
 Kubernetes documentation, including:

--- a/content/en/docs/reference/node/node-status.md
+++ b/content/en/docs/reference/node/node-status.md
@ -0,0 +1,138 @@
+---
+content_type: reference
+title: Node Status
+weight: 80
+---
+<!-- overview -->
+
+The status of a [node](/docs/concepts/architecture/nodes/)  in Kubernetes a critical aspect of managing a Kubernetes cluster. In this article, we'll cover the basics of monitoring and maintaining node status to ensure a healthy and stable cluster
+
+## Node status fields
+
+A Node's status contains the following information:
+
+* [Addresses](#addresses)
+* [Conditions](#condition)
+* [Capacity and Allocatable](#capacity)
+* [Info](#info)
+
+You can use `kubectl` to view a Node's status and other details:
+
+```shell
+kubectl describe node <insert-node-name-here>
+```
+
+Each section of the output is described below.
+
+## Addresses
+
+The usage of these fields varies depending on your cloud provider or bare metal configuration.
+
+* HostName: The hostname as reported by the node's kernel. Can be overridden via the kubelet
+  `--hostname-override` parameter.
+* ExternalIP: Typically the IP address of the node that is externally routable (available from
+  outside the cluster).
+* InternalIP: Typically the IP address of the node that is routable only within the cluster.
+
+
+## Conditions {#condition}
+
+The `conditions` field describes the status of all `Running` nodes. Examples of conditions include:
+
+{{< table caption = "Node conditions, and a description of when each condition applies." >}}
+| Node Condition       | Description |
+|----------------------|-------------|
+| `Ready`              | `True` if the node is healthy and ready to accept pods, `False` if the node is not healthy and is not accepting pods, and `Unknown` if the node controller has not heard from the node in the last `node-monitor-grace-period` (default is 40 seconds) |
+| `DiskPressure`       | `True` if pressure exists on the disk size—that is, if the disk capacity is low; otherwise `False` |
+| `MemoryPressure`     | `True` if pressure exists on the node memory—that is, if the node memory is low; otherwise `False` |
+| `PIDPressure`        | `True` if pressure exists on the processes—that is, if there are too many processes on the node; otherwise `False` |
+| `NetworkUnavailable` | `True` if the network for the node is not correctly configured, otherwise `False` |
+{{< /table >}}
+
+{{< note >}}
+If you use command-line tools to print details of a cordoned Node, the Condition includes
+`SchedulingDisabled`. `SchedulingDisabled` is not a Condition in the Kubernetes API; instead,
+cordoned nodes are marked Unschedulable in their spec.
+{{< /note >}}
+
+In the Kubernetes API, a node's condition is represented as part of the `.status`
+of the Node resource. For example, the following JSON structure describes a healthy node:
+
+```json
+"conditions": [
+  {
+    "type": "Ready",
+    "status": "True",
+    "reason": "KubeletReady",
+    "message": "kubelet is posting ready status",
+    "lastHeartbeatTime": "2019-06-05T18:38:35Z",
+    "lastTransitionTime": "2019-06-05T11:41:27Z"
+  }
+]
+```
+
+When problems occur on nodes, the Kubernetes control plane automatically creates
+[taints](/docs/concepts/scheduling-eviction/taint-and-toleration/) that match the conditions
+affecting the node. An example of this is when the `status` of the Ready condition
+remains `Unknown` or `False` for longer than the kube-controller-manager's `NodeMonitorGracePeriod`,
+which defaults to 40 seconds. This will cause either an `node.kubernetes.io/unreachable` taint, for an `Unknown` status,
+or a `node.kubernetes.io/not-ready` taint, for a `False` status, to be added to the Node.
+
+These taints affect pending pods as the scheduler takes the Node's taints into consideration when
+assigning a pod to a Node. Existing pods scheduled to the node may be evicted due to the application
+of `NoExecute` taints. Pods may also have {{< glossary_tooltip text="tolerations" term_id="toleration" >}} that let
+them schedule to and continue running on a Node even though it has a specific taint.
+
+See [Taint Based Evictions](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions) and
+[Taint Nodes by Condition](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-nodes-by-condition)
+for more details.
+
+## Capacity and Allocatable {#capacity}
+
+Describes the resources available on the node: CPU, memory, and the maximum
+number of pods that can be scheduled onto the node.
+
+The fields in the capacity block indicate the total amount of resources that a
+Node has. The allocatable block indicates the amount of resources on a
+Node that is available to be consumed by normal Pods.
+
+You may read more about capacity and allocatable resources while learning how
+to [reserve compute resources](/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable)
+on a Node.
+
+## Info
+
+Describes general information about the node, such as kernel version, Kubernetes
+version (kubelet and kube-proxy version), container runtime details, and which
+operating system the node uses.
+The kubelet gathers this information from the node and publishes it into
+the Kubernetes API.
+
+## Heartbeats
+
+Heartbeats, sent by Kubernetes nodes, help your cluster determine the
+availability of each node, and to take action when failures are detected.
+
+For nodes there are two forms of heartbeats:
+
+* updates to the `.status` of a Node
+* [Lease](/docs/concepts/architecture/leases/) objects
+  within the `kube-node-lease`
+  {{< glossary_tooltip term_id="namespace" text="namespace">}}.
+  Each Node has an associated Lease object.
+
+Compared to updates to `.status` of a Node, a Lease is a lightweight resource.
+Using Leases for heartbeats reduces the performance impact of these updates
+for large clusters.
+
+The kubelet is responsible for creating and updating the `.status` of Nodes,
+and for updating their related Leases.
+
+- The kubelet updates the node's `.status` either when there is change in status
+  or if there has been no update for a configured interval. The default interval
+  for `.status` updates to Nodes is 5 minutes, which is much longer than the 40
+  second default timeout for unreachable nodes.
+- The kubelet creates and then updates its Lease object every 10 seconds
+  (the default update interval). Lease updates occur independently from
+  updates to the Node's `.status`. If the Lease update fails, the kubelet retries,
+  using exponential backoff that starts at 200 milliseconds and capped at 7 seconds.