Merge pull request #23755 from sftim/20200908_revise_task_safely_drain_node
Revise node draining task page
This commit is contained in:
		
						commit
						720812a0ce
					
				| 
						 | 
				
			
			@ -6,23 +6,21 @@ reviewers:
 | 
			
		|||
- kow3ns
 | 
			
		||||
title: Safely Drain a Node while Respecting the PodDisruptionBudget
 | 
			
		||||
content_type: task
 | 
			
		||||
min-kubernetes-server-version: 1.5
 | 
			
		||||
---
 | 
			
		||||
 | 
			
		||||
<!-- overview -->
 | 
			
		||||
This page shows how to safely drain a node, respecting the PodDisruptionBudget you have defined.
 | 
			
		||||
 | 
			
		||||
This page shows how to safely drain a {{< glossary_tooltip text="node" term_id="node" >}},
 | 
			
		||||
respecting the PodDisruptionBudget you have defined.
 | 
			
		||||
 | 
			
		||||
## {{% heading "prerequisites" %}}
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
This task assumes that you have met the following prerequisites:
 | 
			
		||||
 | 
			
		||||
* You are using Kubernetes release >= 1.5.
 | 
			
		||||
* Either:
 | 
			
		||||
{{% version-check %}}
 | 
			
		||||
This task also assumes that you have met the following prerequisites:
 | 
			
		||||
  1. You do not require your applications to be highly available during the
 | 
			
		||||
     node drain, or
 | 
			
		||||
  1. You have read about the [PodDisruptionBudget concept](/docs/concepts/workloads/pods/disruptions/)
 | 
			
		||||
     and [Configured PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/) for
 | 
			
		||||
  1. You have read about the [PodDisruptionBudget](/docs/concepts/workloads/pods/disruptions/) concept,
 | 
			
		||||
     and have [configured PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/) for
 | 
			
		||||
     applications that need them.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -35,10 +33,10 @@ You can use `kubectl drain` to safely evict all of your pods from a
 | 
			
		|||
node before you perform maintenance on the node (e.g. kernel upgrade,
 | 
			
		||||
hardware maintenance, etc.). Safe evictions allow the pod's containers
 | 
			
		||||
to [gracefully terminate](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination)
 | 
			
		||||
and will respect the `PodDisruptionBudgets` you have specified.
 | 
			
		||||
and will respect the PodDisruptionBudgets you have specified.
 | 
			
		||||
 | 
			
		||||
{{< note >}}
 | 
			
		||||
By default `kubectl drain` will ignore certain system pods on the node
 | 
			
		||||
By default `kubectl drain` ignores certain system pods on the node
 | 
			
		||||
that cannot be killed; see
 | 
			
		||||
the [kubectl drain](/docs/reference/generated/kubectl/kubectl-commands/#drain)
 | 
			
		||||
documentation for more details.
 | 
			
		||||
| 
						 | 
				
			
			@ -78,29 +76,29 @@ The `kubectl drain` command should only be issued to a single node at a
 | 
			
		|||
time. However, you can run multiple `kubectl drain` commands for
 | 
			
		||||
different nodes in parallel, in different terminals or in the
 | 
			
		||||
background. Multiple drain commands running concurrently will still
 | 
			
		||||
respect the `PodDisruptionBudget` you specify.
 | 
			
		||||
respect the PodDisruptionBudget you specify.
 | 
			
		||||
 | 
			
		||||
For example, if you have a StatefulSet with three replicas and have
 | 
			
		||||
set a `PodDisruptionBudget` for that set specifying `minAvailable:
 | 
			
		||||
2`. `kubectl drain` will only evict a pod from the StatefulSet if all
 | 
			
		||||
three pods are ready, and if you issue multiple drain commands in
 | 
			
		||||
parallel, Kubernetes will respect the PodDisruptionBudget and ensure
 | 
			
		||||
that only one pod is unavailable at any given time. Any drains that
 | 
			
		||||
would cause the number of ready replicas to fall below the specified
 | 
			
		||||
budget are blocked.
 | 
			
		||||
set a PodDisruptionBudget for that set specifying `minAvailable: 2`,
 | 
			
		||||
`kubectl drain` only evicts a pod from the StatefulSet if all three
 | 
			
		||||
replicas pods are ready; if then you issue multiple drain commands in
 | 
			
		||||
parallel, Kubernetes respects the PodDisruptionBudget and ensure
 | 
			
		||||
that only 1 (calculated as `replicas - minAvailable`) Pod is unavailable
 | 
			
		||||
at any given time. Any drains that would cause the number of ready
 | 
			
		||||
replicas to fall below the specified budget are blocked.
 | 
			
		||||
 | 
			
		||||
## The Eviction API
 | 
			
		||||
## The Eviction API {#eviction-api}
 | 
			
		||||
 | 
			
		||||
If you prefer not to use [kubectl drain](/docs/reference/generated/kubectl/kubectl-commands/#drain) (such as
 | 
			
		||||
to avoid calling to an external command, or to get finer control over the pod
 | 
			
		||||
eviction process), you can also programmatically cause evictions using the eviction API.
 | 
			
		||||
 | 
			
		||||
You should first be familiar with using [Kubernetes language clients](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api).
 | 
			
		||||
You should first be familiar with using [Kubernetes language clients](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api) to access the API.
 | 
			
		||||
 | 
			
		||||
The eviction subresource of a
 | 
			
		||||
pod can be thought of as a kind of policy-controlled DELETE operation on the pod
 | 
			
		||||
itself. To attempt an eviction (perhaps more REST-precisely, to attempt to
 | 
			
		||||
*create* an eviction), you POST an attempted operation. Here's an example:
 | 
			
		||||
Pod can be thought of as a kind of policy-controlled DELETE operation on the Pod
 | 
			
		||||
itself. To attempt an eviction (more precisely: to attempt to
 | 
			
		||||
*create* an Eviction), you POST an attempted operation. Here's an example:
 | 
			
		||||
 | 
			
		||||
```json
 | 
			
		||||
{
 | 
			
		||||
| 
						 | 
				
			
			@ -116,21 +114,19 @@ itself. To attempt an eviction (perhaps more REST-precisely, to attempt to
 | 
			
		|||
You can attempt an eviction using `curl`:
 | 
			
		||||
 | 
			
		||||
```bash
 | 
			
		||||
curl -v -H 'Content-type: application/json' http://127.0.0.1:8080/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json
 | 
			
		||||
curl -v -H 'Content-type: application/json' https://your-cluster-api-endpoint.example/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
The API can respond in one of three ways:
 | 
			
		||||
 | 
			
		||||
- If the eviction is granted, then the pod is deleted just as if you had sent
 | 
			
		||||
  a `DELETE` request to the pod's URL and you get back `200 OK`.
 | 
			
		||||
- If the eviction is granted, then the Pod is deleted just as if you had sent
 | 
			
		||||
  a `DELETE` request to the Pod's URL and you get back `200 OK`.
 | 
			
		||||
- If the current state of affairs wouldn't allow an eviction by the rules set
 | 
			
		||||
  forth in the budget, you get back `429 Too Many Requests`. This is
 | 
			
		||||
  typically used for generic rate limiting of *any* requests, but here we mean
 | 
			
		||||
  that this request isn't allowed *right now* but it may be allowed later.
 | 
			
		||||
  Currently, callers do not get any `Retry-After` advice, but they may in
 | 
			
		||||
  future versions.
 | 
			
		||||
- If there is some kind of misconfiguration, like multiple budgets pointing at
 | 
			
		||||
  the same pod, you will get `500 Internal Server Error`.
 | 
			
		||||
- If there is some kind of misconfiguration; for example multiple PodDisruptionBudgets
 | 
			
		||||
  that refer the same Pod, you get a `500 Internal Server Error` response.
 | 
			
		||||
 | 
			
		||||
For a given eviction request, there are two cases:
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -139,21 +135,25 @@ For a given eviction request, there are two cases:
 | 
			
		|||
- There is at least one budget. In this case, any of the three above responses may
 | 
			
		||||
 apply.
 | 
			
		||||
 | 
			
		||||
In some cases, an application may reach a broken state where it will never return anything
 | 
			
		||||
other than 429 or 500. This can happen, for example, if the replacement pod created by the
 | 
			
		||||
application's controller does not become ready, or if the last pod evicted has a very long
 | 
			
		||||
termination grace period.
 | 
			
		||||
## Stuck evictions
 | 
			
		||||
 | 
			
		||||
In some cases, an application may reach a broken state, one where unless you intervene the
 | 
			
		||||
eviction API will never return anything other than 429 or 500.
 | 
			
		||||
 | 
			
		||||
For example: this can happen if ReplicaSet is creating Pods for your application but
 | 
			
		||||
the replacement Pods do not become `Ready`. You can also see similar symptoms if the
 | 
			
		||||
last Pod evicted has a very long termination grace period.
 | 
			
		||||
 | 
			
		||||
In this case, there are two potential solutions:
 | 
			
		||||
 | 
			
		||||
- Abort or pause the automated operation. Investigate the reason for the stuck application, and restart the automation.
 | 
			
		||||
- After a suitably long wait, `DELETE` the pod instead of using the eviction API.
 | 
			
		||||
- Abort or pause the automated operation. Investigate the reason for the stuck application,
 | 
			
		||||
  and restart the automation.
 | 
			
		||||
- After a suitably long wait, `DELETE` the Pod from your cluster's control plane, instead
 | 
			
		||||
  of using the eviction API.
 | 
			
		||||
 | 
			
		||||
Kubernetes does not specify what the behavior should be in this case; it is up to the
 | 
			
		||||
application owners and cluster owners to establish an agreement on behavior in these cases.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
## {{% heading "whatsnext" %}}
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -162,4 +162,3 @@ application owners and cluster owners to establish an agreement on behavior in t
 | 
			
		|||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
		Reference in New Issue