Merge pull request #23755 from sftim/20200908_revise_task_safely_drain_node

Revise node draining task page
This commit is contained in:
Kubernetes Prow Robot 2020-10-12 22:14:26 -07:00 committed by GitHub
commit 720812a0ce
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 39 additions and 40 deletions

View File

@ -6,23 +6,21 @@ reviewers:
- kow3ns
title: Safely Drain a Node while Respecting the PodDisruptionBudget
content_type: task
min-kubernetes-server-version: 1.5
---
<!-- overview -->
This page shows how to safely drain a node, respecting the PodDisruptionBudget you have defined.
This page shows how to safely drain a {{< glossary_tooltip text="node" term_id="node" >}},
respecting the PodDisruptionBudget you have defined.
## {{% heading "prerequisites" %}}
This task assumes that you have met the following prerequisites:
* You are using Kubernetes release >= 1.5.
* Either:
{{% version-check %}}
This task also assumes that you have met the following prerequisites:
1. You do not require your applications to be highly available during the
node drain, or
1. You have read about the [PodDisruptionBudget concept](/docs/concepts/workloads/pods/disruptions/)
and [Configured PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/) for
1. You have read about the [PodDisruptionBudget](/docs/concepts/workloads/pods/disruptions/) concept,
and have [configured PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/) for
applications that need them.
@ -35,10 +33,10 @@ You can use `kubectl drain` to safely evict all of your pods from a
node before you perform maintenance on the node (e.g. kernel upgrade,
hardware maintenance, etc.). Safe evictions allow the pod's containers
to [gracefully terminate](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination)
and will respect the `PodDisruptionBudgets` you have specified.
and will respect the PodDisruptionBudgets you have specified.
{{< note >}}
By default `kubectl drain` will ignore certain system pods on the node
By default `kubectl drain` ignores certain system pods on the node
that cannot be killed; see
the [kubectl drain](/docs/reference/generated/kubectl/kubectl-commands/#drain)
documentation for more details.
@ -78,29 +76,29 @@ The `kubectl drain` command should only be issued to a single node at a
time. However, you can run multiple `kubectl drain` commands for
different nodes in parallel, in different terminals or in the
background. Multiple drain commands running concurrently will still
respect the `PodDisruptionBudget` you specify.
respect the PodDisruptionBudget you specify.
For example, if you have a StatefulSet with three replicas and have
set a `PodDisruptionBudget` for that set specifying `minAvailable:
2`. `kubectl drain` will only evict a pod from the StatefulSet if all
three pods are ready, and if you issue multiple drain commands in
parallel, Kubernetes will respect the PodDisruptionBudget and ensure
that only one pod is unavailable at any given time. Any drains that
would cause the number of ready replicas to fall below the specified
budget are blocked.
set a PodDisruptionBudget for that set specifying `minAvailable: 2`,
`kubectl drain` only evicts a pod from the StatefulSet if all three
replicas pods are ready; if then you issue multiple drain commands in
parallel, Kubernetes respects the PodDisruptionBudget and ensure
that only 1 (calculated as `replicas - minAvailable`) Pod is unavailable
at any given time. Any drains that would cause the number of ready
replicas to fall below the specified budget are blocked.
## The Eviction API
## The Eviction API {#eviction-api}
If you prefer not to use [kubectl drain](/docs/reference/generated/kubectl/kubectl-commands/#drain) (such as
to avoid calling to an external command, or to get finer control over the pod
eviction process), you can also programmatically cause evictions using the eviction API.
You should first be familiar with using [Kubernetes language clients](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api).
You should first be familiar with using [Kubernetes language clients](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api) to access the API.
The eviction subresource of a
pod can be thought of as a kind of policy-controlled DELETE operation on the pod
itself. To attempt an eviction (perhaps more REST-precisely, to attempt to
*create* an eviction), you POST an attempted operation. Here's an example:
Pod can be thought of as a kind of policy-controlled DELETE operation on the Pod
itself. To attempt an eviction (more precisely: to attempt to
*create* an Eviction), you POST an attempted operation. Here's an example:
```json
{
@ -116,21 +114,19 @@ itself. To attempt an eviction (perhaps more REST-precisely, to attempt to
You can attempt an eviction using `curl`:
```bash
curl -v -H 'Content-type: application/json' http://127.0.0.1:8080/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json
curl -v -H 'Content-type: application/json' https://your-cluster-api-endpoint.example/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json
```
The API can respond in one of three ways:
- If the eviction is granted, then the pod is deleted just as if you had sent
a `DELETE` request to the pod's URL and you get back `200 OK`.
- If the eviction is granted, then the Pod is deleted just as if you had sent
a `DELETE` request to the Pod's URL and you get back `200 OK`.
- If the current state of affairs wouldn't allow an eviction by the rules set
forth in the budget, you get back `429 Too Many Requests`. This is
typically used for generic rate limiting of *any* requests, but here we mean
that this request isn't allowed *right now* but it may be allowed later.
Currently, callers do not get any `Retry-After` advice, but they may in
future versions.
- If there is some kind of misconfiguration, like multiple budgets pointing at
the same pod, you will get `500 Internal Server Error`.
- If there is some kind of misconfiguration; for example multiple PodDisruptionBudgets
that refer the same Pod, you get a `500 Internal Server Error` response.
For a given eviction request, there are two cases:
@ -139,21 +135,25 @@ For a given eviction request, there are two cases:
- There is at least one budget. In this case, any of the three above responses may
apply.
In some cases, an application may reach a broken state where it will never return anything
other than 429 or 500. This can happen, for example, if the replacement pod created by the
application's controller does not become ready, or if the last pod evicted has a very long
termination grace period.
## Stuck evictions
In some cases, an application may reach a broken state, one where unless you intervene the
eviction API will never return anything other than 429 or 500.
For example: this can happen if ReplicaSet is creating Pods for your application but
the replacement Pods do not become `Ready`. You can also see similar symptoms if the
last Pod evicted has a very long termination grace period.
In this case, there are two potential solutions:
- Abort or pause the automated operation. Investigate the reason for the stuck application, and restart the automation.
- After a suitably long wait, `DELETE` the pod instead of using the eviction API.
- Abort or pause the automated operation. Investigate the reason for the stuck application,
and restart the automation.
- After a suitably long wait, `DELETE` the Pod from your cluster's control plane, instead
of using the eviction API.
Kubernetes does not specify what the behavior should be in this case; it is up to the
application owners and cluster owners to establish an agreement on behavior in these cases.
## {{% heading "whatsnext" %}}
@ -162,4 +162,3 @@ application owners and cluster owners to establish an agreement on behavior in t