From c2f55492bd269a2b128d2972f1613a43e1b87e64 Mon Sep 17 00:00:00 2001 From: Shannon Kularathna Date: Fri, 4 Jun 2021 18:40:39 +0000 Subject: [PATCH 1/2] Add information to API evictions --- .../scheduling-eviction/api-eviction.md | 57 +++++++++++++++++-- 1 file changed, 52 insertions(+), 5 deletions(-) diff --git a/content/en/docs/concepts/scheduling-eviction/api-eviction.md b/content/en/docs/concepts/scheduling-eviction/api-eviction.md index e7f1942df2..d51821557a 100644 --- a/content/en/docs/concepts/scheduling-eviction/api-eviction.md +++ b/content/en/docs/concepts/scheduling-eviction/api-eviction.md @@ -6,14 +6,61 @@ weight: 70 {{< glossary_definition term_id="api-eviction" length="short" >}}
-You can request eviction by directly calling the Eviction API -using a client of the kube-apiserver, like the `kubectl drain` command. -This creates an `Eviction` object, which causes the API server to terminate the Pod. +You can request eviction by calling the Eviction API directly, or programmatically +using a client of the kube-apiserver, like the `kubectl drain` command. This +creates an `Eviction` object, which causes the API server to terminate the Pod. API-initiated evictions respect your configured [`PodDisruptionBudgets`](/docs/tasks/run-application/configure-pdb/) and [`terminationGracePeriodSeconds`](/docs/concepts/workloads/pods/pod-lifecycle#pod-termination). +Using the API to create an Eviction object for a Pod is like performing a +policy-controlled DELETE operation on the Pod. + +## Calling the Eviction API + +You can use a [Kubernetes language client](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api) +to access the Kubernetes API and create an `Eviction` object. To do this, you +POST the attempted operation. + +Alternatively, you can attempt an eviction operation by accessing the API using +`curl` or `wget`. + +## How API-initiated eviction works + +When you attempt to create an `Eviction` object, the API responds in one of the +following ways: + +* `200 OK`: the eviction is allowed and the Pod is deleted, similar to sending a + `DELETE` request to the Pod URL. +* `429 Too Many Requests`: the eviction is not currently allowed because of the + configured PodDisruptionBudget. You may be able to attempt the eviction again + later. +* `500 Internal Server Error`: the eviction is not allowed because there is a + misconfiguration, like if multiple PodDisruptionBudgets reference the same Pod. + +If the Pod you want to evict doesn't have a PodDisruptionBudget, the server always +returns `200 OK` and allows the eviction. + +[[Need more information about the eviction object. Once it's created, what happens +to cause the Pod to shut down? What control plane components work to get the job done?]] + +## Troubleshooting stuck evictions + +In some cases, your applications may enter a broken state, where the Eviction +API will only return `429` or `500` responses until you intervene. This can +happen if, for example, a ReplicaSet creates pods for your application but new +pods do not enter a `Ready` state. You may also notice this behavior in cases +where the last evicted Pod had a long termination grace period. + +If you notice stuck evictions, try one of the following solutions: + +* Abort or pause the automated operation causing the issue. Investigate the stuck + application before you restart the operation. +* Directly delete the Pod from your cluster control plane instead of using the + Eviction API. + ## {{% heading "whatsnext" %}} -* Learn about [Node-pressure Eviction](/docs/concepts/scheduling-eviction/node-pressure-eviction/) -* Learn about [Pod Priority and Preemption](/docs/concepts/scheduling-eviction/pod-priority-preemption/) +* Learn how to protect your applications with a [Pod Disruption Budget](/docs/tasks/run-application/configure-pdb/). +* Learn about [Node-pressure Eviction](/docs/concepts/scheduling-eviction/node-pressure-eviction/). +* Learn about [Pod Priority and Preemption](/docs/concepts/scheduling-eviction/pod-priority-preemption/). From d6345808f45b8634d890c8bba2937fac3628b4a5 Mon Sep 17 00:00:00 2001 From: Shannon Kularathna Date: Fri, 4 Jun 2021 18:40:39 +0000 Subject: [PATCH 2/2] Add information to API evictions --- .../scheduling-eviction/api-eviction.md | 88 ++++++++++++++---- .../administer-cluster/safely-drain-node.md | 91 +------------------ 2 files changed, 75 insertions(+), 104 deletions(-) diff --git a/content/en/docs/concepts/scheduling-eviction/api-eviction.md b/content/en/docs/concepts/scheduling-eviction/api-eviction.md index d51821557a..b4e92c40bf 100644 --- a/content/en/docs/concepts/scheduling-eviction/api-eviction.md +++ b/content/en/docs/concepts/scheduling-eviction/api-eviction.md @@ -7,42 +7,98 @@ weight: 70 {{< glossary_definition term_id="api-eviction" length="short" >}}
You can request eviction by calling the Eviction API directly, or programmatically -using a client of the kube-apiserver, like the `kubectl drain` command. This +using a client of the {{}}, like the `kubectl drain` command. This creates an `Eviction` object, which causes the API server to terminate the Pod. API-initiated evictions respect your configured [`PodDisruptionBudgets`](/docs/tasks/run-application/configure-pdb/) and [`terminationGracePeriodSeconds`](/docs/concepts/workloads/pods/pod-lifecycle#pod-termination). Using the API to create an Eviction object for a Pod is like performing a -policy-controlled DELETE operation on the Pod. +policy-controlled [`DELETE` operation](/docs/reference/kubernetes-api/workload-resources/pod-v1/#delete-delete-a-pod) +on the Pod. ## Calling the Eviction API You can use a [Kubernetes language client](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api) to access the Kubernetes API and create an `Eviction` object. To do this, you -POST the attempted operation. +POST the attempted operation, similar to the following example: + +{{< tabs name="Eviction_example" >}} +{{% tab name="policy/v1" %}} +{{< note >}} +`policy/v1` Eviction is available in v1.22+. Use `policy/v1beta1` with prior releases. +{{< /note >}} + +```json +{ + "apiVersion": "policy/v1", + "kind": "Eviction", + "metadata": { + "name": "quux", + "namespace": "default" + } +} +``` +{{% /tab %}} +{{% tab name="policy/v1beta1" %}} +{{< note >}} +Deprecated in v1.22 in favor of `policy/v1` +{{< /note >}} + +```json +{ + "apiVersion": "policy/v1beta1", + "kind": "Eviction", + "metadata": { + "name": "quux", + "namespace": "default" + } +} +``` +{{% /tab %}} +{{< /tabs >}} Alternatively, you can attempt an eviction operation by accessing the API using -`curl` or `wget`. +`curl` or `wget`, similar to the following example: + +```bash +curl -v -H 'Content-type: application/json' https://your-cluster-api-endpoint.example/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json +``` ## How API-initiated eviction works -When you attempt to create an `Eviction` object, the API responds in one of the -following ways: +When you request an eviction using the API, the API server performs admission +checks and responds in one of the following ways: -* `200 OK`: the eviction is allowed and the Pod is deleted, similar to sending a - `DELETE` request to the Pod URL. +* `200 OK`: the eviction is allowed, the `Eviction` subresource is created, and + the Pod is deleted, similar to sending a `DELETE` request to the Pod URL. * `429 Too Many Requests`: the eviction is not currently allowed because of the - configured PodDisruptionBudget. You may be able to attempt the eviction again - later. + configured {{}}. + You may be able to attempt the eviction again later. You might also see this + response because of API rate limiting. * `500 Internal Server Error`: the eviction is not allowed because there is a misconfiguration, like if multiple PodDisruptionBudgets reference the same Pod. -If the Pod you want to evict doesn't have a PodDisruptionBudget, the server always -returns `200 OK` and allows the eviction. +If the Pod you want to evict isn't part of a workload that has a +PodDisruptionBudget, the API server always returns `200 OK` and allows the +eviction. -[[Need more information about the eviction object. Once it's created, what happens -to cause the Pod to shut down? What control plane components work to get the job done?]] +If the API server allows the eviction, the Pod is deleted as follows: + +1. The `Pod` resource in the API server is updated with a deletion timestamp, + after which the API server considers the `Pod` resource to be terminated. The + `Pod` resource is also marked with the configured grace period. +1. The {{}} on the node where the local Pod is running notices that the `Pod` + resource is marked for termination and starts to gracefully shut down the + local Pod. +1. While the kubelet is shutting the Pod down, the control plane removes the Pod + from {{}} and + {{}} + objects. As a result, controllers no longer consider the Pod as a valid object. +1. After the grace period for the Pod expires, the kubelet forcefully terminates + the local Pod. +1. The kubelet tells the API server to remove the `Pod` resource. +1. The API server deletes the `Pod` resource. ## Troubleshooting stuck evictions @@ -56,8 +112,8 @@ If you notice stuck evictions, try one of the following solutions: * Abort or pause the automated operation causing the issue. Investigate the stuck application before you restart the operation. -* Directly delete the Pod from your cluster control plane instead of using the - Eviction API. +* Wait a while, then directly delete the Pod from your cluster control plane + instead of using the Eviction API. ## {{% heading "whatsnext" %}} diff --git a/content/en/docs/tasks/administer-cluster/safely-drain-node.md b/content/en/docs/tasks/administer-cluster/safely-drain-node.md index 04c908c592..74d1694b08 100644 --- a/content/en/docs/tasks/administer-cluster/safely-drain-node.md +++ b/content/en/docs/tasks/administer-cluster/safely-drain-node.md @@ -23,8 +23,6 @@ This task also assumes that you have met the following prerequisites: and have [configured PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/) for applications that need them. - - ## (Optional) Configure a disruption budget {#configure-poddisruptionbudget} @@ -100,95 +98,12 @@ replicas to fall below the specified budget are blocked. If you prefer not to use [kubectl drain](/docs/reference/generated/kubectl/kubectl-commands/#drain) (such as to avoid calling to an external command, or to get finer control over the pod -eviction process), you can also programmatically cause evictions using the eviction API. +eviction process), you can also programmatically cause evictions using the +eviction API. -You should first be familiar with using [Kubernetes language clients](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api) to access the API. - -The eviction subresource of a -Pod can be thought of as a kind of policy-controlled DELETE operation on the Pod -itself. To attempt an eviction (more precisely: to attempt to -*create* an Eviction), you POST an attempted operation. Here's an example: - -{{< tabs name="Eviction_example" >}} -{{% tab name="policy/v1" %}} -{{< note >}} -`policy/v1` Eviction is available in v1.22+. Use `policy/v1beta1` with prior releases. -{{< /note >}} - -```json -{ - "apiVersion": "policy/v1", - "kind": "Eviction", - "metadata": { - "name": "quux", - "namespace": "default" - } -} -``` -{{% /tab %}} -{{% tab name="policy/v1beta1" %}} -{{< note >}} -Deprecated in v1.22 in favor of `policy/v1` -{{< /note >}} - -```json -{ - "apiVersion": "policy/v1beta1", - "kind": "Eviction", - "metadata": { - "name": "quux", - "namespace": "default" - } -} -``` -{{% /tab %}} -{{< /tabs >}} - -You can attempt an eviction using `curl`: - -```bash -curl -v -H 'Content-type: application/json' https://your-cluster-api-endpoint.example/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json -``` - -The API can respond in one of three ways: - -- If the eviction is granted, then the Pod is deleted as if you sent - a `DELETE` request to the Pod's URL and received back `200 OK`. -- If the current state of affairs wouldn't allow an eviction by the rules set - forth in the budget, you get back `429 Too Many Requests`. This is - typically used for generic rate limiting of *any* requests, but here we mean - that this request isn't allowed *right now* but it may be allowed later. -- If there is some kind of misconfiguration; for example multiple PodDisruptionBudgets - that refer the same Pod, you get a `500 Internal Server Error` response. - -For a given eviction request, there are two cases: - -- There is no budget that matches this pod. In this case, the server always - returns `200 OK`. -- There is at least one budget. In this case, any of the three above responses may - apply. - -## Stuck evictions - -In some cases, an application may reach a broken state, one where unless you intervene the -eviction API will never return anything other than 429 or 500. - -For example: this can happen if ReplicaSet is creating Pods for your application but -the replacement Pods do not become `Ready`. You can also see similar symptoms if the -last Pod evicted has a very long termination grace period. - -In this case, there are two potential solutions: - -- Abort or pause the automated operation. Investigate the reason for the stuck application, - and restart the automation. -- After a suitably long wait, `DELETE` the Pod from your cluster's control plane, instead - of using the eviction API. - -Kubernetes does not specify what the behavior should be in this case; it is up to the -application owners and cluster owners to establish an agreement on behavior in these cases. +For more information, see [API-initiated eviction](/docs/concepts/scheduling-eviction/api-eviction/). ## {{% heading "whatsnext" %}} - * Follow steps to protect your application by [configuring a Pod Disruption Budget](/docs/tasks/run-application/configure-pdb/).