GA TTLAfterFinish

This commit is contained in:
Sahil Vazirani 2021-10-10 23:09:52 -07:00
parent 5ad633bf55
commit 11117310a5
No known key found for this signature in database
GPG Key ID: F4A02D44E1745387
3 changed files with 27 additions and 31 deletions

View File

@ -142,3 +142,5 @@ documents the format of CronJob `schedule` fields.
For instructions on creating and working with cron jobs, and for an example of CronJob For instructions on creating and working with cron jobs, and for an example of CronJob
manifest, see [Running automated tasks with cron jobs](/docs/tasks/job/automated-tasks-with-cron-jobs). manifest, see [Running automated tasks with cron jobs](/docs/tasks/job/automated-tasks-with-cron-jobs).
For instructions to clean up failed or completed jobs automatically, see
[Clean up Jobs automatically](/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically)

View File

@ -1,75 +1,68 @@
--- ---
reviewers: reviewers:
- janetkuo - janetkuo
title: TTL Controller for Finished Resources title: Automatic Clean-up for Finished Jobs
content_type: concept content_type: concept
weight: 70 weight: 70
--- ---
<!-- overview --> <!-- overview -->
{{< feature-state for_k8s_version="v1.21" state="beta" >}} {{< feature-state for_k8s_version="v1.23" state="stable" >}}
The TTL controller provides a TTL (time to live) mechanism to limit the lifetime of resource TTL-after-finished {{<glossary_tooltip text="controller" term_id="controller">}} provides a
objects that have finished execution. TTL controller only handles TTL (time to live) mechanism to limit the lifetime of resource objects that
{{< glossary_tooltip text="Jobs" term_id="job" >}} for now, have finished execution. TTL controller only handles
and may be expanded to handle other resources that will finish execution, {{< glossary_tooltip text="Jobs" term_id="job" >}}.
such as Pods and custom resources.
This feature is currently beta and enabled by default, and can be disabled via
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
`TTLAfterFinished` in both kube-apiserver and kube-controller-manager.
<!-- body --> <!-- body -->
## TTL Controller ## TTL-after-finished Controller
The TTL controller only supports Jobs for now. A cluster operator can use this feature to clean The TTL-after-finished controller is only supported for Jobs. A cluster operator can use this feature to clean
up finished Jobs (either `Complete` or `Failed`) automatically by specifying the up finished Jobs (either `Complete` or `Failed`) automatically by specifying the
`.spec.ttlSecondsAfterFinished` field of a Job, as in this `.spec.ttlSecondsAfterFinished` field of a Job, as in this
[example](/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically). [example](/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically).
The TTL controller will assume that a resource is eligible to be cleaned up The TTL-after-finished controller will assume that a job is eligible to be cleaned up
TTL seconds after the resource has finished, in other words, when the TTL has expired. When the TTL seconds after the job has finished, in other words, when the TTL has expired. When the
TTL controller cleans up a resource, it will delete it cascadingly, that is to say it will delete TTL-after-finished controller cleans up a job, it will delete it cascadingly, that is to say it will delete
its dependent objects together with it. Note that when the resource is deleted, its dependent objects together with it. Note that when the job is deleted,
its lifecycle guarantees, such as finalizers, will be honored. its lifecycle guarantees, such as finalizers, will be honored.
The TTL seconds can be set at any time. Here are some examples for setting the The TTL seconds can be set at any time. Here are some examples for setting the
`.spec.ttlSecondsAfterFinished` field of a Job: `.spec.ttlSecondsAfterFinished` field of a Job:
* Specify this field in the resource manifest, so that a Job can be cleaned up * Specify this field in the job manifest, so that a Job can be cleaned up
automatically some time after it finishes. automatically some time after it finishes.
* Set this field of existing, already finished resources, to adopt this new * Set this field of existing, already finished jobs, to adopt this new
feature. feature.
* Use a * Use a
[mutating admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks) [mutating admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks)
to set this field dynamically at resource creation time. Cluster administrators can to set this field dynamically at job creation time. Cluster administrators can
use this to enforce a TTL policy for finished resources. use this to enforce a TTL policy for finished jobs.
* Use a * Use a
[mutating admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks) [mutating admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks)
to set this field dynamically after the resource has finished, and choose to set this field dynamically after the job has finished, and choose
different TTL values based on resource status, labels, etc. different TTL values based on job status, labels, etc.
## Caveat ## Caveat
### Updating TTL Seconds ### Updating TTL Seconds
Note that the TTL period, e.g. `.spec.ttlSecondsAfterFinished` field of Jobs, Note that the TTL period, e.g. `.spec.ttlSecondsAfterFinished` field of Jobs,
can be modified after the resource is created or has finished. However, once the can be modified after the job is created or has finished. However, once the
Job becomes eligible to be deleted (when the TTL has expired), the system won't Job becomes eligible to be deleted (when the TTL has expired), the system won't
guarantee that the Jobs will be kept, even if an update to extend the TTL guarantee that the Jobs will be kept, even if an update to extend the TTL
returns a successful API response. returns a successful API response.
### Time Skew ### Time Skew
Because TTL controller uses timestamps stored in the Kubernetes resources to Because TTL-after-finished controller uses timestamps stored in the Kubernetes jobs to
determine whether the TTL has expired or not, this feature is sensitive to time determine whether the TTL has expired or not, this feature is sensitive to time
skew in the cluster, which may cause TTL controller to clean up resource objects skew in the cluster, which may cause TTL-after-finish controller to clean up job objects
at the wrong time. at the wrong time.
In Kubernetes, it's required to run NTP on all nodes Clocks aren't always correct, but the difference should be
(see [#6159](https://github.com/kubernetes/kubernetes/issues/6159#issuecomment-93844058))
to avoid time skew. Clocks aren't always correct, but the difference should be
very small. Please be aware of this risk when setting a non-zero TTL. very small. Please be aware of this risk when setting a non-zero TTL.

View File

@ -190,8 +190,6 @@ different Kubernetes components.
| `StorageVersionHash` | `true` | Beta | 1.15 | | | `StorageVersionHash` | `true` | Beta | 1.15 | |
| `SuspendJob` | `false` | Alpha | 1.21 | 1.21 | | `SuspendJob` | `false` | Alpha | 1.21 | 1.21 |
| `SuspendJob` | `true` | Beta | 1.22 | | | `SuspendJob` | `true` | Beta | 1.22 | |
| `TTLAfterFinished` | `false` | Alpha | 1.12 | 1.20 |
| `TTLAfterFinished` | `true` | Beta | 1.21 | |
| `TopologyAwareHints` | `false` | Alpha | 1.21 | | | `TopologyAwareHints` | `false` | Alpha | 1.21 | |
| `TopologyManager` | `false` | Alpha | 1.16 | 1.17 | | `TopologyManager` | `false` | Alpha | 1.16 | 1.17 |
| `TopologyManager` | `true` | Beta | 1.18 | | | `TopologyManager` | `true` | Beta | 1.18 | |
@ -439,6 +437,9 @@ different Kubernetes components.
| `SupportPodPidsLimit` | `true` | GA | 1.20 | - | | `SupportPodPidsLimit` | `true` | GA | 1.20 | - |
| `Sysctls` | `true` | Beta | 1.11 | 1.20 | | `Sysctls` | `true` | Beta | 1.11 | 1.20 |
| `Sysctls` | `true` | GA | 1.21 | | | `Sysctls` | `true` | GA | 1.21 | |
| `TTLAfterFinished` | `false` | Alpha | 1.12 | 1.20 |
| `TTLAfterFinished` | `true` | Beta | 1.21 | 1.22 |
| `TTLAfterFinished` | `true` | GA | 1.23 | - |
| `TaintBasedEvictions` | `false` | Alpha | 1.6 | 1.12 | | `TaintBasedEvictions` | `false` | Alpha | 1.6 | 1.12 |
| `TaintBasedEvictions` | `true` | Beta | 1.13 | 1.17 | | `TaintBasedEvictions` | `true` | Beta | 1.13 | 1.17 |
| `TaintBasedEvictions` | `true` | GA | 1.18 | - | | `TaintBasedEvictions` | `true` | GA | 1.18 | - |