From 7b92c4650324f298ef99e296be1cd686e705abce Mon Sep 17 00:00:00 2001 From: Leon Barrett Date: Fri, 10 Jul 2020 14:09:29 -0700 Subject: [PATCH] Fix description of back-off count reset By carefully reading the code in `job_controller.go`, I finally understood that the back-off count is reset when `forget` is true, which happens when `active` or `successful` changes without any new failures right at the moment. That happens in this code: https://github.com/kubernetes/kubernetes/blob/dd649bb7ef4788bfe65c93ebc974962d64476b39/pkg/controller/job/job_controller.go#L588 That behavior does not match what this document says. My change fixes the doc to match the code. It might be better to fix the behavior to match the doc, since the behavior is kind of weird to describe. But I imagine that the Kubernetes team will need to consider factors I'm not aware of before deciding to change job back-off behavior, so I am not going to the effort of proposing a change like that. --- content/en/docs/concepts/workloads/controllers/job.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/en/docs/concepts/workloads/controllers/job.md b/content/en/docs/concepts/workloads/controllers/job.md index a55759c300..8f6a8c11fe 100644 --- a/content/en/docs/concepts/workloads/controllers/job.md +++ b/content/en/docs/concepts/workloads/controllers/job.md @@ -215,8 +215,8 @@ To do so, set `.spec.backoffLimit` to specify the number of retries before considering a Job as failed. The back-off limit is set by default to 6. Failed Pods associated with the Job are recreated by the Job controller with an exponential back-off delay (10s, 20s, 40s ...) capped at six minutes. The -back-off count is reset if no new failed Pods appear before the Job's next -status check. +back-off count is reset when a job pod is deleted or successful without any +other pods failing around that time. {{< note >}} If your job has `restartPolicy = "OnFailure"`, keep in mind that your container running the Job