Fix description of back-off count reset
By carefully reading the code in `job_controller.go`, I finally understood that
the back-off count is reset when `forget` is true, which happens when `active`
or `successful` changes without any new failures right at the moment. That
happens in this code:
dd649bb7ef/pkg/controller/job/job_controller.go (L588)
That behavior does not match what this document says. My change fixes the doc
to match the code.
It might be better to fix the behavior to match the doc, since the behavior is
kind of weird to describe. But I imagine that the Kubernetes team will need to
consider factors I'm not aware of before deciding to change job back-off
behavior, so I am not going to the effort of proposing a change like that.
This commit is contained in:
parent
6bf2feba74
commit
7b92c46503
|
@ -215,8 +215,8 @@ To do so, set `.spec.backoffLimit` to specify the number of retries before
|
|||
considering a Job as failed. The back-off limit is set by default to 6. Failed
|
||||
Pods associated with the Job are recreated by the Job controller with an
|
||||
exponential back-off delay (10s, 20s, 40s ...) capped at six minutes. The
|
||||
back-off count is reset if no new failed Pods appear before the Job's next
|
||||
status check.
|
||||
back-off count is reset when a job pod is deleted or successful without any
|
||||
other pods failing around that time.
|
||||
|
||||
{{< note >}}
|
||||
If your job has `restartPolicy = "OnFailure"`, keep in mind that your container running the Job
|
||||
|
|
Loading…
Reference in New Issue