Merge pull request #26958 from alculquicondor/indexed-job
Documentation for Indexed completion mode
This commit is contained in:
commit
bfb10f4d5e
|
@ -145,8 +145,8 @@ There are three main types of task suitable to run as a Job:
|
|||
- the Job is complete as soon as its Pod terminates successfully.
|
||||
1. Parallel Jobs with a *fixed completion count*:
|
||||
- specify a non-zero positive value for `.spec.completions`.
|
||||
- the Job represents the overall task, and is complete when there is one successful Pod for each value in the range 1 to `.spec.completions`.
|
||||
- **not implemented yet:** Each Pod is passed a different index in the range 1 to `.spec.completions`.
|
||||
- the Job represents the overall task, and is complete when there are `.spec.completions` successful Pods.
|
||||
- when using `.spec.completionMode="Indexed"`, each Pod gets a different index in the range 0 to `.spec.completions-1`.
|
||||
1. Parallel Jobs with a *work queue*:
|
||||
- do not specify `.spec.completions`, default to `.spec.parallelism`.
|
||||
- the Pods must coordinate amongst themselves or an external service to determine what each should work on. For example, a Pod might fetch a batch of up to N items from the work queue.
|
||||
|
@ -166,7 +166,6 @@ a non-negative integer.
|
|||
|
||||
For more information about how to make use of the different types of job, see the [job patterns](#job-patterns) section.
|
||||
|
||||
|
||||
#### Controlling parallelism
|
||||
|
||||
The requested parallelism (`.spec.parallelism`) can be set to any non-negative value.
|
||||
|
@ -185,6 +184,33 @@ parallelism, for a variety of reasons:
|
|||
- The Job controller may throttle new Pod creation due to excessive previous pod failures in the same Job.
|
||||
- When a Pod is gracefully shut down, it takes time to stop.
|
||||
|
||||
### Completion mode
|
||||
|
||||
{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
|
||||
|
||||
{{< note >}}
|
||||
To be able to create Indexed Jobs, make sure to enable the `IndexedJob`
|
||||
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
|
||||
on the [API server](docs/reference/command-line-tools-reference/kube-apiserver/)
|
||||
and the [controller manager](/docs/reference/command-line-tools-reference/kube-controller-manager/).
|
||||
{{< /note >}}
|
||||
|
||||
Jobs with _fixed completion count_ - that is, jobs that have non null
|
||||
`.spec.completions` - can have a completion mode that is specified in `.spec.completionMode`:
|
||||
|
||||
- `NonIndexed` (default): the Job is considered complete when there have been
|
||||
`.spec.completions` successfully completed Pods. In other words, each Pod
|
||||
completion is homologous to each other. Note that Jobs that have null
|
||||
`.spec.completions` are implicitly `NonIndexed`.
|
||||
- `Indexed`: the Pods of a Job get an associated completion index from 0 to
|
||||
`.spec.completions-1`, available in the annotation `batch.kubernetes.io/job-completion-index`.
|
||||
The Job is considered complete when there is one successfully completed Pod
|
||||
for each index. For more information about how to use this mode, see
|
||||
[Indexed Job for Parallel Processing with Static Work Assignment](/docs/tasks/job/indexed-parallel-processing-static/).
|
||||
Note that, although rare, more than one Pod could be started for the same
|
||||
index, but only one of them will count towards the completion count.
|
||||
|
||||
|
||||
## Handling Pod and container failures
|
||||
|
||||
A container in a Pod may fail for a number of reasons, such as because the process in it exited with
|
||||
|
@ -348,12 +374,12 @@ The tradeoffs are:
|
|||
The tradeoffs are summarized here, with columns 2 to 4 corresponding to the above tradeoffs.
|
||||
The pattern names are also links to examples and more detailed description.
|
||||
|
||||
| Pattern | Single Job object | Fewer pods than work items? | Use app unmodified? | Works in Kube 1.1? |
|
||||
| -------------------------------------------------------------------- |:-----------------:|:---------------------------:|:-------------------:|:-------------------:|
|
||||
| [Job Template Expansion](/docs/tasks/job/parallel-processing-expansion/) | | | ✓ | ✓ |
|
||||
| [Queue with Pod Per Work Item](/docs/tasks/job/coarse-parallel-processing-work-queue/) | ✓ | | sometimes | ✓ |
|
||||
| [Queue with Variable Pod Count](/docs/tasks/job/fine-parallel-processing-work-queue/) | ✓ | ✓ | | ✓ |
|
||||
| Single Job with Static Work Assignment | ✓ | | ✓ | |
|
||||
| Pattern | Single Job object | Fewer pods than work items? | Use app unmodified? |
|
||||
| ----------------------------------------- |:-----------------:|:---------------------------:|:-------------------:|
|
||||
| [Queue with Pod Per Work Item] | ✓ | | sometimes |
|
||||
| [Queue with Variable Pod Count] | ✓ | ✓ | |
|
||||
| [Indexed Job with Static Work Assignment] | ✓ | | ✓ |
|
||||
| [Job Template Expansion] | | | ✓ |
|
||||
|
||||
When you specify completions with `.spec.completions`, each Pod created by the Job controller
|
||||
has an identical [`spec`](https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status). This means that
|
||||
|
@ -364,13 +390,17 @@ are different ways to arrange for pods to work on different things.
|
|||
This table shows the required settings for `.spec.parallelism` and `.spec.completions` for each of the patterns.
|
||||
Here, `W` is the number of work items.
|
||||
|
||||
| Pattern | `.spec.completions` | `.spec.parallelism` |
|
||||
| -------------------------------------------------------------------- |:-------------------:|:--------------------:|
|
||||
| [Job Template Expansion](/docs/tasks/job/parallel-processing-expansion/) | 1 | should be 1 |
|
||||
| [Queue with Pod Per Work Item](/docs/tasks/job/coarse-parallel-processing-work-queue/) | W | any |
|
||||
| [Queue with Variable Pod Count](/docs/tasks/job/fine-parallel-processing-work-queue/) | 1 | any |
|
||||
| Single Job with Static Work Assignment | W | any |
|
||||
| Pattern | `.spec.completions` | `.spec.parallelism` |
|
||||
| ----------------------------------------- |:-------------------:|:--------------------:|
|
||||
| [Queue with Pod Per Work Item] | W | any |
|
||||
| [Queue with Variable Pod Count] | null | any |
|
||||
| [Indexed Job with Static Work Assignment] | W | any |
|
||||
| [Job Template Expansion] | 1 | should be 1 |
|
||||
|
||||
[Queue with Pod Per Work Item]: /docs/tasks/job/coarse-parallel-processing-work-queue/
|
||||
[Queue with Variable Pod Count]: /docs/tasks/job/fine-parallel-processing-work-queue/
|
||||
[Indexed Job with Static Work Assignment]: /docs/tasks/job/indexed-parallel-processing-static/
|
||||
[Job Template Expansion]: /docs/tasks/job/parallel-processing-expansion/
|
||||
|
||||
## Advanced usage
|
||||
|
||||
|
|
|
@ -261,6 +261,7 @@ different Kubernetes components.
|
|||
| `ImmutableEphemeralVolumes` | `false` | Alpha | 1.18 | 1.18 |
|
||||
| `ImmutableEphemeralVolumes` | `true` | Beta | 1.19 | 1.20 |
|
||||
| `ImmutableEphemeralVolumes` | `true` | GA | 1.21 | |
|
||||
| `IndexedJob` | `false` | Alpha | 1.21 | |
|
||||
| `Initializers` | `false` | Alpha | 1.7 | 1.13 |
|
||||
| `Initializers` | - | Deprecated | 1.14 | - |
|
||||
| `KubeletConfigFile` | `false` | Alpha | 1.8 | 1.9 |
|
||||
|
@ -630,10 +631,12 @@ Each feature gate is designed for enabling/disabling a specific feature:
|
|||
- `HyperVContainer`: Enable
|
||||
[Hyper-V isolation](https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/hyperv-container)
|
||||
for Windows containers.
|
||||
- `IPv6DualStack`: Enable [dual stack](/docs/concepts/services-networking/dual-stack/)
|
||||
support for IPv6.
|
||||
- `ImmutableEphemeralVolumes`: Allows for marking individual Secrets and ConfigMaps as
|
||||
immutable for better safety and performance.
|
||||
- `IndexedJob`: Allows the [Job](/docs/concepts/workloads/controllers/job/)
|
||||
controller to manage Pod completions per completion index.
|
||||
- `IPv6DualStack`: Enable [dual stack](/docs/concepts/services-networking/dual-stack/)
|
||||
support for IPv6.
|
||||
- `KubeletConfigFile` (*deprecated*): Enable loading kubelet configuration from
|
||||
a file specified using a config file.
|
||||
See [setting kubelet parameters via a config file](/docs/tasks/administer-cluster/kubelet-config-file/)
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
title: Coarse Parallel Processing Using a Work Queue
|
||||
min-kubernetes-server-version: v1.8
|
||||
content_type: task
|
||||
weight: 30
|
||||
weight: 20
|
||||
---
|
||||
|
||||
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
title: Fine Parallel Processing Using a Work Queue
|
||||
content_type: task
|
||||
min-kubernetes-server-version: v1.8
|
||||
weight: 40
|
||||
weight: 30
|
||||
---
|
||||
|
||||
<!-- overview -->
|
||||
|
|
|
@ -0,0 +1,176 @@
|
|||
---
|
||||
title: Indexed Job for Parallel Processing with Static Work Assignment
|
||||
content_type: task
|
||||
min-kubernetes-server-version: v1.21
|
||||
weight: 30
|
||||
---
|
||||
|
||||
{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
|
||||
In this example, you will run a Kubernetes Job that uses multiple parallel
|
||||
worker processes.
|
||||
Each worker is a different container running in its own Pod. The Pods have an
|
||||
_index number_ that the control plane sets automatically, which allows each Pod
|
||||
to identify which part of the overall task to work on.
|
||||
|
||||
The pod index is available in the {{< glossary_tooltip text="annotation" term_id="annotation" >}}
|
||||
`batch.kubernetes.io/job-completion-index` as string representing its
|
||||
decimal value. In order for the containerized task process to obtain this index,
|
||||
you can publish the value of the annotation using the [downward API](/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/#the-downward-api)
|
||||
mechanism.
|
||||
For convenience, the control plane automatically sets the downward API to
|
||||
expose the index in the `JOB_COMPLETION_INDEX` environment variable.
|
||||
|
||||
Here is an overview of the steps in this example:
|
||||
|
||||
1. **Create an image that can read the pod index**. You might modify the worker
|
||||
program or add a script wrapper.
|
||||
2. **Start an Indexed Job**. The downward API allows you to pass the annotation
|
||||
as an environment variable or file to the container.
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
Be familiar with the basic,
|
||||
non-parallel, use of [Job](/docs/concepts/workloads/controllers/job/).
|
||||
|
||||
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
|
||||
|
||||
To be able to create Indexed Jobs, make sure to enable the `IndexedJob`
|
||||
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
|
||||
on the [API server](docs/reference/command-line-tools-reference/kube-apiserver/)
|
||||
and the [controller manager](/docs/reference/command-line-tools-reference/kube-controller-manager/).
|
||||
|
||||
<!-- steps -->
|
||||
|
||||
## Choose an approach
|
||||
|
||||
To access the work item from the worker program, you have a few options:
|
||||
|
||||
1. Read the `JOB_COMPLETION_INDEX` environment variable. The Job
|
||||
{{< glossary_tooltip text="controller" term_id="controller" >}}
|
||||
automatically links this variable to the annotation containing the completion
|
||||
index.
|
||||
1. Read a file that contains the completion index.
|
||||
1. Assuming that you can't modify the program, you can wrap it with a script
|
||||
that reads the index using any of the methods above and converts it into
|
||||
something that the program can use as input.
|
||||
|
||||
For this example, imagine that you chose option 3 and you want to run the
|
||||
[rev](https://man7.org/linux/man-pages/man1/rev.1.html) utility. This
|
||||
program accepts a file as an argument and prints its content reversed.
|
||||
|
||||
```shell
|
||||
rev data.txt
|
||||
```
|
||||
|
||||
For this example, you'll use the `rev` tool from the
|
||||
[`busybox`](https://hub.docker.com/_/busybox) container image.
|
||||
|
||||
## Define an Indexed Job
|
||||
|
||||
Here is a job definition. You'll need to edit the container image to match your
|
||||
preferred registry.
|
||||
|
||||
{{< codenew language="yaml" file="application/job/indexed-job.yaml" >}}
|
||||
|
||||
In the example above, you use the builtin `JOB_COMPLETION_INDEX` environment
|
||||
variable set by the Job controller for all containers. An [init container](/docs/concepts/workloads/pods/init-containers/)
|
||||
maps the index to a static value and writes it to a file that is shared with the
|
||||
container running the worker through an [emptyDir volume](/docs/concepts/storage/volumes/#emptydir).
|
||||
Optionally, you can [define your own environment variable through the downward
|
||||
API](/docs/tasks/inject-data-application/environment-variable-expose-pod-information/)
|
||||
to publish the index to containers. You can also choose to load a list of values
|
||||
from a [ConfigMap as an environment variable or file](/docs/tasks/configure-pod-container/configure-pod-configmap/).
|
||||
|
||||
Alternatively, you can directly [use the downward API to pass the annotation
|
||||
value as a volume file](/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/#store-pod-fields),
|
||||
like shown in the following example:
|
||||
|
||||
{{< codenew language="yaml" file="application/job/indexed-job-vol.yaml" >}}
|
||||
|
||||
## Running the Job
|
||||
|
||||
Now run the Job:
|
||||
|
||||
```shell
|
||||
kubectl apply -f ./indexed-job.yaml
|
||||
```
|
||||
|
||||
Wait a bit, then check on the job:
|
||||
|
||||
```shell
|
||||
kubectl describe jobs/indexed-job
|
||||
```
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
Name: indexed-job
|
||||
Namespace: default
|
||||
Selector: controller-uid=bf865e04-0b67-483b-9a90-74cfc4c3e756
|
||||
Labels: controller-uid=bf865e04-0b67-483b-9a90-74cfc4c3e756
|
||||
job-name=indexed-job
|
||||
Annotations: <none>
|
||||
Parallelism: 3
|
||||
Completions: 5
|
||||
Start Time: Thu, 11 Mar 2021 15:47:34 +0000
|
||||
Pods Statuses: 2 Running / 3 Succeeded / 0 Failed
|
||||
Completed Indexes: 0-2
|
||||
Pod Template:
|
||||
Labels: controller-uid=bf865e04-0b67-483b-9a90-74cfc4c3e756
|
||||
job-name=indexed-job
|
||||
Init Containers:
|
||||
input:
|
||||
Image: docker.io/library/bash
|
||||
Port: <none>
|
||||
Host Port: <none>
|
||||
Command:
|
||||
bash
|
||||
-c
|
||||
items=(foo bar baz qux xyz)
|
||||
echo ${items[$JOB_COMPLETION_INDEX]} > /input/data.txt
|
||||
|
||||
Environment: <none>
|
||||
Mounts:
|
||||
/input from input (rw)
|
||||
Containers:
|
||||
worker:
|
||||
Image: docker.io/library/busybox
|
||||
Port: <none>
|
||||
Host Port: <none>
|
||||
Command:
|
||||
rev
|
||||
/input/data.txt
|
||||
Environment: <none>
|
||||
Mounts:
|
||||
/input from input (rw)
|
||||
Volumes:
|
||||
input:
|
||||
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
|
||||
Medium:
|
||||
SizeLimit: <unset>
|
||||
Events:
|
||||
Type Reason Age From Message
|
||||
---- ------ ---- ---- -------
|
||||
Normal SuccessfulCreate 4s job-controller Created pod: indexed-job-njkjj
|
||||
Normal SuccessfulCreate 4s job-controller Created pod: indexed-job-9kd4h
|
||||
Normal SuccessfulCreate 4s job-controller Created pod: indexed-job-qjwsz
|
||||
Normal SuccessfulCreate 1s job-controller Created pod: indexed-job-fdhq5
|
||||
Normal SuccessfulCreate 1s job-controller Created pod: indexed-job-ncslj
|
||||
```
|
||||
|
||||
In this example, we run the job with custom values for each index. You can
|
||||
inspect the output of the pods:
|
||||
|
||||
```shell
|
||||
kubectl logs indexed-job-fdhq5 # Change this to match the name of a Pod in your cluster.
|
||||
```
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
xuq
|
||||
```
|
|
@ -2,7 +2,7 @@
|
|||
title: Parallel Processing using Expansions
|
||||
content_type: task
|
||||
min-kubernetes-server-version: v1.8
|
||||
weight: 20
|
||||
weight: 50
|
||||
---
|
||||
|
||||
<!-- overview -->
|
||||
|
|
|
@ -0,0 +1,27 @@
|
|||
apiVersion: batch/v1
|
||||
kind: Job
|
||||
metadata:
|
||||
name: 'indexed-job'
|
||||
spec:
|
||||
completions: 5
|
||||
parallelism: 3
|
||||
completionMode: Indexed
|
||||
template:
|
||||
spec:
|
||||
restartPolicy: Never
|
||||
containers:
|
||||
- name: 'worker'
|
||||
image: 'docker.io/library/busybox'
|
||||
command:
|
||||
- "rev"
|
||||
- "/input/data.txt"
|
||||
volumeMounts:
|
||||
- mountPath: /input
|
||||
name: input
|
||||
volumes:
|
||||
- name: input
|
||||
downwardAPI:
|
||||
items:
|
||||
- path: "data.txt"
|
||||
fieldRef:
|
||||
fieldPath: metadata.annotations['batch.kubernetes.io/job-completion-index']
|
|
@ -0,0 +1,35 @@
|
|||
apiVersion: batch/v1
|
||||
kind: Job
|
||||
metadata:
|
||||
name: 'indexed-job'
|
||||
spec:
|
||||
completions: 5
|
||||
parallelism: 3
|
||||
completionMode: Indexed
|
||||
template:
|
||||
spec:
|
||||
restartPolicy: Never
|
||||
initContainers:
|
||||
- name: 'input'
|
||||
image: 'docker.io/library/bash'
|
||||
command:
|
||||
- "bash"
|
||||
- "-c"
|
||||
- |
|
||||
items=(foo bar baz qux xyz)
|
||||
echo ${items[$JOB_COMPLETION_INDEX]} > /input/data.txt
|
||||
volumeMounts:
|
||||
- mountPath: /input
|
||||
name: input
|
||||
containers:
|
||||
- name: 'worker'
|
||||
image: 'docker.io/library/busybox'
|
||||
command:
|
||||
- "rev"
|
||||
- "/input/data.txt"
|
||||
volumeMounts:
|
||||
- mountPath: /input
|
||||
name: input
|
||||
volumes:
|
||||
- name: input
|
||||
emptyDir: {}
|
Loading…
Reference in New Issue