website/content/en/docs/tasks/manage-gpus/scheduling-gpus.md

129 lines
5.0 KiB
Markdown

---
reviewers:
- vishh
content_type: concept
title: Schedule GPUs
description: Configure and schedule GPUs for use as a resource by nodes in a cluster.
---
<!-- overview -->
{{< feature-state state="stable" for_k8s_version="v1.26" >}}
Kubernetes includes **stable** support for managing AMD and NVIDIA GPUs
(graphical processing units) across different nodes in your cluster, using
{{< glossary_tooltip text="device plugins" term_id="device-plugin" >}}.
This page describes how users can consume GPUs, and outlines
some of the limitations in the implementation.
<!-- body -->
## Using device plugins
Kubernetes implements device plugins to let Pods access specialized hardware features such as GPUs.
{{% thirdparty-content %}}
As an administrator, you have to install GPU drivers from the corresponding
hardware vendor on the nodes and run the corresponding device plugin from the
GPU vendor. Here are some links to vendors' instructions:
* [AMD](https://github.com/ROCm/k8s-device-plugin#deployment)
* [Intel](https://intel.github.io/intel-device-plugins-for-kubernetes/cmd/gpu_plugin/README.html)
* [NVIDIA](https://github.com/NVIDIA/k8s-device-plugin#quick-start)
Once you have installed the plugin, your cluster exposes a custom schedulable resource such as `amd.com/gpu` or `nvidia.com/gpu`.
You can consume these GPUs from your containers by requesting
the custom GPU resource, the same way you request `cpu` or `memory`.
However, there are some limitations in how you specify the resource
requirements for custom devices.
GPUs are only supposed to be specified in the `limits` section, which means:
* You can specify GPU `limits` without specifying `requests`, because
Kubernetes will use the limit as the request value by default.
* You can specify GPU in both `limits` and `requests` but these two values
must be equal.
* You cannot specify GPU `requests` without specifying `limits`.
Here's an example manifest for a Pod that requests a GPU:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: example-vector-add
spec:
restartPolicy: OnFailure
containers:
- name: example-vector-add
image: "registry.example/example-vector-add:v42"
resources:
limits:
gpu-vendor.example/example-gpu: 1 # requesting 1 GPU
```
## Manage clusters with different types of GPUs
If different nodes in your cluster have different types of GPUs, then you
can use [Node Labels and Node Selectors](/docs/tasks/configure-pod-container/assign-pods-nodes/)
to schedule pods to appropriate nodes.
For example:
```shell
# Label your nodes with the accelerator type they have.
kubectl label nodes node1 accelerator=example-gpu-x100
kubectl label nodes node2 accelerator=other-gpu-k915
```
That label key `accelerator` is just an example; you can use
a different label key if you prefer.
## Automatic node labelling {#node-labeller}
As an administrator, you can automatically discover and label all your GPU enabled nodes
by deploying Kubernetes [Node Feature Discovery](https://github.com/kubernetes-sigs/node-feature-discovery) (NFD).
NFD detects the hardware features that are available on each node in a Kubernetes cluster.
Typically, NFD is configured to advertise those features as node labels, but NFD can also add extended resources, annotations, and node taints.
NFD is compatible with all [supported versions](/releases/version-skew-policy/#supported-versions) of Kubernetes.
By default NFD create the [feature labels](https://kubernetes-sigs.github.io/node-feature-discovery/master/usage/features.html) for the detected features.
Administrators can leverage NFD to also taint nodes with specific features, so that only pods that request those features can be scheduled on those nodes.
You also need a plugin for NFD that adds appropriate labels to your nodes; these might be generic
labels or they could be vendor specific. Your GPU vendor may provide a third party
plugin for NFD; check their documentation for more details.
{{< highlight yaml "linenos=false,hl_lines=7-18" >}}
apiVersion: v1
kind: Pod
metadata:
name: example-vector-add
spec:
restartPolicy: OnFailure
# You can use Kubernetes node affinity to schedule this Pod onto a node
# that provides the kind of GPU that its container needs in order to work
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "gpu.gpu-vendor.example/installed-memory"
operator: Gt # (greater than)
values: ["40535"]
- key: "feature.node.kubernetes.io/pci-10.present" # NFD Feature label
values: ["true"] # (optional) only schedule on nodes with PCI device 10
containers:
- name: example-vector-add
image: "registry.example/example-vector-add:v42"
resources:
limits:
gpu-vendor.example/example-gpu: 1 # requesting 1 GPU
{{< /highlight >}}
#### GPU vendor implementations
- [Intel](https://intel.github.io/intel-device-plugins-for-kubernetes/cmd/gpu_plugin/README.html)
- [NVIDIA](https://github.com/NVIDIA/k8s-device-plugin)