Update document for GPU support

The current guide is created two years ago and the content is out of date.
2018-12-21 15:08:07 +08:00 · 2018-12-21 15:08:07 +08:00 · 3b21a193d0
parent 7af77bb79c
commit 3b21a193d0
1 changed files with 3 additions and 72 deletions
--- a/docs/gpu.md
+++ b/docs/gpu.md
@ -1,74 +1,5 @@
-# GPU support
+# GPU Support

-```
-kops create cluster gpu.example.com --zones us-east-1c --node-size p2.xlarge --node-count 1 --kubernetes-version 1.6.1
-```
+You can use [kops hooks](./cluster_spec.md#hooks) to install [Nvidia kubernetes device plugin](https://github.com/NVIDIA/k8s-device-plugin) and enable GPU support in cluster.

-(Note that the p2.xlarge instance type is not cheap, but no GPU instances are)
-
-You can use the experimental hooks feature to install the nvidia drivers:
-
-`> kops edit cluster gpu.example.com`
-```
-spec:
-...
-  hooks:
-  - execContainer:
-      image: kopeio/nvidia-bootstrap:1.6
-```
-
-(TODO: Only on instance groups, or have nvidia-bootstrap detect if GPUs are present..)
-
-In addition, you will likely want to set the `Accelerators=true` feature-flag to kubelet:
-
-`> kops edit cluster gpu.example.com`
-```
-spec:
-...
-  kubelet:
-    featureGates:
-      Accelerators: "true"
-```
-
-`> kops update cluster gpu.example.com --yes`
-
-
-Here is an example pod that runs tensorflow; note that it mounts libcuda from the host:
-
-(TODO: Is there some way to have a well-known volume or similar?)
-
-```
-apiVersion: v1
-kind: Pod
-metadata:
-  name: tf
-spec:
-  containers:
-  - image: gcr.io/tensorflow/tensorflow:1.0.1-gpu
-    imagePullPolicy: IfNotPresent
-    name: gpu
-    command:
-    - /bin/bash
-    - -c
-    - "cp -d /rootfs/usr/lib/x86_64-linux-gnu/libcuda.* /usr/lib/x86_64-linux-gnu/ && cp -d /rootfs/usr/lib/x86_64-linux-gnu/libnvidia* /usr/lib/x86_64-linux-gnu/ &&/run_jupyter.sh"
-    resources:
-      limits:
-        cpu: 2000m
-        alpha.kubernetes.io/nvidia-gpu: 1
-    volumeMounts:
-    - name: rootfs-usr-lib
-      mountPath: /rootfs/usr/lib
-  volumes:
-    - name: rootfs-usr-lib
-      hostPath:
-        path: /usr/lib
-```
-
-To use this particular tensorflow image, you should port-forward and get the URL from the log:
-
-```
-kubectl port-forward tf 8888 &
-kubectl logs tf
-```
-
-And browse to the URL printed
+See instructions in [kops hooks for nvidia-device-plugin](../hooks/nvidia-device-plugin).