From 2836e48c3f0b067fa80941a1fb637d5926a3f1ac Mon Sep 17 00:00:00 2001
From: Piotr <piotr.stankiewicz@docker.com>
Date: Fri, 9 Feb 2024 12:02:53 +0100
Subject: [PATCH] update wsl2 nvidia doc (#19365)

* Update WSL2 + Nvidia doc

The Nvidia drivers have been generally available for a long while now.
So change our doc to reflect that, also add a usage example which better
illustrates what can be done with the feature.

Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>

* Add a top-level GPU support doc

Currently the WSL2 + nvidia doc is not easily discoverable for people
browsing our docs. So add a top level GPU support page and move the
WSL2 + nvidia doc there.

Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>

---------

Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
---
 content/desktop/gpu.md         | 62 ++++++++++++++++++++++++++++++++++
 content/desktop/wsl/use-wsl.md | 51 +---------------------------
 data/toc.yaml                  |  2 ++
 3 files changed, 65 insertions(+), 50 deletions(-)
 create mode 100644 content/desktop/gpu.md
diff --git a/content/desktop/gpu.md b/content/desktop/gpu.md
new file mode 100644
index 0000000000..8ca2ffc3c2
--- /dev/null
+++ b/content/desktop/gpu.md
@@ -0,0 +1,62 @@
+---
+title: GPU support in Docker Desktop
+description: How to use GPU in Docker Desktop
+keywords: gpu, gpu support, nvidia, wsl2, docker desktop, windows
+toc_max: 3
+---
+
+> **Note**
+>
+> Currently GPU support in Docker Desktop is only available on Windows with the WSL2 backend.
+
+## Using NVIDIA GPUs with WSL2
+
+Docker Desktop for Windows supports WSL 2 GPU Paravirtualization (GPU-PV) on NVIDIA GPUs. To enable WSL 2 GPU Paravirtualization, you need:
+
+- A machine with an NVIDIA GPU
+- Up to date Windows 10 or Windows 11 installation
+- [Up to date drivers](https://developer.nvidia.com/cuda/wsl) from NVIDIA supporting WSL 2 GPU Paravirtualization
+- The latest version of the WSL 2 Linux kernel. Use `wsl --update` on the command line
+- Make sure the [WSL 2 backend is turned on](wsl/index.md/#turn-on-docker-desktop-wsl-2) in Docker Desktop
+
+To validate that everything works as expected, execute a `docker run` command with the `--gpus=all` flag. For example, the following will run a short benchmark on your GPU:
+
+```console
+$ docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
+```
+The output will be similar to:
+
+```console
+Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
+        -fullscreen       (run n-body simulation in fullscreen mode)
+        -fp64             (use double precision floating point values for simulation)
+        -hostmem          (stores simulation data in host memory)
+        -benchmark        (run benchmark to measure performance)
+        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
+        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
+        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
+        -compare          (compares simulation results running once on the default GPU and once on the CPU)
+        -cpu              (run n-body simulation on the CPU)
+        -tipsy=<file.bin> (load a tipsy model file for simulation)
+
+> NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
+
+> Windowed mode
+> Simulation data stored in video memory
+> Single precision floating point simulation
+> 1 Devices used for simulation
+MapSMtoCores for SM 7.5 is undefined.  Default to use 64 Cores/SM
+GPU Device 0: "GeForce RTX 2060 with Max-Q Design" with compute capability 7.5
+
+> Compute 7.5 CUDA device: [GeForce RTX 2060 with Max-Q Design]
+30720 bodies, total time for 10 iterations: 69.280 ms
+= 136.219 billion interactions per second
+= 2724.379 single-precision GFLOP/s at 20 flops per interaction
+```
+
+Or if you wanted to try something more useful you could use the official [Ollama image](https://hub.docker.com/r/ollama/ollama) to run the Llama2 large language model.
+
+```console
+$ docker run --gpus=all -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
+$ docker exec -it ollama ollama run llama2
+```
diff --git a/content/desktop/wsl/use-wsl.md b/content/desktop/wsl/use-wsl.md
index f35f74779f..f82692ca65 100644
--- a/content/desktop/wsl/use-wsl.md
+++ b/content/desktop/wsl/use-wsl.md
@@ -1,7 +1,7 @@
 ---
 title: Use WSL
 description: How to develop with Docker and WSL 2 and understand GPU support for WSL
-keywords: wsl, wsl 2, gpu support, develop, docker desktop, windows
+keywords: wsl, wsl 2, develop, docker desktop, windows
 ---
 
 ## Develop with Docker and WSL 2
@@ -23,52 +23,3 @@ The following section describes how to start developing your applications using
 
     Alternatively, you can type the name of your default Linux distro in your Start menu, open it, and then run `code` .
 3. When you are in VS Code, you can use the terminal in VS Code to pull your code and start working natively from your Windows machine.
-
-## GPU support
-
-> **Note**
->
-> GPU support is only available in Docker Desktop for Windows with the WSL2 backend.
-
-With Docker Desktop version 3.1.0 and later,  WSL 2 GPU Paravirtualization (GPU-PV) on NVIDIA GPUs is supported. To enable WSL 2 GPU Paravirtualization, you need:
-
-- A machine with an NVIDIA GPU
-- The latest Windows Insider version from the Dev Preview ring
-- [Beta drivers](https://developer.nvidia.com/cuda/wsl) from NVIDIA supporting WSL 2 GPU Paravirtualization
-- Update WSL 2 Linux kernel to the latest version using `wsl --update` from an elevated command prompt
-- Make sure the WSL 2 backend is turned on in Docker Desktop
-
-To validate that everything works as expected, run the following command to run a short benchmark on your GPU:
-
-```console
-$ docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
-```
-The following displays:
-
-```console
-Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
-        -fullscreen       (run n-body simulation in fullscreen mode)
-        -fp64             (use double precision floating point values for simulation)
-        -hostmem          (stores simulation data in host memory)
-        -benchmark        (run benchmark to measure performance)
-        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
-        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
-        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
-        -compare          (compares simulation results running once on the default GPU and once on the CPU)
-        -cpu              (run n-body simulation on the CPU)
-        -tipsy=<file.bin> (load a tipsy model file for simulation)
-
-> NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
-
-> Windowed mode
-> Simulation data stored in video memory
-> Single precision floating point simulation
-> 1 Devices used for simulation
-MapSMtoCores for SM 7.5 is undefined.  Default to use 64 Cores/SM
-GPU Device 0: "GeForce RTX 2060 with Max-Q Design" with compute capability 7.5
-
-> Compute 7.5 CUDA device: [GeForce RTX 2060 with Max-Q Design]
-30720 bodies, total time for 10 iterations: 69.280 ms
-= 136.219 billion interactions per second
-= 2724.379 single-precision GFLOP/s at 20 flops per interaction
-```
\ No newline at end of file
diff --git a/data/toc.yaml b/data/toc.yaml
index b9cd142270..8c57d899ce 100644
--- a/data/toc.yaml
+++ b/data/toc.yaml
@@ -1160,6 +1160,8 @@ Manuals:
           title: Use WSL
         - path: /desktop/wsl/best-practices/
           title: Best practices
+    - path: /desktop/gpu/
+      title: GPU support
     - sectiontitle: Additional resources
       section:
         - path: /desktop/kubernetes/