|
…
|
||
|---|---|---|
| .. | ||
| gpu | ||
| README.md | ||
README.md
Building a new GPU image (e.g. update CUDA)
Install Packer first: https://developer.hashicorp.com/packer/tutorials/docker-get-started/get-started-install-cli
Then install Amazon plugin on Packer: packer plugins install github.com/hashicorp/amazon
- Maybe adjust the script in
gpu/scripts/install-nvidia-docker.sh - Check GPU instance compatibility with the CUDA version
- The instance type (the GPUs) in
gpu/buildkite-gpu-ami.pkr.hclshould match the ones defined in../ci-stack-module/main.tf - Build using packer:
cd gpu && packer build buildkite-gpu-ami.pkr.hcl
Note that this builds the new image on top of the latest Buildkite agent AMI (not our own - theirs!). This means we may have to update the Buildkite CI stack version as well, as the scripts might otherwise mismatch.
For this, go to ../ci-stack-module-variables.tf and change elastic_ci_stack_version to the
latest version.
- Change the resulting image into the GPU queues in
ci-stack-module/main.tf(search forImageId) - Make sure to annotate the image ID.
- If you had any failing builds, consider removing the failed AMI or snapshots, if necessary.
Then deploy using terraform apply