Commit Graph

8 Commits

Author SHA1 Message Date
Ciprian Hacman 61708eae6b Rename kops to kOps in the docs 2020-10-29 19:40:53 +02:00
Pratik raj e9bafc6785 Some debian package manager tweaks
By default, Ubuntu or Debian based "apt" or "apt-get" system installs recommended but not suggested packages .

By passing "--no-install-recommends" option, the user lets apt-get know not to consider recommended packages as a dependency to install.

This results in smaller downloads and installation of packages .

Refer to blog at [Ubuntu Blog](https://ubuntu.com/blog/we-reduced-our-docker-images-by-60-with-no-install-recommends) .

Signed-off-by: Pratik Raj <rajpratik71@gmail.com>
2020-08-18 23:36:18 +05:30
Robert Everson 89359ace7e Support g3s for gpu driver installation 2019-02-27 10:13:51 -08:00
Steve Winslow 4f99988e2a Clarify license statement for nvidia-bootstrap hook
Signed-off-by: Steve Winslow <swinslow@gmail.com>
2018-10-25 15:08:13 -04:00
David C Wang ef958a7f87 Fixed missing early exit upon unhandled file 2018-04-12 15:18:57 +00:00
David C Wang 69ab306eac Updated Kops GPU Setup Hook
* Changed Dockerfile base image to debian for systemctl and bash.
* Added autodetect of AWS ec2 instanceclass p2, p3, g3.
* For each detected instance class, added the installation of the proper driver
  according to the specific NVIDIA hardware.
  - G3 instance types require Nvidia Grid Series/Grid K520 drivers
  - P2 instance types require Nvidia Tesla K-Series drivers
  - P3 instance types require Nvidia Tesla V-Series drivers
* Set custom nvidia-smi configurations according to nvidia hardware per ec2
  instanceclass, according to the AWS GPU optimization document.
* Added the installation and patches of the latest cuda 9.1 libraries.
* Added restart of kubelet on kube node at end of successful hook run, thereby
  fixing a race condition where kubelet would start before the Nvidia drivers
  were loaded, thus not allowing kubernetes to detect GPUS on the kube node.
* Ensured build of nvidia drivers used same gcc version as that which built
  default kops kernel.
* Fixed issue where *every* run of this container would download all the NVIDIA
  drivers + cuda libs (1GB+), by caching the files on the kube node.
* Fixed issue where after reboot, subsequent runs of this script would fail
  because mknod would try to create a previously-created device node and fail.
  This previously caused download loop as systemd perpetually restarted the
  unit upon failure.
* Tested with p2.xlarge, p3.2xlarge, and g3.4xlarge
2018-04-11 19:45:57 +00:00
Justin Santa Barbara 5a056a3872 Bump all our base docker images 2017-11-28 02:41:03 -05:00
Justin Santa Barbara 9294102139 Experimental nvidia driver installation via a hook
With sha3sum validation technique thanks to @bskaggs
2017-04-19 00:43:59 -04:00