Commit Graph

22 Commits

Author SHA1 Message Date
Ciprian Hacman 61708eae6b Rename kops to kOps in the docs 2020-10-29 19:40:53 +02:00
Pratik raj e9bafc6785 Some debian package manager tweaks
By default, Ubuntu or Debian based "apt" or "apt-get" system installs recommended but not suggested packages .

By passing "--no-install-recommends" option, the user lets apt-get know not to consider recommended packages as a dependency to install.

This results in smaller downloads and installation of packages .

Refer to blog at [Ubuntu Blog](https://ubuntu.com/blog/we-reduced-our-docker-images-by-60-with-no-install-recommends) .

Signed-off-by: Pratik Raj <rajpratik71@gmail.com>
2020-08-18 23:36:18 +05:30
Huihua Zhang 9b0055ecbe HTTPS for NVIDIA drivers 2020-04-17 23:20:37 +08:00
Huihua Zhang 14e5330e19 Minor changes 2020-04-17 23:14:09 +08:00
Huihua Zhang 7d9ad2b14b Docker image explanation 2020-04-17 22:07:35 +08:00
Huihua Zhang 350d8e6f66 Addressed comments 2020-04-17 21:31:48 +08:00
Huihua Zhang ac457e9460 Updated README and Makefile 2020-04-17 15:57:55 +08:00
Huihua Zhang ffbd7d7988 Upgrade CUDA from 9.1 to 10.0 2020-04-16 09:02:54 +00:00
Hanfei Shen 7eb054b988
retry nvidia-device-plugin.sh when failed 2019-11-22 15:09:35 +08:00
tanjunchen 7f64de4c34 fix-up some spelling mistakes 2019-09-29 21:45:47 +08:00
Hanfei Shen bd3c3cd4b6
skip verification when file installed 2019-08-08 14:05:57 +08:00
Hanfei Shen 7b8c520e20
stop kubelet to prevent orphan containers
xref #6728

When install the nvidia docker runtime, kubelet will try to restart
stopped containers via old docker. This will leak those containers as
orphan containers.
2019-07-30 22:04:40 +08:00
Adrian Lyjak da0e2cee17 set specific versions to avoid https://github.com/kubernetes/kops/issues/6767 2019-04-11 20:04:47 -04:00
Robert Everson 89359ace7e Support g3s for gpu driver installation 2019-02-27 10:13:51 -08:00
k8s-ci-robot a0fcf95064
Merge pull request #5502 from dcwangmit01/gpu-device-plugins-3
Implemented Nvidia DevicePlugin GPU Support on AWS
2018-11-19 04:51:16 -08:00
Steve Winslow 4f99988e2a Clarify license statement for nvidia-bootstrap hook
Signed-off-by: Steve Winslow <swinslow@gmail.com>
2018-10-25 15:08:13 -04:00
David C Wang 13a89f22ae Implemented Nvidia DevicePlugin GPU Support
* Supports DevicePlugin GPU Mode AND Legacy Accelerators GPU Mode
2018-07-23 23:31:43 +00:00
David C Wang ef958a7f87 Fixed missing early exit upon unhandled file 2018-04-12 15:18:57 +00:00
David C Wang 69ab306eac Updated Kops GPU Setup Hook
* Changed Dockerfile base image to debian for systemctl and bash.
* Added autodetect of AWS ec2 instanceclass p2, p3, g3.
* For each detected instance class, added the installation of the proper driver
  according to the specific NVIDIA hardware.
  - G3 instance types require Nvidia Grid Series/Grid K520 drivers
  - P2 instance types require Nvidia Tesla K-Series drivers
  - P3 instance types require Nvidia Tesla V-Series drivers
* Set custom nvidia-smi configurations according to nvidia hardware per ec2
  instanceclass, according to the AWS GPU optimization document.
* Added the installation and patches of the latest cuda 9.1 libraries.
* Added restart of kubelet on kube node at end of successful hook run, thereby
  fixing a race condition where kubelet would start before the Nvidia drivers
  were loaded, thus not allowing kubernetes to detect GPUS on the kube node.
* Ensured build of nvidia drivers used same gcc version as that which built
  default kops kernel.
* Fixed issue where *every* run of this container would download all the NVIDIA
  drivers + cuda libs (1GB+), by caching the files on the kube node.
* Fixed issue where after reboot, subsequent runs of this script would fail
  because mknod would try to create a previously-created device node and fail.
  This previously caused download loop as systemd perpetually restarted the
  unit upon failure.
* Tested with p2.xlarge, p3.2xlarge, and g3.4xlarge
2018-04-11 19:45:57 +00:00
Justin Santa Barbara 5a056a3872 Bump all our base docker images 2017-11-28 02:41:03 -05:00
Justin Santa Barbara 456a4635d5 hook for prepulling images 2017-04-19 20:43:56 -04:00
Justin Santa Barbara 9294102139 Experimental nvidia driver installation via a hook
With sha3sum validation technique thanks to @bskaggs
2017-04-19 00:43:59 -04:00