Commit Graph

12 Commits

Author SHA1 Message Date
Aldo Culquicondor 8f5bbd8203
Mount SSH Secret directly on main container (#416)
Remove the init container for faster startup.

Possible by disabling StrictModes in sshd_config.
2021-08-26 15:42:06 -07:00
Aldo Culquicondor c73ef6b0b1
Use fully-qualified label names from common (#409) 2021-08-19 19:01:54 -07:00
Aldo Culquicondor 24bbfe7c27
Increase unit coverage of v2 controller (#406) 2021-08-17 19:13:37 -07:00
Aldo Culquicondor 85aefc60c8
Remove ability to run ranks in launcher (#398) 2021-08-16 13:42:42 -07:00
Aldo Culquicondor b4b62cc302
Pass runPolicy fields to the launcher Job (#392)
* Add runPolicy to MPIJob.spec

* Pass runPolicy fields to the launcher Job
2021-08-13 07:50:54 -07:00
Aldo Culquicondor 3ba33750b5
Manage launcher through k8s Job (#391)
* Ensure restart policy is Never or OnFailure

Always doesn't make sense for Jobs

* Manage launcher through k8s Job

Still tracking Running status of the job pods.

* Add launcher Pod failed reason
2021-08-12 20:38:54 -07:00
Aldo Culquicondor 990bf1c39d
Add support for Intel MPI (#389)
* Add support for Intel MPI

Adds the field .spec.mpiImplementation, defaults to OpenMPI

The Intel implementation requires a Service fronting the launcher.

* Add an example image that uses Intel MPI
2021-08-03 11:23:41 -07:00
Aldo Culquicondor 9ce646773a
Allow running MPI applications as non-root (#383)
* Allow running MPI applications as non-root

Adds the spec field sshAuthMountPath for MPIJob.
The init script sets the permissions and ownership based on the securityContext of the launcherPod

* Add pure MPI sample that run as non-root
2021-07-26 22:35:11 -07:00
Aldo Culquicondor 7b6c1bfe22
Upgrade to apiextensions.k8s.io/v1 (#379) 2021-07-23 14:06:33 -04:00
Aldo Culquicondor 70a866ee52
Downgrade v2 API to v2beta1 (#378)
To leave the path open for improving the API without having to release a v3.
2021-07-16 11:29:46 -04:00
Aldo Culquicondor 6afa62ca0b
Add integration tests for v2 controller (#375)
* Do inter-pod communication through SSH

The controller generates keys and mounts them to the containers. The container images must know how to place the credentials and set file permissions.

* Use init-container instead of entrypoint

* Fix scheme for recorder and defaults

* Add integration tests for v2 controller
2021-07-15 06:43:51 -07:00
Aldo Culquicondor 3d4a4bdb51
Fork v2 controller and API in a new module (#366) 2021-06-22 08:58:51 -04:00