* Implement a culling controller for notebooks
Changes:
* Move the idleness/culling logic into a separate controller
as part of the Notebooks Controller/Operator.
* Introduce an "notebooks.kubeflow.org/last_activity_check_timestamp".
annotation in each Notebook CR to keep the timestamp of the last
performed idleness check
The controller can then compare this timestamp with the current time to
ensure that notebooks will get reconciled every IDLENESS_CHECK_PERIOD
minutes.
The culling-controller will:
* reconcile only notebooks CRs
* set/update culling annotations
- 'notebooks.kubeflow.org/last_activity'
- 'notebooks.kubeflow.org/last_activity_check_timestamp'
* perform idleness checks every 'IDLENESS_CHECK_PERIOD' minutes
and set the 'kubeflow-resource-stopped' annotation, if a notebook
needs to be culled.
Refs: kubeflow/kubeflow#6767
Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
* review: Remove culling annotations when Pod is not found
Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
* review: Improve logs
Add a log message at the beginning of the reconciliation loop
to make it clear that a Reconcile was called for a notebook.
Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
* Run the controller locally
* Introduce make rule for running the controller locally with
culling enabled
* Introduce a dev_culling_authorization_policy which must be
applied when testing the culling-controller locally
Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
* Update README instructions
Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
Fix https://github.com/kubeflow/kubeflow/issues/6366
Migrating to Kubebuilder v3 leads to the following changes:
- Add .dockerignore file.
- Upgrade Go version from v1.15 to v1.17.
- Adapt Makefile.
- Add image (build + push) target to makefile.
- Upgrade EnvTest to use K8s v1.22.
- Update PROJECT template.
- Migrate CRD apiVersion from v1beta to v1.
- Add livenessProbe and readinessProbe to controller manager.
- Upgrade controller-runtime from v0.2.0 to v0.11.0.
Other changes:
- Build image using public.ecr.aws registry instead of gcr.io.
- Update README.md documentation.
- Update 3rd party licences.
- Fix notebook.spec description.
- Add 3 sample notebooks (v1, v1alpha1 and v1beta1).
Signed-off-by: Samuel Veloso <svelosol@redhat.com>
The default leader election ID is controller-leader-election-helper which could conflict when multiple controllers run within the same namespace. This is a required field in later versions of controller-runtime.