Fix https://github.com/kubeflow/kubeflow/issues/6366
Migrating to Kubebuilder v3 leads to the following changes:
- Add .dockerignore file.
- Upgrade Go version from v1.15 to v1.17.
- Adapt Makefile.
- Add image (build + push) target to makefile.
- Upgrade EnvTest to use K8s v1.22.
- Update PROJECT template.
- Migrate CRD apiVersion from v1beta to v1.
- Add livenessProbe and readinessProbe to controller manager.
- Upgrade controller-runtime from v0.2.0 to v0.11.0.
Other changes:
- Build image using public.ecr.aws registry instead of gcr.io.
- Update README.md documentation.
- Update 3rd party licences.
- Fix notebook.spec description.
- Add 3 sample notebooks (v1, v1alpha1 and v1beta1).
Signed-off-by: Samuel Veloso <svelosol@redhat.com>
* notebooks: Update notebook if timestamp changed
We don't want to be updating the spec of the notebook if the timestamp
hasn't changed, since this will lead to constant updates and
reconciliation loops.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Use a deep-copy of the notebook spec
The controller should use a deep-copy of the notebook spec when
calculating the spec for the StatefulSet. If not then we could
update the notebook object without wanting it, since the spec could have
been changed when calculating the STS spec.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Add prefix env var only if missing
The controller should be setting OR updating the NB_PREFIX env var.
Previously it would always blindly append it to the spec, which could
result in double entries for the same env var.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Update image's tag in make
Modify Makefile to update properly the TAG
based on the git TAG.
Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Expose last-activity
Extend the notebook-controller to:
* cull idle Notebook Servers based on their new `last-activity`
annotation
* expose the last activity of each Notebook Server as an annotation
on the metadata of the corresponding CR object
Modify notebook_controller.go to:
* update the Last Activity of each Notebook Server that has a
Running pod
* delete the Last Activity Annotation for every Notebook Server
that does not have a Running pod
Extend culler.go to:
* perform culling based on the new `last-activity` annotation and
not based on the `/api/status` endpoint.
* update the last activity of a Notebook Server, based on the
kernels' execution states.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Athanasios Markou <athamark@arrikto.com>
* notebooks: Introduce a DEV env var
We introduce a DEV ENV var to allow admins
develop and test on their local machine their
custom Notebook Controller.
We provide information and instructions inside
the components/notebook-controller/README.md.
Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Add unit tests for last-activity
* Introduce new tests for allKernelsAreIdle()
* Extend the tests for NotebookIsIdle() and for
NotebookNeedsCulling().
Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* review: UpdateNotebookLastActivityAnnotation()
Ensure that UpdateNotebookLastActivityAnnotation() does not return
"true". This function should not return any value.
Signed-off-by: Athanasios Markou <athamark@arrikto.com>
* Create a culler as a package
Helper functions for culling resources. Takes for granted that ISTIO is
installed to the system and queries Prometheus to get metrics.
Specifically, requests/{configurable time}.
If the resource should be culled, then it should be done by setting an
annotation. This way the UIs can also show that the Resource is stopping
and also easily stop a resource by making a PATCH request.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Culling logic enhancements
Add necessary ENV Vars. Culling won't happen by default. To enable it
the user will need to set the ENABLE_CULLING=true
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Misc fixes in logging and comment cleanup
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Fix typo
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Add Notebooks specific culling
Query the /api/status endpoint of each Server
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Remove the generic culling logic
We need to discuss if it would make sense to have this logic as a go
library, or use knative.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Add unit tests
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Remove unused code
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Review changes #1
* rename `getEnvDef` to `getEnvDefault`
* Add a comment to describe how the STOP_ANNOTATION gets used
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Make cluster domain configurable
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* added detailes into NotebookCondition to keep track of notebook container status change
* update notebook controller image
* fix conitions update
* small fix
* temporary changes to debug
* temporary remove delete step from workflow for debugging
* temoraray merging kfctl-test and kfctl-go-test fir debugging
* debugging
* undo the mistake
* debugging
* debugging tests
* merged kfctl-test and kfctl-go-test
* remove wait-for-kubeflow
* merged with master
* remove test delete step for debugging
* small fix
* update jupyter test component
* update condition test for jupyter component
* revert back deleting step
* revert back change in kfctl.sh
* added some temporary change to debug jupyter-test
* revert back temp changes
* profile and Istio integration
* make profile manage Istio gateway
* add README.md
* make notebooks use gateway in kubeflow namespace
* gateway format to ns/name; add watch for istio ServiceRoleBinding
* Support setting auth header format via parameter
* update README
* update README
* update readme; resolve comments
* added ReadyReplicas status to notebook-controller
* fixed issues related to updating the notebook status
* fixed a problem in updating Notebook's status
* applied cr comments
* small change
* small formating change