Commit Graph

27 Commits

Author SHA1 Message Date
Samu 0215857aa9 Support K8s 1.22 in notebook controller (kubeflow/kubeflow#6374)
Fix https://github.com/kubeflow/kubeflow/issues/6366

Migrating to Kubebuilder v3 leads to the following changes:
- Add .dockerignore file.
- Upgrade Go version from v1.15 to v1.17.
- Adapt Makefile.
- Add image (build + push) target to makefile.
- Upgrade EnvTest to use K8s v1.22.
- Update PROJECT template.
- Migrate CRD apiVersion from v1beta to v1.
- Add livenessProbe and readinessProbe to controller manager.
- Upgrade controller-runtime from v0.2.0 to v0.11.0.

Other changes:
- Build image using public.ecr.aws registry instead of gcr.io.
- Update README.md documentation.
- Update 3rd party licences.
- Fix notebook.spec description.
- Add 3 sample notebooks (v1, v1alpha1 and v1beta1).

Signed-off-by: Samuel Veloso <svelosol@redhat.com>
2022-05-03 15:49:01 +00:00
Kimonas Sotirchos 1d24b75f57 notebooks: Fix endless restarts (kubeflow/kubeflow#6341)
* notebooks: Update notebook if timestamp changed

We don't want to be updating the spec of the notebook if the timestamp
hasn't changed, since this will lead to constant updates and
reconciliation loops.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* notebooks: Use a deep-copy of the notebook spec

The controller should use a deep-copy of the notebook spec when
calculating the spec for the StatefulSet. If not then we could
update the notebook object without wanting it, since the spec could have
been changed when calculating the STS spec.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* notebooks: Add prefix env var only if missing

The controller should be setting OR updating the NB_PREFIX env var.
Previously it would always blindly append it to the spec, which could
result in double entries for the same env var.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
2022-02-09 17:32:07 +00:00
Athanasios Markou fbf5110f01 notebooks: Extend Notebook Controller to expose idleness for Jupyter (kubeflow/kubeflow#6297)
* notebooks: Update image's tag in make

Modify Makefile to update properly the TAG
based on the git TAG.

Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* notebooks: Expose last-activity

Extend the notebook-controller to:
* cull idle Notebook Servers based on their new `last-activity`
  annotation
* expose the last activity of each Notebook Server as an annotation
  on the metadata of the corresponding CR object

Modify notebook_controller.go to:
* update the Last Activity of each Notebook Server that has a
  Running pod
* delete the Last Activity Annotation for every Notebook Server
  that does not have a Running pod

Extend culler.go to:
* perform culling based on the new `last-activity` annotation and
  not based on the `/api/status` endpoint.
* update the last activity of a Notebook Server, based on the
  kernels' execution states.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Athanasios Markou <athamark@arrikto.com>

* notebooks: Introduce a DEV env var

We introduce a DEV ENV var to allow admins
develop and test on their local machine their
custom Notebook Controller.
We provide information and instructions inside
the components/notebook-controller/README.md.

Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* notebooks: Add unit tests for last-activity

* Introduce new tests for allKernelsAreIdle()
* Extend the tests for NotebookIsIdle() and for
  NotebookNeedsCulling().

Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* review: UpdateNotebookLastActivityAnnotation()

Ensure that UpdateNotebookLastActivityAnnotation() does not return
"true". This function should not return any value.

Signed-off-by: Athanasios Markou <athamark@arrikto.com>
2022-02-07 15:19:17 +00:00
Quanjie Lin 1236c5e6d7 initial checkin of tensorboard controller (kubeflow/kubeflow#4312)
* initial checkin of tensorboard controller

* initial checkin of tensorboard controller

* typo

* typo

* fix typo

* support local path

* add status

* conflict

* remove binary
2019-10-29 09:12:44 -07:00
Ben Ye 2e7dc7ec06 add culling metrics (kubeflow/kubeflow#4336)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-10-17 21:37:57 -07:00
Ben Ye d14f6ac07f support metrics in notebook-controller (kubeflow/kubeflow#4123)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-10-16 00:15:40 -07:00
Ben Ye 807843ec2a cleanup some codes in notebook controller (kubeflow/kubeflow#4098)
* cleanup some codes in notebook controller

Signed-off-by: yeya24 <yb532204897@gmail.com>

* remove ambassador in notebook controller

Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-10-14 12:06:52 -07:00
Lun-Kai Hsu 8cad496a13 Migrate notebook CR to kubebuilder V2 (kubeflow/kubeflow#4013)
* wip

* can build

* tested: able to control notebook

* fix
2019-09-04 17:06:22 -07:00
Kimonas Sotirchos 08f43598c2 Culling of Idle Jupyter Notebooks (kubeflow/kubeflow#3856)
* Create a culler as a package

Helper functions for culling resources. Takes for granted that ISTIO is
installed to the system and queries Prometheus to get metrics.
Specifically, requests/{configurable time}.

If the resource should be culled, then it should be done by setting an
annotation. This way the UIs can also show that the Resource is stopping
and also easily stop a resource by making a PATCH request.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Culling logic enhancements

Add necessary ENV Vars. Culling won't happen by default. To enable it
the user will need to set the ENABLE_CULLING=true

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Misc fixes in logging and comment cleanup

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Fix typo

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Add Notebooks specific culling

Query the /api/status endpoint of each Server

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Remove the generic culling logic

We need to discuss if it would make sense to have this logic as a go
library, or use knative.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Add unit tests

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Remove unused code

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Review changes #1

* rename `getEnvDef` to `getEnvDefault`
* Add a comment to describe how the STOP_ANNOTATION gets used

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* Make cluster domain configurable

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
2019-08-26 04:40:21 -07:00
Gabriel Wen 70bd7acdf5 Merge branch 'master' into fix-notebook-controller 2019-06-03 14:33:05 -07:00
zabbasi daa4768f96 Add details to "conditions" in notebook status (kubeflow/kubeflow#3319)
* added detailes into NotebookCondition to keep track of notebook  container status change

* update notebook controller image

* fix conitions update

* small fix

* temporary changes to debug

* temporary remove delete step from workflow for debugging

* temoraray merging kfctl-test and kfctl-go-test fir debugging

* debugging

* undo the mistake

* debugging

* debugging tests

* merged kfctl-test and kfctl-go-test

* remove wait-for-kubeflow

* merged with master

* remove test delete step for debugging

* small fix

* update jupyter test component

* update condition test for jupyter component

* revert back deleting step

* revert back change in kfctl.sh

* added some temporary change to debug jupyter-test

* revert back temp changes
2019-06-03 14:19:30 -07:00
Gabriel Wen c22959f0ac check env when setting watch 2019-06-03 13:54:10 -07:00
Gabriel Wen 525eee5ed8 update notebook_controller to use env 2019-06-03 13:15:56 -07:00
Kunming Qu 42bbb0cdbf profile and Istio integration (kubeflow/kubeflow#3234)
* profile and Istio integration

* make profile manage Istio gateway

* add README.md

* make notebooks use gateway in kubeflow namespace

* gateway format to ns/name; add watch for istio ServiceRoleBinding

* Support setting auth header format via parameter

* update README

* update README

* update readme; resolve comments
2019-05-29 18:36:19 -07:00
zabbasi a7e7d75be9 Renamed PodPreset CRD to PodDefault (kubeflow/kubeflow#3320)
* renamed PodPreset CRD to PodDefault

* typos

* update jupyter-web-app image
2019-05-21 11:22:10 -07:00
zabbasi 5ae44fbdb4 Integrates notebook-controller and jupyter-web-app with admission-webhook (kubeflow/kubeflow#3245)
* integrate jupyter-web-app and notebook-controller with webhook

* merged podpreset component into admission-webhook

* applied cr comments

* undo notebook image for tesing

* update notebook controller image

* temporaray disbaling kubeflow delete to debug presubmit failure

* temporary remove cluster delete in kfctl workflow test

* typo

* typo

* undo debugging changes
2019-05-20 12:39:13 -07:00
Kunming Qu a80025787b enable Istio Injection in user-created namespace; notebook and Istio integration (kubeflow/kubeflow#3235)
* enable Istio Injection in user-created namespace; notebook service and Istio rbac integration

* update README
2019-05-09 16:59:58 -07:00
Hung-Ting Wen 58c977c8e9 ISTIO support for notebook controller (kubeflow/kubeflow#3104)
* virtual service func init

* create virtualservice

* fix

* fix

* add cluster role

* fix unstructured format

* updates

* fix

* reconcile virtual service

* fix

* revert quote changes

* add virtualservice update

* comment

* copy if spec is not found in toSpec

* add watch event
2019-04-29 11:43:19 -07:00
Lun-Kai Hsu 9f70ca7f10 add labels for notebook so that gcp credentials will be injected by webhook (kubeflow/kubeflow#2853)
* add labels for gcp cred

* kfctl set flag

* review comment

* review comment
2019-03-30 20:36:33 -07:00
Lun-Kai Hsu dc69b63667 notebook CR shows container status (kubeflow/kubeflow#2787)
* wip

* fix

* fix format
2019-03-26 17:08:47 -07:00
zabbasi 2500faee10 added ReadyReplicas status to notebook-controller (kubeflow/kubeflow#2743)
* added ReadyReplicas status to notebook-controller

* fixed issues related to updating the notebook status

* fixed a problem in updating Notebook's status

* applied cr comments

* small change

* small formating change
2019-03-21 21:46:18 -07:00
Lun-Kai Hsu bfa59d7769 fix (kubeflow/kubeflow#2620) 2019-03-04 16:48:17 -08:00
Lun-Kai Hsu 931e8e32aa Add status to notebook (kubeflow/kubeflow#2558)
* wip

* wip

* update test to check status condition

* fix
2019-03-04 14:36:17 -08:00
Lun-Kai Hsu a9b8f4e8a0 fix (kubeflow/kubeflow#2506) 2019-02-26 11:21:53 -08:00
Lun-Kai Hsu e377455ce4 Notebook controller fixes (kubeflow/kubeflow#2463)
* fix

* enable e2e test

* fix

* fix

* fix logging for pytest

* fix

* fix

* fix

* fix

* fix

* fix

* address review

* review comment
2019-02-15 00:09:02 -08:00
Lun-Kai Hsu b7555c6727 NB controller fix (kubeflow/kubeflow#2439)
* fix

* fix
2019-02-10 17:37:51 -08:00
Lun-Kai Hsu fa3b0b3b0b Golang notebook controller (kubeflow/kubeflow#2336)
* kubebuilder init

* replae dep with modules

* add notebook api

* notebook controller impl

* remove test

* fix dockerfile

* fix svc reconcile

* notebook controller ksonnet

* update generated crd

* add sample

* remove TODO

* make golang version an arg

* rename

* fix path

* add README

* Add todo in readme

* remove arg default
2019-02-05 16:43:39 -08:00