Commit Graph

37 Commits

Author SHA1 Message Date
Sven Nobis df961073d3
feat: Add support to run notebooks on its own subdomains in Istio
This new feature makes it possible to host notebooks on their own subdomains when Istio is used. This isolates the notebook's origin from the dashboard / Kubeflow API origin in the browser and addresses a security problem that allows session hijacking through a malicious notebook.

Signed-off-by: Lorin Lehawany <llehawany@ernw.de>
Signed-off-by: Sven Nobis <mail@sven.to>
2025-09-04 21:50:40 +02:00
Gleb 80211fb468 feat: allow setting `ISTIO_HOST` in notebook/tensorboard controller for Istio VirtualServices (kubeflow/kubeflow#6902)
* Added ISTIO_HOSTS env variables for tensorboard and notebook controllers

* ISTIO_HOSTS -> ISTIO_HOST rename

* istioHosts -> istioHost rename
2024-05-24 00:11:27 +00:00
LiaoSirui f9af6f9c18 fix(notebook-controller): fix typo (kubeflow/kubeflow#7305)
namesace -> namespace in file components/notebook-controller/controllers/notebook_controller.go
2023-10-26 14:46:10 +00:00
Narayanamurthi Mari b9507167d8 feat(notebooks): propagate annotations from notebook cr to pods (kubeflow/kubeflow#7076)
Co-authored-by: osd530 <narayanamurthi.mari@capitalone.com>
2023-07-31 16:43:28 +00:00
apoger d2c76a0739 Implement a culling controller for Notebooks (kubeflow/kubeflow#6807)
* Implement a culling controller for notebooks

Changes:

 * Move the idleness/culling logic into a separate controller
   as part of the Notebooks Controller/Operator.

 * Introduce an "notebooks.kubeflow.org/last_activity_check_timestamp".
   annotation in each Notebook CR to keep the timestamp of the last
   performed idleness check

The controller can then compare this timestamp with the current time to
ensure that notebooks will get reconciled every IDLENESS_CHECK_PERIOD
minutes.

The culling-controller will:

* reconcile only notebooks CRs
* set/update culling annotations
  - 'notebooks.kubeflow.org/last_activity'
  - 'notebooks.kubeflow.org/last_activity_check_timestamp'
* perform idleness checks every 'IDLENESS_CHECK_PERIOD' minutes
  and set the 'kubeflow-resource-stopped' annotation, if a notebook
  needs to be culled.

Refs: kubeflow/kubeflow#6767

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>

* review: Remove culling annotations when Pod is not found

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>

* review: Improve logs

Add a log message at the beginning of the reconciliation loop
to make it clear that a Reconcile was called for a notebook.

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>

* Run the controller locally

* Introduce make rule for running the controller locally with
  culling enabled

* Introduce a dev_culling_authorization_policy which must be
  applied when testing the culling-controller locally

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>

* Update README instructions

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
2023-01-26 13:32:10 +00:00
apoger b46583df63 Fix notebook culling (kubeflow/kubeflow#6659)
The notebook controller writes the last-activity annotation
before culling the Notebook, however, doesn't remove this
annotation before start. This causes the Notebook to be culled
again before is has a chance to start.

Fix:
* calculate correctly the podFound variable and ensure its value
  its true only if the Pod is actually found. This way the culling
  annotation will be removed when there is no Pod.

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>

Signed-off-by: Apostolos Gerakaris <apoger@arrikto.com>
2022-09-22 14:07:38 +00:00
apoger 02e2fa4c65 Fix #6056: Update Notebook status properly (kubeflow/kubeflow#6628)
* Fix #6056: Update Notebook status properly

Signed-off-by: Apostolos Gerakaris apoger@arrikto.com

* Added suggested code changes

Signed-off-by: Apostolos Gerakaris apoger@arrikto.com

* notebook-controller: Add unit tests

*Introduce basic unit tests for "createNotebookStatus" function
*Add GH action for unit tests

Signed-off-by: Apostolos Gerakaris apoger@arrikto.com

* Fix PodCoditionsMirroringToNotebook & Unit-tests

We encountered an error during testing. It seems that
the pod.status.conditions.condition.LastProbeTime remains
always null and so the controller ends up applying a Notebook
CR instance with null condition values.

Relevant Issues:
*https://github.com/kubernetes/kubernetes/issues/109958
*https://github.com/kubernetes/kubernetes/issues/79402
*https://github.com/kubernetes/kubernetes/issues/14393

Fix: Check if the Pod's condition.LastProbeTime
and condition.LastTransitionTime timestamp fields are null.
If so, initialize them so we dont end up applying
a Notebook instance with null condition values.

Other changes:
*Fix basic unit tests
*Introduced a unit test for the case where Notebook's Pod
 is unschedulable

Signed-off-by: Apostolos Gerakaris apoger@arrikto.com

Signed-off-by: Apostolos Gerakaris apoger@arrikto.com
2022-08-30 13:47:55 +00:00
Midhun Nair 40ef0ffe74 Fix #6528: Mirroring Pod conditions to Notebook (kubeflow/kubeflow#6619)
* Fix #6528: Mirroring Pod conditions to Notebook

* Added missing fields which are part of PodConditions into NotebookConditions

* Added suggested changes
2022-08-26 10:25:49 +00:00
mofanke 79df9e86b2 notebooks: Fix notebook endless restarts (kubeflow/kubeflow#6337) (kubeflow/kubeflow#6603) 2022-07-28 15:33:54 +00:00
Hyunwoo Kim 958df81ff7 Fix typo in notebook_controller.go (kubeflow/kubeflow#6577) 2022-07-18 18:37:08 +00:00
Jeongwook Park bba1bf22ee notebooks: Allow notebook controller to patch events (kubeflow/kubeflow#6523) 2022-06-20 12:25:37 +00:00
Samu 0215857aa9 Support K8s 1.22 in notebook controller (kubeflow/kubeflow#6374)
Fix https://github.com/kubeflow/kubeflow/issues/6366

Migrating to Kubebuilder v3 leads to the following changes:
- Add .dockerignore file.
- Upgrade Go version from v1.15 to v1.17.
- Adapt Makefile.
- Add image (build + push) target to makefile.
- Upgrade EnvTest to use K8s v1.22.
- Update PROJECT template.
- Migrate CRD apiVersion from v1beta to v1.
- Add livenessProbe and readinessProbe to controller manager.
- Upgrade controller-runtime from v0.2.0 to v0.11.0.

Other changes:
- Build image using public.ecr.aws registry instead of gcr.io.
- Update README.md documentation.
- Update 3rd party licences.
- Fix notebook.spec description.
- Add 3 sample notebooks (v1, v1alpha1 and v1beta1).

Signed-off-by: Samuel Veloso <svelosol@redhat.com>
2022-05-03 15:49:01 +00:00
Kimonas Sotirchos 5530e00467 notebooks: Don't reconcile on Events deletion (kubeflow/kubeflow#6391)
The controller should not trigger the reconcile loop when an Event is
deleted. Previously the controller would run the reconciliation loop on
any event deletion.

This commit updates it to not run the loop for ANY event.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
2022-03-04 17:09:59 +00:00
Kimonas Sotirchos 1d24b75f57 notebooks: Fix endless restarts (kubeflow/kubeflow#6341)
* notebooks: Update notebook if timestamp changed

We don't want to be updating the spec of the notebook if the timestamp
hasn't changed, since this will lead to constant updates and
reconciliation loops.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* notebooks: Use a deep-copy of the notebook spec

The controller should use a deep-copy of the notebook spec when
calculating the spec for the StatefulSet. If not then we could
update the notebook object without wanting it, since the spec could have
been changed when calculating the STS spec.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* notebooks: Add prefix env var only if missing

The controller should be setting OR updating the NB_PREFIX env var.
Previously it would always blindly append it to the spec, which could
result in double entries for the same env var.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
2022-02-09 17:32:07 +00:00
Kimonas Sotirchos 9510a2b913 notebooks: Graceful handling of events (kubeflow/kubeflow#6338)
* notebooks: Handle events gracefully

The controller is not exiting the reconciliation loop after it has
re-emitted a Pod/STS Event as a Notebook Event. This results in the
controller to later on try and GET a Notebook with the name of the Event
that triggered the reconciliation loop.

The controller should exit the reconciliation function once it has
emitted the event.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* notebooks: Don't reconcile on deleted events

We don't want to trigger the reconciliation function when an event gets
deleted.

If a Notebook would be deleted then the underlying events would
be deleted as well, which results in the reconcile function to get
triggered and try to GET Events and Notebooks with the name of the
deleted event.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
2022-02-09 15:01:07 +00:00
Athanasios Markou fbf5110f01 notebooks: Extend Notebook Controller to expose idleness for Jupyter (kubeflow/kubeflow#6297)
* notebooks: Update image's tag in make

Modify Makefile to update properly the TAG
based on the git TAG.

Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* notebooks: Expose last-activity

Extend the notebook-controller to:
* cull idle Notebook Servers based on their new `last-activity`
  annotation
* expose the last activity of each Notebook Server as an annotation
  on the metadata of the corresponding CR object

Modify notebook_controller.go to:
* update the Last Activity of each Notebook Server that has a
  Running pod
* delete the Last Activity Annotation for every Notebook Server
  that does not have a Running pod

Extend culler.go to:
* perform culling based on the new `last-activity` annotation and
  not based on the `/api/status` endpoint.
* update the last activity of a Notebook Server, based on the
  kernels' execution states.

Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Athanasios Markou <athamark@arrikto.com>

* notebooks: Introduce a DEV env var

We introduce a DEV ENV var to allow admins
develop and test on their local machine their
custom Notebook Controller.
We provide information and instructions inside
the components/notebook-controller/README.md.

Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* notebooks: Add unit tests for last-activity

* Introduce new tests for allKernelsAreIdle()
* Extend the tests for NotebookIsIdle() and for
  NotebookNeedsCulling().

Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>

* review: UpdateNotebookLastActivityAnnotation()

Ensure that UpdateNotebookLastActivityAnnotation() does not return
"true". This function should not return any value.

Signed-off-by: Athanasios Markou <athamark@arrikto.com>
2022-02-07 15:19:17 +00:00
Abe Sharp 5e960331fd Remove virtualservice timeout to prevent websocket disconnect (kubeflow/kubeflow#6126)
In the existing version, the 'timeout: 300s' added to the notebook's virtual service would cause websockets to disconnect at the 5 minute mark, causing the Jupyter Notebook web terminal function to hang. This is described in https://github.com/kubeflow/kubeflow/issues/6124.
2021-09-09 03:07:01 -07:00
Filinto Duran 5ae1de4dcc Correct missing predicates in controller watches. Fixes #5326 (kubeflow/kubeflow#5873)
Co-authored-by: Filinto Duran <fduran@d2iq.com>
2021-08-11 09:17:26 -07:00
Yannis Zarkadas ae3b53f8d2 Notebook Controller: Consolidate manifests (kubeflow/kubeflow#5723)
* notebook-controller: Modify kubebuilder manifests

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>

* notebook-controller: Set storageVersion to v1

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>

* notebook-controller: Fix RBAC

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>

* notebook-controller: Regenerate manifests

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>

* notebook-controller: Remove unused kubebuilder manifests

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
2021-03-19 10:22:16 -07:00
DavidSpek a1e52c2b9e (Notebook-controller): Add `http-rewrite-uri` and `http-headers-request-set` annotations (kubeflow/kubeflow#5660)
Co-authored-by: Mathew Wicks <thesuperzapper@users.noreply.github.com>
2021-03-12 04:14:24 -08:00
gilbeckers 82c04b3be1 Correct ContainerStatus of Notebook CR (kubeflow/kubeflow#5314)
* Correct ContainerStatus of Notebook CR

The Notebook Controller doesn't set the State of the CR correctly. In some cases
the first container is the istio-sidecar which results in an incorrect state being
shown to the Notebook CR. This is fix now by showing the Notebook container
ContainerState to the Notebook CR ContainerState

* Changed log statement and added a comment
Implemented remarks of @yanniszark and @kimwnasptd

* Small reorganization of some if statements
2021-01-04 01:27:55 -08:00
Naveen bc8df5407e Implemented functional tests using ginkgo for notebook controller (kubeflow/kubeflow#5378)
* Implemented functional tests using ginkgo

The notebook controller can be tested using sigs.k8s.io/controller-runtime/pkg/envtest which comes as part of kubebuilder. With this we should be able to measurable test coverage.

* Fixed the incorrect test condition and included fix to download the envtest binaries.

Fixed the incorrect test condition and included fix to download the envtest binaries.

* Some tweaks based on review.

* Removed the check-license as it was blocking the test.
Included some of the tweaked yaml's files that were being generated.
2020-11-11 05:57:49 -08:00
Nihir Patel b13382b558 notebook_controller.go: make clusterDomain an option (kubeflow/kubeflow#4468) 2020-07-03 19:42:48 -07:00
Humair 8470751a58 Fix notebook controller rbac gen (kubeflow/kubeflow#5083) 2020-06-22 07:18:39 -07:00
Ali Soume'e 6942bf5f87 Remove duplicate import (kubeflow/kubeflow#5058)
"k8s.io/api/core/v1" was imported with names "corev1" and "v1"
2020-06-08 20:47:19 -07:00
Chad Roberts 25bf002c34 Adding env var to suppress automatic additon of fsGroup in notebook pod (kubeflow/kubeflow#4713) (kubeflow/kubeflow#4782)
* Allowing for an env var ADD_FSGROUP to be set to false to suppress the automatic addition of fsGroup: 100 in the pod's security context.
This addresses issue #4617.

* Adding note in README regarding ADD_FSGROUP.
2020-02-19 09:08:25 -08:00
Yannis Zarkadas e02a82fbcc notebook-controller: Fix event filtering (kubeflow/kubeflow#4777)
This commit fixes the event filtering check, so it doesn't crash when
the Pod name doesn't contain a dash ("-").

Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
2020-02-19 08:44:25 -08:00
Jeremy Lewi d25a14aea2 Fix notebook controller and tensorboard controller docker image build. (kubeflow/kubeflow#4631)
* The jupyter docker image isn't building because it now depends on code
  in components/common

* To make this work we need to configure it as a multi module package
  and modify go.mod to redirect to a local path.

* Ref: https://github.com/golang/go/wiki/Modules#when-should-i-use-the-replace-directive

* Replaces PR #4583

Related to #4582 - Jupyter image doesn't build.
2020-01-07 16:25:41 -08:00
Fernando Diaz 1ff2f7a880 Reissue pod and sts events as notebook events (kubeflow/kubeflow#4139) 2019-11-21 12:07:29 -08:00
Quanjie Lin 1236c5e6d7 initial checkin of tensorboard controller (kubeflow/kubeflow#4312)
* initial checkin of tensorboard controller

* initial checkin of tensorboard controller

* typo

* typo

* fix typo

* support local path

* add status

* conflict

* remove binary
2019-10-29 09:12:44 -07:00
Lun-Kai Hsu 2fe3108347 fix notebook route (kubeflow/kubeflow#4402) 2019-10-24 16:01:39 -07:00
Ben Ye 2e7dc7ec06 add culling metrics (kubeflow/kubeflow#4336)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-10-17 21:37:57 -07:00
Ben Ye d14f6ac07f support metrics in notebook-controller (kubeflow/kubeflow#4123)
Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-10-16 00:15:40 -07:00
Jeremie Vallee c88e721fc7 [3945] Configurable Istio Gateway for Notebook Controller (kubeflow/kubeflow#4216) 2019-10-14 12:06:59 -07:00
Ben Ye 807843ec2a cleanup some codes in notebook controller (kubeflow/kubeflow#4098)
* cleanup some codes in notebook controller

Signed-off-by: yeya24 <yb532204897@gmail.com>

* remove ambassador in notebook controller

Signed-off-by: yeya24 <yb532204897@gmail.com>
2019-10-14 12:06:52 -07:00
Lun-Kai Hsu 2f2938bead Notebook v1beta1 (kubeflow/kubeflow#4105)
* add v1beta1

* add storage version

* wip

* add conversion

* setup webhook

* fix

* fix manifest

* webhook wip

* no webhook
2019-09-13 07:04:29 -07:00
Lun-Kai Hsu 8cad496a13 Migrate notebook CR to kubebuilder V2 (kubeflow/kubeflow#4013)
* wip

* can build

* tested: able to control notebook

* fix
2019-09-04 17:06:22 -07:00