The controller should not trigger the reconcile loop when an Event is
deleted. Previously the controller would run the reconciliation loop on
any event deletion.
This commit updates it to not run the loop for ANY event.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Update notebook if timestamp changed
We don't want to be updating the spec of the notebook if the timestamp
hasn't changed, since this will lead to constant updates and
reconciliation loops.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Use a deep-copy of the notebook spec
The controller should use a deep-copy of the notebook spec when
calculating the spec for the StatefulSet. If not then we could
update the notebook object without wanting it, since the spec could have
been changed when calculating the STS spec.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Add prefix env var only if missing
The controller should be setting OR updating the NB_PREFIX env var.
Previously it would always blindly append it to the spec, which could
result in double entries for the same env var.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Handle events gracefully
The controller is not exiting the reconciliation loop after it has
re-emitted a Pod/STS Event as a Notebook Event. This results in the
controller to later on try and GET a Notebook with the name of the Event
that triggered the reconciliation loop.
The controller should exit the reconciliation function once it has
emitted the event.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Don't reconcile on deleted events
We don't want to trigger the reconciliation function when an event gets
deleted.
If a Notebook would be deleted then the underlying events would
be deleted as well, which results in the reconcile function to get
triggered and try to GET Events and Notebooks with the name of the
deleted event.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* wa(back): Add helper for deserializing JSON obj
In some cases we might need to construct Python k8s lib objects from the
JSONs that are provided by clients. I.e. the UI will be sending a PVC
object in json format, so the backend will need to create the
corresponding client.V1PersistentVolumeClaim object and submit it.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Ilias Katsakioris <elikatsis@arrikto.com>
* wa(back): Serialization helper
Add helper function for converting a k8s-client object into a dict that
can be sent as an HTTP response.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Ilias Katsakioris <elikatsis@arrikto.com>
* wa(back): Add dry run to Notebooks and PVCs
The backend will need to be able to create objects with dry-run, in
order to ensure they are valid. The backend will need to check that both
the Notebook and the PVCs can be created beforehand.
This way we avoid the scenario where we create PVCs but the Notebook
fails to be created, and the PVCs are never garbage collected.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Ilias Katsakioris <elikatsis@arrikto.com>
* wa(back): Update kubernetes to 0.17
In order to support dry-run we must use the 0.17 version of the Python
k8s client.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Ilias Katsakioris <elikatsis@arrikto.com>
* wa(back): Extend api module to patch pvcs
The backend will need to be able to PATCH PVCs in order to set the
ownerReference to the Notebook that mounts the PVCs.
Ref: arrikto/dev/issues/386#issuecomment-856700392
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Ilias Katsakioris <elikatsis@arrikto.com>
* jwa(back): Work with new Volumes API
The backend API should not add any more layers of abstractions on top of
the K8s API. The backend should expect the client/UI to be sending the
entire PVC spec of a new PVC.
Refs: arrikto/dev/issues/386
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Ilias Katsakioris <elikatsis@arrikto.com>
* jwa(back): Add unittests for new volumes API
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Ilias Katsakioris <elikatsis@arrikto.com>
* jwa(back): Extend the PVC info returned
We want to show both the access mode and size of the existing PVCs, when
a user clicks on the dropdown to select which PVC to mount.
The backend will need to provide this information to the frontend. We
don't want to send the K8s list of PVCs since this will result in a lot
of unnecessary data to be sent.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Ilias Katsakioris <elikatsis@arrikto.com>
* jwa(front): Add proxy config for Rok
When developing the Rok flavor locally we will need to be able to open
the Rok chooser. This can be done by using Angular/webpack proxy to
bring the exposed rok service and the app under the same domain.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* jwa(front): Remove card from form
The form of the app should not be a big card, but a normal form.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* jwa(front): Install AceModule for yaml editing
Install AceModule to allow users to edit yamls of objects.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* wa(front): Change the styling of form sections
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* jwa(front): Create common volume components
Component for:
* New PVC and configuring its spec
* Attaching an existing PVC in a Notebook
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* jwa(front): Update Rok form for new Volume API
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* jwa(front): Mark inputs as dirty when restoring Lab
When the UI autofills the form with values from a JupyterLab snapshot
then it should mark the touched fields as dirty. This way if a field has
errors the UI will make that input red.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* jwa: Update ConfigMap in manifests
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* jwa(front): Fix format errors
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Update image's tag in make
Modify Makefile to update properly the TAG
based on the git TAG.
Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Expose last-activity
Extend the notebook-controller to:
* cull idle Notebook Servers based on their new `last-activity`
annotation
* expose the last activity of each Notebook Server as an annotation
on the metadata of the corresponding CR object
Modify notebook_controller.go to:
* update the Last Activity of each Notebook Server that has a
Running pod
* delete the Last Activity Annotation for every Notebook Server
that does not have a Running pod
Extend culler.go to:
* perform culling based on the new `last-activity` annotation and
not based on the `/api/status` endpoint.
* update the last activity of a Notebook Server, based on the
kernels' execution states.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Athanasios Markou <athamark@arrikto.com>
* notebooks: Introduce a DEV env var
We introduce a DEV ENV var to allow admins
develop and test on their local machine their
custom Notebook Controller.
We provide information and instructions inside
the components/notebook-controller/README.md.
Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* notebooks: Add unit tests for last-activity
* Introduce new tests for allKernelsAreIdle()
* Extend the tests for NotebookIsIdle() and for
NotebookNeedsCulling().
Signed-off-by: Athanasios Markou <athamark@arrikto.com>
Reviewed-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* review: UpdateNotebookLastActivityAnnotation()
Ensure that UpdateNotebookLastActivityAnnotation() does not return
"true". This function should not return any value.
Signed-off-by: Athanasios Markou <athamark@arrikto.com>
* Update the releasing version tag
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Run automated script for updating versions
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* wa(docker): Don't copy node_modules
To ensure Kaniko is not copying stale node_moduels folders, even though
we have a dockerignore file, we are explicitly only copying the source
code.
We have seen the build system with Kaniko to fail, due to NFS stale
instances with files in node_modules and expect that this is the root
cause.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* twa: Update makefile to use global dockerignore
The dockerfile for TWA was copying over the local dockerignore. This was
overriding the global one we had for all the web apps.
This commit updates the Makefile of the app to use the global
dockerignore that all the apps should use.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* jwa(front): Make ng lint work by ignoring e2e/tsconfig.json
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* web-apps(front): Make the table responsive
We introduced all of our changes in the underlying TableComponent. This
component can then be either used independently or inside a material
card.
The changes we did in that component are:
1. Create Output() emitters, since this component can be used directly
2. The config object for a Table row now supports a `style` prop for
defining the list of CSS styles to be applied
2. Remove the truncate classes (small, medium, large) and only have
a boolean value. The user can define the width directly now via the
`style` property in the row's config
3. Modify the classes for aligning contents right and left
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* web-apps(front): Add table paginator
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* web-apps(front): Add padding to titlebar text
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* web-apps(front): Use bigger font for toolbar title
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* jwa(front): Use responsive table and toolbar
We refactor both the index and form pages of the app to:
1. Add a top row toolbar with the title of the app and the button to
create a new Notebook
2. Replace the card with a responsive table that shows the items. The
component also has a paginator
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* vwa(front): Use responsive table and toolbar
This commit:
1. Adds a toolbar at the top of the index page with the title of the app
and the button to create a new volume
2. Replaces the card with a responsive table
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* twa(front): Use responsive table and toolbar
This commit:
1. Adds a toolbar at the top of the index page with the title of the app
and a button to create a new TensorBoard instance
2. Replaces the card with a resopnsive table
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Tasos Alexiou <tasos@arrikto.com>
* fix the format
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* wa(front): Add npm script for running unit tests in docker
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* fix unit tests failing
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* kfam: Upgrade go to 1.17
Update to a more recent docker image that has a newer version of
openssl.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* poddefaults: Upgrade go to 1.17
Update to a more recent docker image that has a newer version of
openssl.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* tensorboards: Upgrade go to 1.17
Update to a more recent docker image that has a newer version of
openssl.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
JWA should not block users from selecting GPUs if the current cluster
nodes do not have any GPUs attached to them. We've seen users that have
autoscaled nodegroups for GPUs, so a GPU node will be added to the
cluster once a Pod has requested it.
Refs: arrikto/dev#1484
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
In the existing version, the 'timeout: 300s' added to the notebook's virtual service would cause websockets to disconnect at the 5 minute mark, causing the Jupyter Notebook web terminal function to hang. This is described in https://github.com/kubeflow/kubeflow/issues/6124.
* jwa(front): Don't allow NaN values in limits
The UI should always catch a NaN value and don't add it in the form.
Currently this is the case for the cpu/memory limits.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* jwa(front): Limits should not be changed if dirty
If the user has manually edited the limits fields then the UI should not
try to automatically calculate them again, using the limitFactors.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Update dockerfiles and make compatible with Renovate
* Set memory for jupyter pytorch to the same as jupyter tensorflow
* Update protobuf
* Remove conda version and use substring expansion
* Update SQLAlchemy
* Update dill
* [fix]: Make jupyter-web-app parse workspace volume MountPath
- workspace volume path was fixed with "/home/jovyan"
- it should be enable to parse from jupyter-web-app-config's data
* change parsing key correctly
* cwa(front): Ignore font files in assets
* feat(jupyter): add fonts as assets to service
* CRUD: fonts in common
* CWA: Remove link to css file
* jwa(front): Remove font assets from jupyter
Co-authored-by: Wendy Gaultier <wvgaultier@gmail.com>
* jwa(front): Add static logos in the app
The app does not contain the logos' svgs in its source code/static
files. This results in the icons to not show when developing locally.
This commit adds the svgs found in the logos ConfigMap to the static
files of the app as well.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* jwa(front): Change logos fetch url
Change the URLs of the logos from `static/assets/*` to
`static/assets/logos`.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* jwa(manifests): Don't override assets with logos
Mount the ConfigMap under the `static/assets/logos` directory to not
override the contents of the entire assets dir.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* vwa(front): Add npm script to check the formatting
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* vwa(front): Update the package-lock.json
Run `npm install` to bring the package-lock.json up to date
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* vwa(front): Fix formatting
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* vwa(back): Fix formatting
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* vwa(docker): Remove unused dockerignore file
We have created a global dockerignore file for all the web apps in the
parent dir.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* vwa(docker): Don't copy node_modules in dockerfile
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* vwa(make): Don't include dockerignore
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* ci(vwa): Add format check tasks
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* jwa(docker): Copy only necessary files for build
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* review: Use docker context instead of cd ..
Don't use a `cd ..` and copy dockerignore files back and forth. Instead
we should use the Docker context and the global dockerignore file we
have for all the web apps.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Make notebook limits configurable with a multiplication factor
* Make limits configurable under advanced section
* run prettier to format frontend code
* fix formatting and add rounding in backend
* Return error if limit is smaller than request
* Allow disabling limitFactor by setting it to none
* review: remove camelCase in python backend
* fix: update spawner_ui_config.yaml in manifests directory
* review: fix setting limits backend
* review: remove unnecessary check from backend
* rebase: Make logos configurable in configmap and remove trademark references
Rebased to remove the changes to the package-lock.json
* review: add suggested changes and add image group section to README
When the TB controller attempts to schedule a RWO PVC it checks its
accessModes in the PVC status. The controller panics if the list is
empty.
This commit adds a check to ensure the list is not empty.
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* web-apps(back): Introduce APP_SECURE_COOKIES var
Expose a new APP_SECURE_COOKIES env variable that will configure whether
the web apps should set Secure cookies or not.
This will allow the admins to configure the web apps to work when
Kubeflow is exposed over localhost/http.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(back): Switch CSRF checks order
The order the backend makes the CSRF checks should be the following:
1. check if the CSRF cookie is present
2. check if the CSRF header is present
3. check if the CSRF cookie and header have the same value
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps: Set APP_SECURE_COOKIES to false in dev
When running the web apps via the makefiles in dev mode we will need to
explicitly set the APP_SECURE_COOKIES env var to False, since the app
will be served over http.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Add RStudio trademark statement
* move tooltip position to the right of the RStudio button
* fix labels of icons
* RStudio trademark tooltip on index page, remove title jupyter-icon
* jwa(back): ability to setting annotations on NB resource
* jwa(back): update spanwer yaml, dump logo from yaml if file doesn't exist
* jwa(front): add annotations and VSCode/RStudio image types/config
* jwa(front): add server type toggle to UI
* jwa(front): set annotations in notebook request based on server-type
* jwa(front): add server type column to index page
* review: improve button toggle formatting
* jwa(back): set rstudio-tidyverse image in spawner_ui_config
* review: move rewrite and headers to backend
* review: add logo SVGs and set them in environment*.ts
* review: fix how allowing custom images works
* review: add server type logo to index
* Add base dockerfile for all jupyter based images
* cleanup jupyter notebook image
Co-authored-by: Mathew Wicks <thesuperzapper@users.noreply.github.com>
* Add base dockerfile for all Web-IDE images (jupyter, r-studio, vs-code)
remove sudo from image
Add S6-overlay
Change naming and add CD
Add ci build test
change naming in prow config to avoid character limit
Add OWNERS file
rename folder
rename folder (again)
remove labels
Rename to the final folder
* cleanup base notebook image Dockerfile
Co-authored-by: Mathew Wicks <thesuperzapper@users.noreply.github.com>
* web-apps(back): Introduce an APP_NO_AUTHNZ env var
The admin can use the APP_NO_AUTHNZ={True,False} to configure if the
application should perform authnz checks or not.
In case of False, then the app will not be expecting a logged in user, in the
`kubeflow-userid` header, and will not perform authorization checks using
SubjectAccessReviews.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Expose if Dashboard is connected
The NamespaceService will also provide an observable that informs
different parts of the app if the CentralDashboard is present.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* review: Use enumeration for Dashboard state
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* review: Move common vars to a settings module
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
As part of the work of wg-manifests for 1.3
(https://github.com/kubeflow/manifests/issues/1735), we are moving manifests
development in upstream repos. This gives the application developers full
ownership of their manifests, tracked in a single place.
This commit copies the manifests for application `Jupyter Web App`
from path `apps/jupyter/jupyter-web-app/upstream` of kubeflow/manifests to path
`components/crud-web-apps/jupyter/manifests` of the upstream repo (https://github.com/kubeflow/kubeflow).
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
As part of the work of wg-manifests for 1.3
(https://github.com/kubeflow/manifests/issues/1735), we are moving manifests
development in upstream repos. This gives the application developers full
ownership of their manifests, tracked in a single place.
This commit copies the manifests for application `Notebook Controller`
from path `apps/jupyter/notebook-controller/upstream` of kubeflow/manifests to path
`components/notebook-controller/config` of the upstream repo (https://github.com/kubeflow/kubeflow).
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
In order for the frontend tests to run in a CI system, which will be
using a container to run them, we will need to make some adjustments.
Namely, we will need to:
* Run a headless version of Chrome
* Run the `ng test` in non-watch mode, in order for the testing process
to terminate and not watch for changes to the codebase.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(back): Export package code for wheel
When building the wheel we need to define Python packages
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Don't include init in root dir
We don't want to have an init file in the setup directory
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(back): Fetch the correct default SC
The backend would only check for the annotation of the default
StorageClass but not if it's value would be true/false.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(back): Fetch Pod logs
Extend the common backend libraries to fetch pod logs
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Add global variables.scss
Since we will need the page padding in multiple places we'll create a
scss file to store these values.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Export the Dialog module
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Rework the Toolbar component
* Add option to make button to be stroked
* Don't emit an event when a button is clicked. Expect a function which
will be executed from this component.
* Put all the buttons on the right to mimic Pipelines UI
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Make the app's background white
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Export commonly used modules
The mat-divider and mat-icon modules are oftenly used so we could
include them in the exports of the KubeflowModule.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Key-val should be on top in list
In the list that shows key-valu pairs in lines we want the key to remain
on top if the value has a big height. Previously the key-title would be
in the middle height of the value, which looked weird.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Add a Comment component for forms
This is used to explain different sections in the form.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Bar at the bottom for every form
We want our forms to have a bar on the bottom with CREATE CANCEL buttons
and the ability to view the yaml contents of a CR.
This component is only used for visualization. The logic component
should be handling this component via this component's inputs/outputs
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): General purpose css
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Make k8s fields optional
We might need to instantiate a k8s objecti with only some of the
subfields, like only the spec and not metadata.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* remove redundant link class
Co-authored-by: Tasos Alexiou <tasos@arrikto.com>
* support fit-content for firefox as well
Co-authored-by: Tasos Alexiou <tasos@arrikto.com>
* firefox fit-content support
Co-authored-by: Tasos Alexiou <tasos@arrikto.com>
* use button instead of span
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Co-authored-by: Tasos Alexiou <tasos@arrikto.com>
* Use node:12 as specified in the docs
* Move to buster-slim
Move images that build the kubeflow library and frontend to buster-slim to reduce image size. Tested with the Jupyter Web App, assumed to also work for Tensorboards Web App once https://github.com/kubeflow/kubeflow/issues/5529 is solved.
Upgrade go version of the notebook-controller to 1.15, across the
Dockerfile, Makefile and README. We used the same Golang version as our Kubernetes
dependency, after @Jeffwan's suggestion.
* web-apps(back): Helper config functions
Introduce helper function for creating the config object for an app.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* twa: Update the build process
Update both the frontend and the backend of the Tensorboards web app to
follow the build/run process of the other web apps as well.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* twa: Update the README
Restructure the README to look like the JWA one. Also update the
instructions with the latest process for running the web app.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Correct ContainerStatus of Notebook CR
The Notebook Controller doesn't set the State of the CR correctly. In some cases
the first container is the istio-sidecar which results in an incorrect state being
shown to the Notebook CR. This is fix now by showing the Notebook container
ContainerState to the Notebook CR ContainerState
* Changed log statement and added a comment
Implemented remarks of @yanniszark and @kimwnasptd
* Small reorganization of some if statements
* web-apps(back): Add CSRF protection to the backend
The server of each crud-web-app will be setting an XSRF-TOKEN cookie to
the frontend. On each unsafe method (POST, PATCH etc) the backend will
check to make sure that the request:
* Contains an XSRF-TOKEN cookie
* Contains an X-XSRF-TOKEN header
* The value of the above values are the same
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(back): Document the CSRF_SAMESITE env var
Add a new table in the README of the common code to include the ENV vars
that a user can set in any web app. In the future we should also extend
the README of every app with the supported ENV vars.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Update the README
Update the readme with detailed commands on how to consume the library
as well as developer guidelines.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Fix typo in README
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Udate the common library
Add new components to the library. These components will enhance
* The current common table for visualizing objects
* The components we can use for a details-page for each object
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* web-apps(front): Add unit tests to common lib
Fix and introduce new unit tests for most of the components in the
library. We expect the developers to always run `ng test` before any PR
to ensure that the existing functionality is not broken.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* jwa(front): Add required packages for common lib
The common library will expect extra npm modules to be installed in each
app that consumes it.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
We use the local `../common` module to build `notebook-controller`. We
also need to specify a valid pseudo-version for `common` to support
importing the Notebook API in other modules. This is because according
to the `go.mod` docs [1]:
> exclude and replace directives only operate on the current (“main”)
> module. exclude and replace directives in modules other than the main
> module are ignored when building the main module.
If we don't replace the default "zero version" for `common` that is
generated in our require directive, then then builds fail for modules
that require the Notebook API. They will encounter an an "invalid
version" error for `common` at commit hash "000000000000".
[1]: https://github.com/golang/go/wiki/Modules#gomod
* Implemented functional tests using ginkgo
The notebook controller can be tested using sigs.k8s.io/controller-runtime/pkg/envtest which comes as part of kubebuilder. With this we should be able to measurable test coverage.
* Fixed the incorrect test condition and included fix to download the envtest binaries.
Fixed the incorrect test condition and included fix to download the envtest binaries.
* Some tweaks based on review.
* Removed the check-license as it was blocking the test.
Included some of the tweaked yaml's files that were being generated.
The default leader election ID is controller-leader-election-helper which could conflict when multiple controllers run within the same namespace. This is a required field in later versions of controller-runtime.
* Update the backend
For the frontend to work properly we will need to add the following
changes to jupyter web app's backend as well as to the common backend
code:
* rename the references from `flask_rest_backend` to `crud_backend` in
the web app's backend code
* add a route for exposing GPU info. This way the UI will block users
from creating Notebooks with a GPU type that is not installed at all
in the cluster
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Update the common frontend library
New functionality added:
* An `Advanced Settings` button that can expand and shrink to
expose/hide more options in the form
* All validators will have a debounce time to make the input of
characters smoother
* Extend the Status types to allow start/stopped resources
* Extend the main table config to support a button [ ex CONNECT for
jupyter web app ]
* The http services should use relative URLs
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Update the frontend to utilize the common code
The bulk of the new frontend code. The folder structure is changed to
make it more clear what pages are used from the page.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* {Make,Docker}files
Add Makefile and Dockerfiles. Note that GCB build process needs to be
updated.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* README.md
Add a readme that explains how to build the app and have a development
environment.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* WA: Backend common: update the library
Update the common python wheel wrt:
* How to distinguish between dev and prod mode
* Extra routes for handling Notebooks
* Serving the index.html for every non api route (SPA)
* Add a STOPPED state to the possible Status values
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* JWA: Add the refactored backend
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* JWA: Backend: Add support for Affinity/Tolerations
* Extend the configuration yaml with default form values for the
affinity/tolerations
* Set them accordingly when the user submits a notebook
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
Reviewed-by: Ilias Katsakioris <elikatsis@arrikto.com>
* Add frontend for the Tensorboard Web-app
This commit contains the code for the frontend of the Tensorboard
web-app. It completes the GSoC 2020 project for building the
standalone TWA for Kubeflow.
The app is not yet fully integrated to the Kubeflow dashboard, so
the README.md file contains documentation on how to build, run and
use the web-app locally.
Also, a Dockerfile was added in order to build a playground image
of the web-app. The 'deploy' folder contains manifests that will
enable the TWA to properlly run on the cluster in the future.
* Update README.md
* Add RWO_PVC_SCHEDULING env var to Tensorboard controller deployment
The value of the 'RWO_PVC_SCHEDULING' env var is set to "false" by
default. The user will be able to change the value of the env var
manually by modifying the 'config/manager/manager.yaml' file.
* Update README.md
* Add Tensorboard controller permissions for managing resources
The pod running the Tensorboard controller didn't have permissions
to manage the deployments, services, and VirtualServices needed
so that the Tensorboard servers would function properly.
In order for the deployed Tensorboard controller to run properly,
permissions to 'get', 'list', 'watch', 'create' and 'update'
are given to the Tensorboard controller pod so that the necessary
deployments, services and VirtualServices are created and managed
as expected. Also, permissions to 'get', 'list', 'watch' PVCs and
pods were added.
* Add namespace of Tensorboard CR to VirtualService prefix
In order to avoid creating 2 virtual services that have the same
prefix in different namespaces, the namespace of the corresponding
Tensorboard CR was added in the prefix of the generated Virtual
Service.
* Fix directory bug in Makefile
* Add README.md
* Extend Tensorboard CRD with status.readyReplicas field
The Tensorboard CRD didn't contain any information about the
Tensorboard server being ready or not. So, the status of the
Tensorboard resource is extended so that it contains a
readyReplicas field, similar to the status.readyReplicas of
the deployment of the Tensorboard server.
* Extend Tensorboard controller to update status of Tensorboard CR
The frontend of the Tensorboard web-app will need information
about whether the Tensorboard servers are ready to connect or not.
As a result, the Tensorboard controller now copies the value of the
status.readyReplicas field of the Tensorboard deployment to the
status.readyReplicas of the Tensorboard CR.
Also, a Deployment() function was added for applying and updating
Tensorboard server deployments.
* Update tensorboard.status.phase of TWA backend response
The frontend of the TWA will need information about the status
of the Tensorboard server, so that it can inform the user about
the server being ready being ready to connect or not.
As a result, the backend sets the status.phase field of the response
to "ready", if tensorboard.status.readyReplicas == 1. Otherwise, the
status.phase field of the response is set to "unavailable".
Also, the getPVCName() function was added, which extracts the name
of a given PVC object.
* Add GET route for PVCs
The Tensorboard web-app frontend will be using an autocomplete
drop-bar to show user the PVCs that live in a specific namespace.
These PVCs could be used as log storages for the Tensorboard server.
So, a PVC GET route was added to the Tensorboard web-app backend.
* Add message to Tensorboard response object in TWA backend
The frontend of the TWA will need to output a response message for
every Tensorboard object. This response message will inform the
user about the current state of the Tensorboard server.
* Use status.STATUS_PHASE for backend response
* Add requirements.txt to TWA backend
* Use status.create_status() for backend response
Create an Angular Library with common frontend code. Our crud web apps
should use this library to share common functionality like:
* Talking to Central Dashboard for the Namespace selection
* Making http calls
* Surfacing and showing error messages and warnings
* Form utilities
* Showing a table with entries and actions
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Add indexers as custom field selectors for list requests to cache
The tensorboard controller must be able to list pods that have
mounted a PVC with a specific ClaimName.
In order for this list request to cache to work properly, custom
field selectors are added. These selectors are used to index the
"pod.spec.volumes.persistentvolumeclaim.claimname" field so that
unneeded pods can be filtered out.
* Set pod's nodeAffinity if log files exist in a PVC
In the case of using a PVC as a logdir for Tensorboard Server, if
the PVC had a ReadWriteOnce access mode and was alread mounted by
another running pod X, then the Tensorboard Server pod would not
always be scheduled on the same node as X. As a result, the
Tensorboard Server pod would be blocked since multi-node access
is prohibited on ReadWriteOnce volumes.
In order for the Tensorboard Server pod to run successfully,
nodeAffinity was added to the spec.template.spec.affinity field
of the returned deployment.
As a result, both X and the Tensorboard
Server pod are now scheduled on the same node.
Resolveskubernetes/kubernetes#26567
* Set Tensorboard Server scheduling feature to 'off' by default
In the case that the Tensorboard Server used a RWO PVC (as a log
storage) that was already mounted by another pod, nodeAffinity
was used so that the Tensorboard Server would be scheduled
(if possible) on the same node as that pod.
Now, this added functionality is used only if the
'RWO_PVC_SCHEDULING' environmental variable is set to "true"
when running the Tensorboard controller.
This scheduling functionality is disabled by default.
* Create Tensorboard web-app backend
Create the code for the Tensorboard web-app backend which
includes routes for GET, POST and DELETE requests.
The backend is created with Python/Flask, so it also uses
the common code from 'kubeflow.kubeflow.crud_backend'.
* Add 'get_age(k8s_object)' function to 'crud_backend' common code
It would be useful for all web apps of the 'crud-web-apps' folder
to return age information to their frontends.
As a result, 'get_age(k8s_object)' was added to the common code,
so that all web apps can use it.
Create a python module under the kubeflow.kubeflow package that will
be exposing common code and a base app the takes care of:
* Exceptions handling
* Common routes for serving static files and their cache control policy
* Authorization checks with SubjectAccessReview
* Authentication checks on the Kubeflow headers
* Common helper functions for dates, yaml parsing etc
* health/liveness probes
Backends that are written with Python/Flask should use this common code
in order for us to reduce code duplication and have our backends align
with our accepted practices.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Create a new directory in components for web apps
Since we want to also have some common code between our web apps we
should create a parent dir for any future web app we want to develop.
The code for the web apps, common or not, should be organized under this
directory.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* remove the reviewers
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Remove duplicate package import
Package "k8s.io/api/core/v1" was imported twice with names "v1"
and "corev1".
* Mount GCP secret only when accessing Google storage
The Tensorboard controller used to create pods (running the Tensorboard
server) that would always mount user-gcp-sa secret, regardless of the
logs storage being a Google cloud bucket or not. This would lead to pods
never starting properly in the case of using other cloud services (or
PVCs) as log storages, if the user-gcp-sa secret didn't exist on the
cluster.
In order for the Tensorboard server pods to run properly, user-gcp-sa
secret is now mounted only when Google cloud buckets are used as log
storages.
Fixeskubeflow/kubeflow#5065
* Allowing for an env var ADD_FSGROUP to be set to false to suppress the automatic addition of fsGroup: 100 in the pod's security context.
This addresses issue #4617.
* Adding note in README regarding ADD_FSGROUP.
This commit fixes the event filtering check, so it doesn't crash when
the Pod name doesn't contain a dash ("-").
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
* Fix docker builds of notebook and tensorboard controller
* The notebook-controllers and tensorboard-controllers now depend on
the go package components/common
* We need to rewrite the Dockerfiles so that the context is now
${KUBEfLOW_REPO}/common
* so that components/common can be included in the context and copied
to the Dockerfile
* Create skaffold configs to make it easier to do remote builds with Kaniko
* The skaffold configs are currently written assuming the kubeflow-ci cluster
is used to build the images. This could be generalized in the future.
* Remove the code to build the notebook-controller with GCB; we can just
use skaffold and kaniko to do efficient remote builds.
* Related to #4582 - Jupyter image doesn't build.
* Fix docker build rule.
* The jupyter docker image isn't building because it now depends on code
in components/common
* To make this work we need to configure it as a multi module package
and modify go.mod to redirect to a local path.
* Ref: https://github.com/golang/go/wiki/Modules#when-should-i-use-the-replace-directive
* Replaces PR #4583
Related to #4582 - Jupyter image doesn't build.
* Delete all the Tekton pipelines and scripts for continuous delivery
of Kubeflow applications because they are moving into kubeflow/testing
* kubeflow/testing#551 is the PR moving the code into kubeflow/testing
Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
to parameterize the pipelines
* Migrate to kustomize3: Phase 1. Update kustomization.yaml
* Migrate to kustomize3: Phase 2: Update kustomize.go
- Update kustomize.go to match new package structure.
- Update module dependencies.
* Migrate to kustomize3: Phase 3: Implements code review
- As per request, revert kustomization.yaml back to deprecated syntax.
- As per request, revert kustomize.go to use deprecated .Bases field.
- Note: patchesStrategicMerge: will be turned into a deprecated field pretty soon.
- Rerun go mod tidy
* Migrate to kustomize3: Phase 4: Activate legacy order transformer
* Create a culler as a package
Helper functions for culling resources. Takes for granted that ISTIO is
installed to the system and queries Prometheus to get metrics.
Specifically, requests/{configurable time}.
If the resource should be culled, then it should be done by setting an
annotation. This way the UIs can also show that the Resource is stopping
and also easily stop a resource by making a PATCH request.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Culling logic enhancements
Add necessary ENV Vars. Culling won't happen by default. To enable it
the user will need to set the ENABLE_CULLING=true
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Misc fixes in logging and comment cleanup
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Fix typo
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Add Notebooks specific culling
Query the /api/status endpoint of each Server
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Remove the generic culling logic
We need to discuss if it would make sense to have this logic as a go
library, or use knative.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Add unit tests
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Remove unused code
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Review changes #1
* rename `getEnvDef` to `getEnvDefault`
* Add a comment to describe how the STOP_ANNOTATION gets used
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Make cluster domain configurable
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* added detailes into NotebookCondition to keep track of notebook container status change
* update notebook controller image
* fix conitions update
* small fix
* temporary changes to debug
* temporary remove delete step from workflow for debugging
* temoraray merging kfctl-test and kfctl-go-test fir debugging
* debugging
* undo the mistake
* debugging
* debugging tests
* merged kfctl-test and kfctl-go-test
* remove wait-for-kubeflow
* merged with master
* remove test delete step for debugging
* small fix
* update jupyter test component
* update condition test for jupyter component
* revert back deleting step
* revert back change in kfctl.sh
* added some temporary change to debug jupyter-test
* revert back temp changes
* profile and Istio integration
* make profile manage Istio gateway
* add README.md
* make notebooks use gateway in kubeflow namespace
* gateway format to ns/name; add watch for istio ServiceRoleBinding
* Support setting auth header format via parameter
* update README
* update README
* update readme; resolve comments
* added ReadyReplicas status to notebook-controller
* fixed issues related to updating the notebook status
* fixed a problem in updating Notebook's status
* applied cr comments
* small change
* small formating change
* Fix Python code styles based on Pep8 and flake8
* More syle fixes to Python code
* Update python code styles based on what's provided in .style.yapf
* Sync with master and update styles
* Sync with master
* More Python style fixes
* Changes per code review
* Sync with master and update the remaining files
* Add a .flake8 config file for future reference