* Create Tensorboard web-app backend
Create the code for the Tensorboard web-app backend which
includes routes for GET, POST and DELETE requests.
The backend is created with Python/Flask, so it also uses
the common code from 'kubeflow.kubeflow.crud_backend'.
* Add 'get_age(k8s_object)' function to 'crud_backend' common code
It would be useful for all web apps of the 'crud-web-apps' folder
to return age information to their frontends.
As a result, 'get_age(k8s_object)' was added to the common code,
so that all web apps can use it.
Create a python module under the kubeflow.kubeflow package that will
be exposing common code and a base app the takes care of:
* Exceptions handling
* Common routes for serving static files and their cache control policy
* Authorization checks with SubjectAccessReview
* Authentication checks on the Kubeflow headers
* Common helper functions for dates, yaml parsing etc
* health/liveness probes
Backends that are written with Python/Flask should use this common code
in order for us to reduce code duplication and have our backends align
with our accepted practices.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Create a new directory in components for web apps
Since we want to also have some common code between our web apps we
should create a parent dir for any future web app we want to develop.
The code for the web apps, common or not, should be organized under this
directory.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* remove the reviewers
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Remove duplicate package import
Package "k8s.io/api/core/v1" was imported twice with names "v1"
and "corev1".
* Mount GCP secret only when accessing Google storage
The Tensorboard controller used to create pods (running the Tensorboard
server) that would always mount user-gcp-sa secret, regardless of the
logs storage being a Google cloud bucket or not. This would lead to pods
never starting properly in the case of using other cloud services (or
PVCs) as log storages, if the user-gcp-sa secret didn't exist on the
cluster.
In order for the Tensorboard server pods to run properly, user-gcp-sa
secret is now mounted only when Google cloud buckets are used as log
storages.
Fixeskubeflow/kubeflow#5065
* Allowing for an env var ADD_FSGROUP to be set to false to suppress the automatic addition of fsGroup: 100 in the pod's security context.
This addresses issue #4617.
* Adding note in README regarding ADD_FSGROUP.
This commit fixes the event filtering check, so it doesn't crash when
the Pod name doesn't contain a dash ("-").
Signed-off-by: Yannis Zarkadas <yanniszark@arrikto.com>
* Fix docker builds of notebook and tensorboard controller
* The notebook-controllers and tensorboard-controllers now depend on
the go package components/common
* We need to rewrite the Dockerfiles so that the context is now
${KUBEfLOW_REPO}/common
* so that components/common can be included in the context and copied
to the Dockerfile
* Create skaffold configs to make it easier to do remote builds with Kaniko
* The skaffold configs are currently written assuming the kubeflow-ci cluster
is used to build the images. This could be generalized in the future.
* Remove the code to build the notebook-controller with GCB; we can just
use skaffold and kaniko to do efficient remote builds.
* Related to #4582 - Jupyter image doesn't build.
* Fix docker build rule.
* The jupyter docker image isn't building because it now depends on code
in components/common
* To make this work we need to configure it as a multi module package
and modify go.mod to redirect to a local path.
* Ref: https://github.com/golang/go/wiki/Modules#when-should-i-use-the-replace-directive
* Replaces PR #4583
Related to #4582 - Jupyter image doesn't build.
* Delete all the Tekton pipelines and scripts for continuous delivery
of Kubeflow applications because they are moving into kubeflow/testing
* kubeflow/testing#551 is the PR moving the code into kubeflow/testing
Related to: kubeflow/testing#544 redo how we use kustomize and Tekton
to parameterize the pipelines
* Migrate to kustomize3: Phase 1. Update kustomization.yaml
* Migrate to kustomize3: Phase 2: Update kustomize.go
- Update kustomize.go to match new package structure.
- Update module dependencies.
* Migrate to kustomize3: Phase 3: Implements code review
- As per request, revert kustomization.yaml back to deprecated syntax.
- As per request, revert kustomize.go to use deprecated .Bases field.
- Note: patchesStrategicMerge: will be turned into a deprecated field pretty soon.
- Rerun go mod tidy
* Migrate to kustomize3: Phase 4: Activate legacy order transformer
* Create a culler as a package
Helper functions for culling resources. Takes for granted that ISTIO is
installed to the system and queries Prometheus to get metrics.
Specifically, requests/{configurable time}.
If the resource should be culled, then it should be done by setting an
annotation. This way the UIs can also show that the Resource is stopping
and also easily stop a resource by making a PATCH request.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Culling logic enhancements
Add necessary ENV Vars. Culling won't happen by default. To enable it
the user will need to set the ENABLE_CULLING=true
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Misc fixes in logging and comment cleanup
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Fix typo
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Add Notebooks specific culling
Query the /api/status endpoint of each Server
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Remove the generic culling logic
We need to discuss if it would make sense to have this logic as a go
library, or use knative.
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Add unit tests
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Remove unused code
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Review changes #1
* rename `getEnvDef` to `getEnvDefault`
* Add a comment to describe how the STOP_ANNOTATION gets used
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* Make cluster domain configurable
Signed-off-by: Kimonas Sotirchos <kimwnasptd@arrikto.com>
* added detailes into NotebookCondition to keep track of notebook container status change
* update notebook controller image
* fix conitions update
* small fix
* temporary changes to debug
* temporary remove delete step from workflow for debugging
* temoraray merging kfctl-test and kfctl-go-test fir debugging
* debugging
* undo the mistake
* debugging
* debugging tests
* merged kfctl-test and kfctl-go-test
* remove wait-for-kubeflow
* merged with master
* remove test delete step for debugging
* small fix
* update jupyter test component
* update condition test for jupyter component
* revert back deleting step
* revert back change in kfctl.sh
* added some temporary change to debug jupyter-test
* revert back temp changes
* profile and Istio integration
* make profile manage Istio gateway
* add README.md
* make notebooks use gateway in kubeflow namespace
* gateway format to ns/name; add watch for istio ServiceRoleBinding
* Support setting auth header format via parameter
* update README
* update README
* update readme; resolve comments
* added ReadyReplicas status to notebook-controller
* fixed issues related to updating the notebook status
* fixed a problem in updating Notebook's status
* applied cr comments
* small change
* small formating change
* Fix Python code styles based on Pep8 and flake8
* More syle fixes to Python code
* Update python code styles based on what's provided in .style.yapf
* Sync with master and update styles
* Sync with master
* More Python style fixes
* Changes per code review
* Sync with master and update the remaining files
* Add a .flake8 config file for future reference