* checkpointing
* more updates to keep gh summ pipelines example current
cleanup & update; remove obsolete pipelines
create 'preemptible' version of hosted kfp pipeline
notebook update, readme update
* in notebook, add kernel restart after pip install
minor pipeline cleanup
add archive version of pipeline
* fixed namespace glitch, cleaned up css positioning issue
* Add text classification example in TF 2.0
* Add neural machine translation example in TF 2.0
* Add README for tensorflow_2 directory
* Update README.md
Update README to make it more verbose and formal.
* Update directory name
This is important as this is an E2E tutorial. Moreover, the catch that GCP Free Tier and the 12-month trial period with $300 credit does not offer enough resources to run default GCP installation of Kubeflow is mentioned in those docs.
* Changes pulled in from kuueflow/examples#764
* Notebook tests should print a link to the stackdriver logs for
the actual notebook job.
* Related to kubeflow/testing#613
Co-authored-by: Gabriel Wen <gabrielwen@google.com>
Base image `FROM tensorflow/tensorflow:1.15.2-py3` uses python3 and therefore the python binary location is `/usr/bin/python3`. However, [tensorflow base image creates a symlink](e5bf8de410/tensorflow/tools/dockerfiles/dockerfiles/cpu.Dockerfile (L45)) to the current python binary as `/usr/local/bin/python` regardless if that is python version 2 or version 3, so that binary location should be used in the *ENTRYPOINT* of the `Dockerfile.model` instead of `/usr/bin/python` which is customary for Python v2.x installations.
* some mods to accommodate (perhaps temporary) changes in how the kfp sdk works
* Use gcs client libs rather than gsutil for a gcs copy; required due to changes in node service account permissions.
* more mods to address kfp syntax changes
* Fix the mnist_gcp_test.py
* The job spec was invalid; we were missing container name
* There were a bunch of other issues as well.
* Pull in the changes from xgboost_synthetic to upload an HTML version
of the notebook output to GCS.
* Add exceptoin
* Revert "Add exceptoin"
This reverts commit 44f34d9d74.
* Remove kustomize from mnist example.
* The mnist E2E guide has been updated to use notebooks and get rid
of kustomize
* We have notebooks for AWS, GCP, and Vanilla K8s.
* As such we no longer need the old, outdated kustomization files or
Docker containers anymore
* The notebooks handle parameterizing the K8s resources using Python
f style string.
* Update the README to remove the old instructions.
* Cleanup more references.
* Add method to get ALB hostname for aws users
* Revoke setup based on the platform
* Add AWS notebook for mnist e2e example
* Remove legacy kustomize manifests for mnist example
* Address feedbacks from reviewers
* Updated the azurepipeline example.
I believe there is a small bug in the script, use tmp variable to solve the issue.
* updated with github actions example
* Update README.md
Updated the readme further.
* Update README.md
* Update README.md
* Update data.py
* specifing version of ubuntu and updateing text
* updating spelling misstake
* update the linting
* updated with github actions example
* Update README.md
Updated the readme further.
* Update README.md
* Update README.md
* Update data.py
* specifing version of ubuntu and updateing text
* updating spelling misstake
* update the linting
* updated yaml
* Update data.py
Co-authored-by: JohanWork <39947546+JohanWork@users.noreply.github.com>
* A notebook to run the mnist E2E example on GCP.
This fixes a number of issues with the example
* Use ISTIO instead of Ambassador to add reverse proxy routes
* The training job needs to be updated to run in a profile created namespace in order to have the required service accounts
* See kubeflow/examples#713
* Running inside a notebook running on Kubeflow should ensure user
is running inside an appropriately setup namespace
* With ISTIO the default RBAC rules prevent the web UI from sending requests to the model server
* A short term fix was to not include the ISTIO side car
* In the future we can add an appropriate ISTIO rbac policy
* Using a notebook allows us to eliminate the use of kustomize
* This resolveskubeflow/examples#713 which required people to use
and old version of kustomize
* Rather than using kustomize we can use python f style strings to
write the YAML specs and then easily substitute in user specific values
* This should be more informative; it avoids introducing kustomize and
users can see the resource specs.
* I've opted to make the notebook GCP specific. I think its less confusing
to users to have separate notebooks focused on specific platforms rather
than having one notebook with a lot of caveats about what to do under
different conditions
* I've deleted the kustomize overlays for GCS since we don't want users to
use them anymore
* I used fairing and kaniko to eliminate the use of docker to build the images
so that everything can run from a notebook running inside the cluster.
* k8s_utils.py has some reusable functions to add some details from users
(e.g. low level calls to K8s APIs.)
* * Change the mnist test to just run the notebook
* Copy the notebook test infra for xgboost_synthetic to py/kubeflow/examples/notebook_test to make it more reusable
* Fix lint.
* Update for lint.
* A notebook to run the mnist E2E example.
Related to: kubeflow/website#1553
* 1. Use fairing to build the model. 2. Construct the YAML spec directly in the notebook. 3. Use the TFJob python SDK.
* Fix the ISTIO rule.
* Fix UI and serving; need to update TF serving to match version trained on.
* Get the IAP endpoint.
* Start writing some helper python functions for K8s.
* Commit before switching from replace to delete.
* Create a library to bulk create objects.
* Cleanup.
* Add back k8s_util.py
* Delete train.yaml; this shouldn't have been aded.
* update the notebook image.
* Refactor code into k8s_util; print out links.
* Clean up the notebok. Should be working E2E.
* Added section to get logs from stackdriver.
* Add comment about profile.
* Latest.
* Override mnist_gcp.ipynb with mnist.ipynb
I accidentally put my latest changes in mnist.ipynb even though that file
was deleted.
* More fixes.
* Resolve some conflicts from the rebase; override with changes on remote branch.
* Lint is failing because we are still runing python2 for lint
* kubeflow/testing#560 is related to building an updated image with python3.8
compatible version of lint so we can support f style strings.
* However, the unittests for kubeflow examples are still written in
ksonnet. Its not worth trying to update that so we just
remove that test for now. The test was just running lint
* We should really see about using Tekton to write the workflows
see kubeflow/testing#425