* A notebook to run the mnist E2E example on GCP.
This fixes a number of issues with the example
* Use ISTIO instead of Ambassador to add reverse proxy routes
* The training job needs to be updated to run in a profile created namespace in order to have the required service accounts
* See kubeflow/examples#713
* Running inside a notebook running on Kubeflow should ensure user
is running inside an appropriately setup namespace
* With ISTIO the default RBAC rules prevent the web UI from sending requests to the model server
* A short term fix was to not include the ISTIO side car
* In the future we can add an appropriate ISTIO rbac policy
* Using a notebook allows us to eliminate the use of kustomize
* This resolves kubeflow/examples#713 which required people to use
and old version of kustomize
* Rather than using kustomize we can use python f style strings to
write the YAML specs and then easily substitute in user specific values
* This should be more informative; it avoids introducing kustomize and
users can see the resource specs.
* I've opted to make the notebook GCP specific. I think its less confusing
to users to have separate notebooks focused on specific platforms rather
than having one notebook with a lot of caveats about what to do under
different conditions
* I've deleted the kustomize overlays for GCS since we don't want users to
use them anymore
* I used fairing and kaniko to eliminate the use of docker to build the images
so that everything can run from a notebook running inside the cluster.
* k8s_utils.py has some reusable functions to add some details from users
(e.g. low level calls to K8s APIs.)
* * Change the mnist test to just run the notebook
* Copy the notebook test infra for xgboost_synthetic to py/kubeflow/examples/notebook_test to make it more reusable
* Fix lint.
* Update for lint.
* A notebook to run the mnist E2E example.
Related to: kubeflow/website#1553
* 1. Use fairing to build the model. 2. Construct the YAML spec directly in the notebook. 3. Use the TFJob python SDK.
* Fix the ISTIO rule.
* Fix UI and serving; need to update TF serving to match version trained on.
* Get the IAP endpoint.
* Start writing some helper python functions for K8s.
* Commit before switching from replace to delete.
* Create a library to bulk create objects.
* Cleanup.
* Add back k8s_util.py
* Delete train.yaml; this shouldn't have been aded.
* update the notebook image.
* Refactor code into k8s_util; print out links.
* Clean up the notebok. Should be working E2E.
* Added section to get logs from stackdriver.
* Add comment about profile.
* Latest.
* Override mnist_gcp.ipynb with mnist.ipynb
I accidentally put my latest changes in mnist.ipynb even though that file
was deleted.
* More fixes.
* Resolve some conflicts from the rebase; override with changes on remote branch.
|
||
|---|---|---|
| .github | ||
| code_search | ||
| codelab-image | ||
| demos | ||
| financial_time_series | ||
| github_issue_summarization | ||
| mnist | ||
| named_entity_recognition | ||
| object_detection | ||
| pipelines | ||
| py | ||
| pytorch_mnist | ||
| tensorflow-horovod | ||
| test | ||
| videos | ||
| xgboost_ames_housing | ||
| xgboost_synthetic | ||
| .gitignore | ||
| .pylintrc | ||
| CONTRIBUTING.md | ||
| LICENSE | ||
| OWNERS | ||
| README.md | ||
| prow_config.yaml | ||
README.md
kubeflow-examples
A repository to share extended Kubeflow examples and tutorials to demonstrate machine learning concepts, data science workflows, and Kubeflow deployments. The examples illustrate the happy path, acting as a starting point for new users and a reference guide for experienced users.
This repository is home to the following types of examples and demos:
End-to-end
Named Entity Recognition
Author: Sascha Heyer
This example covers the following concepts:
- Build reusable pipeline components
- Run Kubeflow Pipelines with Jupyter notebooks
- Train a Named Entity Recognition model on a Kubernetes cluster
- Deploy a Keras model to AI Platform
- Use Kubeflow metrics
- Use Kubeflow visualizations
GitHub issue summarization
Author: Hamel Husain
This example covers the following concepts:
- Natural Language Processing (NLP) with Keras and Tensorflow
- Connecting to Jupyterhub
- Shared persistent storage
- Training a Tensorflow model
- CPU
- GPU
- Serving with Seldon Core
- Flask front-end
Pachyderm Example - GitHub issue summarization
Author: Nick Harvey & Daniel Whitenack
This example covers the following concepts:
- A production pipeline for pre-processing, training, and model export
- CI/CD for model binaries, building and deploying a docker image for serving in Seldon
- Full tracking of what data produced which model, and what model is being used for inference
- Automatic updates of models based on changes to training data or code
- Training with single node Tensorflow and distributed TF-jobs
Pytorch MNIST
Author: David Sabater
This example covers the following concepts:
- Distributed Data Parallel (DDP) training with Pytorch on CPU and GPU
- Shared persistent storage
- Training a Pytorch model
- CPU
- GPU
- Serving with Seldon Core
- Flask front-end
MNIST
Author: Elson Rodriguez
This example covers the following concepts:
- Image recognition of handwritten digits
- S3 storage
- Training automation with Argo
- Monitoring with Argo UI and Tensorboard
- Serving with Tensorflow
Distributed Object Detection
Author: Daniel Castellanos
This example covers the following concepts:
- Gathering and preparing the data for model training using K8s jobs
- Using Kubeflow tf-job and tf-operator to launch a distributed object training job
- Serving the model through Kubeflow's tf-serving
Financial Time Series
Author: Sven Degroote
This example covers the following concepts:
- Deploying Kubeflow to a GKE cluster
- Exploration via JupyterHub (prospect data, preprocess data, develop ML model)
- Training several tensorflow models at scale with TF-jobs
- Deploy and serve with TF-serving
- Iterate training and serving
- Training on GPU
- Using Kubeflow Pipelines to automate ML workflow
Pipelines
Simple notebook pipeline
Author: Zane Durante
This example covers the following concepts:
- How to create pipeline components from python functions in jupyter notebook
- How to compile and run a pipeline from jupyter notebook
MNIST Pipelines
Author: Dan Sanche and Jin Chi He
This example covers the following concepts:
- Run MNIST Pipelines sample on a Google Cloud Platform (GCP).
- Run MNIST Pipelines sample for on premises cluster.
Component-focused
XGBoost - Ames housing price prediction
Author: Puneith Kaul
This example covers the following concepts:
- Training an XGBoost model
- Shared persistent storage
- GCS and GKE
- Serving with Seldon Core
Demos
Demos are for showing Kubeflow or one of its components publicly, with the intent of highlighting product vision, not necessarily teaching. In contrast, the goal of the examples is to provide a self-guided walkthrough of Kubeflow or one of its components, for the purpose of teaching you how to install and use the product.
In an example, all commands should be embedded in the process and explained. In a demo, most details should be done behind the scenes, to optimize for on-stage rhythm and limited timing.
You can find the demos in the /demos directory.
Third-party hosted
| Source | Example | Description |
|---|---|---|
Get Involved
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
The Kubeflow community is guided by our Code of Conduct, which we encourage everybody to read before participating.