Commit Graph

5 Commits

Author SHA1 Message Date
David Sabater Dinter a9c6e69f0e Lint fixes mnist (#581)
* Remove modules from .pylintrc

* Add lint inline exceptions

* Add lint inline exceptions as all as the specific exception is not available for Pylint 1.8

* Fix string formatting logging message and remove unnecessary Pylint exception

* Update app.yaml with correct environment details
2019-07-24 19:23:52 -07:00
David Sabater Dinter a1f0d6dfec Fixed some outdated comments to trigger pushing web-ui and model serve images to gcr.io/kubeflow-examples (#444) 2018-12-26 15:05:42 -08:00
David Sabater Dinter f9a707ee85 [pytorch_mnist] Point images back to gcr.io/kubeflow-examples (#360)
* Point images back to gcr.io/kubeflow-images-public

* Point images back to gcr.io/kubeflow-examples

* Point images back to gcr.io/kubeflow-examples
2018-11-28 22:48:16 -08:00
David Sabater Dinter a630fcea34 [mnist_pytorch] fix train image (#342)
* Default to model trained with CPUs
TODO: Enable A/B testing with Seldon to load GPU and CPU models

* Checkout 1.0rc1 release as latest Pytorch master seems to have MPI backend detection broken

* Track changes in pytorch_mnist/training/ddp/mnist folder to trigger test jobs

* Repoint to pull images from gcr.io/kubeflow-ci built during pre-submit

* Fix image webui name

* Fix logging

* Add GCFS to CPU train

* Fix logging

* Add GCFS to CPU train

* Default to model trained with GPUs
TODO: Enable A/B testing with Seldon to load GPU and CPU models

* Fix Predict() method as Seldon expects 3 arguments

* Fix x reference
2018-11-24 13:22:28 -08:00
David Sabater Dinter a402db1ccc E2E Pytorch mnist example (#274)
* Add Pytorch MNIST example

* Fix link to Pytorch NMIST example

* Fix indentation in README

* Fix lint errors

* Fix lint errors
Add prediction proto files

* Add build_image.sh script to build image and push to gcr.io

* Add pytorch-mnist-webui-release release through automatic ksonnet package

* Fix lint errors

* Add pytorch-mnist-webui-release release through automatic ksonnet package

* Add PB2 autogenerated files to ignore with Pylint

* Fix lint errors

* Add official Pytorch DDP examples to ignore with Pylint

* Fix lint errors

* Update component to web-ui release

* Update mount point to kubeflow-gcfs as the example is GCP specific

* 01_setup_a_kubeflow_cluster document complete

* Test release job while PR is WIP

* Reduce workflow name to avoid Argo error:
"must be no more than 63 characters"

* Fix extra_repos to pull worker image

* Fix testing_image using kubeflow-ci rather than kubeflow-releasing

* Fix extra_repo, only needs kubeflow/testing

* Set build_image.sh executable

* Update build_image.sh from CentralDashboard component

* Remove old reference to centraldashboard in echo message

* Build Pytorch serving image using Python Docker Seldon wrapper rather than s2i:
https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python-docker.md

* Build Pytorch serving image using Python Docker Seldon wrapper rather than s2i:
https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python-docker.md

* Add releases for the training and serving images

* Add releases for the training and serving images

* Fix testing_image using kubeflow-ci rather than kubeflow-releasing

* Fix path to Seldon-wrapper build_image.sh

* Fix image name in ksonnet parameter

* Add 02 distributed training documentation

* Add 03 serving the model documentation
Update shared persistent reference in 02 distributed training documentation

* Add 05 teardown documentation

* Add section to test the model is deployed correctly in 03 serving the model

* Add 04 querying the model documentation

* Fix ks-app to ks_app

* Set prow jobs back to postsubmit

* Set prow jobs to trigger presubmit to kubeflow-ci and postsubmit to
kubeflow-images-public

* Change to kubeflow-ci project

* Increase timeout limit during image build to compile Pytorch

* Increase timeout limit during image build to compile Pytorch

* Change build machine type to compile Pytorch for training image

* Change build machine type to compile Pytorch for training image

* Add OWNERS file to Pytorch example

* Fix typo in documentation

* Remove checking docker daemon as we are using gcloud build instead

* Use logging module rather print()

* Remove empty file, replace with .gitignore to keep tmp folder

* Add ksonnet application to deploy model server and web-ui
Delete model server JSON manifest

* Refactor ks-app to ks_app

* Parametrise serving_model ksonnet component
Default web-ui to use ambassador route to seldon
Remove form section in web-ui

* Remove default environment from ksonnet application

* Update documentation to use ksonnet application

* Fix component name in documentation

* Consolidate Pytorch train module and build_image.sh script

* Consolidate Pytorch train module

* Consolidate Pytorch train module

* Consolidate Pytorch train module and build_image.sh script

* Revert back build_image.sh scripts

* Remove duplicates

* Consolidate train Dockerflies and build_image.sh script using docker build rather than gcloud

* Fix docker build command

* Fix docker build command

* Fix image name for cpu and gpu train

* Consolidate Pytorch train module

* Consolidate train Dockerflies and build_image.sh script using docker build rather than gcloud
2018-11-18 14:24:43 -08:00