* added named entity recognition example
https://github.com/kubeflow/website/issues/853
* added previous and next steps
* changed all absolute links to relative links
* changed headline for better understanding
* moved dataset description section to top
* fixed style
* added missing Jupyter notebook
* changed headline
* added link to documentation
* fixed meaning of images and components
* adapted documentation to https://www.kubeflow.org/docs/about/style-guide/#address-the-audience-directly
* added link to ai platform models
* make it clear these are optional extensions
* changed summary and goals
* added kubeflow version
* fixed s/an/a/ also checked the rest of the documentation
* added #!/bin/sh
* added environment variables for build scripts and adapted documentation
* changed PROJECT TO PROJECT_ID
* added link to kaggle dataset and removed not required copy script (due to direct public location in gs://). Adapted Jupyter notebook input data path
* added hint to make clear no further steps are required
* fixed s/Run/RUN/
* grammar fix
* optimized text
* added prev link to index
* removed model description due to lack of information
* added significance and congrats =)
* added example
* guided the user's attention to specific screens/metrics/graphs
* explenation of pieces
* updated main readme
* updated parts
* fixed typo
* adapted dataset path
* made scripts executable
chmod +x
* Update step-1-setup.md
swaped sections and added env variables to gsutil comand
* added information regarding public access
* added named entity recognition example
https://github.com/kubeflow/website/issues/853
* added previous and next steps
* changed all absolute links to relative links
* changed headline for better understanding
* moved dataset description section to top
* fixed style
* added missing Jupyter notebook
* changed headline
* added link to documentation
* fixed meaning of images and components
* adapted documentation to https://www.kubeflow.org/docs/about/style-guide/#address-the-audience-directly
* added link to ai platform models
* make it clear these are optional extensions
* changed summary and goals
* added kubeflow version
* fixed s/an/a/ also checked the rest of the documentation
* added #!/bin/sh
* added environment variables for build scripts and adapted documentation
* changed PROJECT TO PROJECT_ID
* added link to kaggle dataset and removed not required copy script (due to direct public location in gs://). Adapted Jupyter notebook input data path
* added hint to make clear no further steps are required
* fixed s/Run/RUN/
* grammar fix
* optimized text
* added prev link to index
* removed model description due to lack of information
* added significance and congrats =)
* added example
* guided the user's attention to specific screens/metrics/graphs
* explenation of pieces
* updated main readme
* updated parts
* fixed typo
* adapted dataset path
* made scripts executable
chmod +x
* Update step-1-setup.md
swaped sections and added env variables to gsutil comand
* added information regarding public access
* fixed lint error
* fixed lint issues
* fixed lint issues
* figured kubeflow examples are using 2 rather then 4 spaces (due to tensorflow standards)
* lint fixes
* reverted changes
* removed unused import
* removed object inherit
* fixed lint issues
* added kwargs to ignored-argument-name (due to best practice in Google custom prediction routine)
* fix lint issues
* set pylintrc back to default and removed unused argument
* Add Pytorch MNIST example
* Fix link to Pytorch NMIST example
* Fix indentation in README
* Fix lint errors
* Fix lint errors
Add prediction proto files
* Add build_image.sh script to build image and push to gcr.io
* Add pytorch-mnist-webui-release release through automatic ksonnet package
* Fix lint errors
* Add pytorch-mnist-webui-release release through automatic ksonnet package
* Add PB2 autogenerated files to ignore with Pylint
* Fix lint errors
* Add official Pytorch DDP examples to ignore with Pylint
* Fix lint errors
* Update component to web-ui release
* Update mount point to kubeflow-gcfs as the example is GCP specific
* 01_setup_a_kubeflow_cluster document complete
* Test release job while PR is WIP
* Reduce workflow name to avoid Argo error:
"must be no more than 63 characters"
* Fix extra_repos to pull worker image
* Fix testing_image using kubeflow-ci rather than kubeflow-releasing
* Fix extra_repo, only needs kubeflow/testing
* Set build_image.sh executable
* Update build_image.sh from CentralDashboard component
* Remove old reference to centraldashboard in echo message
* Build Pytorch serving image using Python Docker Seldon wrapper rather than s2i:
https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python-docker.md
* Build Pytorch serving image using Python Docker Seldon wrapper rather than s2i:
https://github.com/SeldonIO/seldon-core/blob/master/docs/wrappers/python-docker.md
* Add releases for the training and serving images
* Add releases for the training and serving images
* Fix testing_image using kubeflow-ci rather than kubeflow-releasing
* Fix path to Seldon-wrapper build_image.sh
* Fix image name in ksonnet parameter
* Add 02 distributed training documentation
* Add 03 serving the model documentation
Update shared persistent reference in 02 distributed training documentation
* Add 05 teardown documentation
* Add section to test the model is deployed correctly in 03 serving the model
* Add 04 querying the model documentation
* Fix ks-app to ks_app
* Set prow jobs back to postsubmit
* Set prow jobs to trigger presubmit to kubeflow-ci and postsubmit to
kubeflow-images-public
* Change to kubeflow-ci project
* Increase timeout limit during image build to compile Pytorch
* Increase timeout limit during image build to compile Pytorch
* Change build machine type to compile Pytorch for training image
* Change build machine type to compile Pytorch for training image
* Add OWNERS file to Pytorch example
* Fix typo in documentation
* Remove checking docker daemon as we are using gcloud build instead
* Use logging module rather print()
* Remove empty file, replace with .gitignore to keep tmp folder
* Add ksonnet application to deploy model server and web-ui
Delete model server JSON manifest
* Refactor ks-app to ks_app
* Parametrise serving_model ksonnet component
Default web-ui to use ambassador route to seldon
Remove form section in web-ui
* Remove default environment from ksonnet application
* Update documentation to use ksonnet application
* Fix component name in documentation
* Consolidate Pytorch train module and build_image.sh script
* Consolidate Pytorch train module
* Consolidate Pytorch train module
* Consolidate Pytorch train module and build_image.sh script
* Revert back build_image.sh scripts
* Remove duplicates
* Consolidate train Dockerflies and build_image.sh script using docker build rather than gcloud
* Fix docker build command
* Fix docker build command
* Fix image name for cpu and gpu train
* Consolidate Pytorch train module
* Consolidate train Dockerflies and build_image.sh script using docker build rather than gcloud
* add financial time series example
* fix ReadMe comments
* fix PyLint remarks
* clean up based on PR remarks
* Completing docstrings and fixing PR remarks
* Added tutorial for object detection distributed training
Added steps on how to leverage kubeflow tooling to
submit a distributed object detection training job
in a small kubernetes cluster (minikube, 2-4 node cluster)
* Added Jobs to prepare the training data and model
* Updated instructions
* fixed typos and added export tf graph job
* Fixed paths in jobs and instructions
* Enhanced instructions and re-arranged folder structure
* Updated links to kubeflow user guide documentation
* Proposed repo strategy
Define and describe example types (end-to-end, component-based, third-party hosted)
Define requirements for housing examples
Update list of ideas for additional examples
* Add get involved
* Move descriptions into CONTRIBUTING
Add application-specific category
Add clarifying details
* edit TF example readme
* prefix tutorial steps with a number for nicer display in repo
* fix typo
* edit steps 4 and 5
* edit docs
* add navigation and formatting edits to example
* Add awscli tools container.
* Add initial readme.
* Add argo skeleton.
* Run a an argo job.
* Artifact support and argo test
* Use built container (#3)
* Fix artifacts and secrets
* Add work in progress tfflow (#14)
* Add kvc deployment to workflow.
* Switch aws repo.
* wip.
* Add working tfflow job.
* Add sidecar that waits for MASTER completion
* Pass in job-name
* Add volumemanager info step
* Add input parameters to step
* Adds nodeaffinity and hostpath
* Add fixes for workflow (#17)
- Use correct images for worker and ps
- Use correct aws keys
- Change volumemanager to mnist
- Comment unused steps
- Fix volume mount to correct containers
* Fix hostpath for tfjob
* Download all mnist files
* added GCS stored artifacts comptability to Argo
* Add initial inference workflow. (#30)
* Initial serving step (#31)
* Adds fixes to initial serving step
* Ready for rough demo: Workflow in working state
* Move conflicting readme.
* Initial commit, everything boots without crashing.
* Working, with some python errors.
* Adding explicit flags
* Working with ins-outs
* Letting training job exit on success
* Adding documentation skeletion
* trying to properly save model
* Almost working
* Working
* Adding export script, refactored to allow model more reusability
* Starting documentation
* little further on docs
* More doc updates, fixing sleep logic
* adding urls for mnist data
* Removing download logic, it's to tied in with build-in tf examples.
* Added argo workflow instructions, minor cleanups.
* Adding mnist client.
* Fixing typos
* Adding instructions for installing components.
* Added ksonnet container
* Adding new entrypoint.
* Added helm install instructions for kvc
* doing things with variables
* Typos.
* Added better namespace support
* S3 refactor.
* Added missing region variables.
* Adding tensorboard support.
* Addding Container for Tensorboard.
* Added temporary flag, added install instructions for CLI.
* Removing invalid ksonnet environment.
* Updating readme
* Cleanup currently unused pieces
* Add missint cluster-role
* Minor cleanup.
* Adding more parameters.
* added changes to allow model to train on multiple workers and fixed some doc typos
* Adding flag to enable/disable model serving. Adding s3 urls as outputs for future querying, renaming info step.
* Adding seperate deployer workflow.
* Split serving working.
* Adding split workflow.
* More parameters.
* updates as to elson comments
* Revert "added changes to allow model to train on multiple workers and fixed s…"
* Initial working pure-s3 workflow.
* Removed wait sidecars.
* Remove unused flag.
* Added part two, minor doc fixes
* Inverted links...
* Adding diff.
* Fix url syntax
* Documentation updates.
* Added AWS Cli
* Parameterized export.
* Fixing image in s3 version.
* Fixed documentation issues.
* KVC snippet changes, need to find last working helm chart.
* Temporarily pinning kvc version.
* working master model and some doc typos fixes (#13)
* added changes to allow model to train on multiple workers and fixed some doc typos
* Adding flag to enable/disable model serving. Adding s3 urls as outputs for future querying, renaming info step.
* Adding seperate deployer workflow.
* Split serving working.
* Adding split workflow.
* More parameters.
* updates as to elson comments
* working master model and some doc typos
* fixes as to Elson
* Removign whitespace differences
* updating diff
* Changing parameters.
* Undoing whitespace.
* Changing termination policy on s3 version due to unknown issue.
* Updating mnist diff.
* Changing train steps.
* Syncing Demo changes.
* Update README.md
* Going S3-native for initial example. Getting rid of Master.
* Minor documentation tweaks, adding params, swapping aws cli for minio.
* Updating KVC version.
* Switching ksonnet repo, removing model name from client.
* Updating git url.
* Adding certificate hack to avoid RBAC errors.
* Pinning KVC to commit while working on PR.
* Updating version.
* Updates README with additional details (#14)
* Updates README with additional details
* Adding clarity to kubectl config commands
* Fixed comma placement
* Refactoring notes for github and kubernetes credentials.
* Forgot to add an overview of the argo template.
* Updating example based on feedback.
- Removed superflous images
- Clarified use of KVC
- Added unaltered model
- Variable cleanup
* Refactored grpc image into generic base image.
* minor cleanup of resubmitting section.
* Switching Argo deployment to ksonnet, conslidating install instructions.
* Removing old cruft, clarifying cluster requirements.
* [WIP] Switching out model (#15)
* Switching to new mnist example.
* Parameterized model, testing export.
* Got CNN model exporting.
* Attempting to do distributed training with Estimator, removed seperate export.
* Adding master back, otherwise Estimator complains about not having a chief.
* Switching to tf.estimator.train_and_evaluate.
* Minor path/var name refactor.
* Adding test data and new client.
* Fixed documentation to reflect new client.
* Getting rid of tf job shim.
* Removing KVC from example, renaming directory
* Modifying parent README
* Removed reference to export.
* Adding reference to export.
* Removing unused Dockerfile.
* Removing uneeded files, simplifying how to get status, refactor model serving workflow step.
* Renaming directory
* Minor doc improvements, removed extra clis.
* Making SSL configurable for clusters without secured s3 endpoints.
* Added a tf-user account for workflow. Fixed serving bug.
* Updating gke version.
* Re-ran through instructions, fixed errata.
* Fixing lint issues
* Pylint errors
* Pylint errors
* Adding parenthesis back.
* pylint Hacks
* Disabling argument filter, model bombs without empty arg.
* Removing unneeded lambdas
* Fix folder link
* Add detail to cluster setup instructions
Add a link to the image for this example.
In Tutorial.ipynb, move mounted directory into a variable to help avoid collisions on shared clusters.