examples/github_issue_summarization
Jeremy Lewi 1cc4550b7d GIS E2E test verify the TFJob runs successfully (#456)
* Create a test for submitting the TFJob for the GitHub issue summarization example.

* This test needs to be run manually right now. In a follow on PR we will
  integrate it into CI.

* We use the image built from Dockerfile.estimator because that is the image
  we are running train_test.py in.

  * Note: The current version of the code now requires Python3 (I think this
    is due to an earlier PR which refactored the code into a shared
    implementation for using TF estimator and not TF estimator).

* Create a TFJob component for TFJob v1beta1; this is the version
  in KF 0.4.

TFJob component
  * Upgrade to v1beta to work with 0.4
  * Update command line arguments to match the versions in the current code
      * input & output are now single parameters rather then separate parameters
        for bucket and name

  * change default input to a CSV file because the current version of the
    code doesn't handle unzipping it.

* Use ks_util from kubeflow/testing

* Address comments.
2019-01-08 15:06:49 -08:00
..
demo [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
docker Setup continuous building of Docker images for GH Issue Summarization Example (#449) 2019-01-04 17:02:24 -08:00
ks_app GIS E2E test verify the TFJob runs successfully (#456) 2019-01-08 15:06:49 -08:00
notebooks GIS E2E test verify the TFJob runs successfully (#456) 2019-01-08 15:06:49 -08:00
sql Remove third_party folder & MIT license file 2018-02-27 13:17:42 -05:00
testing GIS E2E test verify the TFJob runs successfully (#456) 2019-01-08 15:06:49 -08:00
workflow Add .pylintrc (#61) 2018-03-29 08:25:02 -07:00
.gitignore Setup continuous building of Docker images for GH Issue Summarization Example (#449) 2019-01-04 17:02:24 -08:00
01_setup_a_kubeflow_cluster.md [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
02_distributed_training.md [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
02_training_the_model.md [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
02_training_the_model_tfjob.md [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
03_serving_the_model.md [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
04_querying_the_model.md [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
05_teardown.md [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
Makefile Setup continuous building of Docker images for GH Issue Summarization Example (#449) 2019-01-04 17:02:24 -08:00
README.md [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
image_build.jsonnet Setup continuous building of Docker images for GH Issue Summarization Example (#449) 2019-01-04 17:02:24 -08:00
requirements.txt Remove third_party folder & MIT license file 2018-02-27 13:17:42 -05:00

README.md

End-to-End kubeflow tutorial using a Sequence-to-Sequence model

This example demonstrates how you can use kubeflow end-to-end to train and serve a Sequence-to-Sequence model on an existing kubernetes cluster. This tutorial is based upon @hamelsmu's article "How To Create Data Products That Are Magical Using Sequence-to-Sequence Models".

Goals

There are two primary goals for this tutorial:

  • Demonstrate an End-to-End kubeflow example
  • Present an End-to-End Sequence-to-Sequence model

By the end of this tutorial, you should learn how to:

  • Setup a Kubeflow cluster on an existing Kubernetes deployment
  • Spawn a Jupyter Notebook on the cluster
  • Spawn a shared-persistent storage across the cluster to store large datasets
  • Train a Sequence-to-Sequence model using TensorFlow and GPUs on the cluster
  • Serve the model using Seldon Core
  • Query the model from a simple front-end application

Steps:

  1. Setup a Kubeflow cluster
  2. Training the model. You can train the model using any of the following methods using Jupyter Notebook or using TFJob:
  3. Serving the model
  4. Querying the model
  5. Teardown