examples/github_issue_summarization/demo
Michelle Casbon 70a22d6d7b [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450)
* Update tfjob components to v1beta1

Remove old version of tensor2tensor component

* Combine UI into a single jsonnet file

* Upgrade GH issue summarization to kf v0.4.0-rc.2

Use latest ksonnet v0.13.1
Use latest seldon v1alpha2
Remove ksonnet app with full kubeflow platform & replace with components specific to this example.
Remove outdated scripts
Add cluster creation links to Click-to-deploy & kfctl
Add warning not to use the Training with an Estimator guide
Replace commandline with bash for better syntax highlighting
Replace messy port-forwarding commands with svc/ambassador
Add modelUrl param to ui component
Modify teardown instructions to remove the deployment
Fix grammatical mistakes

* Rearrange tfjob instructions
2018-12-30 20:05:29 -08:00
..
gh-app Fix gh-demo.kubeflow.org and make it easy to setup. (#261) 2018-10-15 08:36:11 -07:00
gh-demo-1003 A bunch of changes to support distributed training using tf.estimator (#265) 2018-11-07 16:23:59 -08:00
README.md [GH Issue Summarization] Upgrade to kf v0.4.0-rc.2 (#450) 2018-12-30 20:05:29 -08:00
gh-demo-dm-config.yaml Fix gh-demo.kubeflow.org and make it easy to setup. (#261) 2018-10-15 08:36:11 -07:00

README.md

Demo

This folder contains the resources needed by the Kubeflow DevRel team to setup a public demo of the GitHub Issue Summarization Example.

Public gh-demo.kubeflow.org

We currently run a public instance of the ui at gh-demo.kubeflow.org

The current setup is as follows

PROJECT=kubecon-gh-demo-1
CLUSTER=gh-demo-1003
ZONE=us-east1-d

Directory contents

  • gh-app - This contains the ksonnet for deploying the public instance of the model and ui.
  • gh-demo-1003 - This is the app created by kfctl

Setting up the demo

Here are the instructions for setting up the demo.

  1. Follow the GKE instructions for deploying Kubeflow

    • If you are using PROJECT kubecon-gh-demo-1 you can reuse the existing OAuth client
      • Use the Cloud console to lookup Client ID and secret and set the corresponding environment variables

      • You will also need to add an authorized redirect URI for the new Kubeflow deployment

  2. Follow the instructions to Setup an NFS share

    • This is needed to do distributed training with the TF estimator example
  3. Create static IP for serving gh-demo.kubeflow.org

    gcloud --project=${PROJECT}  deployment-manager deployments create  --config=gh-demo-dm-config.yaml gh-public-ui
    
  4. Update the Cloud DNS record gh-demo.kubeflow.org in project kubeflow-dns to use the new static ip.

  5. Create a namespace for serving the UI and model

    kubectl create namespace gh-public
    
  6. Deploy Seldon controller in the namespace that will serve the public model

    cd gh-demo-1003/ks_app
    ks env add gh-public --namespace=gh-public
    ks generate seldon seldon
    ks apply gh-public -c seldon
    
  7. Create a secret with a GitHub token

    • Follow GitHub's instructions to create a token

    • Then run the following command to create the secret

      kubectl -n gh-public create secret generic github-token --from-literal=github-token=${GITHUB_TOKEN}
      
  8. Deploy the public UI and model

    cd gh-app
    ks env add gh-public --namespace=gh-public
    ks apply gh-public
    

Training and Deploying the model.

We use the ksonnet app in github/kubeflow/examples/github_issue_summarization/ks_app

The current environment is

export ENV=gh-demo-1003

Set a bucket for the job output

DAY=$(date +%Y%m%d)
ks param set --env=${ENV} tfjob output_model_gcs_bucket kubecon-gh-demo
ks param set --env=${ENV} tfjob output_model_gcs_path gh-demo/${DAY}/output

Run the job

ks apply ${ENV} -c tfjob

Using TF Estimator with Keras

  1. Copy the data to the GCFS mount by launching a notebook and then running the following commands

    !mkdir -p /mnt/kubeflow-gcfs/gh-demo/data
    !gcloud auth activate-service-account --key-file=${GOOGLE_APPLICATION_CREDENTIALS}
    !gsutil cp gs://kubeflow-examples/github-issue-summarization-data/github-issues.zip /mnt/kubeflow-gcfs/gh-demo/data
    !unzip /mnt/kubeflow-gcfs/gh-demo/data/github-issues.zip
    !cp github_issues.csv /mnt/kubeflow-gcfs/gh-demo/data/
    
    • TODO(jlewi): Can we modify the existing job that downloads data to a PVC to do this?
  2. Run the estimator job

    ks apply ${ENV} -c tfjob-estimator
    
  3. Run TensorBoard

    ks apply ${ENV} -c tensorboard-pvc-tb