Merge remote-tracking branch 'upstream/master' into third-party

This commit is contained in:
Michelle Casbon 2018-03-01 13:41:18 -05:00
commit 8e3ddb2eec
3 changed files with 109 additions and 7 deletions

View File

@ -1,13 +1,29 @@
## Sequence-to-Sequence Tutorial with Github Issues Data
Code For Medium Article: ["How To Create Data Products That Are Magical Using Sequence-to-Sequence Models"](https://medium.com/@hamelhusain/how-to-create-data-products-that-are-magical-using-sequence-to-sequence-models-703f86a231f8)
# [WIP] End-to-End kubeflow tutorial using a Sequence-to-Sequence model
## Resources:
This example demonstrates how you can use `kubeflow` end-to-end to train and
serve a Sequence-to-Sequence model on an existing kubernetes cluster. This
tutorial is based upon @hamelsmu's article ["How To Create Data Products That
Are Magical Using Sequence-to-Sequence
Models"](https://medium.com/@hamelhusain/how-to-create-data-products-that-are-magical-using-sequence-to-sequence-models-703f86a231f8).
1. [Tutorial Notebook](https://nbviewer.jupyter.org/github/hamelsmu/Seq2Seq_Tutorial/blob/master/notebooks/Tutorial.ipynb): The Jupyter notebook that coincides with the Medium post.
## Goals
2. [seq2seq_utils.py](./notebooks/seq2seq_utils.py): convenience functions that are used in the tutorial notebook to make predictions.
There are two primary goals for this tutorial:
3. [ktext](https://github.com/hamelsmu/ktext): this library is used in the tutorial to clean data. This library can be installed with `pip`.
* End-to-End kubeflow example
* End-to-End Sequence-to-Sequence model
4. [Nvidia Docker Container](https://hub.docker.com/r/hamelsmu/seq2seq_tutorial/): contains all libraries that are required to run the tutorial. This container is built with Nvidia-Docker v1.0. You can run this container by executing `nvidia-docker run hamelsmu/seq2seq_tutorial/` after installing **Nvidia-Docker v1.0.** Note: I have not tested this on Nvidia-Docker v2.0.
By the end of this tutorial, you should learn how to:
* Setup a Kubeflow cluster on an existing Kubernetes deployment
* Spawn up a Jupyter Notebook on the cluster
* Spawn up a shared-persistent storage across the cluster to store large
datasets
* Train a Sequence-to-Sequence model using TensorFlow on the cluster using
GPUs
* Serve the model using TensorFlow Serving
## Steps:
1. [Setup a Kubeflow cluster](setup_a_kubeflow_cluster.md)
1. [Teardown](teardown.md)

View File

@ -0,0 +1,66 @@
# Setup Kubeflow
In this part, you will setup kubeflow on an existing kubernetes cluster.
## Requirements
* A kubernetes cluster
* `kubectl` CLI pointing to the kubernetes cluster
* Make sure that you can run `kubectl get nodes` from your terminal
successfully
* The ksonnet CLI: [ks](https://ksonnet.io/#get-started)
Refer to the [user
guide](https://github.com/kubeflow/kubeflow/blob/master/user_guide.md) for
instructions on how to setup Kubeflow on your Kubernetes Cluster. Specifically
complete the [Deploy
Kubeflow](https://github.com/kubeflow/kubeflow/blob/master/user_guide.md#deploy-kubeflow)
section and [Bringing up a
Notebook](https://github.com/kubeflow/kubeflow/blob/master/user_guide.md#bringing-up-a-notebook)
section.
After completing that, you should have the following ready
* A ksonnet app in a directory named `my-kubeflow`
* An output similar to this for `kubectl get pods`
```
NAME READY STATUS RESTARTS AGE
ambassador-7987df44b9-4pht8 2/2 Running 0 1m
ambassador-7987df44b9-dh5h6 2/2 Running 0 1m
ambassador-7987df44b9-qrgsm 2/2 Running 0 1m
tf-hub-0 1/1 Running 0 1m
tf-job-operator-78757955b-qkg7s 1/1 Running 0 1m
```
* A Jupyter Notebook accessible at `http://127.0.0.1:8000`
## Provision storage for training data
We need a shared persistent disk to store our training data since containers'
filesystems are ephemeral and don't have a lot of storage space.
The [Advanced
Customization](https://github.com/kubeflow/kubeflow/blob/master/user_guide.md#advanced-customization)
section of the [user
guide](https://github.com/kubeflow/kubeflow/blob/master/user_guide.md) has
instructions on how to provision a cluster-wide shared NFS.
For this example, provision a `10GB` NFS mount with the name
`github-issues-data`.
After the NFS is ready, delete the `tf-hub-0` pod so that it gets recreated and
picks up the NFS mount. You can delete it by running `kubectl delete pod
tf-hub-0 -n={NAMESPACE}`
At this point you should have a 10GB mount `/mnt/github-issues-data` in your
Jupyter Notebook pod. Check this by running `!df` in your Jupyter Notebook.
## Summary
* We created a ksonnet app for our kubeflow deployment
* We created a disk for storing our training data
* We deployed the kubeflow-core component to our kubernetes cluster
* We connected to JupyterHub and spawned a new Jupyter notebook
Next: [Training the model using our cluster](training_the_model.md)

View File

@ -0,0 +1,20 @@
# Teardown
Delete the kubernetes namespace
```
kubectl delete namespace ${NAMESPACE}
```
Delete the PD backing the NFS mount
```
gcloud --project=${PROJECT} compute disks delete --zone=${ZONE} ${PD_DISK_NAME}
```
Delete the kubeflow-app directory
```
rm -rf my-kubeflow
```