Mnist pipelines (#524)

* added mnist pipelines sample

* fixed lint issues
This commit is contained in:
Daniel Sanche 2019-03-16 14:14:55 -07:00 committed by Kubernetes Prow Robot
parent 7924e0fe21
commit 895e88bf67
12 changed files with 461 additions and 0 deletions

2
pipelines/mnist-pipelines/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
venv
*.tar.gz

View File

@ -0,0 +1,187 @@
# MNIST Pipelines GCP
This document describes how to run the [MNIST example](https://github.com/kubeflow/examples/tree/master/mnist) on Kubeflow Pipelines on a Google Cloud Platform cluster
## Setup
#### Create a GCS bucket
This pipeline requires a [Google Cloud Storage bucket](https://cloud.google.com/storage/) to hold your trained model. You can create one with the following command
```
BUCKET_NAME=kubeflow-pipeline-demo-$(date +%s)
gsutil mb gs://$BUCKET_NAME/
```
#### Deploy Kubeflow
Follow the [Getting Started Guide](https://www.kubeflow.org/docs/started/getting-started-gke) to deploy a Kubeflow cluster to GKE
#### Open the Kubeflow Pipelines UI
![Kubeflow UI](./img/kubeflow.png "Kubeflow UI")
##### IAP enabled
If you set up your cluster with IAP enabled as described in the [GKE Getting Started guide](https://www.kubeflow.org/docs/started/getting-started-gke),
you can now access the Kubeflow Pipelines UI at `https://<deployment_name>.endpoints.<project>.cloud.goog/pipeline`
##### IAP disabled
If you opted to skip IAP, you can open a connection to the UI using *kubectl port-forward* and browsing to http://localhost:8085/pipeline
```
kubectl port-forward -n kubeflow $(kubectl get pods -n kubeflow --selector=service=ambassador \
-o jsonpath='{.items[0].metadata.name}') 8085:80
```
#### Install Python Dependencies
Set up a [virtual environment](https://docs.python.org/3/tutorial/venv.html) for your Kubeflow Pipelines work:
```
python3 -m venv $(pwd)/venv
source ./venv/bin/activate
```
Install the Kubeflow Pipelines sdk, along with other Python dependencies in the [requirements.txt](./requirements.txt) file
```
pip install -r requirements.txt --upgrade
```
## Running the Pipeline
#### Compile Pipeline
Pipelines are written in Python, but they must be compiled into a [domain-specific language (DSL)](https://en.wikipedia.org/wiki/Domain-specific_language)
before they can be used. Most pipelines are designed so that simply running the script will preform the compilation step
```
python3 mnist-pipeline.py
```
Running this command should produce a compiled *mnist.tar.gz* file
Additionally, you can compile manually using the *dsl-compile* script
```
python venv/bin/dsl-compile --py mnist-pipeline.py --output mnist-pipeline.py.tar.gz
```
#### Upload through the UI
Now that you have the compiled pipelines file, you can upload it through the Kubeflow Pipelines UI.
Simply select the "Upload pipeline" button
![Upload Button](./img/upload_btn.png "Upload Button")
Upload your file and give it a name
![Upload Form](./img/upload_form.png "Upload Form")
#### Run the Pipeline
After clicking on the newly created pipeline, you should be presented with an overview of the pipeline graph.
When you're ready, select the "Create Run" button to launch the pipeline
![Pipeline](./img/pipeline.png "Pipeline")
Fill out the information required for the run, including the GCP `$BUCKET_ID` you created earlier. Press "Start" when you are ready
![Run Form](./img/run_form.png "Run Form")
After clicking on the newly created Run, you should see the pipeline run through the 'train', 'serve', and 'web-ui' components. Click on any component to see its logs.
When the pipeline is complete, look at the logs for the web-ui component to find the IP address created for the MNIST web interface
![Logs](./img/logs.png "Logs")
## Pipeline Breakdown
Now that we've run a pipeline, lets break down how it works
#### Decorator
```
@dsl.pipeline(
name='MNIST',
description='A pipeline to train and serve the MNIST example.'
)
```
Pipelines are expected to include a `@dsl.pipeline` decorator to provide metadata about the pipeline
#### Function Header
```
def mnist_pipeline(model_export_dir='gs://your-bucket/export',
train_steps='200',
learning_rate='0.01',
batch_size='100'):
```
The pipeline is defined in the mnist_pipeline function. It includes a number of arguments, which are exposed in the Kubeflow Pipelines UI when creating a new Run.
Although passed as strings, these arguments are of type [`kfp.dsl.PipelineParam`](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/dsl/_pipeline_param.py)
#### Train
```
train = dsl.ContainerOp(
name='train',
image='gcr.io/kubeflow-examples/mnist/model:v20190304-v0.2-176-g15d997b',
arguments=[
"/opt/model.py",
"--tf-export-dir", model_export_dir,
"--tf-train-steps", train_steps,
"--tf-batch-size", batch_size,
"--tf-learning-rate", learning_rate
]
).apply(gcp.use_gcp_secret('user-gcp-sa'))
```
This block defines the 'train' component. A component is made up of a [`kfp.dsl.ContainerOp`](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/dsl/_container_op.py)
object with the container path and a name specified. The container image used is defined in the [Dockerfile.model in the MNIST example](https://github.com/kubeflow/examples/blob/master/mnist/Dockerfile.model)
Because the training component needs access to our GCS bucket, it is run with access to our 'user-gcp-sa' secret, which gives
read/write access to GCS resources.
After defining the train component, we also set a number of environment variables for the training script
#### Serve
```
serve = dsl.ContainerOp(
name='serve',
image='gcr.io/ml-pipeline/ml-pipeline-kubeflow-deployer:\
7775692adf28d6f79098e76e839986c9ee55dd61',
arguments=[
'--model-export-path', model_export_dir,
'--server-name', "mnist-service"
]
).apply(gcp.use_gcp_secret('user-gcp-sa'))
```
The 'serve' component is slightly different than 'train'. While 'train' runs a single container and then exits, 'serve' runs a container that launches long-living
resources in the cluster. The ContainerOP takes two arguments: the path we exported our trained model to, and a server name. Using these, this pipeline component
creates a Kubeflow [`tf-serving`](https://github.com/kubeflow/kubeflow/tree/master/kubeflow/tf-serving) service within the cluster. This service lives after the
pipeline is complete, and can be seen using `kubectl get all -n kubeflow`. The Dockerfile used to build this container [can be found here](https://github.com/kubeflow/pipelines/blob/master/components/kubeflow/deployer/Dockerfile).
Like the 'train' component, 'serve' requires access to the 'user-gcp-sa' secret for access to the 'kubectl' command within the container.
The `serve.after(train)` line specifies that this component is to run sequentially after 'train' is complete
#### Web UI
```
web_ui = dsl.ContainerOp(
name='web-ui',
image='gcr.io/kubeflow-examples/mnist/deploy-service:latest',
arguments=[
'--image', 'gcr.io/kubeflow-examples/mnist/web-ui:\
v20190304-v0.2-176-g15d997b-pipelines',
'--name', 'web-ui',
'--container-port', '5000',
'--service-port', '80',
'--service-type', "LoadBalancer"
]
).apply(gcp.use_gcp_secret('user-gcp-sa'))
web_ui.after(serve)
```
Like 'serve', the web-ui component launches a service that exists after the pipeline is complete. Instead of launching a Kubeflow resource, the web-ui launches
a standard Kubernetes Deployment/Service pair. The Dockerfile that builds the deployment image [can be found here.](./deploy-service/Dockerfile) This image is used
to deploy the web UI, which was built from the [Dockerfile found in the MNIST example](https://github.com/kubeflow/examples/blob/master/mnist/web-ui/Dockerfile)
After this component is run, a new LoadBalancer is provisioned that gives external access to a 'web-ui' deployment launched in the cluster.
#### Main Function
```
if __name__ == '__main__':
import kfp.compiler as compiler
compiler.Compiler().compile(mnist_pipeline, __file__ + '.tar.gz')
```
At the bottom of the script is a main function. This is used to compile the pipeline when the script is run

View File

@ -0,0 +1,62 @@
# Copyright 2018 The Kubeflow Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
FROM debian
RUN apt-get update -q && apt-get upgrade -y && \
apt-get install -y -qq --no-install-recommends \
apt-transport-https \
ca-certificates \
git \
gnupg \
lsb-release \
unzip \
wget && \
wget -O /opt/ks_0.12.0_linux_amd64.tar.gz \
https://github.com/ksonnet/ksonnet/releases/download/v0.12.0/ks_0.12.0_linux_amd64.tar.gz && \
tar -C /opt -xzf /opt/ks_0.12.0_linux_amd64.tar.gz && \
cp /opt/ks_0.12.0_linux_amd64/ks /bin/. && \
rm -f /opt/ks_0.12.0_linux_amd64.tar.gz && \
wget -O /bin/kubectl \
https://storage.googleapis.com/kubernetes-release/release/v1.11.2/bin/linux/amd64/kubectl && \
chmod u+x /bin/kubectl && \
wget -O /opt/kubernetes_v1.11.2 \
https://github.com/kubernetes/kubernetes/archive/v1.11.2.tar.gz && \
mkdir -p /src && \
tar -C /src -xzf /opt/kubernetes_v1.11.2 && \
rm -rf /opt/kubernetes_v1.11.2 && \
wget -O /opt/google-apt-key.gpg \
https://packages.cloud.google.com/apt/doc/apt-key.gpg && \
apt-key add /opt/google-apt-key.gpg && \
export CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)" && \
echo "deb https://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" >> \
/etc/apt/sources.list.d/google-cloud-sdk.list && \
apt-get update -q && \
apt-get install -y -qq --no-install-recommends google-cloud-sdk && \
gcloud config set component_manager/disable_update_check true
ENV KUBEFLOW_VERSION v0.2.5
# Checkout the kubeflow packages at image build time so that we do not
# require calling in to the GitHub API at run time.
RUN cd /src && \
mkdir -p github.com/kubeflow && \
cd github.com/kubeflow && \
git clone https://github.com/kubeflow/kubeflow && \
cd kubeflow && \
git checkout ${KUBEFLOW_VERSION}
ADD ./src/deploy.sh /bin/.
ENTRYPOINT ["/bin/deploy.sh"]

View File

@ -0,0 +1,127 @@
#!/bin/bash -e
# Copyright 2018 The Kubeflow Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -x
KUBERNETES_NAMESPACE="${KUBERNETES_NAMESPACE:-kubeflow}"
NAME="my-deployment"
while (($#)); do
case $1 in
"--image")
shift
IMAGE_PATH="$1"
shift
;;
"--service-type")
shift
SERVICE_TYPE="$1"
shift
;;
"--container-port")
shift
CONTAINER_PORT="--containerPort=$1"
shift
;;
"--service-port")
shift
SERVICE_PORT="--servicePort=$1"
shift
;;
"--cluster-name")
shift
CLUSTER_NAME="$1"
shift
;;
"--namespace")
shift
KUBERNETES_NAMESPACE="$1"
shift
;;
"--name")
shift
NAME="$1"
shift
;;
*)
echo "Unknown argument: '$1'"
exit 1
;;
esac
done
if [ -z "${IMAGE_PATH}" ]; then
echo "You must specify an image to deploy"
exit 1
fi
if [ -z "$SERVICE_TYPE" ]; then
SERVICE_TYPE=ClusterIP
fi
echo "Deploying the image '${IMAGE_PATH}'"
if [ -z "${CLUSTER_NAME}" ]; then
CLUSTER_NAME=$(wget -q -O- --header="Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/cluster-name)
fi
# Ensure the name is not more than 63 characters.
NAME="${NAME:0:63}"
# Trim any trailing hyphens from the server name.
while [[ "${NAME:(-1)}" == "-" ]]; do NAME="${NAME::-1}"; done
echo "Deploying ${NAME} to the cluster ${CLUSTER_NAME}"
# Connect kubectl to the local cluster
kubectl config set-cluster "${CLUSTER_NAME}" --server=https://kubernetes.default --certificate-authority=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
kubectl config set-credentials pipeline --token "$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"
kubectl config set-context kubeflow --cluster "${CLUSTER_NAME}" --user pipeline
kubectl config use-context kubeflow
# Configure and deploy the app
cd /src/github.com/kubeflow/kubeflow
git checkout ${KUBEFLOW_VERSION}
cd /opt
echo "Initializing KSonnet app..."
ks init tf-serving-app
cd tf-serving-app/
if [ -n "${KUBERNETES_NAMESPACE}" ]; then
echo "Setting Kubernetes namespace: ${KUBERNETES_NAMESPACE} ..."
ks env set default --namespace "${KUBERNETES_NAMESPACE}"
fi
ks generate deployed-service $NAME --name=$NAME --image=$IMAGE_PATH --type=$SERVICE_TYPE $CONTAINER_PORT $SERVICE_PORT
echo "Deploying the service..."
ks apply default -c $NAME
# Wait for the ip address
timeout="1000"
start_time=`date +%s`
PUBLIC_IP=""
while [ -z "$PUBLIC_IP" ]; do
PUBLIC_IP=$(kubectl get svc -n $KUBERNETES_NAMESPACE $NAME -o jsonpath='{.status.loadBalancer.ingress[0].ip}' 2> /dev/null)
current_time=`date +%s`
elapsed_time=$(expr $current_time + 1 - $start_time)
if [[ $elapsed_time -gt $timeout ]];then
echo "timeout"
exit 1
fi
sleep 5
done
echo "service active: $PUBLIC_IP"

Binary file not shown.

After

Width:  |  Height:  |  Size: 133 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 199 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

View File

@ -0,0 +1,80 @@
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Kubeflow Pipelines MNIST example
Run this script to compile pipeline
"""
import kfp.dsl as dsl
import kfp.gcp as gcp
@dsl.pipeline(
name='MNIST',
description='A pipeline to train and serve the MNIST example.'
)
def mnist_pipeline(model_export_dir='gs://your-bucket/export',
train_steps='200',
learning_rate='0.01',
batch_size='100'):
"""
Pipeline with three stages:
1. train an MNIST classifier
2. deploy a tf-serving instance to the cluster
3. deploy a web-ui to interact with it
"""
train = dsl.ContainerOp(
name='train',
image='gcr.io/kubeflow-examples/mnist/model:v20190304-v0.2-176-g15d997b',
arguments=[
"/opt/model.py",
"--tf-export-dir", model_export_dir,
"--tf-train-steps", train_steps,
"--tf-batch-size", batch_size,
"--tf-learning-rate", learning_rate
]
).apply(gcp.use_gcp_secret('user-gcp-sa'))
serve = dsl.ContainerOp(
name='serve',
image='gcr.io/ml-pipeline/ml-pipeline-kubeflow-deployer:\
7775692adf28d6f79098e76e839986c9ee55dd61',
arguments=[
'--model-export-path', model_export_dir,
'--server-name', "mnist-service"
]
).apply(gcp.use_gcp_secret('user-gcp-sa'))
serve.after(train)
web_ui = dsl.ContainerOp(
name='web-ui',
image='gcr.io/kubeflow-examples/mnist/deploy-service:latest',
arguments=[
'--image', 'gcr.io/kubeflow-examples/mnist/web-ui:\
v20190304-v0.2-176-g15d997b-pipelines',
'--name', 'web-ui',
'--container-port', '5000',
'--service-port', '80',
'--service-type', "LoadBalancer"
]
).apply(gcp.use_gcp_secret('user-gcp-sa'))
web_ui.after(serve)
if __name__ == '__main__':
import kfp.compiler as compiler
compiler.Compiler().compile(mnist_pipeline, __file__ + '.tar.gz')

View File

@ -0,0 +1,3 @@
python-dateutil
https://storage.googleapis.com/ml-pipeline/release/0.1.9/kfp.tar.gz
kubernetes==8.0.0