Examples for pipelines running on-premise (#559)

This commit is contained in:
Jin Chi He 2019-06-05 11:49:52 +08:00 committed by Kubernetes Prow Robot
parent 767ecd240d
commit 73d3a00c8b
4 changed files with 62 additions and 23 deletions

View File

@ -1,9 +1,11 @@
# MNIST Pipelines GCP
This document describes how to run the [MNIST example](https://github.com/kubeflow/examples/tree/master/mnist) on Kubeflow Pipelines on a Google Cloud Platform cluster
This document describes how to run the [MNIST example](https://github.com/kubeflow/examples/tree/master/mnist) on Kubeflow Pipelines on a Google Cloud Platform and on premise cluster.
## Setup
### GCP
#### Create a GCS bucket
This pipeline requires a [Google Cloud Storage bucket](https://cloud.google.com/storage/) to hold your trained model. You can create one with the following command
@ -32,7 +34,11 @@ kubectl port-forward -n kubeflow $(kubectl get pods -n kubeflow --selector=servi
-o jsonpath='{.items[0].metadata.name}') 8085:80
```
#### Install Python Dependencies
### On Premise Cluster
For on premise cluster, beside of Kubeflow deployment, you need to create a [Persistent Volume (PV)](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) and [Persistent Volume Claims(PVC)](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) to store trained result.
Note that the `accessModes` of the PVC should be `ReadWriteMany ` so that the PVC can be mounted by containers of multiple steps in parallel.
### Install Python Dependencies
Set up a [virtual environment](https://docs.python.org/3/tutorial/venv.html) for your Kubeflow Pipelines work:
@ -50,17 +56,24 @@ pip install -r requirements.txt --upgrade
## Running the Pipeline
#### Compile Pipeline
Pipelines are written in Python, but they must be compiled into a [domain-specific language (DSL)](https://en.wikipedia.org/wiki/Domain-specific_language)
before they can be used. Most pipelines are designed so that simply running the script will preform the compilation step
Pipelines are written in Python, but they must be compiled into a [domain-specific language (DSL)](https://en.wikipedia.org/wiki/Domain-specific_language) before they can be used.
For on premise cluster, update the `platform` to `onprem` in `mnist_pipeline.py`.
```bash
sed -i.sedbak s"/platform = 'GCP'/platform = 'onprem'/" mnist_pipeline.py
```
python3 mnist-pipeline.py
Most pipelines are designed so that simply running the script will preform the compilation steps:
```
Running this command should produce a compiled *mnist.tar.gz* file
python3 mnist_pipeline.py
```
Running this command should produce a compiled * mnist_pipeline.py.tar.gz* file:
Additionally, you can compile manually using the *dsl-compile* script
```
python venv/bin/dsl-compile --py mnist-pipeline.py --output mnist-pipeline.py.tar.gz
python venv/bin/dsl-compile --py mnist_pipeline.py --output mnist_pipeline.py.tar.gz
```
#### Upload through the UI
@ -81,7 +94,9 @@ When you're ready, select the "Create Run" button to launch the pipeline
![Pipeline](./img/pipeline.png "Pipeline")
Fill out the information required for the run, including the GCP `$BUCKET_ID` you created earlier. Press "Start" when you are ready
Fill out the information required for the run, and press "Start" when you are ready.
- GCP: Fill out the GCP `$BUCKET_ID` you created earlier, and ignore the option `pvc_name`.
- On premise cluster: Fill out the `pvc_name` as name of the PVC you created earlier, and the PVC is mounted to '/mnt', so the `model-export-dir` can be `/mnt/export`.
![Run Form](./img/run_form.png "Run Form")
@ -108,7 +123,8 @@ Pipelines are expected to include a `@dsl.pipeline` decorator to provide metadat
def mnist_pipeline(model_export_dir='gs://your-bucket/export',
train_steps='200',
learning_rate='0.01',
batch_size='100'):
batch_size='100'
pvc_name=''):
```
The pipeline is defined in the mnist_pipeline function. It includes a number of arguments, which are exposed in the Kubeflow Pipelines UI when creating a new Run.
Although passed as strings, these arguments are of type [`kfp.dsl.PipelineParam`](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/dsl/_pipeline_param.py)
@ -170,7 +186,7 @@ web_ui = dsl.ContainerOp(
).apply(gcp.use_gcp_secret('user-gcp-sa'))
web_ui.after(serve)
```
```
Like 'serve', the web-ui component launches a service that exists after the pipeline is complete. Instead of launching a Kubeflow resource, the web-ui launches
a standard Kubernetes Deployment/Service pair. The Dockerfile that builds the deployment image [can be found here.](./deploy-service/Dockerfile) This image is used
to deploy the web UI, which was built from the [Dockerfile found in the MNIST example](https://github.com/kubeflow/examples/blob/master/mnist/web-ui/Dockerfile)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 79 KiB

After

Width:  |  Height:  |  Size: 90 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 34 KiB

After

Width:  |  Height:  |  Size: 79 KiB

View File

@ -20,7 +20,9 @@ Run this script to compile pipeline
import kfp.dsl as dsl
import kfp.gcp as gcp
import kfp.onprem as onprem
platform = 'GCP'
@dsl.pipeline(
name='MNIST',
@ -29,7 +31,8 @@ import kfp.gcp as gcp
def mnist_pipeline(model_export_dir='gs://your-bucket/export',
train_steps='200',
learning_rate='0.01',
batch_size='100'):
batch_size='100',
pvc_name=''):
"""
Pipeline with three stages:
1. train an MNIST classifier
@ -46,34 +49,54 @@ def mnist_pipeline(model_export_dir='gs://your-bucket/export',
"--tf-batch-size", batch_size,
"--tf-learning-rate", learning_rate
]
).apply(gcp.use_gcp_secret('user-gcp-sa'))
)
serve_args = [
'--model-export-path', model_export_dir,
'--server-name', "mnist-service"
]
if platform != 'GCP':
serve_args.extend([
'--cluster-name', "mnist-pipeline",
'--pvc-name', pvc_name
])
serve = dsl.ContainerOp(
name='serve',
image='gcr.io/ml-pipeline/ml-pipeline-kubeflow-deployer:'
'7775692adf28d6f79098e76e839986c9ee55dd61',
arguments=[
'--model-export-path', model_export_dir,
'--server-name', "mnist-service"
]
).apply(gcp.use_gcp_secret('user-gcp-sa'))
arguments=serve_args
)
serve.after(train)
web_ui = dsl.ContainerOp(
name='web-ui',
image='gcr.io/kubeflow-examples/mnist/deploy-service:latest',
arguments=[
webui_args = [
'--image', 'gcr.io/kubeflow-examples/mnist/web-ui:'
'v20190304-v0.2-176-g15d997b-pipelines',
'--name', 'web-ui',
'--container-port', '5000',
'--service-port', '80',
'--service-type', "LoadBalancer"
]
).apply(gcp.use_gcp_secret('user-gcp-sa'))
]
if platform != 'GCP':
webui_args.extend([
'--cluster-name', "mnist-pipeline"
])
web_ui = dsl.ContainerOp(
name='web-ui',
image='gcr.io/kubeflow-examples/mnist/deploy-service:latest',
arguments=webui_args
)
web_ui.after(serve)
steps = [train, serve, web_ui]
for step in steps:
if platform == 'GCP':
step.apply(gcp.use_gcp_secret('user-gcp-sa'))
else:
step.apply(onprem.mount_pvc(pvc_name, 'local-storage', '/mnt'))
if __name__ == '__main__':
import kfp.compiler as compiler