Use latest kubeflow release branch v0.3.4-rc.1 (#365)

Remove separate pipelines installation
Update kfp version to 0.1.3-rc.2
Clarify difference in installation paths (click-to-deploy vs CLI)
Use set_gpu_limit() and remove generated yaml with resource limits
This commit is contained in:
Michelle Casbon 2018-11-28 04:27:34 +11:00 committed by k8s-ci-robot
parent 31390d39a0
commit 6fcb28bc26
5 changed files with 27 additions and 164 deletions

View File

@ -6,17 +6,13 @@ presentation to public audiences.
The base demo includes the following steps:
1. [Setup your environment](#1-setup-your-environment)
1. [Create a GKE cluster and install Kubeflow](#2-create-a-gke-cluster-and-install-kubeflow)
1. [Install pipelines on GKE](#3-install-pipelines-on-gke)
1. [Create a GKE cluster and install Kubeflow with pipelines](#2-create-a-gke-cluster-and-install-kubeflow-with-pipelines)
## 1. Setup your environment
Clone the [kubeflow/kubeflow](https://github.com/kubeflow/kubeflow) repo and
checkout the
[`v0.3.3`](https://github.com/kubeflow/kubeflow/releases/tag/v0.3.3) branch.
Clone the [kubeflow/pipelines](https://github.com/kubeflow/pipelines) repo and
checkout the
[`0.1.2`](https://github.com/kubeflow/pipelines/releases/tag/0.1.2) branch.
[`v0.3.4-rc.1`](https://github.com/kubeflow/kubeflow/releases/tag/v0.3.4-rc.1) branch.
Ensure that the repo paths, project name, and other variables are set correctly.
When all overrides are set, source the environment file:
@ -35,16 +31,28 @@ source activate kfp
Install the Kubeflow Pipelines SDK:
```
pip install https://storage.googleapis.com/ml-pipeline/release/0.1.2/kfp.tar.gz --upgrade
pip install https://storage.googleapis.com/ml-pipeline/release/0.1.3-rc.2/kfp.tar.gz --upgrade
```
## 2. Create a GKE cluster and install Kubeflow
If you encounter any errors, run this before repeating the previous command:
Creating a cluster with click-to-deploy does not yet support the installation of
pipelines. It is not useful for demonstrating pipelines, but is still worth showing.
```
pip uninstall kfp
```
## 2. Create a GKE cluster and install Kubeflow with pipelines
Choose one of the following options for creating a cluster and installing
Kubeflow with pipelines:
* Click-to-deploy
* CLI (kfctl)
### Click-to-deploy
This is the recommended path if you do not require access to GKE beta features
such as node auto-provisioning (NAP).
Generate a web app Client ID and Client Secret by following the instructions
[here](https://www.kubeflow.org/docs/started/getting-started-gke/#create-oauth-client-credentials).
Save these as environment variables for easy access.
@ -58,11 +66,13 @@ In the [GCP Console](https://console.cloud.google.com/kubernetes), navigate to t
Kubernetes Engine panel to watch the cluster creation process. This results in a
full cluster with Kubeflow installed.
### kfctl
### CLI (kfctl)
While node autoprovisioning is in beta, it must be enabled manually. To create
a cluster with autoprovisioning, run the following commands, which will take
around 30 minutes:
If you require GKE beta features such as node autoprovisioning (NAP), these
instructions describe manual cluster creation.
To create a cluster with autoprovisioning, run the following commands
(estimated: 30 minutes):
```
gcloud container clusters create ${CLUSTER} \
@ -162,18 +172,6 @@ kfctl generate k8s
kfctl apply k8s
```
## 3. Install pipelines on GKE
```
kubectl create clusterrolebinding sa-admin --clusterrole=cluster-admin --serviceaccount=kubeflow:pipeline-runner
cd ks_app
ks registry add ml-pipeline "${PIPELINES_REPO}/ml-pipeline"
ks pkg install ml-pipeline/ml-pipeline
ks generate ml-pipeline ml-pipeline
ks param set ml-pipeline namespace kubeflow
ks apply default -c ml-pipeline
```
View the installed components in the GCP Console. In the
[Kubernetes Engine](https://console.cloud.google.com/kubernetes)
section, you will see a new cluster ${CLUSTER}. Under

View File

@ -38,7 +38,7 @@ def kubeflow_training(
num_layers: kfp.PipelineParam = kfp.PipelineParam(name='numlayers', value='2'),
optimizer: kfp.PipelineParam = kfp.PipelineParam(name='optimizer', value='ftrl')):
training = training_op(learning_rate, num_layers, optimizer)
training = training_op(learning_rate, num_layers, optimizer).set_gpu_limit(1)
postprocessing = postprocessing_op(training.output) # pylint: disable=unused-variable
if __name__ == '__main__':

View File

@ -1,136 +0,0 @@
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: pipeline-gpu-example-
spec:
arguments:
parameters:
- name: learning-rate
value: '0.1'
- name: num-layers
value: '2'
- name: optimizer
value: ftrl
entrypoint: pipeline-gpu-example
serviceAccountName: pipeline-runner
templates:
- dag:
tasks:
- arguments:
parameters:
- name: training-output
value: '{{tasks.training.outputs.parameters.training-output}}'
dependencies:
- training
name: postprocessing
template: postprocessing
- arguments:
parameters:
- name: learning-rate
value: '{{inputs.parameters.learning-rate}}'
- name: num-layers
value: '{{inputs.parameters.num-layers}}'
- name: optimizer
value: '{{inputs.parameters.optimizer}}'
name: training
template: training
inputs:
parameters:
- name: learning-rate
- name: num-layers
- name: optimizer
name: pipeline-gpu-example
- container:
args:
- echo "{{inputs.parameters.training-output}}"
command:
- sh
- -c
image: library/bash:4.4.23
inputs:
parameters:
- name: training-output
name: postprocessing
outputs:
artifacts:
- name: mlpipeline-ui-metadata
path: /mlpipeline-ui-metadata.json
s3:
accessKeySecret:
key: accesskey
name: mlpipeline-minio-artifact
bucket: mlpipeline
endpoint: minio-service.kubeflow:9000
insecure: true
key: runs/{{workflow.uid}}/{{pod.name}}/mlpipeline-ui-metadata.tgz
secretKeySecret:
key: secretkey
name: mlpipeline-minio-artifact
- name: mlpipeline-metrics
path: /mlpipeline-metrics.json
s3:
accessKeySecret:
key: accesskey
name: mlpipeline-minio-artifact
bucket: mlpipeline
endpoint: minio-service.kubeflow:9000
insecure: true
key: runs/{{workflow.uid}}/{{pod.name}}/mlpipeline-metrics.tgz
secretKeySecret:
key: secretkey
name: mlpipeline-minio-artifact
- container:
args:
- --batch-size
- '64'
- --lr
- '{{inputs.parameters.learning-rate}}'
- --num-layers
- '{{inputs.parameters.num-layers}}'
- --optimizer
- '{{inputs.parameters.optimizer}}'
command:
- python
- /mxnet/example/image-classification/train_mnist.py
image: katib/mxnet-mnist-example
resources:
limits:
nvidia.com/gpu: 1
inputs:
parameters:
- name: learning-rate
- name: num-layers
- name: optimizer
name: training
outputs:
artifacts:
- name: mlpipeline-ui-metadata
path: /mlpipeline-ui-metadata.json
s3:
accessKeySecret:
key: accesskey
name: mlpipeline-minio-artifact
bucket: mlpipeline
endpoint: minio-service.kubeflow:9000
insecure: true
key: runs/{{workflow.uid}}/{{pod.name}}/mlpipeline-ui-metadata.tgz
secretKeySecret:
key: secretkey
name: mlpipeline-minio-artifact
- name: mlpipeline-metrics
path: /mlpipeline-metrics.json
s3:
accessKeySecret:
key: accesskey
name: mlpipeline-minio-artifact
bucket: mlpipeline
endpoint: minio-service.kubeflow:9000
insecure: true
key: runs/{{workflow.uid}}/{{pod.name}}/mlpipeline-metrics.tgz
secretKeySecret:
key: secretkey
name: mlpipeline-minio-artifact
parameters:
- name: training-output
valueFrom:
path: /etc/timezone

View File

@ -6,7 +6,7 @@ export CLUSTER=kfdemo
export PROJECT=${DEMO_PROJECT}
export ENV=gke
export SVC_ACCT=minikube
export KUBEFLOW_TAG=v0.3.3
export KUBEFLOW_TAG=v0.3.4-rc.1
export KUBEFLOW_REPO=${HOME}/repos/kubeflow/kubeflow
export DEMO_REPO=${HOME}/repos/kubeflow/examples/demos/yelp_demo

View File

@ -7,4 +7,5 @@ export CLUSTER=kubeflow
export KUBEFLOW_REPO=${HOME}/repos/kubeflow/kubeflow
export DEMO_REPO=${HOME}/repos/kubeflow/examples/demos/simple_pipeline
export PIPELINES_REPO=${HOME}/repos/kubeflow/pipelines
export PIPELINE_VERSION=0.1.3-rc.2