Applied changes to README and Kustomize files to handle training, monitoring, and serving the mnist model in S3 using Kustomize (#543)

2019-08-19 17:41:33 -07:00 · 2019-08-19 17:41:33 -07:00 · 0b33b536b7
parent 5b3016fae9
commit 0b33b536b7
6 changed files with 299 additions and 31 deletions
--- a/mnist/README.md
+++ b/mnist/README.md
@ -351,7 +351,7 @@ kustomize edit add configmap mnist-map-training --from-literal=name=mnist-train-
 Optionally, if you want to use your custom training image, configurate that as below.

 ```
-kustomize edit set image training-image=$DOCKER_URL:$TAG
+kustomize edit set image training-image=$DOCKER_URL
 ```

 Next we configure it to run distributed by setting the number of parameter servers and workers to use. The `numPs` means the number of Ps and the `numWorkers` means the number of Worker.
@ -368,13 +368,6 @@ kustomize edit add configmap mnist-map-training --from-literal=batchSize=100
 kustomize edit add configmap mnist-map-training --from-literal=learningRate=0.01
 ```

-Now we need to configure parameters telling the code to save the model to S3, replace `${S3_MODEL_PATH_URI}` and `${S3_MODEL_EXPORT_URI}` below with real value.
-
-```
-kustomize edit add configmap mnist-map-training --from-literal=modelDir=${S3_MODEL_PATH_URI}
-kustomize edit add configmap mnist-map-training --from-literal=exportDir=${S3_MODEL_EXPORT_URI}
-```
-
 In order to write to S3 we need to supply the TensorFlow code with AWS credentials we also need to set various environment variables configuring access to S3.

  1. Define a bunch of environment variables corresponding to your S3 settings; these will be used in subsequent steps
@ -388,24 +381,25 @@ In order to write to S3 we need to supply the TensorFlow code with AWS credentia
     export BUCKET_NAME=mybucket
     export S3_USE_HTTPS=1 #set to 0 for default minio installs
     export S3_VERIFY_SSL=1 #set to 0 for defaul minio installs 
+     export S3_MODEL_PATH_URI=s3://${BUCKET_NAME}/model
+     export S3_MODEL_EXPORT_URI=s3://${BUCKET_NAME}/export
     ```

-  2. Create a K8s secret containing your AWS credentials
+  1. Create a K8s secret containing your AWS credentials

     ```
     kustomize edit add secret aws-creds --from-literal=awsAccessKeyID=${AWS_ACCESS_KEY_ID} \
       --from-literal=awsSecretAccessKey=${AWS_SECRET_ACCESS_KEY}
     ```

-  3. Pass secrets as environment variables into pod
+  1. Pass secrets as environment variables into pod

     ```
-     kustomize edit add configmap mnist-map-training --from-literal=awsSecretName=aws-creds
     kustomize edit add configmap mnist-map-training --from-literal=awsAccessKeyIDName=awsAccessKeyID
     kustomize edit add configmap mnist-map-training --from-literal=awsSecretAccessKeyName=awsSecretAccessKey
     ```   

-  4. Next we need to set a whole bunch of S3 related environment variables so that TensorFlow knows how to talk to S3
+  1. Next we need to set a whole bunch of S3 related environment variables so that TensorFlow knows how to talk to S3

     ```
     kustomize edit add configmap mnist-map-training --from-literal=S3_ENDPOINT=${S3_ENDPOINT}
@ -414,6 +408,8 @@ In order to write to S3 we need to supply the TensorFlow code with AWS credentia
     kustomize edit add configmap mnist-map-training --from-literal=BUCKET_NAME=${BUCKET_NAME}
     kustomize edit add configmap mnist-map-training --from-literal=S3_USE_HTTPS=${S3_USE_HTTPS}
     kustomize edit add configmap mnist-map-training --from-literal=S3_VERIFY_SSL=${S3_VERIFY_SSL}
+     kustomize edit add configmap mnist-map-training --from-literal=modelDir=${S3_MODEL_PATH_URI}
+     kustomize edit add configmap mnist-map-training --from-literal=exportDir=${S3_MODEL_EXPORT_URI}
     ```

     * If we look at the spec for our job we can see that the environment variables related to S3 are set.
@ -436,10 +432,28 @@ In order to write to S3 we need to supply the TensorFlow code with AWS credentia
                    ..
                    env:
                    ...
+                    - name: S3_ENDPOINT
+                      value: s3.us-west-2.amazonaws.com
+                    - name: AWS_ENDPOINT_URL
+                      value: https://s3.us-west-2.amazonaws.com
                    - name: AWS_REGION
                      value: us-west-2
                    - name: BUCKET_NAME
-                      value: somebucket
+                      value: mybucket
+                    - name: S3_USE_HTTPS
+                      value: "1"
+                    - name: S3_VERIFY_SSL
+                      value: "1"
+                    - name: AWS_ACCESS_KEY_ID
+                      valueFrom:
+                        secretKeyRef:
+                          key: awsAccessKeyID
+                          name: aws-creds-somevalue
+                    - name: AWS_SECRET_ACCESS_KEY
+                      valueFrom:
+                        secretKeyRef:
+                          key: awsSecretAccessKey
+                          name: aws-creds-somevalue
                    ...
                  ...
            ...
@ -543,29 +557,30 @@ Enter the `monitoring/S3` from the `mnist` application directory.
 cd monitoring/S3
 ```

-Configure TensorBoard to point to your model location
-
-```
-kustomize edit add configmap mnist-map-monitoring --from-literal=logDir=${LOGDIR}
-```
-
 Assuming you followed the directions above if you used S3 you can use the following value

 ```
-LOGDIR=s3://${BUCKET}/${MODEL_PATH}
+LOGDIR=${S3_MODEL_PATH_URI}
+kustomize edit add configmap mnist-map-monitoring --from-literal=logDir=${LOGDIR}
 ```

 You need to point TensorBoard to AWS credentials to access S3 bucket with model.

+  1. Create a K8s secret containing your AWS credentials
+
+     ```
+     kustomize edit add secret aws-creds --from-literal=awsAccessKeyID=${AWS_ACCESS_KEY_ID} \
+       --from-literal=awsSecretAccessKey=${AWS_SECRET_ACCESS_KEY}
+     ```
+
  1. Pass secrets as environment variables into pod

     ```
-     kustomize edit add configmap mnist-map-monitoring --from-literal=awsSecretName=aws-creds
     kustomize edit add configmap mnist-map-monitoring --from-literal=awsAccessKeyIDName=awsAccessKeyID
     kustomize edit add configmap mnist-map-monitoring --from-literal=awsSecretAccessKeyName=awsSecretAccessKey
     ```

-  2. Next we need to set a whole bunch of S3 related environment variables so that TensorBoard knows how to talk to S3
+  1. Next we need to set a whole bunch of S3 related environment variables so that TensorBoard knows how to talk to S3

     ```
     kustomize edit add configmap mnist-map-monitoring --from-literal=S3_ENDPOINT=${S3_ENDPOINT}
@ -590,10 +605,28 @@ You need to point TensorBoard to AWS credentials to access S3 bucket with model.
            ..
            env:
            ...
+            - name: S3_ENDPOINT
+              value: s3.us-west-2.amazonaws.com
+            - name: AWS_ENDPOINT_URL
+              value: https://s3.us-west-2.amazonaws.com
            - name: AWS_REGION
              value: us-west-2
            - name: BUCKET_NAME
-              value: somebucket
+              value: mybucket
+            - name: S3_USE_HTTPS
+              value: "1"
+            - name: S3_VERIFY_SSL
+              value: "1"
+            - name: AWS_ACCESS_KEY_ID
+              valueFrom:
+                secretKeyRef:
+                  key: awsAccessKeyID
+                  name: aws-creds-somevalue
+            - name: AWS_SECRET_ACCESS_KEY
+              valueFrom:
+                secretKeyRef:
+                  key: awsSecretAccessKey
+                  name: aws-creds-somevalue
            ...
       ```

@ -680,7 +713,122 @@ kubectl describe service mnist-gcs-dist

 ### S3

-TODO: Add instructions
+We can also serve the model when it is stored on S3. This assumes that when you trained the model you set `exportDir` to a S3
+URI; if not you can always copy it to S3 using the AWS CLI.
+
+Assuming you followed the directions above, you should have set the following environment variables that will be used in this section:
+
+```
+echo ${S3_MODEL_EXPORT_URI}
+echo ${AWS_REGION}
+echo ${S3_ENDPOINT}
+echo ${S3_USE_HTTPS}
+echo ${S3_VERIFY_SSL}
+```
+
+Check that a model was exported to s3
+
+```
+aws s3 ls ${S3_MODEL_EXPORT_URI} --recursive
+```
+
+The output should look something like
+
+```
+${S3_MODEL_EXPORT_URI}/1547100373/saved_model.pb
+${S3_MODEL_EXPORT_URI}/1547100373/variables/
+${S3_MODEL_EXPORT_URI}/1547100373/variables/variables.data-00000-of-00001
+${S3_MODEL_EXPORT_URI}/1547100373/variables/variables.index
+```
+
+The number `1547100373` is a version number auto-generated by TensorFlow; it will vary on each run but should be monotonically increasing if you save a model to the same location as a previous location.
+
+Enter the `serving/S3` folder from the `mnist` application directory.
+```
+cd serving/S3
+```
+
+Set a different name for the tf-serving.
+
+```
+kustomize edit add configmap mnist-map-serving --from-literal=name=mnist-s3-serving
+```
+
+Create a K8s secret containing your AWS credentials
+
+```
+kustomize edit add secret aws-creds --from-literal=awsAccessKeyID=${AWS_ACCESS_KEY_ID} \
+  --from-literal=awsSecretAccessKey=${AWS_SECRET_ACCESS_KEY}
+```
+
+Enable serving from S3 by configuring the following ksonnet parameters using the environment variables from above:
+
+```
+kustomize edit add configmap mnist-map-serving --from-literal=s3Enable=1 #This needs to be true for S3 connection to work
+kustomize edit add configmap mnist-map-serving --from-literal=modelBasePath=${S3_MODEL_EXPORT_URI}/ 
+kustomize edit add configmap mnist-map-serving --from-literal=S3_ENDPOINT=${S3_ENDPOINT}
+kustomize edit add configmap mnist-map-serving --from-literal=AWS_REGION=${AWS_REGION}
+kustomize edit add configmap mnist-map-serving --from-literal=S3_USE_HTTPS=${S3_USE_HTTPS}
+kustomize edit add configmap mnist-map-serving --from-literal=S3_VERIFY_SSL=${S3_VERIFY_SSL}
+kustomize edit add configmap mnist-map-serving --from-literal=AWS_ACCESS_KEY_ID=awsAccessKeyID
+kustomize edit add configmap mnist-map-serving --from-literal=AWS_SECRET_ACCESS_KEY=awsSecretAccessKey
+```
+
+If we look at the spec for TensorFlow deployment we can see that the environment variables related to S3 are set.
+```
+kustomize build .
+```
+
+```
+...
+spec:
+  containers:
+  - command:
+    ..
+    env:
+    ...
+    - name: modelBasePath
+      value: s3://mybucket/export/
+    - name: s3Enable
+      value: "1"
+    - name: S3_ENDPOINT
+      value: s3.us-west-2.amazonaws.com
+    - name: AWS_REGION
+      value: us-west-2
+    - name: S3_USE_HTTPS
+      value: "1"
+    - name: S3_VERIFY_SSL
+      value: "1"
+    - name: AWS_ACCESS_KEY_ID
+      valueFrom:
+        secretKeyRef:
+          key: awsAccessKeyID
+          name: aws-creds-somevalue
+    - name: AWS_SECRET_ACCESS_KEY
+      valueFrom:
+        secretKeyRef:
+          key: awsSecretAccessKey
+          name: aws-creds-somevalue
+    ...
+```
+
+Deploy it, and run a service to make the deployment accessible to other pods in the cluster
+
+```
+kustomize build . |kubectl apply -f -
+```
+
+You can check the deployment by running
+
+```
+kubectl describe deployments mnist-s3-serving
+```
+
+The service should make the `mnist-s3-serving` deployment accessible over port 9000
+
+```
+kubectl describe service mnist-s3-serving
+```

 ### Local storage

@ -753,7 +901,7 @@ POD_NAME=$(kubectl get pods --selector=app=web-ui --template '{{range .items}}{{
 kubectl port-forward ${POD_NAME} 8080:5000  
 ```

-You should now be able to open up the web app at your localhost. [Local Storage](http://localhost:8080) or [GCS](http://localhost:8080/?addr=mnist-gcs-dist).
+You should now be able to open up the web app at your localhost. [Local Storage](http://localhost:8080) or [GCS](http://localhost:8080/?addr=mnist-gcs-dist) or [S3](http://localhost:8080/?addr=mnist-s3-serving).


 ### Using IAP on GCP
--- a/mnist/monitoring/S3/kustomization.yaml
+++ b/mnist/monitoring/S3/kustomization.yaml
@ -51,12 +51,12 @@ vars:
    kind: ConfigMap
    name: mnist-map-monitoring
 - fieldref:
-    fieldPath: data.awsSecretName
+    fieldPath: metadata.name
  name: awsSecretName
  objref:
    apiVersion: v1
-    kind: ConfigMap
-    name: mnist-map-monitoring
+    kind: Secret
+    name: aws-creds
 - fieldref:
    fieldPath: data.awsAccessKeyIDName
  name: awsAccessKeyIDName
--- a/mnist/serving/S3/deployment_patch.yaml
+++ b/mnist/serving/S3/deployment_patch.yaml
@ -0,0 +1,41 @@
+- op: add
+  path: /spec/template/spec/containers/0/env/-
+  value:
+    name: s3Enable
+    value: $(s3Enable)
+- op: add
+  path: /spec/template/spec/containers/0/env/-
+  value:
+    name: S3_ENDPOINT
+    value: $(S3_ENDPOINT)
+- op: add
+  path: /spec/template/spec/containers/0/env/-
+  value:
+    name: AWS_REGION
+    value: $(AWS_REGION)
+- op: add
+  path: /spec/template/spec/containers/0/env/-
+  value:
+    name: S3_USE_HTTPS
+    value: $(S3_USE_HTTPS)
+- op: add
+  path: /spec/template/spec/containers/0/env/-
+  value:
+    name: S3_VERIFY_SSL
+    value: $(S3_VERIFY_SSL)
+- op: add
+  path: /spec/template/spec/containers/0/env/-
+  value:
+    name: AWS_ACCESS_KEY_ID
+    valueFrom:
+      secretKeyRef:
+        key: $(AWS_ACCESS_KEY_ID)
+        name: $(awsSecretName)
+- op: add
+  path: /spec/template/spec/containers/0/env/-
+  value:
+    name: AWS_SECRET_ACCESS_KEY
+    valueFrom:
+      secretKeyRef:
+        key: $(AWS_SECRET_ACCESS_KEY)
+        name: $(awsSecretName)
--- a/mnist/serving/S3/kustomization.yaml
+++ b/mnist/serving/S3/kustomization.yaml
@ -0,0 +1,74 @@
+apiVersion: kustomize.config.k8s.io/v1beta1
+kind: Kustomization
+
+bases:
+- ../base
+
+configurations:
+- params.yaml
+
+vars:
+- fieldref:
+    fieldPath: data.s3Enable
+  name: s3Enable
+  objref:
+    apiVersion: v1
+    kind: ConfigMap
+    name: mnist-map-serving
+- fieldref:
+    fieldPath: data.S3_ENDPOINT
+  name: S3_ENDPOINT
+  objref:
+    apiVersion: v1
+    kind: ConfigMap
+    name: mnist-map-serving
+- fieldref:
+    fieldPath: data.AWS_REGION
+  name: AWS_REGION
+  objref:
+    apiVersion: v1
+    kind: ConfigMap
+    name: mnist-map-serving
+- fieldref:
+    fieldPath: data.S3_USE_HTTPS
+  name: S3_USE_HTTPS
+  objref:
+    apiVersion: v1
+    kind: ConfigMap
+    name: mnist-map-serving
+- fieldref:
+    fieldPath: data.S3_VERIFY_SSL
+  name: S3_VERIFY_SSL
+  objref:
+    apiVersion: v1
+    kind: ConfigMap
+    name: mnist-map-serving
+- fieldref:
+    fieldPath: metadata.name
+  name: awsSecretName
+  objref:
+    apiVersion: v1
+    kind: Secret
+    name: aws-creds
+- fieldref:
+    fieldPath: data.AWS_ACCESS_KEY_ID
+  name: AWS_ACCESS_KEY_ID
+  objref:
+    apiVersion: v1
+    kind: ConfigMap
+    name: mnist-map-serving
+- fieldref:
+    fieldPath: data.AWS_SECRET_ACCESS_KEY
+  name: AWS_SECRET_ACCESS_KEY
+  objref:
+    apiVersion: v1
+    kind: ConfigMap
+    name: mnist-map-serving
+
+patchesJson6902:
+- path: deployment_patch.yaml
+  target:
+    group: extensions
+    kind: Deployment
+    name: $(svcName)
+    version: v1beta1
--- a/mnist/serving/S3/params.yaml
+++ b/mnist/serving/S3/params.yaml
@ -0,0 +1,5 @@
+varReference:
+- path: spec/template/spec/containers/env/valueFrom/secretKeyRef/name
+  kind: Deployment
+- path: spec/template/spec/containers/env/valueFrom/secretKeyRef/key
+  kind: Deployment
--- a/mnist/training/S3/kustomization.yaml
+++ b/mnist/training/S3/kustomization.yaml
@ -60,12 +60,12 @@ vars:
    kind: ConfigMap
    name: mnist-map-training
 - fieldref:
-    fieldPath: data.awsSecretName
+    fieldPath: metadata.name
  name: awsSecretName
  objref:
    apiVersion: v1
-    kind: ConfigMap
-    name: mnist-map-training
+    kind: Secret
+    name: aws-creds
 - fieldref:
    fieldPath: data.awsAccessKeyIDName
  name: awsAccessKeyIDName