Add the web-ui for the mnist example (#473)

* Add the web-ui for the mnist example Copy the mnist web app from https://github.com/googlecodelabs/kubeflow-introduction * Update the web app * Change "server-name" argument to "model-name" because this is what is. * Update the prediction client code; The prediction code was copied from https://github.com/googlecodelabs/kubeflow-introduction and that model used slightly different values for the input names and outputs. * Add a test for the mnist_client code; currently it needs to be run manually. * Fix the label selector for the mnist service so that it matches the TFServing deployment. * Delete the old copy of mnist_client.py; we will go with the copy in ewb-ui from https://github.com/googlecodelabs/kubeflow-introduction * Delete model-deploy.yaml, model-train.yaml, and tf-user.yaml. The K8s resources for training and deploying the model are now in ks_app. * Fix tensorboard; tensorboard only partially works behind Ambassador. It seems like some requests don't work behind a reverse proxy. * Fix lint.
2019-01-14 13:56:39 -08:00 · 2019-01-14 13:56:39 -08:00 · 6770b4adcc
parent b3f06c204d
commit 6770b4adcc
24 changed files with 928 additions and 743 deletions
--- a/mnist/.gitignore
+++ b/mnist/.gitignore
@ -1 +1,2 @@
 build/**
 web-ui/static/tmp/**
--- a/mnist/README.md
+++ b/mnist/README.md
@ -464,14 +464,32 @@ There are various ways to monitor workflow/training job. In addition to using `k
 TODO: This section needs to be updated
-Tensorboard is deployed just before training starts. To connect:
+Configure TensorBoard to point to your model location
 ```
-PODNAME=$(kubectl get pod -l app=tensorboard-${JOB_NAME} -o jsonpath='{.items[0].metadata.name}')
+ks param set tensorboard --env=${KSENV} logDir ${LOGDIR}
-kubectl port-forward ${PODNAME} 6006:6006
+
 ```
-Tensorboard can now be accessed at [http://127.0.0.1:6006](http://127.0.0.1:6006).
+Assuming you followed the directions above if you used GCS you can use the following value
 ```
 LOGDIR=gs://${BUCKET}/${MODEL_PATH}
 ```
 Then you can deploy tensorboard
 ```
 ks apply ${KSENV} -c tensorboard
 ```
 To access tensorboard using port-forwarding
 ```
 kubectl -n jlewi port-forward service/tensorboard-tb 8090:80
 ```
 Tensorboard can now be accessed at [http://127.0.0.1:8090](http://127.0.0.1:8090).
 ## Serving the model
@ -534,96 +552,34 @@ TODO: Add instructions
 TODO: Add instructions
-### Create the K8s service
+## Web Front End
-Next we need to create a K8s service to route traffic to our model
+The example comes with a simple web front end that can be used with your model.
 To deploy the web front end
 ```
-ks apply jlewi -c mnist-service
+ks apply ${ENV} -c web-ui
 ```
-By default the workflow deploys our model via Tensorflow Serving. Included in this example is a client that can query your model and provide results:
+### Connecting via port forwarding
 To connect to the web app via port-forwarding
 ```
-POD_NAME=$(kubectl get pod -l=app=mnist-${JOB_NAME} -o jsonpath='{.items[0].metadata.name}')
+kubectl -n ${NAMESPACE} port-forward svc/web-ui 8080:80
 kubectl port-forward ${POD_NAME} 9000:9000 &
 TF_MNIST_IMAGE_PATH=data/7.png python mnist_client.py
 ```
-This should result in output similar to this, depending on how well your model was trained:
+You should now be able to open up the web app at [http://localhost:8080](http://localhost:8080).
 ### Using IAP on GCP
 If you are using GCP and have set up IAP then you can access the web UI at
 ```
-outputs {
+https://${DEPLOYMENT}.endpoints.${PROJECT}.cloud.goog/${NAMESPACE}/mnist/
  key: "classes"
  value {
    dtype: DT_UINT8
    tensor_shape {
      dim {
        size: 1
      }
    }
    int_val: 7
  }
 }
 outputs {
  key: "predictions"
  value {
    dtype: DT_FLOAT
    tensor_shape {
      dim {
        size: 1
      }
      dim {
        size: 10
      }
    }
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
    float_val: 0.0
    float_val: 1.0
    float_val: 0.0
    float_val: 0.0
  }
 }
 ............................
 ............................
 ............................
 ............................
 ............................
 ............................
 ............................
 ..............@@@@@@........
 ..........@@@@@@@@@@........
 ........@@@@@@@@@@@@........
 ........@@@@@@@@.@@@........
 ........@@@@....@@@@........
 ................@@@@........
 ...............@@@@.........
 ...............@@@@.........
 ...............@@@..........
 ..............@@@@..........
 ..............@@@...........
 .............@@@@...........
 .............@@@............
 ............@@@@............
 ............@@@.............
 ............@@@.............
 ...........@@@..............
 ..........@@@@..............
 ..........@@@@..............
 ..........@@................
 ............................
 Your model says the above number is... 7!
 ```
 You can also omit `TF_MNIST_IMAGE_PATH`, and the client will pick a random number from the mnist test data. Run it repeatedly and see how your model fares!
 ## Conclusion and Next Steps
 This is an example of what your machine learning can look like. Feel free to play with the tunables and see if you can increase your model's accuracy (increasing `model-train-steps` can go a long way).
--- a/mnist/image_build.jsonnet
+++ b/mnist/image_build.jsonnet
@ -88,6 +88,12 @@
    contextDir: "."
  },
-  steps: modelSteps.steps + ksonnetSteps.steps,
+  local uiSteps = subGraphTemplate {
-  images: modelSteps.images + ksonnetSteps.images,
+    name: "web-ui",
    dockerFile: "./web-ui/Dockerfile",
    contextDir: "./web-ui"
  },
  steps: modelSteps.steps + ksonnetSteps.steps + uiSteps.steps,
  images: modelSteps.images + ksonnetSteps.images + uiSteps.images,
 }
--- a/mnist/ks_app/components/params.libsonnet
+++ b/mnist/ks_app/components/params.libsonnet
@ -52,10 +52,23 @@
    "mnist-service": {
      enablePrometheus: 'true',
      injectIstio: 'false',
-      modelName: 'null',
+      modelName: 'mnist',
      name: 'mnist-service',
      serviceType: 'ClusterIP',
      trafficRule: 'v1:100',
    },
    "tensorboard": {
      image: "tensorflow/tensorflow:1.11.0",
      logDir: "gs://example/to/model/logdir",
      name: "tensorboard",
    },
    "web-ui": {
      containerPort: 5000,
      image: "gcr.io/kubeflow-examples/mnist/web-ui:v20190112-v0.2-142-g3b38225",
      name: "web-ui",
      replicas: 1,
      servicePort: 80,
      type: "ClusterIP",
    },
  },
 }
--- a/mnist/ks_app/components/tensorboard.jsonnet
+++ b/mnist/ks_app/components/tensorboard.jsonnet
@ -0,0 +1,114 @@
 // TODO: Generalize to use S3. We can follow the pattern of training that
 // takes parameters to specify environment variables and secret which can be customized
 // for GCS, S3 as needed.
 local env = std.extVar("__ksonnet/environments");
 local params = std.extVar("__ksonnet/params").components.tensorboard;
 local k = import "k.libsonnet";
 local name = params.name;
 local namespace = env.namespace;
 local service = {
  apiVersion: "v1",
  kind: "Service",
  metadata: {
    name: name + "-tb",
    namespace: env.namespace,
    annotations: {
      "getambassador.io/config":
        std.join("\n", [
          "---",
          "apiVersion: ambassador/v0",
          "kind:  Mapping",
          "name: " + name + "_mapping",
          "prefix: /" + env.namespace + "/tensorboard/mnist",
          "rewrite: /",
          "service: " + name + "-tb." + namespace,
          "---",
          "apiVersion: ambassador/v0",
          "kind:  Mapping",
          "name: " + name + "_mapping_data",
          "prefix: /" + env.namespace + "/tensorboard/mnist/data/",
          "rewrite: /data/",
          "service: " + name + "-tb." + namespace,
        ]),
    },  //annotations
  },
  spec: {
    ports: [
      {
        name: "http",
        port: 80,
        targetPort: 80,
      },
    ],
    selector: {
      app: "tensorboard",
      "tb-job": name,
    },
  },
 };
 local deployment = {
  apiVersion: "apps/v1beta1",
  kind: "Deployment",
  metadata: {
    name: name + "-tb",
    namespace: env.namespace,
  },
  spec: {
    replicas: 1,
    template: {
      metadata: {
        labels: {
          app: "tensorboard",
          "tb-job": name,
        },
        name: name,
        namespace: namespace,
      },
      spec: {
        containers: [
          {
            command: [
              "/usr/local/bin/tensorboard",
              "--logdir=" + params.logDir,
              "--port=80",
            ],
            image: params.image,
            name: "tensorboard",
            ports: [
              {
                containerPort: 80,
              },
            ],
            env: [
              {
                name: "GOOGLE_APPLICATION_CREDENTIALS",
                value: "/secret/gcp-credentials/user-gcp-sa.json",
              },
            ],
            volumeMounts: [
              {
                mountPath: "/secret/gcp-credentials",
                name: "gcp-credentials",
              },
            ],
          },
        ],
        volumes: [
          {
            name: "gcp-credentials",
            secret: {
              secretName: "user-gcp-sa",
            },
          },
        ],
      },
    },
  },
 };
 std.prune(k.core.v1.list.new([service, deployment]))
--- a/mnist/ks_app/components/web-ui.jsonnet
+++ b/mnist/ks_app/components/web-ui.jsonnet
@ -0,0 +1,72 @@
 local env = std.extVar("__ksonnet/environments");
 local params = std.extVar("__ksonnet/params").components["web-ui"];
 [
  {
    "apiVersion": "v1",
    "kind": "Service",
    "metadata": {
      "name": params.name,
      "namespace": env.namespace,
      annotations: {
        "getambassador.io/config":
          std.join("\n", [
            "---",
            "apiVersion: ambassador/v0",
            "kind:  Mapping",
            "name: " + params.name + "_mapping",
            "prefix: /" + env.namespace + "/mnist/",
            "rewrite: /",
            "service: " + params.name + "." + env.namespace,
          ]),
      },  //annotations
    },
    "spec": {
      "ports": [
        {
          "port": params.servicePort,
          "targetPort": params.containerPort
        }
      ],
      "selector": {
        "app": params.name
      },
      "type": params.type
    }
  },
  {
    "apiVersion": "apps/v1beta2",
    "kind": "Deployment",
    "metadata": {
      "name": params.name,
      "namespace": env.namespace,
    },
    "spec": {
      "replicas": params.replicas,
      "selector": {
        "matchLabels": {
          "app": params.name
        },
      },
      "template": {
        "metadata": {
          "labels": {
            "app": params.name
          }
        },
        "spec": {
          "containers": [
            {
              "image": params.image,
              "name": params.name,
              "ports": [
                {
                  "containerPort": params.containerPort
                }
              ]
            }
          ]
        }
      }
    }
  }
 ]
--- a/mnist/ks_app/environments/jlewi/params.libsonnet
+++ b/mnist/ks_app/environments/jlewi/params.libsonnet
@ -22,9 +22,10 @@ local envParams = params + {
      namespace: 'jlewi',
    },
    "mnist-service"+: {
      name: 'jlewi-deploy-test',
      namespace: 'jlewi',
-      modelBasePath: 'gs://kubeflow-ci_temp/mnist-jlewi/export',
+    },
    tensorboard+: {
      logDir: 'gs://kubeflow-ci_temp/mnist-jlewi/',
    },
  },
 };
--- a/mnist/mnist_client.py
+++ b/mnist/mnist_client.py
@ -1,50 +0,0 @@
 #!/usr/bin/env python2.7
 import os
 import random
 import numpy
 from PIL import Image
 import tensorflow as tf
 from tensorflow.examples.tutorials.mnist import input_data
 from tensorflow_serving.apis import predict_pb2
 from tensorflow_serving.apis import prediction_service_pb2
 from grpc.beta import implementations
 from mnist import MNIST # pylint: disable=no-name-in-module
 TF_MODEL_SERVER_HOST = os.getenv("TF_MODEL_SERVER_HOST", "127.0.0.1")
 TF_MODEL_SERVER_PORT = int(os.getenv("TF_MODEL_SERVER_PORT", 9000))
 TF_DATA_DIR = os.getenv("TF_DATA_DIR", "/tmp/data/")
 TF_MNIST_IMAGE_PATH = os.getenv("TF_MNIST_IMAGE_PATH", None)
 TF_MNIST_TEST_IMAGE_NUMBER = int(os.getenv("TF_MNIST_TEST_IMAGE_NUMBER", -1))
 if TF_MNIST_IMAGE_PATH != None:
  raw_image = Image.open(TF_MNIST_IMAGE_PATH)
  int_image = numpy.array(raw_image)
  image = numpy.reshape(int_image, 784).astype(numpy.float32)
 elif TF_MNIST_TEST_IMAGE_NUMBER > -1:
  test_data_set = input_data.read_data_sets(TF_DATA_DIR, one_hot=True).test
  image = test_data_set.images[TF_MNIST_TEST_IMAGE_NUMBER]
 else:
  test_data_set = input_data.read_data_sets(TF_DATA_DIR, one_hot=True).test
  image = random.choice(test_data_set.images)
 channel = implementations.insecure_channel(
    TF_MODEL_SERVER_HOST, TF_MODEL_SERVER_PORT)
 stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
 request = predict_pb2.PredictRequest()
 request.model_spec.name = "mnist"
 request.model_spec.signature_name = "serving_default"
 request.inputs['x'].CopyFrom(
    tf.contrib.util.make_tensor_proto(image, shape=[1, 28, 28]))
 result = stub.Predict(request, 10.0)  # 10 secs timeout
 print(result)
 print(MNIST.display(image, threshold=0))
 print("Your model says the above number is... %d!" %
      result.outputs["classes"].int_val[0])
--- a/mnist/model-deploy.yaml
+++ b/mnist/model-deploy.yaml
@ -1,144 +0,0 @@
 apiVersion: argoproj.io/v1alpha1
 kind: Workflow
 metadata:
  generateName: tf-workflow-
 spec:
  entrypoint: deploy-model 
  # Parameters can be passed/overridden via the argo CLI.
  # To override the printed message, run `argo submit` with the -p option:
  # $ argo submit examples/arguments-parameters.yaml -p message="goodbye world"
  arguments:
    parameters:
    - name: workflow
      value: workflow-name
  templates:
  - name: deploy-model
    steps:
      - - name: get-workflow-info
          template: get-workflow-info
      - - name: serve-model
          template: tf-inference
          arguments:
            parameters:
            - name: s3-model-url
              value: "{{steps.get-workflow-info.outputs.parameters.s3-model-url}}"
            - name: s3-exported-url
              value: "{{steps.get-workflow-info.outputs.parameters.s3-exported-url}}"
            - name: aws-secret 
              value: "{{steps.get-workflow-info.outputs.parameters.aws-secret}}"
            - name: namespace
              value: "{{steps.get-workflow-info.outputs.parameters.namespace}}"
            - name: aws-region 
              value: "{{steps.get-workflow-info.outputs.parameters.aws-region}}"
            - name: s3-endpoint
              value: "{{steps.get-workflow-info.outputs.parameters.s3-endpoint}}"
            - name: s3-use-https
              value: "{{steps.get-workflow-info.outputs.parameters.s3-use-https}}"
            - name: s3-verify-ssl
              value: "{{steps.get-workflow-info.outputs.parameters.s3-verify-ssl}}"
            - name: job-name
              value: "{{steps.get-workflow-info.outputs.parameters.job-name}}"
            - name: tf-serving-image 
              value: "{{steps.get-workflow-info.outputs.parameters.tf-serving-image}}"
            - name: model-serving-servicetype
              value: "{{steps.get-workflow-info.outputs.parameters.model-serving-servicetype}}"
            - name: model-serving-ks-url
              value: "{{steps.get-workflow-info.outputs.parameters.model-serving-ks-url}}"
            - name: model-serving-ks-tag
              value: "{{steps.get-workflow-info.outputs.parameters.model-serving-ks-tag}}"
            - name: model-name
              value: "{{steps.get-workflow-info.outputs.parameters.model-name}}"
  - name: get-workflow-info
    container:
      image: nervana/circleci:master
      imagePullPolicy: Always
      command: ["bash", "-c", "-x", 'for var in s3-model-url s3-exported-url; do kubectl get workflow {{workflow.parameters.workflow}} -o json | jq -r ".status.nodes[] | select(.name|contains(\"get-workflow-info\")) | .outputs.parameters[] | select(.name == \"${var}\") | .value" > /tmp/${var} ; done; for var in job-name namespace aws-secret aws-region s3-endpoint s3-use-https s3-verify-ssl tf-serving-image model-serving-servicetype model-serving-ks-url model-serving-ks-tag model-name; do kubectl get workflow {{workflow.parameters.workflow}} -o jsonpath="{.spec.arguments.parameters[?(.name==\"${var}\")].value}" > /tmp/${var}; done']
    outputs:
      parameters:
      - name: s3-model-url
        valueFrom:
          path: /tmp/s3-model-url
      - name: s3-exported-url
        valueFrom:
          path: /tmp/s3-exported-url
      - name: aws-secret
        valueFrom:
          path: /tmp/aws-secret
      - name: namespace
        valueFrom:
          path: /tmp/namespace
      - name: aws-region
        valueFrom:
          path: /tmp/aws-region
      - name: s3-endpoint
        valueFrom:
          path: /tmp/s3-endpoint
      - name: s3-use-https
        valueFrom:
          path: /tmp/s3-use-https
      - name: s3-verify-ssl
        valueFrom:
          path: /tmp/s3-verify-ssl
      - name: job-name
        valueFrom:
          path: /tmp/job-name
      - name: tf-serving-image
        valueFrom:
          path: /tmp/tf-serving-image
      - name: model-serving-servicetype
        valueFrom:
          path: /tmp/model-serving-servicetype
      - name: model-serving-ks-url
        valueFrom:
          path: /tmp/model-serving-ks-url
      - name: model-serving-ks-tag
        valueFrom:
          path: /tmp/model-serving-ks-tag
      - name: model-name
        valueFrom:
          path: /tmp/model-name
  - name: tf-inference
    inputs:
      parameters:
      - name: s3-model-url
      - name: s3-exported-url
      - name: aws-secret
      - name: namespace
      - name: aws-region
      - name: s3-endpoint
      - name: s3-use-https
      - name: s3-verify-ssl
      - name: job-name
      - name: tf-serving-image
      - name: model-serving-servicetype
      - name: model-serving-ks-url
      - name: model-serving-ks-tag
      - name: model-name
    script:
      image: elsonrodriguez/ksonnet:0.10.1
      command: ["/ksonnet-entrypoint.sh"]
      source: |
        ks init my-model-server
        cd my-model-server
        ks registry add kubeflow {{inputs.parameters.model-serving-ks-url}}
        ks pkg install kubeflow/tf-serving@{{inputs.parameters.model-serving-ks-tag}}
        ks env add default
        # TODO change mnist name to be specific to a job. Right now mnist name is required to serve the model.
        ks generate tf-serving {{inputs.parameters.model-name}} --name=mnist-{{inputs.parameters.job-name}} --namespace={{inputs.parameters.namespace}} --model_path={{inputs.parameters.s3-exported-url}}
        ks param set {{inputs.parameters.model-name}} model_server_image {{inputs.parameters.tf-serving-image}}
        ks param set {{inputs.parameters.model-name}} model_name {{inputs.parameters.model-name}}
        ks param set {{inputs.parameters.model-name}} namespace {{inputs.parameters.namespace}}
        ks param set {{inputs.parameters.model-name}} service_type {{inputs.parameters.model-serving-servicetype}}
        ks param set {{inputs.parameters.model-name}} s3_create_secret false
        ks param set {{inputs.parameters.model-name}} s3_secret_name {{inputs.parameters.aws-secret}}
        ks param set {{inputs.parameters.model-name}} s3_secret_accesskeyid_key_name awsAccessKeyID
        ks param set {{inputs.parameters.model-name}} s3_secret_secretaccesskey_key_name awsSecretAccessKey
        ks param set {{inputs.parameters.model-name}} s3_aws_region {{inputs.parameters.aws-region}}
        ks param set {{inputs.parameters.model-name}} s3_endpoint {{inputs.parameters.s3-endpoint}}
        ks param set {{inputs.parameters.model-name}} s3_use_https {{inputs.parameters.s3-use-https}} --as-string
        ks param set {{inputs.parameters.model-name}} s3_verify_ssl {{inputs.parameters.s3-verify-ssl}} --as-string
        ks apply default -c {{inputs.parameters.model-name}}
      #FIXME This doesn't actually work in the current version of argo. We're using a default of `tf-user` in the container entrypoint for now.
      env:
      - name: SERVICE_ACCOUNT
        value: tf-user
--- a/mnist/model-train.yaml
+++ b/mnist/model-train.yaml
@ -1,364 +0,0 @@
 apiVersion: argoproj.io/v1alpha1
 kind: Workflow
 metadata:
  generateName: tf-workflow-
 spec:
  entrypoint: tests
  onExit: exit-handler
  # Parameters can be passed/overridden via the argo CLI.
  # To override the printed message, run `argo submit` with the -p option:
  # $ argo submit examples/arguments-parameters.yaml -p message="goodbye world"
  arguments:
    parameters:
    - name: tf-worker # number of tf workers
      value: 1
    - name: tf-ps # number of tf parameter servers
      value: 2
    - name: tf-model-image
      value: elsonrodriguez/mytfmodel:1.7
    - name: tf-serving-image #FIXME this image is a mirror of a private kubeflow-ci image, once we're building images swap this out. https://github.com/kubeflow/kubeflow/blob/dcf4adfe2dd1cec243647f3dd05d7c26246fddb1/components/k8s-model-server/images/Dockerfile.cpu
      value: elsonrodriguez/model-server:1.6
    - name: tf-tensorboard-image
      value: tensorflow/tensorflow:1.7.0
    - name: ks-image
      value: elsonrodriguez/ksonnet:0.10.1
    - name: model-name
      value: mnist
    - name: model-hidden-units
      value: 100
    - name: model-train-steps
      value: 200
    - name: model-batch-size
      value: 100
    - name: model-learning-rate
      value: 0.01
    - name: model-serving
      value: true
    - name: model-serving-servicetype
      value: ClusterIP
    - name: model-serving-ks-url
      value: github.com/kubeflow/kubeflow/tree/master/kubeflow
    - name: model-serving-ks-tag
      value: 1f474f30
    - name: job-name
      value: myjob
    - name: namespace
      value: default
    - name: s3-data-url
      value: s3://mybucket/data/mnist/
    - name: s3-train-base-url
      value: s3://mybucket/models
    - name: aws-endpoint-url
      value: https://s3.us-west-1.amazonaws.com
    - name: s3-endpoint
      value: s3.us-west-1.amazonaws.com
    - name: s3-use-https
      value: true
    - name: s3-verify-ssl
      value: true
    - name: aws-region
      value: us-west-1
    - name: aws-secret
      value: aws-creds
  volumes:
  - name: training-data
    emptyDir: {}
  - name: training-output
  templates:
  - name: tests
    steps:
      - - name: get-workflow-info
          template: get-workflow-info
      - - name: tensorboard
          template: tf-tensorboard
          arguments:
            parameters:
            - name: s3-model-url
              value: "{{steps.get-workflow-info.outputs.parameters.s3-model-url}}"
      - - name: train-model
          template: tf-train
          arguments:
            parameters:
            - name: s3-model-url
              value: "{{steps.get-workflow-info.outputs.parameters.s3-model-url}}"
      - - name: serve-model
          template: tf-inference
          arguments:
            parameters:
            - name: s3-exported-url
              value: "{{steps.get-workflow-info.outputs.parameters.s3-exported-url}}"
          when: "{{workflow.parameters.model-serving}} == true"
  - name: exit-handler
    steps:
      - - name: cleanup
          template: clean
  - name: get-workflow-info
    container:
      image: nervana/circleci:master
      imagePullPolicy: Always
      command: ["bash", "-c", "echo '{{workflow.parameters.s3-train-base-url}}/{{workflow.parameters.job-name}}/' | tr -d '[:space:]' > /tmp/s3-model-url; echo '{{workflow.parameters.s3-train-base-url}}/{{workflow.parameters.job-name}}/export/{{workflow.parameters.model-name}}/' | tr -d '[:space:]' > /tmp/s3-exported-url"]
    outputs:
      parameters:
      - name: s3-model-url
        valueFrom:
          path: /tmp/s3-model-url
      - name: s3-exported-url
        valueFrom:
          path: /tmp/s3-exported-url
  - name: tf-train
    inputs:
      parameters:
      - name: s3-model-url
    resource:
      action: apply
      # NOTE: need to detect master node complete
      successCondition: status.tfReplicaStatuses.Master.succeeded == 1
      manifest: |
        apiVersion: "kubeflow.org/v1alpha2"
        kind: "TFJob"
        metadata:
          name: {{workflow.parameters.job-name}}
          namespace: {{workflow.parameters.namespace}}
        spec:
          tfReplicaSpecs:
            Master:
              replicas: 1
              template:
                spec:
                  serviceAccountName: tf-job-operator
                  containers:
                    - image: {{workflow.parameters.tf-model-image}}
                      name: tensorflow
                      imagePullPolicy: Always
                      env:
                      - name: TF_MODEL_DIR
                        value: {{inputs.parameters.s3-model-url}}
                      - name: TF_EXPORT_DIR
                        value: {{workflow.parameters.model-name}}
                      - name: TF_TRAIN_STEPS
                        value: "{{workflow.parameters.model-train-steps}}"
                      - name: TF_BATCH_SIZE
                        value: "{{workflow.parameters.model-batch-size}}"
                      - name: TF_LEARNING_RATE
                        value: "{{workflow.parameters.model-learning-rate}}"
                      - name: AWS_ACCESS_KEY_ID
                        valueFrom:
                          secretKeyRef:
                            name: {{workflow.parameters.aws-secret}}
                            key: awsAccessKeyID
                      - name: AWS_SECRET_ACCESS_KEY
                        valueFrom:
                          secretKeyRef:
                            name: {{workflow.parameters.aws-secret}}
                            key: awsSecretAccessKey
                      - name: AWS_DEFAULT_REGION
                        value: {{workflow.parameters.aws-region}}
                      - name: AWS_REGION
                        value: {{workflow.parameters.aws-region}}
                      - name: S3_REGION
                        value: {{workflow.parameters.aws-region}}
                      - name: S3_USE_HTTPS
                        value: "{{workflow.parameters.s3-use-https}}"
                      - name: S3_VERIFY_SSL
                        value: "{{workflow.parameters.s3-verify-ssl}}"
                      - name: S3_ENDPOINT
                        value: {{workflow.parameters.s3-endpoint}}
                  restartPolicy: OnFailure
            Worker:
              replicas: {{workflow.parameters.tf-worker}}
              template:
                spec:
                  serviceAccountName: tf-job-operator
                  containers:
                    - image: {{workflow.parameters.tf-model-image}}
                      name: tensorflow
                      imagePullPolicy: Always
                      env:
                      - name: TF_MODEL_DIR
                        value: {{inputs.parameters.s3-model-url}}
                      - name: TF_EXPORT_DIR
                        value: {{workflow.parameters.model-name}}
                      - name: TF_TRAIN_STEPS
                        value: "{{workflow.parameters.model-train-steps}}"
                      - name: TF_BATCH_SIZE
                        value: "{{workflow.parameters.model-batch-size}}"
                      - name: TF_LEARNING_RATE
                        value: "{{workflow.parameters.model-learning-rate}}"
                      - name: AWS_ACCESS_KEY_ID
                        valueFrom:
                          secretKeyRef:
                            name: {{workflow.parameters.aws-secret}}
                            key: awsAccessKeyID
                      - name: AWS_SECRET_ACCESS_KEY
                        valueFrom:
                          secretKeyRef:
                            name: {{workflow.parameters.aws-secret}}
                            key: awsSecretAccessKey
                      - name: AWS_DEFAULT_REGION
                        value: {{workflow.parameters.aws-region}}
                      - name: AWS_REGION
                        value: {{workflow.parameters.aws-region}}
                      - name: S3_REGION
                        value: {{workflow.parameters.aws-region}}
                      - name: S3_USE_HTTPS
                        value: "{{workflow.parameters.s3-use-https}}"
                      - name: S3_VERIFY_SSL
                        value: "{{workflow.parameters.s3-verify-ssl}}"
                      - name: S3_ENDPOINT
                        value: {{workflow.parameters.s3-endpoint}}
                  restartPolicy: OnFailure
            Ps:
              replicas: {{workflow.parameters.tf-ps}}
              template:
                spec:
                  containers:
                    - image: {{workflow.parameters.tf-model-image}}
                      name: tensorflow
                      imagePullPolicy: Always
                      env:
                      - name: TF_MODEL_DIR
                        value: {{inputs.parameters.s3-model-url}}
                      - name: TF_EXPORT_DIR
                        value: {{workflow.parameters.model-name}}
                      - name: TF_TRAIN_STEPS
                        value: "{{workflow.parameters.model-train-steps}}"
                      - name: TF_BATCH_SIZE
                        value: "{{workflow.parameters.model-batch-size}}"
                      - name: TF_LEARNING_RATE
                        value: "{{workflow.parameters.model-learning-rate}}"
                      - name: AWS_ACCESS_KEY_ID
                        valueFrom:
                          secretKeyRef:
                            name: {{workflow.parameters.aws-secret}}
                            key: awsAccessKeyID
                      - name: AWS_SECRET_ACCESS_KEY
                        valueFrom:
                          secretKeyRef:
                            name: {{workflow.parameters.aws-secret}}
                            key: awsSecretAccessKey
                      - name: AWS_DEFAULT_REGION
                        value: {{workflow.parameters.aws-region}}
                      - name: AWS_REGION
                        value: {{workflow.parameters.aws-region}}
                      - name: S3_REGION
                        value: {{workflow.parameters.aws-region}}
                      - name: S3_USE_HTTPS
                        value: "{{workflow.parameters.s3-use-https}}"
                      - name: S3_VERIFY_SSL
                        value: "{{workflow.parameters.s3-verify-ssl}}"
                      - name: S3_ENDPOINT
                        value: {{workflow.parameters.s3-endpoint}}
                  restartPolicy: OnFailure
  - name: tf-tensorboard
    inputs:
      parameters:
      - name: s3-model-url
    resource:
      action: apply
      manifest: |
        apiVersion: extensions/v1beta1
        kind: Deployment
        metadata:
          labels:
            app: tensorboard-{{workflow.parameters.job-name}}
          name: tensorboard-{{workflow.parameters.job-name}}
          namespace: {{workflow.parameters.namespace}}
        spec:
          replicas: 1
          selector:
            matchLabels:
              app: tensorboard-{{workflow.parameters.job-name}}
          template:
            metadata:
              labels:
                app: tensorboard-{{workflow.parameters.job-name}}
            spec:
              containers:
              - name: tensorboard-{{workflow.parameters.job-name}}
                image: {{workflow.parameters.tf-tensorboard-image}}
                imagePullPolicy: Always
                command:
                 - /usr/local/bin/tensorboard
                args:
                - --logdir
                - {{inputs.parameters.s3-model-url}}
                env:
                - name: AWS_ACCESS_KEY_ID
                  valueFrom:
                    secretKeyRef:
                      key: awsAccessKeyID
                      name: {{workflow.parameters.aws-secret}}
                - name: AWS_SECRET_ACCESS_KEY
                  valueFrom:
                    secretKeyRef:
                      key: awsSecretAccessKey
                      name: {{workflow.parameters.aws-secret}}
                - name: AWS_REGION
                  value: {{workflow.parameters.aws-region}}
                - name: S3_REGION
                  value: {{workflow.parameters.aws-region}}
                - name: S3_USE_HTTPS
                  value: "{{workflow.parameters.s3-use-https}}"
                - name: S3_VERIFY_SSL
                  value: "{{workflow.parameters.s3-verify-ssl}}"
                - name: S3_ENDPOINT
                  value: {{workflow.parameters.s3-endpoint}}
                ports:
                - containerPort: 6006
                  protocol: TCP
              dnsPolicy: ClusterFirst
              restartPolicy: Always
        ---
        apiVersion: v1
        kind: Service
        metadata:
          labels:
            app: tensorboard-{{workflow.parameters.job-name}}
          name: tensorboard-{{workflow.parameters.job-name}}
          namespace: {{workflow.parameters.namespace}}
        spec:
          ports:
          - port: 80
            protocol: TCP
            targetPort: 6006
          selector:
            app: tensorboard-{{workflow.parameters.job-name}}
          sessionAffinity: None
          type: ClusterIP
  - name: tf-inference
    inputs:
      parameters:
      - name: s3-exported-url
    script:
      image: "{{workflow.parameters.ks-image}}"
      command: ["/ksonnet-entrypoint.sh"]
      source: |
        ks init my-model-server
        cd my-model-server
        ks registry add kubeflow {{workflow.parameters.model-serving-ks-url}}
        ks pkg install kubeflow/tf-serving@{{workflow.parameters.model-serving-ks-tag}}
        ks env add default
        # TODO change mnist name to be specific to a job. Right now mnist name is required to serve the model.
        ks generate tf-serving {{workflow.parameters.model-name}} --name=mnist-{{workflow.parameters.job-name}} --namespace={{workflow.parameters.namespace}} --model_path={{inputs.parameters.s3-exported-url}}
        ks param set {{workflow.parameters.model-name}} model_server_image {{workflow.parameters.tf-serving-image}}
        ks param set {{workflow.parameters.model-name}} model_name {{workflow.parameters.model-name}}
        ks param set {{workflow.parameters.model-name}} namespace {{workflow.parameters.namespace}}
        ks param set {{workflow.parameters.model-name}} service_type {{workflow.parameters.model-serving-servicetype}}
        ks param set {{workflow.parameters.model-name}} s3_create_secret false
        ks param set {{workflow.parameters.model-name}} s3_secret_name {{workflow.parameters.aws-secret}}
        ks param set {{workflow.parameters.model-name}} s3_secret_accesskeyid_key_name awsAccessKeyID
        ks param set {{workflow.parameters.model-name}} s3_secret_secretaccesskey_key_name awsSecretAccessKey
        ks param set {{workflow.parameters.model-name}} s3_aws_region {{workflow.parameters.aws-region}}
        ks param set {{workflow.parameters.model-name}} s3_endpoint {{workflow.parameters.s3-endpoint}}
        ks param set {{workflow.parameters.model-name}} s3_use_https {{workflow.parameters.s3-use-https}} --as-string
        ks param set {{workflow.parameters.model-name}} s3_verify_ssl {{workflow.parameters.s3-verify-ssl}} --as-string
        ks apply default -c {{workflow.parameters.model-name}}
      #FIXME This doesn't actually work in the current version of argo. We're using a default of `tf-user` in the container entrypoint for now.
      env:
      - name: SERVICE_ACCOUNT
        value: tf-user
  - name: clean
    container:
      image: nervana/circleci:master
      imagePullPolicy: Always
      command: ["bash", "-c", "kubectl delete tfjob {{workflow.parameters.job-name}} || true"]
--- a/mnist/tf-user.yaml
+++ b/mnist/tf-user.yaml
@ -1,97 +0,0 @@
 apiVersion: rbac.authorization.k8s.io/v1
 kind: Role
 metadata:
  name: tf-user
 rules:
 - apiGroups:
  - ""
  resources:
  - pods
  - pods/exec
  verbs:
  - create
  - get
  - list
  - watch
  - update
  - patch
 - apiGroups:
  - ""
  resources:
  - configmaps
  - serviceaccounts
  - secrets
  verbs:
  - get
  - watch
  - list
 - apiGroups:
  - ""
  resources:
  - persistentvolumeclaims
  verbs:
  - create
  - delete
 - apiGroups:
  - ""
  resources:
  - services
  verbs:
  - create
  - get
  - list
  - watch
  - update
  - patch
 - apiGroups:
  - apps
  - extensions
  resources:
  - deployments
  verbs:
  - create
  - get
  - list
  - watch
  - update
  - patch
  - delete
 - apiGroups:
  - argoproj.io
  resources:
  - workflows
  verbs:
  - get
  - list
  - watch
  - update
  - patch
 - apiGroups:
  - kubeflow.org
  resources:
  - tfjobs
  verbs:
  - create
  - get
  - list
  - watch
  - update
  - patch
  - delete
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: RoleBinding
 metadata:
  name: tf-user
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: tf-user
 subjects:
 - kind: ServiceAccount
  name: tf-user
 ---
 apiVersion: v1
 kind: ServiceAccount
 metadata:
  name: tf-user
--- a/mnist/web-ui/Dockerfile
+++ b/mnist/web-ui/Dockerfile
@ -0,0 +1,50 @@
 FROM ubuntu:16.04
 MAINTAINER "Daniel Sanche"
 # add TF dependencies
 RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        curl \
        libfreetype6-dev \
        libpng12-dev \
        libzmq3-dev \
        pkg-config \
        python3 \
        python-dev \
        rsync \
        software-properties-common \
        unzip \
        && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*
 # add python dependencies
 RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
    python get-pip.py && \
    rm get-pip.py
 RUN pip --no-cache-dir install \
        Pillow \
        h5py \
        ipykernel \
        numpy \
        tensorflow \
        tensorflow-serving-api \
        flask \
        && \
    python -m ipykernel.kernelspec
 # show python logs as they occur
 ENV PYTHONUNBUFFERED=0
 # add project files
 ADD *.py /home/
 ADD templates/* /home/templates/
 ADD static/styles /home/static/styles/
 RUN mkdir /home/static/tmp/
 ADD static/scripts/ /home/static/scripts/
 # start server on port 5000
 WORKDIR /home/
 EXPOSE 5000
 ENTRYPOINT python flask_server.py
--- a/mnist/web-ui/MNIST_data/t10k-images-idx3-ubyte.gz
+++ b/mnist/web-ui/MNIST_data/t10k-images-idx3-ubyte.gz
--- a/mnist/web-ui/MNIST_data/t10k-labels-idx1-ubyte.gz
+++ b/mnist/web-ui/MNIST_data/t10k-labels-idx1-ubyte.gz
--- a/mnist/web-ui/MNIST_data/train-images-idx3-ubyte.gz
+++ b/mnist/web-ui/MNIST_data/train-images-idx3-ubyte.gz
--- a/mnist/web-ui/MNIST_data/train-labels-idx1-ubyte.gz
+++ b/mnist/web-ui/MNIST_data/train-labels-idx1-ubyte.gz
--- a/mnist/web-ui/README.md
+++ b/mnist/web-ui/README.md
@ -0,0 +1,14 @@
 # web-ui
 The files in this folder define a web interface that can be used to interact with a TensorFlow server
 - flask_server.py
  - main server code. Handles incoming requests, and renders HTML from template
 - mnist_client.py
  - code to interact with TensorFlow model server
  - takes in an image and server details, and returns the server's response
 - Dockerfile
  - builds a runnable container out of the files in this directory
 ---
 This is not an officially supported Google product
--- a/mnist/web-ui/flask_server.py
+++ b/mnist/web-ui/flask_server.py
@ -0,0 +1,93 @@
 '''
 Copyright 2018 Google LLC
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
    https://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 '''
 import logging
 import os
 from threading import Timer
 import uuid
 from flask import Flask, render_template, request
 from mnist_client import get_prediction, random_mnist
 app = Flask(__name__)
 # handle requests to the server
@app.route("/")
 def main():
  # get url parameters for HTML template
  name_arg = request.args.get('name', 'mnist')
  addr_arg = request.args.get('addr', 'mnist-service')
  port_arg = request.args.get('port', '9000')
  args = {"name": name_arg, "addr": addr_arg, "port": port_arg}
  logging.info("Request args: %s", args)
  output = None
  connection = {"text": "", "success": False}
  img_id = str(uuid.uuid4())
  img_path = "static/tmp/" + img_id + ".png"
  try:
    # get a random test MNIST image
    x, y, _ = random_mnist(img_path)
    # get prediction from TensorFlow server
    pred, scores, ver = get_prediction(x,
                                       server_host=addr_arg,
                                       server_port=int(port_arg),
                                       server_name=name_arg,
                                       timeout=10)
    # if no exceptions thrown, server connection was a success
    connection["text"] = "Connected (model version: " + str(ver) + ")"
    connection["success"] = True
    # parse class confidence scores from server prediction
    scores_dict = []
    for i in range(0, 10):
      scores_dict += [{"index": str(i), "val": scores[i]}]
    output = {"truth": y, "prediction": pred,
              "img_path": img_path, "scores": scores_dict}
  except Exception as e: # pylint: disable=broad-except
    logging.info("Exception occured: %s", e)
    # server connection failed
    connection["text"] = "Exception making request: {0}".format(e)
  # after 10 seconds, delete cached image file from server
  t = Timer(10.0, remove_resource, [img_path])
  t.start()
  # render results using HTML template
  return render_template('index.html', output=output,
                         connection=connection, args=args)
 def remove_resource(path):
  """
  attempt to delete file from path. Used to clean up MNIST testing images
  :param path: the path of the file to delete
  """
  try:
    os.remove(path)
    print("removed " + path)
  except OSError:
    print("no file at " + path)
 if __name__ == '__main__':
  logging.basicConfig(level=logging.INFO,
                      format=('%(levelname)s|%(asctime)s'
                              '|%(pathname)s|%(lineno)d| %(message)s'),
                      datefmt='%Y-%m-%dT%H:%M:%S',
                      )
  logging.getLogger().setLevel(logging.INFO)
  logging.info("Starting flask.")
  app.run(debug=True, host='0.0.0.0')
--- a/mnist/web-ui/mnist_client.py
+++ b/mnist/web-ui/mnist_client.py
@ -0,0 +1,92 @@
 #!/usr/bin/env python2.7
 '''
 Copyright 2018 Google LLC
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
    https://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 '''
 from __future__ import print_function
 import logging
 from grpc.beta import implementations
 import numpy as np
 import tensorflow as tf
 from tensorflow.examples.tutorials.mnist import input_data
 from tensorflow_serving.apis import predict_pb2
 from tensorflow_serving.apis import prediction_service_pb2
 from PIL import Image
 def get_prediction(image, server_host='127.0.0.1', server_port=9000,
                   server_name="server", timeout=10.0):
  """
  Retrieve a prediction from a TensorFlow model server
  :param image:       a MNIST image represented as a 1x784 array
  :param server_host: the address of the TensorFlow server
  :param server_port: the port used by the server
  :param server_name: the name of the server
  :param timeout:     the amount of time to wait for a prediction to complete
  :return 0:          the integer predicted in the MNIST image
  :return 1:          the confidence scores for all classes
  :return 2:          the version number of the model handling the request
  """
  print("connecting to:%s:%i" % (server_host, server_port))
  # initialize to server connection
  channel = implementations.insecure_channel(server_host, server_port)
  stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
  # build request
  request = predict_pb2.PredictRequest()
  request.model_spec.name = server_name
  request.model_spec.signature_name = 'serving_default'
  request.inputs['x'].CopyFrom(
      tf.contrib.util.make_tensor_proto(image, shape=image.shape))
  # retrieve results
  result = stub.Predict(request, timeout)
  resultVal = result.outputs["classes"].int_val[0]
  scores = result.outputs['predictions'].float_val
  version = result.outputs["classes"].int_val[0]
  return resultVal, scores, version
 def random_mnist(save_path=None):
  """
  Pull a random image out of the MNIST test dataset
  Optionally save the selected image as a file to disk
  :param savePath: the path to save the file to. If None, file is not saved
  :return 0: a 1x784 representation of the MNIST image
  :return 1: the ground truth label associated with the image
  :return 2: a bool representing whether the image file was saved to disk
  """
  mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
  batch_size = 1
  batch_x, batch_y = mnist.test.next_batch(batch_size)
  saved = False
  if save_path is not None:
    # save image file to disk
    try:
      data = (batch_x * 255).astype(np.uint8).reshape(28, 28)
      img = Image.fromarray(data, 'L')
      img.save(save_path)
      saved = True
    except Exception as e: # pylint: disable=broad-except
      logging.error("There was a problem saving the image; %s", e)
  return batch_x, np.argmax(batch_y), saved
--- a/mnist/web-ui/mnist_client_test.py
+++ b/mnist/web-ui/mnist_client_test.py
@ -0,0 +1,57 @@
 """Test mnist_client.
 This file tests that we can send predictions to the model.
 It is an integration test as it depends on having access to
 a deployed model.
 Python Path Requirements:
  kubeflow/testing/py - https://github.com/kubeflow/testing/tree/master/py
     * Provides utilities for testing
 Manually running the test
 1. Configure your KUBECONFIG file to point to the desired cluster
 2. Use kubectl port-forward to forward a local port
    to the gRPC port of TFServing; e.g.
    kubectl -n ${NAMESPACE} port-forward service/mnist-service 9000:9000
 """
 import os
 import mnist_client
 from py import test_runner
 from kubeflow.testing import test_util
 class MnistClientTest(test_util.TestCase):
  def __init__(self, args):
    self.args = args
    super(MnistClientTest, self).__init__(class_name="MnistClientTest",
                                          name="MnistClientTest")
  def test_predict(self):  # pylint: disable=no-self-use
    this_dir = os.path.dirname(__file__)
    data_dir = os.path.join(this_dir, "..", "data")
    img_path = os.path.abspath(data_dir)
    x, _, _ = mnist_client.random_mnist(img_path)
    server_host = "localhost"
    server_port = 9000
    model_name = "mnist"
    # get prediction from TensorFlow server
    pred, scores, _ = mnist_client.get_prediction(
      x, server_host=server_host, server_port=server_port,
      server_name=model_name, timeout=10)
    if pred < 0 or pred >= 10:
      raise ValueError("Prediction {0} is not in the range [0, 9]".format(pred))
    if len(scores) != 10:
      raise ValueError("Scores should have dimension 10. Got {0}".format(
        scores))
    # TODO(jlewi): Should we do any additional validation?
 if __name__ == "__main__":
  test_runner.main(module=__name__)
--- a/mnist/web-ui/static/scripts/material.min.js
+++ b/mnist/web-ui/static/scripts/material.min.js
--- a/mnist/web-ui/static/styles/demo.css
+++ b/mnist/web-ui/static/styles/demo.css
@ -0,0 +1,238 @@
 /**
 * Copyright 2015 Google Inc. All Rights Reserved.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
 html, body {
  font-family: 'Roboto', 'Helvetica', sans-serif;
  margin: 0;
  padding: 0;
 }
 .mdl-demo .mdl-layout__header-row {
  padding-left: 40px;
 }
 .mdl-demo .mdl-layout.is-small-screen .mdl-layout__header-row h3 {
  font-size: inherit;
 }
 .mdl-demo .mdl-layout__tab-bar-button {
  display: none;
 }
 .mdl-demo .mdl-layout.is-small-screen .mdl-layout__tab-bar .mdl-button {
  display: none;
 }
 .mdl-demo .mdl-layout:not(.is-small-screen) .mdl-layout__tab-bar,
 .mdl-demo .mdl-layout:not(.is-small-screen) .mdl-layout__tab-bar-container {
  overflow: visible;
 }
 .mdl-demo .mdl-layout__tab-bar-container {
  height: 64px;
 }
 .mdl-demo .mdl-layout__tab-bar {
  padding: 0;
  padding-left: 16px;
  box-sizing: border-box;
  height: 100%;
  width: 100%;
 }
 .mdl-demo .mdl-layout__tab-bar .mdl-layout__tab {
  height: 64px;
  line-height: 64px;
 }
 .mdl-demo .mdl-layout__tab-bar .mdl-layout__tab.is-active::after {
  background-color: white;
  height: 4px;
 }
 .mdl-demo main > .mdl-layout__tab-panel {
  padding: 8px;
  padding-top: 48px;
 }
 .mdl-demo .mdl-card {
  height: auto;
  display: -webkit-flex;
  display: -ms-flexbox;
  display: flex;
  -webkit-flex-direction: column;
      -ms-flex-direction: column;
          flex-direction: column;
 }
 .mdl-demo .mdl-card > * {
  height: auto;
 }
 .mdl-demo .mdl-card .mdl-card__supporting-text {
  margin: 40px;
  -webkit-flex-grow: 1;
      -ms-flex-positive: 1;
          flex-grow: 1;
  padding: 0;
  color: inherit;
  width: calc(100% - 80px);
 }
 .mdl-demo.mdl-demo .mdl-card__supporting-text h4 {
  margin-top: 0;
  margin-bottom: 20px;
 }
 .mdl-demo .mdl-card__actions {
  margin: 0;
  padding: 4px 40px;
  color: inherit;
 }
 .mdl-demo .mdl-card__actions a {
  color: #00BCD4;
  margin: 0;
 }
 .mdl-demo .mdl-card__actions a:hover,
 .mdl-demo .mdl-card__actions a:active {
  color: inherit;
  background-color: transparent;
 }
 .mdl-demo .mdl-card__supporting-text + .mdl-card__actions {
  border-top: 1px solid rgba(0, 0, 0, 0.12);
 }
 .mdl-demo #add {
  position: absolute;
  right: 40px;
  top: 36px;
  z-index: 999;
 }
 .mdl-demo .mdl-layout__content section:not(:last-of-type) {
  position: relative;
  margin-bottom: 48px;
 }
 .mdl-demo section.section--center {
  max-width: 860px;
 }
 .mdl-demo #features section.section--center {
  max-width: 620px;
 }
 .mdl-demo section > header{
  display: -webkit-flex;
  display: -ms-flexbox;
  display: flex;
  -webkit-align-items: center;
      -ms-flex-align: center;
          align-items: center;
  -webkit-justify-content: center;
      -ms-flex-pack: center;
          justify-content: center;
 }
 .mdl-demo section > .section__play-btn {
  min-height: 200px;
 }
 .mdl-demo section > header > .material-icons {
  font-size: 3rem;
 }
 .mdl-demo section > button {
  position: absolute;
  z-index: 99;
  top: 8px;
  right: 8px;
 }
 .mdl-demo section .section__circle {
  display: -webkit-flex;
  display: -ms-flexbox;
  display: flex;
  -webkit-align-items: center;
      -ms-flex-align: center;
          align-items: center;
  -webkit-justify-content: flex-start;
      -ms-flex-pack: start;
          justify-content: flex-start;
  -webkit-flex-grow: 0;
      -ms-flex-positive: 0;
          flex-grow: 0;
  -webkit-flex-shrink: 1;
      -ms-flex-negative: 1;
          flex-shrink: 1;
 }
 .mdl-demo section .section__text {
  -webkit-flex-grow: 1;
      -ms-flex-positive: 1;
          flex-grow: 1;
  -webkit-flex-shrink: 0;
      -ms-flex-negative: 0;
          flex-shrink: 0;
  padding-top: 8px;
 }
 .mdl-demo section .section__text h5 {
  font-size: inherit;
  margin: 0;
  margin-bottom: 0.5em;
 }
 .mdl-demo section .section__text a {
  text-decoration: none;
 }
 .mdl-demo section .section__circle-container > .section__circle-container__circle {
  width: 64px;
  height: 64px;
  border-radius: 32px;
  margin: 8px 0;
 }
 .mdl-demo section.section--footer .section__circle--big {
  width: 100px;
  height: 100px;
  border-radius: 50px;
  margin: 8px 32px;
 }
 .mdl-demo .is-small-screen section.section--footer .section__circle--big {
  width: 50px;
  height: 50px;
  border-radius: 25px;
  margin: 8px 16px;
 }
 .mdl-demo section.section--footer {
  padding: 64px 0;
  margin: 0 -8px -8px -8px;
 }
 .mdl-demo section.section--center .section__text:not(:last-child) {
  border-bottom: 1px solid rgba(0,0,0,.13);
 }
 .mdl-demo .mdl-card .mdl-card__supporting-text > h3:first-child {
  margin-bottom: 24px;
 }
 .mdl-demo .mdl-layout__tab-panel:not(#overview) {
  background-color: white;
 }
 .mdl-demo #features section {
  margin-bottom: 72px;
 }
 .mdl-demo #features h4, #features h5 {
  margin-bottom: 16px;
 }
 .mdl-demo .toc {
  border-left: 4px solid #C1EEF4;
  margin: 24px;
  padding: 0;
  padding-left: 8px;
  display: -webkit-flex;
  display: -ms-flexbox;
  display: flex;
  -webkit-flex-direction: column;
      -ms-flex-direction: column;
          flex-direction: column;
 }
 .mdl-demo .toc h4 {
  font-size: 0.9rem;
  margin-top: 0;
 }
 .mdl-demo .toc a {
  color: #4DD0E1;
  text-decoration: none;
  font-size: 16px;
  line-height: 28px;
  display: block;
 }
 .mdl-demo .mdl-menu__container {
  z-index: 99;
 }
--- a/mnist/web-ui/static/styles/material.deep_purple-pink.min.css
+++ b/mnist/web-ui/static/styles/material.deep_purple-pink.min.css
--- a/mnist/web-ui/templates/index.html
+++ b/mnist/web-ui/templates/index.html
@ -0,0 +1,115 @@
 <!--
 Copyright 2018 Google LLC
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
    https://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 -->
 <html lang="en">
   <head>
      <meta charset="utf-8">
      <meta http-equiv="X-UA-Compatible" content="IE=edge">
      <meta name="viewport" content="width=device-width, initial-scale=1.0, minimum-scale=1.0">
      <title>Kubeflow UI</title>
      <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:regular,bold,italic,thin,light,bolditalic,black,medium&amp;lang=en">
      <link rel="stylesheet" href="https://fonts.googleapis.com/icon?family=Material+Icons">
      <link rel="stylesheet" href="static/styles/material.deep_purple-pink.min.css">
      <link rel="stylesheet" href="static/styles/demo.css">
      <script src="static/scripts/material.min.js"></script>
   </head>
   <body class="mdl-demo mdl-color--grey-100 mdl-color-text--grey-700 mdl-base">
      <div class="mdl-layout mdl-js-layout mdl-layout--fixed-header">
         <header class="mdl-layout__header mdl-layout__header--scroll mdl-color--primary">
            <div class="mdl-layout--large-screen-only mdl-layout__header-row"></div>
            <div class="mdl-layout__header-row">
               <h3>Kubeflow Codelab UI</h3>
            </div>
         </header>
         <main class="mdl-layout__content">
            <!-- render server connection status -->
            <section class="section--center mdl-grid mdl-grid--no-spacing mdl-shadow--2dp">
               <div class="mdl-card mdl-cell mdl-cell--12-col">
                  <div class="mdl-card__supporting-text">
                     <h4>MNIST Model Server</h4>
                     <form action="/">
                        <div class="mdl-textfield mdl-js-textfield mdl-textfield--floating-label" style="width:250px">
                           <input class="mdl-textfield__input" type="text" id="server-name" name="name" value="{{ args.name }}">
                           <label class="mdl-textfield__label" for="sample1">Model Name</label>
                        </div>
                        <div class="mdl-textfield mdl-js-textfield mdl-textfield--floating-label" style="width:250px">
                           <input class="mdl-textfield__input" type="text" id="server-address" name="addr" value="{{ args.addr }}">
                           <label class="mdl-textfield__label" for="sample1">Server Address</label>
                        </div>
                        <div class="mdl-textfield mdl-js-textfield mdl-textfield--floating-label" style="width:100px">
                           <input class="mdl-textfield__input" type="text" name="port"
                              pattern="^([0-9]{1,4}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$"
                              id="server-port" value="{{ args.port  }}">
                           <label class="mdl-textfield__label" for="sample2">Port</label>
                           <span class="mdl-textfield__error">Input is not a valid port</span>
                        </div>
                        <button class="mdl-button mdl-js-button">Connect</button>
                     </form>
                     {% if connection.success %}
                     <h6><font color="#388E3C">✓ {{ connection.text }}</font></h6>
                     {% else %}
                     <h6><font color="#C62828">❗ {{ connection.text }}</font></h6>
                     {% endif %}
                  </div>
               </div>
            </section>
            <!-- if connected to server, render testing results -->
            {% if output %}
            <section class="section--center mdl-grid mdl-grid--no-spacing mdl-shadow--2dp">
               <div class="mdl-card mdl-cell mdl-cell--12-col">
                  <div class="mdl-card__supporting-text">
                     <h4>Test Results</h4>
                     <img src={{ output.img_path  }}
                        style="width:128px;height:128px;display:block;margin:auto;">
                     <br><br>
                     <table class="mdl-data-table mdl-js-data-table mdl-shadow--2dp" style="margin:auto;width:40%">
                        <tbody>
                           <tr>
                              <td class="mdl-data-table__cell--non-numeric"><b>Truth</b></td>
                              <td><b>{{ output.truth }}</b></td>
                           </tr>
                           <tr>
                              <td class="mdl-data-table__cell--non-numeric"><b>Prediction</b></td>
                              <td><b> {{ output.prediction }}</b></td>
                           </tr>
                           {% for score in output.scores %}
                           <tr>
                              <td class="mdl-data-table__cell--non-numeric">Probability {{ score.index }}:</td>
                              <td>
                                 <div id="progressbar{{ score.index }}"
                                    class="mdl-progress mdl-js-progress"
                                    style="width:120;"></div>
                                 <script language = "javascript">
                                    document.querySelector('#progressbar{{ score.index }}').addEventListener('mdl-componentupgraded',
                                    function() { this.MaterialProgress.setProgress({{ score.val * 100 }}); })
                                 </script>
                              </td>
                           </tr>
                           {% endfor %}
                        </tbody>
                     </table>
                     <br><br>
                     <button type="button"
                        class="mdl-button mdl-js-button mdl-button--raised mdl-js-ripple-effect mdl-color--accent mdl-color-text--accent-contrast"
                        onClick="window.location.reload()" style="margin:auto;display:block">Test Random Image</button>
                  </div>
               </div>
            </section>
            {% endif %}
         </main>
      </div>
   </body>
 </html>