{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# MNIST E2E on Kubeflow on IBM Cloud Kubernetes Service.\n", "\n", "This example guides you through:\n", " \n", " 1. Taking an example TensorFlow model and modifying it to support distributed training\n", " 1. Serving the resulting model using TFServing\n", " 1. Deploying and using a web-app that uses the model\n", " \n", "## Requirements\n", "\n", " * You must be [running Kubeflow 1.0 on IBM Cloud Kubernetes Service](https://www.kubeflow.org/docs/ibm/install-kubeflow/).\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Required Services and Credentials\n", "\n", "Before proceeding to the next steps, we first need to provision the necessary IBM Services and input the credentials below.\n", "\n", "IBM Cloud Object Storage(COS): https://cloud.ibm.com/catalog/services/cloud-object-storage" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Tip**: follow the steps below to access your COS instance dashboard. From the [IBM Cloud dashboard](https://cloud.ibm.com/resources):\n", "\n", "- Click the **Storage** tab\n", "- Select and click your target object storage (COS)\n", "\n", "\n", "**Create new credentials with HMAC**:\n", "\n", " - Go to your COS dashboard (see the above **Tip**).\n", " - In the **Service credentials** tab, click **New Credential+**.\n", " - In the **Add Inline Configuration Parameters(Optional)**: box, add {\"HMAC\":true}\n", " - Click **Add**. (For more information, see HMAC.)\n", " \n", "**Replace** the information in the following cell with your COS credentials.\n", "\n", "You can find these credentials in your COS instance dashboard under the **Service credentials** tab." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cos_credentials = {\n", " \"apikey\": \"-------\",\n", " \"cos_hmac_keys\": {\n", " \"access_key_id\": \"------\",\n", " \"secret_access_key\": \"------\"\n", " },\n", " \"endpoints\": \"https://cos-service.bluemix.net/endpoints\",\n", " \"iam_apikey_description\": \"------\",\n", " \"iam_apikey_name\": \"------\",\n", " \"iam_role_crn\": \"------\",\n", " \"iam_serviceid_crn\": \"------\",\n", " \"resource_instance_id\": \"-------\"\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define the endpoint.\n", "\n", "To do this, go to the **Endpoint** tab in the COS instance's dashboard to get the endpoint information, then enter it in the `service_endpoint` cell below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "service_endpoint = 's3.us.cloud-object-storage.appdomain.cloud'\n", "service_endpoint_with_https=\"https://\" + service_endpoint" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepare model\n", "\n", "There is a delta between existing distributed mnist examples and what's needed to run well as a TFJob.\n", "\n", "Basically, we must:\n", "\n", "1. Add options in order to make the model configurable.\n", "1. Use `tf.estimator.train_and_evaluate` to enable model exporting and serving.\n", "1. Define serving signatures for model serving.\n", "\n", "The resulting model is [model.py](model.py)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Install Required Libraries\n", "\n", "Import the libraries required to train this model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import logging\n", "import os\n", "import uuid\n", "from importlib import reload\n", "import notebook_setup\n", "reload(notebook_setup)\n", "notebook_setup.notebook_setup(platform=None)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import k8s_util\n", "# Force a reload of kubeflow; since kubeflow is a multi namespace module\n", "# it looks like doing this in notebook_setup may not be sufficient\n", "import kubeflow\n", "reload(kubeflow)\n", "from kubernetes import client as k8s_client\n", "from kubernetes import config as k8s_config\n", "from kubeflow.tfjob.api import tf_job_client as tf_job_client_module\n", "from IPython.core.display import display, HTML\n", "import yaml" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Configure docker credentials\n", "\n", "Get your docker registry user and password encoded in base64
\n", "\n", "`echo -n USER:PASSWORD | base64`
\n", "\n", "Update the config auth section below with your Docker registry url and the previous generated base64 string
\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import json\n", "config={\n", " \"auths\": {\n", " \"https://index.docker.io/v1/\": {\n", " \"auth\": \"xxxxxxxxxxxxxxx\"\n", " }\n", " }\n", "}\n", "with open('config.json', 'w') as outfile:\n", " json.dump(config, outfile)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a config-map in the namespace you're using with the docker config\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# !kubectl delete configmap docker-config\n", "!kubectl create configmap docker-config --from-file=config.json\n", "!rm config.json" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Update the `DOCKER_REGISTRY` and build the training image using Kaniko" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from kubernetes import client as k8s_client\n", "from kubernetes.client import rest as k8s_rest\n", "from kubeflow import fairing \n", "from kubeflow.fairing import utils as fairing_utils\n", "from kubeflow.fairing.builders import append\n", "from kubeflow.fairing.deployers import job\n", "from kubeflow.fairing.preprocessors import base as base_preprocessor\n", "\n", "# Update the DOCKER_REGISTRY to your docker registry!!\n", "DOCKER_REGISTRY = \"dockerregistry\"\n", "namespace = fairing_utils.get_current_k8s_namespace()\n", "\n", "cos_username = cos_credentials['cos_hmac_keys']['access_key_id']\n", "cos_key = cos_credentials['cos_hmac_keys']['secret_access_key']\n", "cos_region = \"us-east-1\"\n", "\n", "logging.info(f\"Running in namespace {namespace}\")\n", "logging.info(f\"Using docker registry {DOCKER_REGISTRY}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# TODO(https://github.com/kubeflow/fairing/issues/426): We should get rid of this once the default \n", "# Kaniko image is updated to a newer image than 0.7.0.\n", "from kubeflow.fairing import constants\n", "constants.constants.KANIKO_IMAGE = \"gcr.io/kaniko-project/executor:v0.14.0\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from kubeflow.fairing.builders import cluster\n", "\n", "# output_map is a map of extra files to add to the notebook.\n", "# It is a map from source location to the location inside the context.\n", "output_map = {\n", " \"Dockerfile.model\": \"Dockerfile\",\n", " \"model.py\": \"model.py\"\n", "}\n", "\n", "\n", "preprocessor = base_preprocessor.BasePreProcessor(\n", " command=[\"python\"], # The base class will set this.\n", " input_files=[],\n", " path_prefix=\"/app\", # irrelevant since we aren't preprocessing any files\n", " output_map=output_map)\n", "\n", "preprocessor.preprocess()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Use a Tensorflow image as the base image\n", "# We use a custom Dockerfile \n", "from kubeflow.fairing.cloud.k8s import MinioUploader\n", "from kubeflow.fairing.builders.cluster.minio_context import MinioContextSource\n", "minio_uploader = MinioUploader(endpoint_url=service_endpoint_with_https, minio_secret=cos_username, minio_secret_key=cos_key, region_name=cos_region)\n", "minio_context_source = MinioContextSource(endpoint_url=service_endpoint_with_https, minio_secret=cos_username, minio_secret_key=cos_key, region_name=cos_region)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# TODO: Add IBM Container registry as part of the fairing SDK.\n", "cluster_builder = cluster.cluster.ClusterBuilder(registry=DOCKER_REGISTRY,\n", " base_image=\"\", # base_image is set in the Dockerfile\n", " preprocessor=preprocessor,\n", " image_name=\"mnist\",\n", " dockerfile_path=\"Dockerfile\",\n", " context_source=minio_context_source)\n", "cluster_builder.build()\n", "logging.info(f\"Built image {cluster_builder.image_tag}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a Object Storage Bucket\n", "\n", "* Create a object storage bucket to store our models and other results." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mnist_bucket = f\"{DOCKER_REGISTRY}-mnist\"\n", "minio_uploader.create_bucket(mnist_bucket)\n", "logging.info(f\"Bucket {mnist_bucket} created or already exists\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Distributed training\n", "\n", "* We will train the model by using TFJob to run a distributed training job" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Training job parameters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "train_name = f\"mnist-train-{uuid.uuid4().hex[:4]}\"\n", "num_ps = 1\n", "num_workers = 2\n", "model_dir = f\"s3://{mnist_bucket}/mnist\"\n", "export_path = f\"s3://{mnist_bucket}/mnist/export\" \n", "train_steps = 200\n", "batch_size = 100\n", "learning_rate = .01\n", "image = cluster_builder.image_tag" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "train_spec = f\"\"\"apiVersion: kubeflow.org/v1\n", "kind: TFJob\n", "metadata:\n", " name: {train_name} \n", "spec:\n", " tfReplicaSpecs:\n", " Ps:\n", " replicas: {num_ps}\n", " template:\n", " metadata:\n", " annotations:\n", " sidecar.istio.io/inject: \"false\"\n", " spec:\n", " serviceAccount: default-editor\n", " containers:\n", " - name: tensorflow\n", " command:\n", " - python\n", " - /opt/model.py\n", " - --tf-model-dir={model_dir}\n", " - --tf-export-dir={export_path}\n", " - --tf-train-steps={train_steps}\n", " - --tf-batch-size={batch_size}\n", " - --tf-learning-rate={learning_rate}\n", " env:\n", " - name: S3_ENDPOINT\n", " value: {service_endpoint}\n", " - name: AWS_REGION\n", " value: {cos_region}\n", " - name: BUCKET_NAME\n", " value: {mnist_bucket}\n", " - name: S3_USE_HTTPS\n", " value: \"1\"\n", " - name: S3_VERIFY_SSL\n", " value: \"1\"\n", " - name: AWS_ACCESS_KEY_ID\n", " value: {cos_username}\n", " - name: AWS_SECRET_ACCESS_KEY\n", " value: {cos_key}\n", " image: {image}\n", " workingDir: /opt\n", " restartPolicy: OnFailure\n", " Chief:\n", " replicas: 1\n", " template:\n", " metadata:\n", " annotations:\n", " sidecar.istio.io/inject: \"false\"\n", " spec:\n", " serviceAccount: default-editor\n", " containers:\n", " - name: tensorflow\n", " command:\n", " - python\n", " - /opt/model.py\n", " - --tf-model-dir={model_dir}\n", " - --tf-export-dir={export_path}\n", " - --tf-train-steps={train_steps}\n", " - --tf-batch-size={batch_size}\n", " - --tf-learning-rate={learning_rate}\n", " env:\n", " - name: S3_ENDPOINT\n", " value: {service_endpoint}\n", " - name: AWS_REGION\n", " value: {cos_region}\n", " - name: BUCKET_NAME\n", " value: {mnist_bucket}\n", " - name: S3_USE_HTTPS\n", " value: \"1\"\n", " - name: S3_VERIFY_SSL\n", " value: \"1\"\n", " - name: AWS_ACCESS_KEY_ID\n", " value: {cos_username}\n", " - name: AWS_SECRET_ACCESS_KEY\n", " value: {cos_key}\n", " image: {image}\n", " workingDir: /opt\n", " restartPolicy: OnFailure\n", " Worker:\n", " replicas: 1\n", " template:\n", " metadata:\n", " annotations:\n", " sidecar.istio.io/inject: \"false\"\n", " spec:\n", " serviceAccount: default-editor\n", " containers:\n", " - name: tensorflow\n", " command:\n", " - python\n", " - /opt/model.py\n", " - --tf-model-dir={model_dir}\n", " - --tf-export-dir={export_path}\n", " - --tf-train-steps={train_steps}\n", " - --tf-batch-size={batch_size}\n", " - --tf-learning-rate={learning_rate}\n", " env:\n", " - name: S3_ENDPOINT\n", " value: {service_endpoint}\n", " - name: AWS_REGION\n", " value: {cos_region}\n", " - name: BUCKET_NAME\n", " value: {mnist_bucket}\n", " - name: S3_USE_HTTPS\n", " value: \"1\"\n", " - name: S3_VERIFY_SSL\n", " value: \"1\"\n", " - name: AWS_ACCESS_KEY_ID\n", " value: {cos_username}\n", " - name: AWS_SECRET_ACCESS_KEY\n", " value: {cos_key}\n", " image: {image}\n", " workingDir: /opt\n", " restartPolicy: OnFailure\n", "\"\"\" " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create the training job\n", "\n", "* You could write the spec to a YAML file and then do `kubectl apply -f {FILE}`\n", "* Since you are running in jupyter you will use the TFJob client\n", "* You will run the TFJob in a namespace created by a Kubeflow profile\n", " * The namespace will be the same namespace you are running the notebook in\n", " * Creating a profile ensures the namespace is provisioned with service accounts and other resources needed for Kubeflow" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tf_job_client = tf_job_client_module.TFJobClient()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tf_job_body = yaml.safe_load(train_spec)\n", "tf_job = tf_job_client.create(tf_job_body, namespace=namespace) \n", "\n", "logging.info(f\"Created job {namespace}.{train_name}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from kubeflow.tfjob import TFJobClient\n", "tfjob_client = TFJobClient()\n", "tfjob_client.wait_for_job(train_name, namespace=namespace, watch=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get TF Job logs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tfjob_client.get_logs(train_name, namespace=namespace)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy Tensorboard" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tb_name = \"mnist-tensorboard\"\n", "tb_deploy = f\"\"\"apiVersion: apps/v1\n", "kind: Deployment\n", "metadata:\n", " labels:\n", " app: mnist-tensorboard\n", " name: {tb_name}\n", " namespace: {namespace}\n", "spec:\n", " selector:\n", " matchLabels:\n", " app: mnist-tensorboard\n", " template:\n", " metadata:\n", " labels:\n", " app: mnist-tensorboard\n", " version: v1\n", " spec:\n", " serviceAccount: default-editor\n", " containers:\n", " - command:\n", " - /usr/local/bin/tensorboard\n", " - --logdir={model_dir}\n", " - --port=80\n", " image: tensorflow/tensorflow:1.15.2-py3\n", " env:\n", " - name: S3_ENDPOINT\n", " value: {service_endpoint}\n", " - name: AWS_REGION\n", " value: {cos_region}\n", " - name: BUCKET_NAME\n", " value: {mnist_bucket}\n", " - name: S3_USE_HTTPS\n", " value: \"1\"\n", " - name: S3_VERIFY_SSL\n", " value: \"1\"\n", " - name: AWS_ACCESS_KEY_ID\n", " value: {cos_username}\n", " - name: AWS_SECRET_ACCESS_KEY\n", " value: {cos_key} \n", " name: tensorboard\n", " ports:\n", " - containerPort: 80\n", "\"\"\"\n", "tb_service = f\"\"\"apiVersion: v1\n", "kind: Service\n", "metadata:\n", " labels:\n", " app: mnist-tensorboard\n", " name: {tb_name}\n", " namespace: {namespace}\n", "spec:\n", " ports:\n", " - name: http-tb\n", " port: 80\n", " targetPort: 80\n", " selector:\n", " app: mnist-tensorboard\n", " type: ClusterIP\n", "\"\"\"\n", "\n", "tb_virtual_service = f\"\"\"apiVersion: networking.istio.io/v1alpha3\n", "kind: VirtualService\n", "metadata:\n", " name: {tb_name}\n", " namespace: {namespace}\n", "spec:\n", " gateways:\n", " - kubeflow/kubeflow-gateway\n", " hosts:\n", " - '*'\n", " http:\n", " - match:\n", " - uri:\n", " prefix: /mnist/{namespace}/tensorboard/\n", " rewrite:\n", " uri: /\n", " route:\n", " - destination:\n", " host: {tb_name}.{namespace}.svc.cluster.local\n", " port:\n", " number: 80\n", " timeout: 300s\n", "\"\"\"\n", "\n", "tb_specs = [tb_deploy, tb_service, tb_virtual_service]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "k8s_util.apply_k8s_specs(tb_specs, k8s_util.K8S_CREATE_OR_REPLACE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get Tensorboard URL" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run this with the appropriate RBAC permissions
\n", "**Note:** You can get the node worker ip from `kubectl get no -o wide`
\n", "```bash\n", "export INGRESS_HOST=\n", "export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name==\"http2\")].nodePort}')\n", "printf \"Tensorboard URL: \\n${INGRESS_HOST}:${INGRESS_PORT}/mnist/anonymous/tensorboard/\\n\"\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Serve the model\n", "\n", "* Deploy the model using tensorflow serving\n", "* We need to create\n", " 1. A Kubernetes Deployment\n", " 1. A Kubernetes service\n", " 1. (Optional) Create a configmap containing the prometheus monitoring config" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "deploy_name = \"mnist-model\"\n", "model_base_path = export_path\n", "\n", "# The web ui defaults to mnist-service so if you change it you will\n", "# need to change it in the UI as well to send predictions to the mode\n", "model_service = \"mnist-service\"\n", "\n", "deploy_spec = f\"\"\"apiVersion: apps/v1\n", "kind: Deployment\n", "metadata:\n", " labels:\n", " app: mnist\n", " name: {deploy_name}\n", " namespace: {namespace}\n", "spec:\n", " selector:\n", " matchLabels:\n", " app: mnist-model\n", " template:\n", " metadata:\n", " # TODO(jlewi): Right now we disable the istio side car because otherwise ISTIO rbac will prevent the\n", " # UI from sending RPCs to the server. We should create an appropriate ISTIO rbac authorization\n", " # policy to allow traffic from the UI to the model servier.\n", " # https://istio.io/docs/concepts/security/#target-selectors\n", " annotations: \n", " sidecar.istio.io/inject: \"false\"\n", " labels:\n", " app: mnist-model\n", " version: v1\n", " spec:\n", " serviceAccount: default-editor\n", " containers:\n", " - args:\n", " - --port=9000\n", " - --rest_api_port=8500\n", " - --model_name=mnist\n", " - --model_base_path={model_base_path}\n", " command:\n", " - /usr/bin/tensorflow_model_server\n", " env:\n", " - name: modelBasePath\n", " value: {model_base_path}\n", " - name: S3_ENDPOINT\n", " value: {service_endpoint}\n", " - name: AWS_REGION\n", " value: {cos_region}\n", " - name: BUCKET_NAME\n", " value: {mnist_bucket}\n", " - name: S3_USE_HTTPS\n", " value: \"1\"\n", " - name: S3_VERIFY_SSL\n", " value: \"1\"\n", " - name: AWS_ACCESS_KEY_ID\n", " value: {cos_username}\n", " - name: AWS_SECRET_ACCESS_KEY\n", " value: {cos_key} \n", " image: tensorflow/serving:1.15.0\n", " imagePullPolicy: IfNotPresent\n", " livenessProbe:\n", " initialDelaySeconds: 30\n", " periodSeconds: 30\n", " tcpSocket:\n", " port: 9000\n", " name: mnist\n", " ports:\n", " - containerPort: 9000\n", " - containerPort: 8500\n", " resources:\n", " limits:\n", " cpu: \"4\"\n", " memory: 4Gi\n", " requests:\n", " cpu: \"1\"\n", " memory: 1Gi\n", " volumeMounts:\n", " - mountPath: /var/config/\n", " name: model-config\n", " volumes:\n", " - configMap:\n", " name: {deploy_name}\n", " name: model-config\n", "\"\"\"\n", "\n", "service_spec = f\"\"\"apiVersion: v1\n", "kind: Service\n", "metadata:\n", " annotations: \n", " prometheus.io/path: /monitoring/prometheus/metrics\n", " prometheus.io/port: \"8500\"\n", " prometheus.io/scrape: \"true\"\n", " labels:\n", " app: mnist-model\n", " name: {model_service}\n", " namespace: {namespace}\n", "spec:\n", " ports:\n", " - name: grpc-tf-serving\n", " port: 9000\n", " targetPort: 9000\n", " - name: http-tf-serving\n", " port: 8500\n", " targetPort: 8500\n", " selector:\n", " app: mnist-model\n", " type: ClusterIP\n", "\"\"\"\n", "\n", "monitoring_config = f\"\"\"kind: ConfigMap\n", "apiVersion: v1\n", "metadata:\n", " name: {deploy_name}\n", " namespace: {namespace}\n", "data:\n", " monitoring_config.txt: |-\n", " prometheus_config: {{\n", " enable: true,\n", " path: \"/monitoring/prometheus/metrics\"\n", " }}\n", "\"\"\"\n", "\n", "model_specs = [deploy_spec, service_spec, monitoring_config]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "k8s_util.apply_k8s_specs(model_specs, k8s_util.K8S_CREATE_OR_REPLACE) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy the mnist UI\n", "\n", "* We will now deploy the UI to visualize the mnist results\n", "* Note: This is using a prebuilt and public docker image for the UI" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ui_name = \"mnist-ui\"\n", "ui_deploy = f\"\"\"apiVersion: apps/v1\n", "kind: Deployment\n", "metadata:\n", " name: {ui_name}\n", " namespace: {namespace}\n", "spec:\n", " replicas: 1\n", " selector:\n", " matchLabels:\n", " app: mnist-web-ui\n", " template:\n", " metadata:\n", " labels:\n", " app: mnist-web-ui\n", " spec:\n", " containers:\n", " - image: gcr.io/kubeflow-examples/mnist/web-ui:v20190112-v0.2-142-g3b38225\n", " name: web-ui\n", " ports:\n", " - containerPort: 5000 \n", " serviceAccount: default-editor\n", "\"\"\"\n", "\n", "ui_service = f\"\"\"apiVersion: v1\n", "kind: Service\n", "metadata:\n", " annotations:\n", " name: {ui_name}\n", " namespace: {namespace}\n", "spec:\n", " ports:\n", " - name: http-mnist-ui\n", " port: 80\n", " targetPort: 5000\n", " selector:\n", " app: mnist-web-ui\n", " type: ClusterIP\n", "\"\"\"\n", "\n", "ui_virtual_service = f\"\"\"apiVersion: networking.istio.io/v1alpha3\n", "kind: VirtualService\n", "metadata:\n", " name: {ui_name}\n", " namespace: {namespace}\n", "spec:\n", " gateways:\n", " - kubeflow/kubeflow-gateway\n", " hosts:\n", " - '*'\n", " http:\n", " - match:\n", " - uri:\n", " prefix: /mnist/{namespace}/ui/\n", " rewrite:\n", " uri: /\n", " route:\n", " - destination:\n", " host: {ui_name}.{namespace}.svc.cluster.local\n", " port:\n", " number: 80\n", " timeout: 300s\n", "\"\"\"\n", "\n", "ui_specs = [ui_deploy, ui_service, ui_virtual_service]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "k8s_util.apply_k8s_specs(ui_specs, k8s_util.K8S_CREATE_OR_REPLACE) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Access the web UI\n", "\n", "* The endpoint will be\n", "\n", "Run this with the appropriate RBAC permissions
\n", "**Note:** You can get the node worker ip from `kubectl get no -o wide`
\n", "```bash\n", "export INGRESS_HOST=\n", "export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name==\"http2\")].nodePort}')\n", "printf \"mnist-web-app URL: \\n${INGRESS_HOST}:${INGRESS_PORT}/mnist/anonymous/ui/\\n\"\n", "```" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 4 }