examples/xgboost_synthetic/build-train-deploy.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Train and deploy on Kubeflow from Notebooks\n",
    "\n",
    "This notebook shows you how to use Kubeflow to build, train, and deploy models on Kubernetes.\n",
    "This notebook walks you through the following\n",
    " \n",
    "* Building an XGBoost model inside a notebook\n",
    "* Training the model inside the notebook\n",
    "* Performing inference using the model inside the notebook\n",
    "* Using Kubeflow Fairing to launch training jobs on Kubernetes\n",
    "* Using Kubeflow Fairing to build and deploy a model using [Seldon Core](https://www.seldon.io/)\n",
    "* Using [Kubeflow metadata](https://github.com/kubeflow/metadata) to record metadata about your models\n",
    "* Using [Kubeflow Pipelines](https://www.kubeflow.org/docs/pipelines/) to build a pipeline to train your model\n",
    "\n",
    "## Prerequisites \n",
    "\n",
    "* This notebook assumes you are running inside 0.6 Kubeflow deployed on GKE following the [GKE instructions](https://www.kubeflow.org/docs/gke/deploy/)\n",
    "* If you are running somewhere other than GKE you will need to modify the notebook to use a different docker registry or else configure Kubeflow to work with GCR."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Verify we have a GCP account\n",
    "\n",
    "* The cell below checks that this notebook was spawned with credentials to access GCP"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from oauth2client.client import GoogleCredentials\n",
    "credentials = GoogleCredentials.get_application_default()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Install Required Libraries\n",
    "\n",
    "Import the libraries required to train this model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "pip installing requirements.txt\n",
      "pip installing KFP https://storage.googleapis.com/ml-pipeline/release/0.1.32/kfp.tar.gz\n",
      "pip installing fairing git+git://github.com/kubeflow/fairing.git@9b0d4ed4796ba349ac6067bbd802ff1d6454d015\n",
      "Configure docker credentials\n"
     ]
    }
   ],
   "source": [
    "import notebook_setup\n",
    "notebook_setup.notebook_setup()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Import the python libraries we will use\n",
    "* We add a comment \"fairing:include-cell\" to tell the kubefow fairing preprocessor to keep this cell when converting to python code later"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# fairing:include-cell\n",
    "import fire\n",
    "import joblib\n",
    "import logging\n",
    "import nbconvert\n",
    "import os\n",
    "import pathlib\n",
    "import sys\n",
    "from pathlib import Path\n",
    "import pandas as pd\n",
    "import pprint\n",
    "from sklearn.metrics import mean_absolute_error\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.impute import SimpleImputer\n",
    "from xgboost import XGBRegressor\n",
    "from importlib import reload\n",
    "from sklearn.datasets import make_regression\n",
    "from kubeflow.metadata import metadata\n",
    "from datetime import datetime\n",
    "import retrying\n",
    "import urllib3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Imports not to be included in the built docker image\n",
    "import util\n",
    "import kfp\n",
    "import kfp.components as comp\n",
    "import kfp.gcp as gcp\n",
    "import kfp.dsl as dsl\n",
    "import kfp.compiler as compiler\n",
    "from kubernetes import client as k8s_client\n",
    "from kubeflow import fairing   \n",
    "from kubeflow.fairing.builders import append\n",
    "from kubeflow.fairing.deployers import job\n",
    "from kubeflow.fairing.preprocessors.converted_notebook import ConvertNotebookPreprocessorWithFire\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Code to train and predict \n",
    "\n",
    "* In the cells below we define some functions to generate data and train a model\n",
    "* These functions could just as easily be defined in a separate python module"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# fairing:include-cell\n",
    "def read_synthetic_input(test_size=0.25):\n",
    "    \"\"\"generate synthetic data and split it into train and test.\"\"\"\n",
    "    # generate regression dataset\n",
    "    X, y = make_regression(n_samples=200, n_features=5, noise=0.1)\n",
    "    train_X, test_X, train_y, test_y = train_test_split(X,\n",
    "                                                      y,\n",
    "                                                      test_size=test_size,\n",
    "                                                      shuffle=False)\n",
    "\n",
    "    imputer = SimpleImputer()\n",
    "    train_X = imputer.fit_transform(train_X)\n",
    "    test_X = imputer.transform(test_X)\n",
    "\n",
    "    return (train_X, train_y), (test_X, test_y)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# fairing:include-cell\n",
    "def train_model(train_X,\n",
    "                train_y,\n",
    "                test_X,\n",
    "                test_y,\n",
    "                n_estimators,\n",
    "                learning_rate):\n",
    "    \"\"\"Train the model using XGBRegressor.\"\"\"\n",
    "    model = XGBRegressor(n_estimators=n_estimators, learning_rate=learning_rate)\n",
    "\n",
    "    model.fit(train_X,\n",
    "            train_y,\n",
    "            early_stopping_rounds=40,\n",
    "            eval_set=[(test_X, test_y)])\n",
    "\n",
    "    print(\"Best RMSE on eval: %.2f with %d rounds\",\n",
    "               model.best_score,\n",
    "               model.best_iteration+1)\n",
    "    return model\n",
    "\n",
    "def eval_model(model, test_X, test_y):\n",
    "    \"\"\"Evaluate the model performance.\"\"\"\n",
    "    predictions = model.predict(test_X)\n",
    "    mae=mean_absolute_error(predictions, test_y)\n",
    "    logging.info(\"mean_absolute_error=%.2f\", mae)\n",
    "    return mae\n",
    "\n",
    "def save_model(model, model_file):\n",
    "    \"\"\"Save XGBoost model for serving.\"\"\"\n",
    "    joblib.dump(model, model_file)\n",
    "    logging.info(\"Model export success: %s\", model_file)\n",
    "\n",
    "def create_workspace():\n",
    "    METADATA_STORE_HOST = \"metadata-grpc-service.kubeflow\" # default DNS of Kubeflow Metadata gRPC serivce.\n",
    "    METADATA_STORE_PORT = 8080\n",
    "    return metadata.Workspace(\n",
    "        store=metadata.Store(grpc_host=METADATA_STORE_HOST, grpc_port=METADATA_STORE_PORT),\n",
    "        name=\"xgboost-synthetic\",\n",
    "        description=\"workspace for xgboost-synthetic artifacts and executions\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Wrap Training and Prediction in a class\n",
    "\n",
    "* In the cell below we wrap training and prediction in a class\n",
    "* A class provides the structure we will need to eventually use kubeflow fairing to launch separate training jobs and/or deploy the model on Kubernetes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# fairing:include-cell\n",
    "class ModelServe(object):    \n",
    "    def __init__(self, model_file=None):\n",
    "        self.n_estimators = 50\n",
    "        self.learning_rate = 0.1\n",
    "        if not model_file:\n",
    "            if \"MODEL_FILE\" in os.environ:\n",
    "                print(\"model_file not supplied; checking environment variable\")\n",
    "                model_file = os.getenv(\"MODEL_FILE\")\n",
    "            else:\n",
    "                print(\"model_file not supplied; using the default\")\n",
    "                model_file = \"mockup-model.dat\"\n",
    "        \n",
    "        self.model_file = model_file\n",
    "        print(\"model_file={0}\".format(self.model_file))\n",
    "        \n",
    "        self.model = None\n",
    "        self._workspace = None\n",
    "        self.exec = self.create_execution()\n",
    "\n",
    "    def train(self):\n",
    "        (train_X, train_y), (test_X, test_y) = read_synthetic_input()\n",
    "        \n",
    "        # Here we use Kubeflow's metadata library to record information\n",
    "        # about the training run to Kubeflow's metadata store.\n",
    "        self.exec.log_input(metadata.DataSet(\n",
    "            description=\"xgboost synthetic data\",\n",
    "            name=\"synthetic-data\",\n",
    "            owner=\"someone@kubeflow.org\",\n",
    "            uri=\"file://path/to/dataset\",\n",
    "            version=\"v1.0.0\"))\n",
    "        \n",
    "        model = train_model(train_X,\n",
    "                          train_y,\n",
    "                          test_X,\n",
    "                          test_y,\n",
    "                          self.n_estimators,\n",
    "                          self.learning_rate)\n",
    "\n",
    "        mae = eval_model(model, test_X, test_y)\n",
    "        \n",
    "        # Here we log metrics about the model to Kubeflow's metadata store.\n",
    "        self.exec.log_output(metadata.Metrics(\n",
    "            name=\"xgboost-synthetic-traing-eval\",\n",
    "            owner=\"someone@kubeflow.org\",\n",
    "            description=\"training evaluation for xgboost synthetic\",\n",
    "            uri=\"gcs://path/to/metrics\",\n",
    "            metrics_type=metadata.Metrics.VALIDATION,\n",
    "            values={\"mean_absolute_error\": mae}))\n",
    "        \n",
    "        save_model(model, self.model_file)\n",
    "        self.exec.log_output(metadata.Model(\n",
    "            name=\"housing-price-model\",\n",
    "            description=\"housing price prediction model using synthetic data\",\n",
    "            owner=\"someone@kubeflow.org\",\n",
    "            uri=self.model_file,\n",
    "            model_type=\"linear_regression\",\n",
    "            training_framework={\n",
    "                \"name\": \"xgboost\",\n",
    "                \"version\": \"0.9.0\"\n",
    "            },\n",
    "            hyperparameters={\n",
    "                \"learning_rate\": self.learning_rate,\n",
    "                \"n_estimators\": self.n_estimators\n",
    "            },\n",
    "            version=datetime.utcnow().isoformat(\"T\")))\n",
    "        \n",
    "    def predict(self, X, feature_names):\n",
    "        \"\"\"Predict using the model for given ndarray.\n",
    "        \n",
    "        The predict signature should match the syntax expected by Seldon Core\n",
    "        https://github.com/SeldonIO/seldon-core so that we can use\n",
    "        Seldon h to wrap it a model server and deploy it on Kubernetes\n",
    "        \"\"\"\n",
    "        if not self.model:\n",
    "            self.model = joblib.load(self.model_file)\n",
    "        # Do any preprocessing\n",
    "        prediction = self.model.predict(data=X)\n",
    "        # Do any postprocessing\n",
    "        return [[prediction.item(0), prediction.item(1)]]\n",
    "\n",
    "    @property\n",
    "    def workspace(self):\n",
    "        if not self._workspace:\n",
    "            self._workspace = create_workspace()\n",
    "        return self._workspace\n",
    "    \n",
    "    def create_execution(self):                \n",
    "        r = metadata.Run(\n",
    "            workspace=self.workspace,\n",
    "            name=\"xgboost-synthetic-faring-run\" + datetime.utcnow().isoformat(\"T\"),\n",
    "            description=\"a notebook run\")\n",
    "\n",
    "        return metadata.Execution(\n",
    "            name = \"execution\" + datetime.utcnow().isoformat(\"T\"),\n",
    "            workspace=self.workspace,\n",
    "            run=r,\n",
    "            description=\"execution for training xgboost-synthetic\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train your Model Locally\n",
    "\n",
    "* Train your model locally inside your notebook\n",
    "* To train locally we just instatiante the ModelServe class and then call train"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "MetadataStore with gRPC connection initialized\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "model_file=mockup-model.dat\n",
      "[23:44:47] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
      "[0]\tvalidation_0-rmse:134.005\n",
      "Will train until validation_0-rmse hasn't improved in 40 rounds.\n",
      "[1]\tvalidation_0-rmse:129.102\n",
      "[2]\tvalidation_0-rmse:124.39\n",
      "[3]\tvalidation_0-rmse:119.218\n",
      "[4]\tvalidation_0-rmse:114.096\n",
      "[5]\tvalidation_0-rmse:109.494\n",
      "[6]\tvalidation_0-rmse:107.101\n",
      "[7]\tvalidation_0-rmse:103.463\n",
      "[8]\tvalidation_0-rmse:100.657\n",
      "[9]\tvalidation_0-rmse:96.576\n",
      "[10]\tvalidation_0-rmse:94.8884\n",
      "[11]\tvalidation_0-rmse:91.7095\n",
      "[12]\tvalidation_0-rmse:90.7389\n",
      "[13]\tvalidation_0-rmse:88.1934\n",
      "[14]\tvalidation_0-rmse:86.1535\n",
      "[15]\tvalidation_0-rmse:84.8222\n",
      "[16]\tvalidation_0-rmse:83.5818\n",
      "[17]\tvalidation_0-rmse:81.6697\n",
      "[18]\tvalidation_0-rmse:80.2789\n",
      "[19]\tvalidation_0-rmse:79.4583\n",
      "[20]\tvalidation_0-rmse:78.4213\n",
      "[21]\tvalidation_0-rmse:77.0478\n",
      "[22]\tvalidation_0-rmse:75.3792\n",
      "[23]\tvalidation_0-rmse:73.9913\n",
      "[24]\tvalidation_0-rmse:73.2026\n",
      "[25]\tvalidation_0-rmse:72.2079\n",
      "[26]\tvalidation_0-rmse:70.9489\n",
      "[27]\tvalidation_0-rmse:70.5206\n",
      "[28]\tvalidation_0-rmse:69.8641\n",
      "[29]\tvalidation_0-rmse:69.0409\n",
      "[30]\tvalidation_0-rmse:68.3776\n",
      "[31]\tvalidation_0-rmse:67.2776\n",
      "[32]\tvalidation_0-rmse:66.7612\n",
      "[33]\tvalidation_0-rmse:65.9548\n",
      "[34]\tvalidation_0-rmse:65.5048\n",
      "[35]\tvalidation_0-rmse:64.8582\n",
      "[36]\tvalidation_0-rmse:64.118\n",
      "[37]\tvalidation_0-rmse:63.5615\n",
      "[38]\tvalidation_0-rmse:63.2716\n",
      "[39]\tvalidation_0-rmse:62.9765\n",
      "[40]\tvalidation_0-rmse:62.3468\n",
      "[41]\tvalidation_0-rmse:62.0579\n",
      "[42]\tvalidation_0-rmse:61.9598\n",
      "[43]\tvalidation_0-rmse:61.6452\n",
      "[44]\tvalidation_0-rmse:61.2468\n",
      "[45]\tvalidation_0-rmse:60.7332\n",
      "[46]\tvalidation_0-rmse:60.6493\n",
      "[47]\tvalidation_0-rmse:60.2032\n",
      "[48]\tvalidation_0-rmse:59.9972\n",
      "[49]\tvalidation_0-rmse:59.5956\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "mean_absolute_error=47.22\n",
      "Model export success: mockup-model.dat\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Best RMSE on eval: %.2f with %d rounds 59.595573 50\n"
     ]
    }
   ],
   "source": [
    "model = ModelServe(model_file=\"mockup-model.dat\")\n",
    "model.train()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Predict locally\n",
    "\n",
    "* Run prediction inside the notebook using the newly created model\n",
    "* To run prediction we just invoke redict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "MetadataStore with gRPC connection initialized\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "model_file not supplied; using the default\n",
      "model_file=mockup-model.dat\n",
      "[23:44:47] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[[-30.6968994140625, 45.884098052978516]]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(train_X, train_y), (test_X, test_y) =read_synthetic_input()\n",
    "\n",
    "ModelServe().predict(test_X, None)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Use Kubeflow Fairing to Launch a K8s Job to train your model\n",
    "\n",
    "* Now that we have trained a model locally we can use Kubeflow fairing to\n",
    "  1. Launch a Kubernetes job to train the model\n",
    "  1. Deploy the model on Kubernetes\n",
    "* Launching a separate Kubernetes job to train the model has the following advantages\n",
    "\n",
    "  * You can leverage Kubernetes to run multiple training jobs in parallel \n",
    "  * You can run long running jobs without blocking your kernel"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Configure The Docker Registry For Kubeflow Fairing\n",
    "\n",
    "* In order to build docker images from your notebook we need a docker registry where the images will be stored\n",
    "* Below you set some variables specifying a [GCR container registry](https://cloud.google.com/container-registry/docs/)\n",
    "* Kubeflow Fairing provides a utility function to guess the name of your GCP project"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Setting up google container repositories (GCR) for storing output containers\n",
    "# You can use any docker container registry istead of GCR\n",
    "GCP_PROJECT = fairing.cloud.gcp.guess_project_name()\n",
    "DOCKER_REGISTRY = 'gcr.io/{}/fairing-job'.format(GCP_PROJECT)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Use Kubeflow fairing to build the docker image\n",
    "\n",
    "* First you will use kubeflow fairing's kaniko builder to build a docker image that includes all your dependencies\n",
    "  * You use kaniko because you want to be able to run `pip` to install dependencies\n",
    "  * Kaniko gives you the flexibility to build images from Dockerfiles\n",
    "* kaniko, however, can be slow\n",
    "* so you will build a base image using Kaniko and then every time your code changes you will just build an image\n",
    "  starting from your base image and adding your code to it\n",
    "* you use the kubeflow fairing build to enable these fast rebuilds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO(https://github.com/kubeflow/fairing/issues/426): We should get rid of this once the default \n",
    "# Kaniko image is updated to a newer image than 0.7.0.\n",
    "from kubeflow.fairing import constants\n",
    "constants.constants.KANIKO_IMAGE = \"gcr.io/kaniko-project/executor:v0.14.0\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Converting build-train-deploy.ipynb to build-train-deploy.py\n",
      "Creating entry point for the class name ModelServe\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[PosixPath('build-train-deploy.py'), 'xgboost_util.py', 'mockup-model.dat']"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from kubeflow.fairing.builders import cluster\n",
    "\n",
    "# output_map is a map of extra files to add to the notebook.\n",
    "# It is a map from source location to the location inside the context.\n",
    "output_map =  {\n",
    "    \"Dockerfile\": \"Dockerfile\",\n",
    "    \"requirements.txt\": \"requirements.txt\",\n",
    "}\n",
    "\n",
    "\n",
    "preprocessor = ConvertNotebookPreprocessorWithFire(class_name='ModelServe', notebook_file='build-train-deploy.ipynb',\n",
    "                                                   output_map=output_map)\n",
    "\n",
    "if not preprocessor.input_files:\n",
    "    preprocessor.input_files = set()\n",
    "input_files=[\"xgboost_util.py\", \"mockup-model.dat\"]\n",
    "preprocessor.input_files =  set([os.path.normpath(f) for f in input_files])\n",
    "preprocessor.preprocess()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Build the base image\n",
    "\n",
    "* You use cluster_builder to build the base image\n",
    "* You only need to perform this again if we change our Docker image or the dependencies we need to install\n",
    "* ClusterBuilder takes as input the DockerImage to use as a base image\n",
    "* You should use the same Jupyter image that you are using for your notebook server so that your environment will be\n",
    "  the same when you launch Kubernetes jobs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Building image using cluster builder.\n",
      "Creating docker context: /tmp/fairing_context_n34sz0lr\n",
      "Converting build-train-deploy.ipynb to build-train-deploy.py\n",
      "Creating entry point for the class name ModelServe\n",
      "Not able to find gcp credentials secret: user-gcp-sa\n",
      "Trying workload identity service account: default-editor\n",
      "Waiting for fairing-builder-dcbz2-lqzjg to start...\n",
      "Waiting for fairing-builder-dcbz2-lqzjg to start...\n",
      "Waiting for fairing-builder-dcbz2-lqzjg to start...\n",
      "Pod started running True\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ERROR: logging before flag.Parse: E0226 23:44:52.505936       1 metadata.go:241] Failed to unmarshal scopes: invalid character 'h' looking for beginning of value\n",
      "\u001b[36mINFO\u001b[0m[0002] Resolved base name gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0 to gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0\n",
      "\u001b[36mINFO\u001b[0m[0002] Resolved base name gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0 to gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0\n",
      "\u001b[36mINFO\u001b[0m[0002] Downloading base image gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0\n",
      "\u001b[36mINFO\u001b[0m[0002] Error while retrieving image from cache: getting file info: stat /cache/sha256:fe174faf7c477bc3dae796b067d98ac3f0d31e8075007a1146f86d13f2c98e13: no such file or directory\n",
      "\u001b[36mINFO\u001b[0m[0002] Downloading base image gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0\n",
      "\u001b[36mINFO\u001b[0m[0003] Built cross stage deps: map[]\n",
      "\u001b[36mINFO\u001b[0m[0003] Downloading base image gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0\n",
      "\u001b[36mINFO\u001b[0m[0003] Error while retrieving image from cache: getting file info: stat /cache/sha256:fe174faf7c477bc3dae796b067d98ac3f0d31e8075007a1146f86d13f2c98e13: no such file or directory\n",
      "\u001b[36mINFO\u001b[0m[0003] Downloading base image gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0\n",
      "\u001b[36mINFO\u001b[0m[0003] Using files from context: [/kaniko/buildcontext/requirements.txt]\n",
      "\u001b[36mINFO\u001b[0m[0003] Checking for cached layer gcr.io/kubeflow-ci/fairing-job/fairing-job/cache:233bc2f24de09b29aa4c12d0f5adcc3098286c3c35eb0b4864fa00f73d8b9d2c...\n",
      "\u001b[36mINFO\u001b[0m[0003] Using caching version of cmd: COPY requirements.txt .\n",
      "\u001b[36mINFO\u001b[0m[0003] cmd: USER\n",
      "\u001b[36mINFO\u001b[0m[0003] Checking for cached layer gcr.io/kubeflow-ci/fairing-job/fairing-job/cache:1acac4c9cb73d1b18003ae6076fd264e37af3983927234122784b04452b9b44e...\n",
      "\u001b[36mINFO\u001b[0m[0004] Using caching version of cmd: RUN pip3 --no-cache-dir install -r requirements.txt\n",
      "\u001b[36mINFO\u001b[0m[0004] cmd: USER\n",
      "\u001b[36mINFO\u001b[0m[0004] Skipping unpacking as no commands require it.\n",
      "\u001b[36mINFO\u001b[0m[0004] Taking snapshot of full filesystem...\n",
      "\u001b[36mINFO\u001b[0m[0004] COPY requirements.txt .\n",
      "\u001b[36mINFO\u001b[0m[0004] Found cached layer, extracting to filesystem\n",
      "\u001b[36mINFO\u001b[0m[0004] extractedFiles: [/tf/requirements.txt / /tf]\n",
      "\u001b[36mINFO\u001b[0m[0004] Taking snapshot of files...\n",
      "\u001b[36mINFO\u001b[0m[0004] USER root\n",
      "\u001b[36mINFO\u001b[0m[0004] cmd: USER\n",
      "\u001b[36mINFO\u001b[0m[0004] No files changed in this command, skipping snapshotting.\n",
      "\u001b[36mINFO\u001b[0m[0004] RUN pip3 --no-cache-dir install -r requirements.txt\n",
      "\u001b[36mINFO\u001b[0m[0004] Found cached layer, extracting to filesystem\n",
      "\u001b[36mINFO\u001b[0m[0032] Taking snapshot of files...\n",
      "\u001b[36mINFO\u001b[0m[0070] USER jovyan\n",
      "\u001b[36mINFO\u001b[0m[0070] cmd: USER\n",
      "\u001b[36mINFO\u001b[0m[0070] No files changed in this command, skipping snapshotting.\n"
     ]
    }
   ],
   "source": [
    "# Use a stock jupyter image as our base image\n",
    "# TODO(jlewi): Should we try to use the downward API to default to the image we are running in?\n",
    "base_image = \"gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0\"\n",
    "# We use a custom Dockerfile \n",
    "cluster_builder = cluster.cluster.ClusterBuilder(registry=DOCKER_REGISTRY,\n",
    "                                                 base_image=base_image,\n",
    "                                                 preprocessor=preprocessor,\n",
    "                                                 dockerfile_path=\"Dockerfile\",\n",
    "                                                 pod_spec_mutators=[fairing.cloud.gcp.add_gcp_credentials_if_exists],\n",
    "                                                 context_source=cluster.gcs_context.GCSContextSource())\n",
    "cluster_builder.build()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Build the actual image\n",
    "\n",
    "Here you use the append builder to add your code to the base image\n",
    "\n",
    "* Calling preprocessor.preprocess() converts your notebook file to a python file\n",
    "\n",
    "  * You are using the [ConvertNotebookPreprocessorWithFire](https://github.com/kubeflow/fairing/blob/master/fairing/preprocessors/converted_notebook.py#L85) \n",
    "  * This preprocessor converts ipynb files to py files by doing the following\n",
    "    1. Removing all cells which don't have a comment `# fairing:include-cell`\n",
    "    1. Using [python-fire](https://github.com/google/python-fire) to add entry points for the class specified in the constructor \n",
    "    \n",
    "  * Call preprocess() will create the file build-train-deploy.py\n",
    "  \n",
    "* You use the AppendBuilder to rapidly build a new docker image by quickly adding some files to an existing docker image\n",
    "  * The AppendBuilder is super fast so its very convenient for rebuilding your images as you iterate on your code\n",
    "  * The AppendBuilder will add the converted notebook, build-train-deploy.py, along with any files specified in `preprocessor.input_files` to `/app` in the newly created image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Converting build-train-deploy.ipynb to build-train-deploy.py\n",
      "Creating entry point for the class name ModelServe\n",
      "Building image using Append builder...\n",
      "Creating docker context: /tmp/fairing_context_x4g0orab\n",
      "Converting build-train-deploy.ipynb to build-train-deploy.py\n",
      "Creating entry point for the class name ModelServe\n",
      "build-train-deploy.py already exists in Fairing context, skipping...\n",
      "Loading Docker credentials for repository 'gcr.io/kubeflow-ci/fairing-job/fairing-job:F47EE88D'\n",
      "Invoking 'docker-credential-gcloud' to obtain Docker credentials.\n",
      "Successfully obtained Docker credentials.\n",
      "Image successfully built in 2.249176573008299s.\n",
      "Pushing image gcr.io/kubeflow-ci/fairing-job/fairing-job:BDE79D77...\n",
      "Loading Docker credentials for repository 'gcr.io/kubeflow-ci/fairing-job/fairing-job:BDE79D77'\n",
      "Invoking 'docker-credential-gcloud' to obtain Docker credentials.\n",
      "Successfully obtained Docker credentials.\n",
      "Uploading gcr.io/kubeflow-ci/fairing-job/fairing-job:BDE79D77\n",
      "Layer sha256:8832e37735788665026956430021c6d1919980288c66c4526502965aeb5ac006 exists, skipping\n",
      "Layer sha256:b4ecb6928817c974946ba93ffc5ce60de886457eb57955dae9d7bc8facfb690a exists, skipping\n",
      "Layer sha256:9269cef1ab8b202433fe1dfbfbdf4649926d70d7a8b94f0324421bda79b917fa exists, skipping\n",
      "Layer sha256:d77b634303a107b22366d05d1071bf79e7d2a27b3d6db1fb726fcfd5dd5f9831 exists, skipping\n",
      "Layer sha256:5bd1cb59702536c10e96bb14e54846922c9b257580d4e2c733076a922525240b exists, skipping\n",
      "Layer sha256:7babe47a4c402afbe26f10dffceb85be7bfd2072a96b816814503f41ce9c5273 exists, skipping\n",
      "Layer sha256:21a7832aeb8625dc8228ceb115a28222f87e0fbce61b2588c42a2cce7a3a63d6 exists, skipping\n",
      "Layer sha256:107cba84ef3d72ed995c76c7a4f60ba5613f58b029ab7e42ac20ece99bec88b1 exists, skipping\n",
      "Layer sha256:a31c3b1caad473a474d574283741f880e37c708cc06ee620d3e93fa602125ee0 exists, skipping\n",
      "Layer sha256:92d24c89f5bc70958385728755b042a5a45bddf2f997de80e84d1161f43ba316 exists, skipping\n",
      "Layer sha256:e590ee7edf442435692956d6fed54190416a217147a50c63e73a6a78d15bec84 exists, skipping\n",
      "Layer sha256:96685dce34a0d24bf69741972441398cffbed89aed4f40e3c063176c59a3c81c exists, skipping\n",
      "Layer sha256:daa5c419d33d51d1730ea530f4f7335640f5bb42856f319c63a1a521aee368c1 exists, skipping\n",
      "Layer sha256:016724bbd2c9643f24eff7c1e86d9202d7c04caddd7fdd4375a77e3998ce8203 exists, skipping\n",
      "Layer sha256:b5494e32d0131350be270a54399cee65934e90d3c2df87a83757903e627813b2 exists, skipping\n",
      "Layer sha256:823f4685c03b26a545ca41dcdca1e782ad5e52cf85bac03113edaa6aebdca1b3 exists, skipping\n",
      "Layer sha256:777cec03b3e23c21f8cf78f07812cc83dd7f352719226f27f361c5b706f6a93f exists, skipping\n",
      "Layer sha256:dc2840b4417186d66a29d64a039ac164be95929211d808294d36acae9301fc6b exists, skipping\n",
      "Layer sha256:5e671b828b2af02924968841e5d12084fa78e8722e9510402aaee80dc5d7a6db exists, skipping\n",
      "Layer sha256:5bac0c144f6e0b7082e3691da95d3f057ee0be0735e9efca76096da59cfd1786 exists, skipping\n",
      "Layer sha256:5b7339215d1d5f8e68622d584a224f60339f5bef41dbd74330d081e912f0cddd exists, skipping\n",
      "Layer sha256:35daced67e5901b8de4a92bca9fdc67c8593d400aae483591987442f54c87d0a exists, skipping\n",
      "Layer sha256:330a9002e0b4aa1e27d3628dd3f02ff9a39d25745b8f2f219b06e3725153ffc0 exists, skipping\n",
      "Layer sha256:4e8a6b90828e0d339f646d723df8720ffa17c0ffb905f8f009faf1be320ab5d9 exists, skipping\n",
      "Layer sha256:2b940936f9933b7737cf407f2149dd7393998d7a0bee5acf1c4a57b0487cef79 exists, skipping\n",
      "Layer sha256:d684674aa1a4d080be26286fd9356f573b80d2448599392e3dcf3c61ce98a0f0 exists, skipping\n",
      "Layer sha256:68543864d6442a851eaff0500161b92e4a151051cf7ed2649b3790a3f876bada exists, skipping\n",
      "Layer sha256:21640f54008ccbfc0d100246633f8e6f18f918a0566561f61aebbda785321e56 exists, skipping\n",
      "Layer sha256:f44c204b040238da05a21af1fd8543ea95f1e9249fac34b3b65217e38815568d exists, skipping\n",
      "Layer sha256:b054a26005b7f3b032577f811421fab5ec3b42ce45a4012dfa00cf6ed6191b0f exists, skipping\n",
      "Layer sha256:14ca88e9f6723ce82bc14b241cda8634f6d19677184691d086662641ab96fe68 exists, skipping\n",
      "Layer sha256:e3ab47ad84d9e11c5fad45791ce00ec5b5f3b7f1ae61a5fab17eb44c399d910f exists, skipping\n",
      "Layer sha256:01ad04a655b291ed8502f23f5b8c73d94475763e9b3cdbf6d1107f7879aadac6 exists, skipping\n",
      "Layer sha256:4b9d9f2fa2a2b168f0a49fcd3074c885ab1ca2c507848f7b2e3cee8104f1f7c3 exists, skipping\n",
      "Layer sha256:5bd2e6f0de430cd3936eec59afb6cf466b052344fe4348ac33a48ac903b661e2 exists, skipping\n",
      "Layer sha256:15bca5bd6fdc1b1ac156de08ce3b0f57760b345556b64017d1be5cc7c95e5e5b pushed.\n",
      "Layer sha256:d23f2bdcc84f066126b083288f75d140d58fc252618bda5bf05cb9696a183958 pushed.\n",
      "Finished upload of: gcr.io/kubeflow-ci/fairing-job/fairing-job:BDE79D77\n",
      "Pushed image gcr.io/kubeflow-ci/fairing-job/fairing-job:BDE79D77 in 2.74867532402277s.\n"
     ]
    }
   ],
   "source": [
    "preprocessor.preprocess()\n",
    "\n",
    "builder = append.append.AppendBuilder(registry=DOCKER_REGISTRY,\n",
    "                                      base_image=cluster_builder.image_tag, preprocessor=preprocessor)\n",
    "builder.build()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Launch the K8s Job\n",
    "\n",
    "* You can use kubeflow fairing to easily launch a [Kubernetes job](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/) to invoke code\n",
    "* You use fairings Kubernetes job library to build a Kubernetes job\n",
    "  * You use pod mutators to attach GCP credentials to the pod\n",
    "  * You can also use pod mutators to attch PVCs\n",
    "* Since the [ConvertNotebookPreprocessorWithFire](https://github.com/kubeflow/fairing/blob/master/fairing/preprocessors/converted_notebook.py#L85) is using [python-fire](https://github.com/google/python-fire) you can easily invoke any method inside the ModelServe class just by configuring the command invoked by the Kubernetes job\n",
    "   * In the cell below you extend the command to include `train` as an argument because you want to invoke the train\n",
    "     function\n",
    "     \n",
    "**Note** When you invoke train_deployer.deploy; kubeflow fairing will stream the logs from the Kubernetes job. The job will initially show some connection errors because the job will try to connect to the metadataserver. You can ignore these errors; the job will retry until its able to connect and then continue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Not able to find gcp credentials secret: user-gcp-sa\n",
      "Trying workload identity service account: default-editor\n",
      "The job fairing-job-qwdlb launched.\n",
      "Waiting for fairing-job-qwdlb-67ddb to start...\n",
      "Waiting for fairing-job-qwdlb-67ddb to start...\n",
      "Waiting for fairing-job-qwdlb-67ddb to start...\n",
      "Pod started running True\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2020-02-26 23:48:04.056153: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory\n",
      "2020-02-26 23:48:04.056318: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory\n",
      "2020-02-26 23:48:04.056332: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
      "WARNING: Logging before flag parsing goes to stderr.\n",
      "I0226 23:48:06.277673 140238089848640 metadata_store.py:80] MetadataStore with gRPC connection initialized\n",
      "model_file not supplied; using the default\n",
      "model_file=mockup-model.dat\n",
      "[23:48:06] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
      "[0]\tvalidation_0-rmse:106.201\n",
      "Will train until validation_0-rmse hasn't improved in 40 rounds.\n",
      "[1]\tvalidation_0-rmse:102.289\n",
      "[2]\tvalidation_0-rmse:99.0904\n",
      "[3]\tvalidation_0-rmse:95.5223\n",
      "[4]\tvalidation_0-rmse:92.2357\n",
      "[5]\tvalidation_0-rmse:90.1649\n",
      "[6]\tvalidation_0-rmse:87.6004\n",
      "[7]\tvalidation_0-rmse:85.4127\n",
      "[8]\tvalidation_0-rmse:82.7163\n",
      "[9]\tvalidation_0-rmse:81.1641\n",
      "[10]\tvalidation_0-rmse:79.1006\n",
      "[11]\tvalidation_0-rmse:77.2564\n",
      "[12]\tvalidation_0-rmse:75.3755\n",
      "[13]\tvalidation_0-rmse:74.3393\n",
      "[14]\tvalidation_0-rmse:72.0505\n",
      "[15]\tvalidation_0-rmse:70.8315\n",
      "[16]\tvalidation_0-rmse:69.1124\n",
      "[17]\tvalidation_0-rmse:67.9681\n",
      "[18]\tvalidation_0-rmse:66.2094\n",
      "[19]\tvalidation_0-rmse:64.6999\n",
      "[20]\tvalidation_0-rmse:63.6925\n",
      "[21]\tvalidation_0-rmse:62.261\n",
      "[22]\tvalidation_0-rmse:60.887\n",
      "[23]\tvalidation_0-rmse:59.5543\n",
      "[24]\tvalidation_0-rmse:58.3673\n",
      "[25]\tvalidation_0-rmse:57.0439\n",
      "[26]\tvalidation_0-rmse:55.7172\n",
      "[27]\tvalidation_0-rmse:54.7011\n",
      "[28]\tvalidation_0-rmse:53.8976\n",
      "[29]\tvalidation_0-rmse:53.3325\n",
      "[30]\tvalidation_0-rmse:52.81\n",
      "[31]\tvalidation_0-rmse:51.8806\n",
      "[32]\tvalidation_0-rmse:50.9026\n",
      "[33]\tvalidation_0-rmse:50.0451\n",
      "[34]\tvalidation_0-rmse:49.2711\n",
      "[35]\tvalidation_0-rmse:48.6533\n",
      "[36]\tvalidation_0-rmse:47.8613\n",
      "[37]\tvalidation_0-rmse:47.5519\n",
      "[38]\tvalidation_0-rmse:46.9383\n",
      "[39]\tvalidation_0-rmse:46.7275\n",
      "[40]\tvalidation_0-rmse:46.1317\n",
      "[41]\tvalidation_0-rmse:45.7704\n",
      "[42]\tvalidation_0-rmse:45.4888\n",
      "[43]\tvalidation_0-rmse:44.8847\n",
      "[44]\tvalidation_0-rmse:44.5583\n",
      "[45]\tvalidation_0-rmse:43.9202\n",
      "[46]\tvalidation_0-rmse:43.7332\n",
      "[47]\tvalidation_0-rmse:43.2122\n",
      "[48]\tvalidation_0-rmse:43.0383\n",
      "[49]\tvalidation_0-rmse:42.7427\n",
      "I0226 23:48:06.457567 140238089848640 build-train-deploy.py:100] mean_absolute_error=33.15\n",
      "I0226 23:48:06.494030 140238089848640 build-train-deploy.py:106] Model export success: mockup-model.dat\n",
      "Best RMSE on eval: %.2f with %d rounds 42.742691 50\n"
     ]
    }
   ],
   "source": [
    "pod_spec = builder.generate_pod_spec()\n",
    "train_deployer = job.job.Job(cleanup=False,\n",
    "                             pod_spec_mutators=[\n",
    "                             fairing.cloud.gcp.add_gcp_credentials_if_exists])\n",
    "\n",
    "# Add command line arguments\n",
    "pod_spec.containers[0].command.extend([\"train\"])\n",
    "result = train_deployer.deploy(pod_spec)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* You can use kubectl to inspect the job that fairing created"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "apiVersion: v1\n",
      "items:\n",
      "- apiVersion: batch/v1\n",
      "  kind: Job\n",
      "  metadata:\n",
      "    creationTimestamp: \"2020-02-26T23:47:21Z\"\n",
      "    generateName: fairing-job-\n",
      "    labels:\n",
      "      fairing-deployer: job\n",
      "      fairing-id: 54d568cc-58f2-11ea-964d-46fd3ccc57c5\n",
      "    name: fairing-job-qwdlb\n",
      "    namespace: zhenghui\n",
      "    resourceVersion: \"11375571\"\n",
      "    selfLink: /apis/batch/v1/namespaces/zhenghui/jobs/fairing-job-qwdlb\n",
      "    uid: 54d8d81b-58f2-11ea-a99d-42010a8000ac\n",
      "  spec:\n",
      "    backoffLimit: 0\n",
      "    completions: 1\n",
      "    parallelism: 1\n",
      "    selector:\n",
      "      matchLabels:\n",
      "        controller-uid: 54d8d81b-58f2-11ea-a99d-42010a8000ac\n",
      "    template:\n",
      "      metadata:\n",
      "        annotations:\n",
      "          sidecar.istio.io/inject: \"false\"\n",
      "        creationTimestamp: null\n",
      "        labels:\n",
      "          controller-uid: 54d8d81b-58f2-11ea-a99d-42010a8000ac\n",
      "          fairing-deployer: job\n",
      "          fairing-id: 54d568cc-58f2-11ea-964d-46fd3ccc57c5\n",
      "          job-name: fairing-job-qwdlb\n",
      "        name: fairing-deployer\n",
      "      spec:\n",
      "        containers:\n",
      "        - command:\n",
      "          - python\n",
      "          - /app/build-train-deploy.py\n",
      "          - train\n",
      "          env:\n",
      "          - name: FAIRING_RUNTIME\n",
      "            value: \"1\"\n",
      "          image: gcr.io/kubeflow-ci/fairing-job/fairing-job:BDE79D77\n",
      "          imagePullPolicy: IfNotPresent\n",
      "          name: fairing-job\n",
      "          resources: {}\n",
      "          securityContext:\n",
      "            runAsUser: 0\n",
      "          terminationMessagePath: /dev/termination-log\n",
      "          terminationMessagePolicy: File\n",
      "          workingDir: /app/\n",
      "        dnsPolicy: ClusterFirst\n",
      "        restartPolicy: Never\n",
      "        schedulerName: default-scheduler\n",
      "        securityContext: {}\n",
      "        serviceAccount: default-editor\n",
      "        serviceAccountName: default-editor\n",
      "        terminationGracePeriodSeconds: 30\n",
      "  status:\n",
      "    completionTime: \"2020-02-26T23:48:08Z\"\n",
      "    conditions:\n",
      "    - lastProbeTime: \"2020-02-26T23:48:08Z\"\n",
      "      lastTransitionTime: \"2020-02-26T23:48:08Z\"\n",
      "      status: \"True\"\n",
      "      type: Complete\n",
      "    startTime: \"2020-02-26T23:47:21Z\"\n",
      "    succeeded: 1\n",
      "kind: List\n",
      "metadata:\n",
      "  resourceVersion: \"\"\n",
      "  selfLink: \"\"\n"
     ]
    }
   ],
   "source": [
    "!kubectl get jobs -l fairing-id={train_deployer.job_id} -o yaml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Deploy the trained model to Kubeflow for predictions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Now that you have trained a model you can use kubeflow fairing to deploy it on Kubernetes\n",
    "* When you call deployer.deploy fairing will create a [Kubernetes Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) to serve your model\n",
    "* Kubeflow fairing uses the docker image you created earlier\n",
    "* The docker image you created contains your code and [Seldon core](https://www.seldon.io/)\n",
    "* Kubeflow fairing uses Seldon to wrap your prediction code, ModelServe.predict, in a REST and gRPC server"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Cluster endpoint: http://fairing-service-kkbtm.zhenghui.svc.cluster.local:5000/predict\n"
     ]
    }
   ],
   "source": [
    "from kubeflow.fairing.deployers import serving\n",
    "pod_spec = builder.generate_pod_spec()\n",
    "\n",
    "module_name = os.path.splitext(preprocessor.executable.name)[0]\n",
    "deployer = serving.serving.Serving(module_name + \".ModelServe\",\n",
    "                                   service_type=\"ClusterIP\",\n",
    "                                   labels={\"app\": \"mockup\"})\n",
    "    \n",
    "url = deployer.deploy(pod_spec)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* You can use kubectl to inspect the deployment that fairing created"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "apiVersion: extensions/v1beta1\r\n",
      "kind: Deployment\r\n",
      "metadata:\r\n",
      "  annotations:\r\n",
      "    deployment.kubernetes.io/revision: \"1\"\r\n",
      "  creationTimestamp: \"2020-02-26T23:48:12Z\"\r\n",
      "  generateName: fairing-deployer-\r\n",
      "  generation: 1\r\n",
      "  labels:\r\n",
      "    app: mockup\r\n",
      "    fairing-deployer: serving\r\n",
      "    fairing-id: 73532514-58f2-11ea-964d-46fd3ccc57c5\r\n",
      "  name: fairing-deployer-p8xc9\r\n",
      "  namespace: zhenghui\r\n",
      "  resourceVersion: \"11375642\"\r\n",
      "  selfLink: /apis/extensions/v1beta1/namespaces/zhenghui/deployments/fairing-deployer-p8xc9\r\n",
      "  uid: 7354b5ec-58f2-11ea-a99d-42010a8000ac\r\n",
      "spec:\r\n",
      "  progressDeadlineSeconds: 600\r\n",
      "  replicas: 1\r\n",
      "  revisionHistoryLimit: 10\r\n",
      "  selector:\r\n",
      "    matchLabels:\r\n",
      "      app: mockup\r\n",
      "      fairing-deployer: serving\r\n",
      "      fairing-id: 73532514-58f2-11ea-964d-46fd3ccc57c5\r\n",
      "  strategy:\r\n",
      "    rollingUpdate:\r\n",
      "      maxSurge: 25%\r\n",
      "      maxUnavailable: 25%\r\n",
      "    type: RollingUpdate\r\n",
      "  template:\r\n",
      "    metadata:\r\n",
      "      annotations:\r\n",
      "        sidecar.istio.io/inject: \"false\"\r\n",
      "      creationTimestamp: null\r\n",
      "      labels:\r\n",
      "        app: mockup\r\n",
      "        fairing-deployer: serving\r\n",
      "        fairing-id: 73532514-58f2-11ea-964d-46fd3ccc57c5\r\n",
      "      name: fairing-deployer\r\n",
      "    spec:\r\n",
      "      containers:\r\n",
      "      - command:\r\n",
      "        - seldon-core-microservice\r\n",
      "        - build-train-deploy.ModelServe\r\n",
      "        - REST\r\n",
      "        - --service-type=MODEL\r\n",
      "        - --persistence=0\r\n",
      "        env:\r\n",
      "        - name: FAIRING_RUNTIME\r\n",
      "          value: \"1\"\r\n",
      "        image: gcr.io/kubeflow-ci/fairing-job/fairing-job:BDE79D77\r\n",
      "        imagePullPolicy: IfNotPresent\r\n",
      "        name: model\r\n",
      "        resources: {}\r\n",
      "        securityContext:\r\n",
      "          runAsUser: 0\r\n",
      "        terminationMessagePath: /dev/termination-log\r\n",
      "        terminationMessagePolicy: File\r\n",
      "        workingDir: /app/\r\n",
      "      dnsPolicy: ClusterFirst\r\n",
      "      restartPolicy: Always\r\n",
      "      schedulerName: default-scheduler\r\n",
      "      securityContext: {}\r\n",
      "      terminationGracePeriodSeconds: 30\r\n",
      "status:\r\n",
      "  conditions:\r\n",
      "  - lastTransitionTime: \"2020-02-26T23:48:12Z\"\r\n",
      "    lastUpdateTime: \"2020-02-26T23:48:12Z\"\r\n",
      "    message: Deployment does not have minimum availability.\r\n",
      "    reason: MinimumReplicasUnavailable\r\n",
      "    status: \"False\"\r\n",
      "    type: Available\r\n",
      "  - lastTransitionTime: \"2020-02-26T23:48:12Z\"\r\n",
      "    lastUpdateTime: \"2020-02-26T23:48:13Z\"\r\n",
      "    message: ReplicaSet \"fairing-deployer-p8xc9-854c699677\" is progressing.\r\n",
      "    reason: ReplicaSetUpdated\r\n",
      "    status: \"True\"\r\n",
      "    type: Progressing\r\n",
      "  observedGeneration: 1\r\n",
      "  replicas: 1\r\n",
      "  unavailableReplicas: 1\r\n",
      "  updatedReplicas: 1\r\n"
     ]
    }
   ],
   "source": [
    "!kubectl get deploy -o yaml {deployer.deployment.metadata.name}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Send an inference request to the prediction server\n",
    "\n",
    "* Now that you have deployed the model into your Kubernetes cluster, you can send a REST request to \n",
    "  preform inference\n",
    "* The code below reads some data, sends, a prediction request and then prints out the response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "(train_X, train_y), (test_X, test_y) = read_synthetic_input()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(b'{\"data\":{\"names\":[\"t:0\",\"t:1\"],\"tensor\":{\"shape\":[1,2],\"values\":[-49.2782592'\n",
      " b'7734375,-54.25324630737305]}},\"meta\":{}}\\n')\n"
     ]
    }
   ],
   "source": [
    "result = util.predict_nparray(url, test_X)\n",
    "pprint.pprint(result.content)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Clean up the prediction endpoint\n",
    "\n",
    "* You can use kubectl to delete the Kubernetes resources for your model\n",
    "* If you want to delete the resources uncomment the following lines and run them"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "# !kubectl delete service -l app=ames\n",
    "# !kubectl delete deploy -l app=ames"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Track Models and Artifacts\n",
    "\n",
    "* Using Kubeflow's metadata server you can track models and artifacts\n",
    "* The ModelServe code was instrumented to log executions and outputs\n",
    "* You can access Kubeflow's metadata UI by selecting **Artifact Store** from the central dashboard\n",
    "  * See [here](https://www.kubeflow.org/docs/other-guides/accessing-uis/) for instructions on connecting to Kubeflow's UIs\n",
    "* You can also use the python SDK to read and write entries\n",
    "* This [notebook](https://github.com/kubeflow/metadata/blob/master/sdk/python/demo.ipynb) illustrates a bunch of metadata functionality"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create a workspace\n",
    "\n",
    "* Kubeflow metadata uses workspaces as a logical grouping for artifacts, executions, and datasets that belong together\n",
    "* Earlier in the notebook we defined the function `create_workspace` to create a workspace for this example\n",
    "* You can use that function to return a workspace object and then call list to see all the artifacts in that workspace"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "MetadataStore with gRPC connection initialized\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'id': 3,\n",
       "  'workspace': 'xgboost-synthetic',\n",
       "  'run': 'xgboost-synthetic-faring-run2020-02-26T23:26:36.443396',\n",
       "  'version': '2020-02-26T23:26:36.660862',\n",
       "  'owner': 'someone@kubeflow.org',\n",
       "  'description': 'housing price prediction model using synthetic data',\n",
       "  'name': 'housing-price-model',\n",
       "  'model_type': 'linear_regression',\n",
       "  'create_time': '2020-02-26T23:26:36.660887Z',\n",
       "  'uri': 'mockup-model.dat',\n",
       "  'training_framework': {'name': 'xgboost', 'version': '0.9.0'},\n",
       "  'hyperparameters': {'learning_rate': 0.1, 'n_estimators': 50},\n",
       "  'labels': None,\n",
       "  'kwargs': {}},\n",
       " {'id': 6,\n",
       "  'workspace': 'xgboost-synthetic',\n",
       "  'run': 'xgboost-synthetic-faring-run2020-02-26T23:27:11.144500',\n",
       "  'create_time': '2020-02-26T23:27:11.458520Z',\n",
       "  'version': '2020-02-26T23:27:11.458480',\n",
       "  'owner': 'someone@kubeflow.org',\n",
       "  'description': 'housing price prediction model using synthetic data',\n",
       "  'name': 'housing-price-model',\n",
       "  'model_type': 'linear_regression',\n",
       "  'uri': 'mockup-model.dat',\n",
       "  'training_framework': {'name': 'xgboost', 'version': '0.9.0'},\n",
       "  'hyperparameters': {'learning_rate': 0.1, 'n_estimators': 50},\n",
       "  'labels': None,\n",
       "  'kwargs': {}},\n",
       " {'id': 9,\n",
       "  'workspace': 'xgboost-synthetic',\n",
       "  'run': 'xgboost-synthetic-faring-run2020-02-26T23:30:04.636580',\n",
       "  'create_time': '2020-02-26T23:30:04.866997Z',\n",
       "  'version': '2020-02-26T23:30:04.866972',\n",
       "  'owner': 'someone@kubeflow.org',\n",
       "  'description': 'housing price prediction model using synthetic data',\n",
       "  'name': 'housing-price-model',\n",
       "  'model_type': 'linear_regression',\n",
       "  'uri': 'mockup-model.dat',\n",
       "  'training_framework': {'name': 'xgboost', 'version': '0.9.0'},\n",
       "  'hyperparameters': {'learning_rate': 0.1, 'n_estimators': 50},\n",
       "  'labels': None,\n",
       "  'kwargs': {}},\n",
       " {'id': 12,\n",
       "  'workspace': 'xgboost-synthetic',\n",
       "  'run': 'xgboost-synthetic-faring-run2020-02-26T23:44:47.344352',\n",
       "  'create_time': '2020-02-26T23:44:47.585805Z',\n",
       "  'version': '2020-02-26T23:44:47.585782',\n",
       "  'owner': 'someone@kubeflow.org',\n",
       "  'description': 'housing price prediction model using synthetic data',\n",
       "  'name': 'housing-price-model',\n",
       "  'model_type': 'linear_regression',\n",
       "  'uri': 'mockup-model.dat',\n",
       "  'training_framework': {'name': 'xgboost', 'version': '0.9.0'},\n",
       "  'hyperparameters': {'learning_rate': 0.1, 'n_estimators': 50},\n",
       "  'labels': None,\n",
       "  'kwargs': {}},\n",
       " {'id': 15,\n",
       "  'workspace': 'xgboost-synthetic',\n",
       "  'run': 'xgboost-synthetic-faring-run2020-02-26T23:48:06.287002',\n",
       "  'version': '2020-02-26T23:48:06.495138',\n",
       "  'owner': 'someone@kubeflow.org',\n",
       "  'description': 'housing price prediction model using synthetic data',\n",
       "  'name': 'housing-price-model',\n",
       "  'model_type': 'linear_regression',\n",
       "  'create_time': '2020-02-26T23:48:06.495166Z',\n",
       "  'uri': 'mockup-model.dat',\n",
       "  'training_framework': {'name': 'xgboost', 'version': '0.9.0'},\n",
       "  'hyperparameters': {'learning_rate': 0.1, 'n_estimators': 50},\n",
       "  'labels': None,\n",
       "  'kwargs': {}}]"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ws = create_workspace()\n",
    "ws.list()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a pipeline to train your model\n",
    "\n",
    "* [Kubeflow pipelines](https://www.kubeflow.org/docs/pipelines/) makes it easy to define complex workflows to build and deploy models\n",
    "* Below you will define and run a simple one step pipeline to train your model\n",
    "* Kubeflow pipelines uses experiments to group different runs of a pipeline together\n",
    "* So you start by defining a name for your experiement"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Define the pipeline\n",
    "\n",
    "* To create a pipeline you create a function and decorate it with the `@dsl.pipeline` decorator\n",
    "  * You use the decorator to give the pipeline a name and description\n",
    "  \n",
    "* Inside the function, each step in the function is defined by a ContainerOp that specifies\n",
    "  a container to invoke\n",
    " \n",
    "* You will use the container image that you built earlier using Kubeflow Fairing\n",
    "* Since the Kubeflow Fairing preprocessor added a main function using [python-fire](https://github.com/google/python-fire), a step in your pipeline can invocation any function in the ModelServe class just by setting the command for the container op\n",
    "* See the pipelines [SDK reference](https://kubeflow-pipelines.readthedocs.io/en/latest/) for more information"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "@dsl.pipeline(\n",
    "   name='Training pipeline',\n",
    "   description='A pipeline that trains an xgboost model for the Ames dataset.'\n",
    ")\n",
    "def train_pipeline(\n",
    "   ):      \n",
    "    command=[\"python\", preprocessor.executable.name, \"train\"]\n",
    "    train_op = dsl.ContainerOp(\n",
    "            name=\"train\", \n",
    "            image=builder.image_tag,        \n",
    "            command=command,\n",
    "            ).apply(\n",
    "                gcp.use_gcp_secret('user-gcp-sa'),\n",
    "            )\n",
    "    train_op.container.working_dir = \"/app\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Compile the pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Pipelines need to be compiled"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline_func = train_pipeline\n",
    "pipeline_filename = pipeline_func.__name__ + '.pipeline.zip'\n",
    "compiler.Compiler().compile(pipeline_func, pipeline_filename)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Submit the pipeline for execution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Pipelines groups runs using experiments\n",
    "* So before you submit a pipeline you need to create an experiment or pick an existing experiment\n",
    "* Once you have compiled a pipeline, you can use the pipelines SDK to submit that pipeline\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Creating experiment MockupModel.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "Experiment link <a href=\"/pipeline/#/experiments/details/8d173bc4-d2a7-4f88-9751-fc156bcf280b\" target=\"_blank\" >here</a>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "Run link <a href=\"/pipeline/#/runs/details/105b69c8-dbf0-4ad4-8baa-e8eadfb38769\" target=\"_blank\" >here</a>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "EXPERIMENT_NAME = 'MockupModel'\n",
    "\n",
    "#Specify pipeline argument values\n",
    "arguments = {}\n",
    "\n",
    "# Get or create an experiment and submit a pipeline run\n",
    "client = kfp.Client()\n",
    "experiment = client.create_experiment(EXPERIMENT_NAME)\n",
    "\n",
    "#Submit a pipeline run\n",
    "run_name = pipeline_func.__name__ + ' run'\n",
    "run_result = client.run_pipeline(experiment.id, run_name, pipeline_filename, arguments)\n",
    "\n",
    "#vvvvvvvvv This link leads to the run information page. (Note: There is a bug in JupyterLab that modifies the URL and makes the link stop working)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}