mirror of https://github.com/kubeflow/examples.git
1541 lines
79 KiB
Plaintext
1541 lines
79 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Train and deploy on Kubeflow from Notebooks\n",
|
|
"\n",
|
|
"This notebook shows you how to use Kubeflow to build, train, and deploy models on Kubernetes.\n",
|
|
"This notebook walks you through the following\n",
|
|
" \n",
|
|
"* Building an XGBoost model inside a notebook\n",
|
|
"* Training the model inside the notebook\n",
|
|
"* Performing inference using the model inside the notebook\n",
|
|
"* Using Kubeflow Fairing to launch training jobs on Kubernetes\n",
|
|
"* Using Kubeflow Fairing to build and deploy a model using [Seldon Core](https://www.seldon.io/)\n",
|
|
"* Using [Kubeflow metadata](https://github.com/kubeflow/metadata) to record metadata about your models\n",
|
|
"* Using [Kubeflow Pipelines](https://www.kubeflow.org/docs/pipelines/) to build a pipeline to train your model\n",
|
|
"\n",
|
|
"## Prerequisites \n",
|
|
"\n",
|
|
"* This notebook assumes you are running inside 0.6 Kubeflow deployed on GKE following the [GKE instructions](https://www.kubeflow.org/docs/gke/deploy/)\n",
|
|
"* If you are running somewhere other than GKE you will need to modify the notebook to use a different docker registry or else configure Kubeflow to work with GCR."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Verify we have a GCP account\n",
|
|
"\n",
|
|
"* The cell below checks that this notebook was spawned with credentials to access GCP"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import os\n",
|
|
"from oauth2client.client import GoogleCredentials\n",
|
|
"credentials = GoogleCredentials.get_application_default()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Install Required Libraries\n",
|
|
"\n",
|
|
"Import the libraries required to train this model."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"pip installing requirements.txt\n",
|
|
"pip installing KFP https://storage.googleapis.com/ml-pipeline/release/0.1.32/kfp.tar.gz\n",
|
|
"pip installing fairing git+git://github.com/kubeflow/fairing.git@7c93e888c3fc98bdf5fb0140e90f6407ce7a807b\n",
|
|
"Configure docker credentials\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"import notebook_setup\n",
|
|
"notebook_setup.notebook_setup()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"* Import the python libraries we will use\n",
|
|
"* We add a comment \"fairing:include-cell\" to tell the kubefow fairing preprocessor to keep this cell when converting to python code later"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# fairing:include-cell\n",
|
|
"import fire\n",
|
|
"import joblib\n",
|
|
"import logging\n",
|
|
"import nbconvert\n",
|
|
"import os\n",
|
|
"import pathlib\n",
|
|
"import sys\n",
|
|
"from pathlib import Path\n",
|
|
"import pandas as pd\n",
|
|
"import pprint\n",
|
|
"from sklearn.metrics import mean_absolute_error\n",
|
|
"from sklearn.model_selection import train_test_split\n",
|
|
"from sklearn.impute import SimpleImputer\n",
|
|
"from xgboost import XGBRegressor\n",
|
|
"from importlib import reload\n",
|
|
"from sklearn.datasets import make_regression\n",
|
|
"from kubeflow.metadata import metadata\n",
|
|
"from kubeflow.metadata import openapi_client\n",
|
|
"from kubeflow.metadata.openapi_client import Configuration, ApiClient, MetadataServiceApi\n",
|
|
"from datetime import datetime\n",
|
|
"import retrying\n",
|
|
"import urllib3"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Imports not to be included in the built docker image\n",
|
|
"import util\n",
|
|
"import kfp\n",
|
|
"import kfp.components as comp\n",
|
|
"import kfp.gcp as gcp\n",
|
|
"import kfp.dsl as dsl\n",
|
|
"import kfp.compiler as compiler\n",
|
|
"from kubernetes import client as k8s_client\n",
|
|
"from kubeflow import fairing \n",
|
|
"from kubeflow.fairing.builders import append\n",
|
|
"from kubeflow.fairing.deployers import job\n",
|
|
"from kubeflow.fairing.preprocessors.converted_notebook import ConvertNotebookPreprocessorWithFire\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Code to train and predict \n",
|
|
"\n",
|
|
"* In the cells below we define some functions to generate data and train a model\n",
|
|
"* These functions could just as easily be defined in a separate python module"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# fairing:include-cell\n",
|
|
"def read_synthetic_input(test_size=0.25):\n",
|
|
" \"\"\"generate synthetic data and split it into train and test.\"\"\"\n",
|
|
" # generate regression dataset\n",
|
|
" X, y = make_regression(n_samples=200, n_features=5, noise=0.1)\n",
|
|
" train_X, test_X, train_y, test_y = train_test_split(X,\n",
|
|
" y,\n",
|
|
" test_size=test_size,\n",
|
|
" shuffle=False)\n",
|
|
"\n",
|
|
" imputer = SimpleImputer()\n",
|
|
" train_X = imputer.fit_transform(train_X)\n",
|
|
" test_X = imputer.transform(test_X)\n",
|
|
"\n",
|
|
" return (train_X, train_y), (test_X, test_y)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# fairing:include-cell\n",
|
|
"def train_model(train_X,\n",
|
|
" train_y,\n",
|
|
" test_X,\n",
|
|
" test_y,\n",
|
|
" n_estimators,\n",
|
|
" learning_rate):\n",
|
|
" \"\"\"Train the model using XGBRegressor.\"\"\"\n",
|
|
" model = XGBRegressor(n_estimators=n_estimators, learning_rate=learning_rate)\n",
|
|
"\n",
|
|
" model.fit(train_X,\n",
|
|
" train_y,\n",
|
|
" early_stopping_rounds=40,\n",
|
|
" eval_set=[(test_X, test_y)])\n",
|
|
"\n",
|
|
" print(\"Best RMSE on eval: %.2f with %d rounds\",\n",
|
|
" model.best_score,\n",
|
|
" model.best_iteration+1)\n",
|
|
" return model\n",
|
|
"\n",
|
|
"def eval_model(model, test_X, test_y):\n",
|
|
" \"\"\"Evaluate the model performance.\"\"\"\n",
|
|
" predictions = model.predict(test_X)\n",
|
|
" mae=mean_absolute_error(predictions, test_y)\n",
|
|
" logging.info(\"mean_absolute_error=%.2f\", mae)\n",
|
|
" return mae\n",
|
|
"\n",
|
|
"def save_model(model, model_file):\n",
|
|
" \"\"\"Save XGBoost model for serving.\"\"\"\n",
|
|
" joblib.dump(model, model_file)\n",
|
|
" logging.info(\"Model export success: %s\", model_file)\n",
|
|
"\n",
|
|
"@retrying.retry(stop_max_delay=180000)\n",
|
|
"def wait_for_istio(address=\"metadata-service.kubeflow.svc.cluster.local:8080\"):\n",
|
|
" \"\"\"Wait until we can connect to the metadata service.\n",
|
|
" \n",
|
|
" When we launch a K8s pod we may not be able to connect to the metadata service immediately\n",
|
|
" because the ISTIO side car hasn't started.\n",
|
|
" \n",
|
|
" This function allows us to wait for a time specified up to stop_max_delay to see if the service\n",
|
|
" is ready. \n",
|
|
" \"\"\"\n",
|
|
" config = Configuration()\n",
|
|
" config.host = address\n",
|
|
" api_client = ApiClient(config)\n",
|
|
" client = MetadataServiceApi(api_client)\n",
|
|
"\n",
|
|
" client.list_artifacts2()\n",
|
|
" \n",
|
|
"def create_workspace():\n",
|
|
" return metadata.Workspace(\n",
|
|
" # Connect to metadata-service in namesapce kubeflow in k8s cluster.\n",
|
|
" backend_url_prefix=\"metadata-service.kubeflow.svc.cluster.local:8080\",\n",
|
|
" name=\"xgboost-synthetic\",\n",
|
|
" description=\"workspace for xgboost-synthetic artifacts and executions\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Wrap Training and Prediction in a class\n",
|
|
"\n",
|
|
"* In the cell below we wrap training and prediction in a class\n",
|
|
"* A class provides the structure we will need to eventually use kubeflow fairing to launch separate training jobs and/or deploy the model on Kubernetes"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# fairing:include-cell\n",
|
|
"class ModelServe(object): \n",
|
|
" def __init__(self, model_file=None):\n",
|
|
" self.n_estimators = 50\n",
|
|
" self.learning_rate = 0.1\n",
|
|
" if not model_file:\n",
|
|
" if \"MODEL_FILE\" in os.environ:\n",
|
|
" print(\"model_file not supplied; checking environment variable\")\n",
|
|
" model_file = os.getenv(\"MODEL_FILE\")\n",
|
|
" else:\n",
|
|
" print(\"model_file not supplied; using the default\")\n",
|
|
" model_file = \"mockup-model.dat\"\n",
|
|
" \n",
|
|
" self.model_file = model_file\n",
|
|
" print(\"model_file={0}\".format(self.model_file))\n",
|
|
" \n",
|
|
" self.model = None\n",
|
|
" self._workspace = None\n",
|
|
" self.exec = self.create_execution()\n",
|
|
"\n",
|
|
" def train(self):\n",
|
|
" (train_X, train_y), (test_X, test_y) = read_synthetic_input()\n",
|
|
" \n",
|
|
" # Here we use Kubeflow's metadata library to record information\n",
|
|
" # about the training run to Kubeflow's metadata store.\n",
|
|
" self.exec.log_input(metadata.DataSet(\n",
|
|
" description=\"xgboost synthetic data\",\n",
|
|
" name=\"synthetic-data\",\n",
|
|
" owner=\"someone@kubeflow.org\",\n",
|
|
" uri=\"file://path/to/dataset\",\n",
|
|
" version=\"v1.0.0\"))\n",
|
|
" \n",
|
|
" model = train_model(train_X,\n",
|
|
" train_y,\n",
|
|
" test_X,\n",
|
|
" test_y,\n",
|
|
" self.n_estimators,\n",
|
|
" self.learning_rate)\n",
|
|
"\n",
|
|
" mae = eval_model(model, test_X, test_y)\n",
|
|
" \n",
|
|
" # Here we log metrics about the model to Kubeflow's metadata store.\n",
|
|
" self.exec.log_output(metadata.Metrics(\n",
|
|
" name=\"xgboost-synthetic-traing-eval\",\n",
|
|
" owner=\"someone@kubeflow.org\",\n",
|
|
" description=\"training evaluation for xgboost synthetic\",\n",
|
|
" uri=\"gcs://path/to/metrics\",\n",
|
|
" metrics_type=metadata.Metrics.VALIDATION,\n",
|
|
" values={\"mean_absolute_error\": mae}))\n",
|
|
" \n",
|
|
" save_model(model, self.model_file)\n",
|
|
" self.exec.log_output(metadata.Model(\n",
|
|
" name=\"housing-price-model\",\n",
|
|
" description=\"housing price prediction model using synthetic data\",\n",
|
|
" owner=\"someone@kubeflow.org\",\n",
|
|
" uri=self.model_file,\n",
|
|
" model_type=\"linear_regression\",\n",
|
|
" training_framework={\n",
|
|
" \"name\": \"xgboost\",\n",
|
|
" \"version\": \"0.9.0\"\n",
|
|
" },\n",
|
|
" hyperparameters={\n",
|
|
" \"learning_rate\": self.learning_rate,\n",
|
|
" \"n_estimators\": self.n_estimators\n",
|
|
" },\n",
|
|
" version=datetime.utcnow().isoformat(\"T\")))\n",
|
|
" \n",
|
|
" def predict(self, X, feature_names):\n",
|
|
" \"\"\"Predict using the model for given ndarray.\n",
|
|
" \n",
|
|
" The predict signature should match the syntax expected by Seldon Core\n",
|
|
" https://github.com/SeldonIO/seldon-core so that we can use\n",
|
|
" Seldon h to wrap it a model server and deploy it on Kubernetes\n",
|
|
" \"\"\"\n",
|
|
" if not self.model:\n",
|
|
" self.model = joblib.load(self.model_file)\n",
|
|
" # Do any preprocessing\n",
|
|
" prediction = self.model.predict(data=X)\n",
|
|
" # Do any postprocessing\n",
|
|
" return [[prediction.item(0), prediction.item(1)]]\n",
|
|
"\n",
|
|
" @property\n",
|
|
" def workspace(self):\n",
|
|
" if not self._workspace:\n",
|
|
" wait_for_istio()\n",
|
|
" self._workspace = create_workspace()\n",
|
|
" return self._workspace\n",
|
|
" \n",
|
|
" def create_execution(self): \n",
|
|
" r = metadata.Run(\n",
|
|
" workspace=self.workspace,\n",
|
|
" name=\"xgboost-synthetic-faring-run\" + datetime.utcnow().isoformat(\"T\"),\n",
|
|
" description=\"a notebook run\")\n",
|
|
"\n",
|
|
" return metadata.Execution(\n",
|
|
" name = \"execution\" + datetime.utcnow().isoformat(\"T\"),\n",
|
|
" workspace=self.workspace,\n",
|
|
" run=r,\n",
|
|
" description=\"execution for training xgboost-synthetic\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Train your Model Locally\n",
|
|
"\n",
|
|
"* Train your model locally inside your notebook\n",
|
|
"* To train locally we just instatiante the ModelServe class and then call train"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"model_file=mockup-model.dat\n",
|
|
"[00:34:17] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n",
|
|
"[0]\tvalidation_0-rmse:162.856\n",
|
|
"Will train until validation_0-rmse hasn't improved in 40 rounds.\n",
|
|
"[1]\tvalidation_0-rmse:156.25\n",
|
|
"[2]\tvalidation_0-rmse:150.238\n",
|
|
"[3]\tvalidation_0-rmse:145.026\n",
|
|
"[4]\tvalidation_0-rmse:138.321\n",
|
|
"[5]\tvalidation_0-rmse:131.554\n",
|
|
"[6]\tvalidation_0-rmse:127.809\n",
|
|
"[7]\tvalidation_0-rmse:122.574\n",
|
|
"[8]\tvalidation_0-rmse:117.394\n",
|
|
"[9]\tvalidation_0-rmse:114.842\n",
|
|
"[10]\tvalidation_0-rmse:111.601\n",
|
|
"[11]\tvalidation_0-rmse:108.426\n",
|
|
"[12]\tvalidation_0-rmse:105.283\n",
|
|
"[13]\tvalidation_0-rmse:102.916\n",
|
|
"[14]\tvalidation_0-rmse:101.126\n",
|
|
"[15]\tvalidation_0-rmse:98.9049\n",
|
|
"[16]\tvalidation_0-rmse:96.6027\n",
|
|
"[17]\tvalidation_0-rmse:94.6449\n",
|
|
"[18]\tvalidation_0-rmse:92.7175\n",
|
|
"[19]\tvalidation_0-rmse:89.821\n",
|
|
"[20]\tvalidation_0-rmse:87.785\n",
|
|
"[21]\tvalidation_0-rmse:85.8316\n",
|
|
"[22]\tvalidation_0-rmse:84.7495\n",
|
|
"[23]\tvalidation_0-rmse:83.3638\n",
|
|
"[24]\tvalidation_0-rmse:81.9553\n",
|
|
"[25]\tvalidation_0-rmse:80.1649\n",
|
|
"[26]\tvalidation_0-rmse:79.2545\n",
|
|
"[27]\tvalidation_0-rmse:77.5626\n",
|
|
"[28]\tvalidation_0-rmse:75.979\n",
|
|
"[29]\tvalidation_0-rmse:74.6956\n",
|
|
"[30]\tvalidation_0-rmse:74.1145\n",
|
|
"[31]\tvalidation_0-rmse:73.102\n",
|
|
"[32]\tvalidation_0-rmse:71.9953\n",
|
|
"[33]\tvalidation_0-rmse:71.2614\n",
|
|
"[34]\tvalidation_0-rmse:70.4738\n",
|
|
"[35]\tvalidation_0-rmse:69.6975\n",
|
|
"[36]\tvalidation_0-rmse:69.0899\n",
|
|
"[37]\tvalidation_0-rmse:68.6369\n",
|
|
"[38]\tvalidation_0-rmse:67.6392\n",
|
|
"[39]\tvalidation_0-rmse:67.153\n",
|
|
"[40]\tvalidation_0-rmse:66.8115\n",
|
|
"[41]\tvalidation_0-rmse:66.2017\n",
|
|
"[42]\tvalidation_0-rmse:65.5889\n",
|
|
"[43]\tvalidation_0-rmse:64.793\n",
|
|
"[44]\tvalidation_0-rmse:64.2622\n",
|
|
"[45]\tvalidation_0-rmse:63.75\n",
|
|
"[46]\tvalidation_0-rmse:63.0683\n",
|
|
"[47]\tvalidation_0-rmse:62.5844\n",
|
|
"[48]\tvalidation_0-rmse:62.4817\n",
|
|
"[49]\tvalidation_0-rmse:61.9615\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"mean_absolute_error=47.50\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Best RMSE on eval: %.2f with %d rounds 61.961517 50\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Model export success: mockup-model.dat\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"model = ModelServe(model_file=\"mockup-model.dat\")\n",
|
|
"model.train()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Predict locally\n",
|
|
"\n",
|
|
"* Run prediction inside the notebook using the newly created model\n",
|
|
"* To run prediction we just invoke redict"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"model_file not supplied; using the default\n",
|
|
"model_file=mockup-model.dat\n",
|
|
"[00:34:17] WARNING: /workspace/src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[[361.5152893066406, -99.92890930175781]]"
|
|
]
|
|
},
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"(train_X, train_y), (test_X, test_y) =read_synthetic_input()\n",
|
|
"\n",
|
|
"ModelServe().predict(test_X, None)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Use Kubeflow Fairing to Launch a K8s Job to train your model\n",
|
|
"\n",
|
|
"* Now that we have trained a model locally we can use Kubeflow fairing to\n",
|
|
" 1. Launch a Kubernetes job to train the model\n",
|
|
" 1. Deploy the model on Kubernetes\n",
|
|
"* Launching a separate Kubernetes job to train the model has the following advantages\n",
|
|
"\n",
|
|
" * You can leverage Kubernetes to run multiple training jobs in parallel \n",
|
|
" * You can run long running jobs without blocking your kernel"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Configure The Docker Registry For Kubeflow Fairing\n",
|
|
"\n",
|
|
"* In order to build docker images from your notebook we need a docker registry where the images will be stored\n",
|
|
"* Below you set some variables specifying a [GCR container registry](https://cloud.google.com/container-registry/docs/)\n",
|
|
"* Kubeflow Fairing provides a utility function to guess the name of your GCP project"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Setting up google container repositories (GCR) for storing output containers\n",
|
|
"# You can use any docker container registry istead of GCR\n",
|
|
"GCP_PROJECT = fairing.cloud.gcp.guess_project_name()\n",
|
|
"DOCKER_REGISTRY = 'gcr.io/{}/fairing-job'.format(GCP_PROJECT)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Use Kubeflow fairing to build the docker image\n",
|
|
"\n",
|
|
"* First you will use kubeflow fairing's kaniko builder to build a docker image that includes all your dependencies\n",
|
|
" * You use kaniko because you want to be able to run `pip` to install dependencies\n",
|
|
" * Kaniko gives you the flexibility to build images from Dockerfiles\n",
|
|
"* kaniko, however, can be slow\n",
|
|
"* so you will build a base image using Kaniko and then every time your code changes you will just build an image\n",
|
|
" starting from your base image and adding your code to it\n",
|
|
"* you use the kubeflow fairing build to enable these fast rebuilds"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 25,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Converting build-train-deploy.ipynb to build-train-deploy.py\n",
|
|
"Creating entry point for the class name ModelServe\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[PosixPath('build-train-deploy.py'),\n",
|
|
" 'mockup-model.dat',\n",
|
|
" 'xgboost_util.py',\n",
|
|
" 'requirements.txt']"
|
|
]
|
|
},
|
|
"execution_count": 25,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"from kubeflow.fairing.builders import cluster\n",
|
|
"preprocessor = ConvertNotebookPreprocessorWithFire(class_name='ModelServe', notebook_file='build-train-deploy.ipynb')\n",
|
|
"\n",
|
|
"if not preprocessor.input_files:\n",
|
|
" preprocessor.input_files = set()\n",
|
|
"input_files=[\"xgboost_util.py\", \"mockup-model.dat\", \"requirements.txt\"]\n",
|
|
"preprocessor.input_files = set([os.path.normpath(f) for f in input_files])\n",
|
|
"preprocessor.preprocess()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Build the base image\n",
|
|
"\n",
|
|
"* You use cluster_builder to build the base image\n",
|
|
"* You only need to perform this again if we change our Docker image or the dependencies we need to install\n",
|
|
"* ClusterBuilder takes as input the DockerImage to use as a base image\n",
|
|
"* You should use the same Jupyter image that you are using for your notebook server so that your environment will be\n",
|
|
" the same when you launch Kubernetes jobs"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 26,
|
|
"metadata": {
|
|
"scrolled": true
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Building image using cluster builder.\n",
|
|
"Creating docker context: /tmp/fairing_context_ybqvdghn\n",
|
|
"Converting build-train-deploy.ipynb to build-train-deploy.py\n",
|
|
"Creating entry point for the class name ModelServe\n",
|
|
"Waiting for fairing-builder-ksmm7-gt427 to start...\n",
|
|
"Waiting for fairing-builder-ksmm7-gt427 to start...\n",
|
|
"Waiting for fairing-builder-ksmm7-gt427 to start...\n",
|
|
"Pod started running True\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"ERROR: logging before flag.Parse: E1025 01:42:23.499654 1 metadata.go:241] Failed to unmarshal scopes: invalid character 'h' looking for beginning of value\n",
|
|
"\u001b[36mINFO\u001b[0m[0002] Downloading base image gcr.io/kubeflow-images-public/tensorflow-1.13.1-notebook-cpu:v0.5.0\n",
|
|
"\u001b[36mINFO\u001b[0m[0002] Downloading base image gcr.io/kubeflow-images-public/tensorflow-1.13.1-notebook-cpu:v0.5.0\n",
|
|
"\u001b[33mWARN\u001b[0m[0002] Error while retrieving image from cache: getting image from path: open /cache/sha256:5aaccf0267f085afd976342a8e943a9c6cefccef5b554df4e15fa7bf15cbd7a3: no such file or directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0002] Using files from context: [/kaniko/buildcontext/app/requirements.txt]\n",
|
|
"\u001b[36mINFO\u001b[0m[0002] Checking for cached layer gcr.io/jlewi-dev/fairing-job/fairing-job/cache:864fc6b813659edb48dd37b06d234c939c364db3e60df63a7de4e13b3174f933...\n",
|
|
"\u001b[36mINFO\u001b[0m[0002] No cached layer found for cmd RUN if [ -e requirements.txt ];then pip install --no-cache -r requirements.txt; fi\n",
|
|
"\u001b[36mINFO\u001b[0m[0002] Unpacking rootfs as cmd RUN if [ -e requirements.txt ];then pip install --no-cache -r requirements.txt; fi requires it.\n",
|
|
"\u001b[36mINFO\u001b[0m[0117] Taking snapshot of full filesystem...\n",
|
|
"\u001b[36mINFO\u001b[0m[0129] Skipping paths under /dev, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0129] Skipping paths under /etc/secrets, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0129] Skipping paths under /kaniko, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0130] Skipping paths under /proc, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0130] Skipping paths under /sys, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0131] Skipping paths under /var/run, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0202] WORKDIR /app/\n",
|
|
"\u001b[36mINFO\u001b[0m[0202] cmd: workdir\n",
|
|
"\u001b[36mINFO\u001b[0m[0202] Changed working directory to /app/\n",
|
|
"\u001b[36mINFO\u001b[0m[0202] Creating directory /app/\n",
|
|
"\u001b[36mINFO\u001b[0m[0202] Taking snapshot of files...\n",
|
|
"\u001b[36mINFO\u001b[0m[0202] ENV FAIRING_RUNTIME 1\n",
|
|
"\u001b[36mINFO\u001b[0m[0202] No files changed in this command, skipping snapshotting.\n",
|
|
"\u001b[36mINFO\u001b[0m[0202] Using files from context: [/kaniko/buildcontext/app/requirements.txt]\n",
|
|
"\u001b[36mINFO\u001b[0m[0203] COPY /app//requirements.txt /app/\n",
|
|
"\u001b[36mINFO\u001b[0m[0203] Taking snapshot of files...\n",
|
|
"\u001b[36mINFO\u001b[0m[0203] RUN if [ -e requirements.txt ];then pip install --no-cache -r requirements.txt; fi\n",
|
|
"\u001b[36mINFO\u001b[0m[0203] cmd: /bin/bash\n",
|
|
"\u001b[36mINFO\u001b[0m[0203] args: [-c if [ -e requirements.txt ];then pip install --no-cache -r requirements.txt; fi]\n",
|
|
"Collecting fire (from -r requirements.txt (line 1))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/d9/69/faeaae8687f4de0f5973694d02e9d6c3eb827636a009157352d98de1129e/fire-0.2.1.tar.gz (76kB)\n",
|
|
"Collecting gitpython (from -r requirements.txt (line 2))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/aa/25/9fd9f0b05408021736a22ae73f837152c132e4ea85cdd71d186e24efec31/GitPython-3.0.4-py3-none-any.whl (454kB)\n",
|
|
"Requirement already satisfied: google-cloud-storage in /opt/conda/lib/python3.6/site-packages (from -r requirements.txt (line 3)) (1.14.0)\n",
|
|
"Collecting joblib (from -r requirements.txt (line 4))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/8f/42/155696f85f344c066e17af287359c9786b436b1bf86029bb3411283274f3/joblib-0.14.0-py2.py3-none-any.whl (294kB)\n",
|
|
"Collecting kubeflow-metadata (from -r requirements.txt (line 5))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/43/b4/3fa3c1a88b8c52695b33acd09189dda8c84ea582acbfd07a1d46f085828c/kubeflow_metadata-0.2.0-py3-none-any.whl (69kB)\n",
|
|
"Requirement already satisfied: numpy in /opt/conda/lib/python3.6/site-packages (from -r requirements.txt (line 6)) (1.16.2)\n",
|
|
"Collecting pandas (from -r requirements.txt (line 7))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/86/12/08b092f6fc9e4c2552e37add0861d0e0e0d743f78f1318973caad970b3fc/pandas-0.25.2-cp36-cp36m-manylinux1_x86_64.whl (10.4MB)\n",
|
|
"Collecting retrying (from -r requirements.txt (line 8))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/44/ef/beae4b4ef80902f22e3af073397f079c96969c69b2c7d52a57ea9ae61c9d/retrying-1.3.3.tar.gz\n",
|
|
"Collecting seldon-core (from -r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/62/25/442db772bc1950864756de2b7cb9f23b0ae0d0997189f3e3eb56e84ea22f/seldon_core-0.4.1-py3-none-any.whl (45kB)\n",
|
|
"Collecting sklearn (from -r requirements.txt (line 10))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/1e/7a/dbb3be0ce9bd5c8b7e3d87328e79063f8b263b2b1bfa4774cb1147bfcd3f/sklearn-0.0.tar.gz\n",
|
|
"Requirement already satisfied: xgboost in /opt/conda/lib/python3.6/site-packages (from -r requirements.txt (line 11)) (0.82)\n",
|
|
"Collecting tornado>=6.0.3 (from -r requirements.txt (line 12))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/30/78/2d2823598496127b21423baffaa186b668f73cd91887fcef78b6eade136b/tornado-6.0.3.tar.gz (482kB)\n",
|
|
"Requirement already satisfied: six in /opt/conda/lib/python3.6/site-packages (from fire->-r requirements.txt (line 1)) (1.12.0)\n",
|
|
"Requirement already satisfied: termcolor in /opt/conda/lib/python3.6/site-packages (from fire->-r requirements.txt (line 1)) (1.1.0)\n",
|
|
"Collecting gitdb2>=2.0.0 (from gitpython->-r requirements.txt (line 2))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/03/6c/99296f89bad2ef85626e1df9f677acbee8885bb043ad82ad3ed4746d2325/gitdb2-2.0.6-py2.py3-none-any.whl (63kB)\n",
|
|
"Requirement already satisfied: google-resumable-media>=0.3.1 in /opt/conda/lib/python3.6/site-packages (from google-cloud-storage->-r requirements.txt (line 3)) (0.3.2)\n",
|
|
"Requirement already satisfied: google-cloud-core<0.30dev,>=0.29.0 in /opt/conda/lib/python3.6/site-packages (from google-cloud-storage->-r requirements.txt (line 3)) (0.29.1)\n",
|
|
"Requirement already satisfied: google-api-core<2.0.0dev,>=1.6.0 in /opt/conda/lib/python3.6/site-packages (from google-cloud-storage->-r requirements.txt (line 3)) (1.9.0)\n",
|
|
"Requirement already satisfied: pytz>=2017.2 in /opt/conda/lib/python3.6/site-packages (from pandas->-r requirements.txt (line 7)) (2018.9)\n",
|
|
"Requirement already satisfied: python-dateutil>=2.6.1 in /opt/conda/lib/python3.6/site-packages (from pandas->-r requirements.txt (line 7)) (2.8.0)\n",
|
|
"Requirement already satisfied: grpcio in /opt/conda/lib/python3.6/site-packages (from seldon-core->-r requirements.txt (line 9)) (1.19.0)\n",
|
|
"Collecting Flask-OpenTracing==0.2.0 (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/1d/c4/0546b854a3f42af9ef959df9bd1108903698e175e7a07c057cdfaeeef718/Flask_OpenTracing-0.2.0-py2.py3-none-any.whl\n",
|
|
"Collecting flatbuffers (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/c9/84/adf5837f96c39990bc55afdfddf460b38b4562f50341359afa32e4a98de7/flatbuffers-1.11-py2.py3-none-any.whl\n",
|
|
"Collecting minio>=4.0.9 (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/ba/17/6084f63de9bd7c6d47b5aab719d6246c01d74d4aaad373e0142a666080cc/minio-5.0.1-py2.py3-none-any.whl (62kB)\n",
|
|
"Requirement already satisfied: requests in /opt/conda/lib/python3.6/site-packages (from seldon-core->-r requirements.txt (line 9)) (2.21.0)\n",
|
|
"Collecting flask-cors (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/78/38/e68b11daa5d613e3a91e4bf3da76c94ac9ee0d9cd515af9c1ab80d36f709/Flask_Cors-3.0.8-py2.py3-none-any.whl\n",
|
|
"Requirement already satisfied: pyyaml in /opt/conda/lib/python3.6/site-packages (from seldon-core->-r requirements.txt (line 9)) (5.1)\n",
|
|
"Requirement already satisfied: protobuf in /opt/conda/lib/python3.6/site-packages (from seldon-core->-r requirements.txt (line 9)) (3.7.1)\n",
|
|
"Collecting opentracing<2,>=1.2.2 (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/06/c2/90b35a1abdc639a5c6000d8202c70217d60e80f5b328870efb73fda71115/opentracing-1.3.0.tar.gz\n",
|
|
"Collecting flask (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/9b/93/628509b8d5dc749656a9641f4caf13540e2cdec85276964ff8f43bbb1d3b/Flask-1.1.1-py2.py3-none-any.whl (94kB)\n",
|
|
"Collecting grpcio-opentracing (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/db/82/2fcad380697c3dab25de76ee590bcab3eb9bbfb4add916044d7e83ec2b10/grpcio_opentracing-1.1.4-py3-none-any.whl\n",
|
|
"Requirement already satisfied: tensorflow in /opt/conda/lib/python3.6/site-packages (from seldon-core->-r requirements.txt (line 9)) (1.13.1)\n",
|
|
"Collecting gunicorn>=19.9.0 (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/8c/da/b8dd8deb741bff556db53902d4706774c8e1e67265f69528c14c003644e6/gunicorn-19.9.0-py2.py3-none-any.whl (112kB)\n",
|
|
"Collecting jaeger-client==3.13.0 (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/c8/a2/e9bd04cd660cbdffe0598173be068be23099fbd68e7a4a89b74440509130/jaeger-client-3.13.0.tar.gz (77kB)\n",
|
|
"Collecting azure-storage-blob>=2.0.1 (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/3e/84/610f379b46d7d3c2d48eadeed6a12b6d46a43100fea70534f5992d0ac996/azure_storage_blob-2.1.0-py2.py3-none-any.whl (88kB)\n",
|
|
"Collecting redis (from seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/32/ae/28613a62eea0d53d3db3147f8715f90da07667e99baeedf1010eb400f8c0/redis-3.3.11-py2.py3-none-any.whl (66kB)\n",
|
|
"Collecting scikit-learn (from sklearn->-r requirements.txt (line 10))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/a0/c5/d2238762d780dde84a20b8c761f563fe882b88c5a5fb03c056547c442a19/scikit_learn-0.21.3-cp36-cp36m-manylinux1_x86_64.whl (6.7MB)\n",
|
|
"Requirement already satisfied: scipy in /opt/conda/lib/python3.6/site-packages (from xgboost->-r requirements.txt (line 11)) (1.2.1)\n",
|
|
"Collecting smmap2>=2.0.0 (from gitdb2>=2.0.0->gitpython->-r requirements.txt (line 2))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/55/d2/866d45e3a121ee15a1dc013824d58072fd5c7799c9c34d01378eb262ca8f/smmap2-2.0.5-py2.py3-none-any.whl\n",
|
|
"Requirement already satisfied: googleapis-common-protos!=1.5.4,<2.0dev,>=1.5.3 in /opt/conda/lib/python3.6/site-packages (from google-api-core<2.0.0dev,>=1.6.0->google-cloud-storage->-r requirements.txt (line 3)) (1.5.9)\n",
|
|
"Requirement already satisfied: google-auth<2.0dev,>=0.4.0 in /opt/conda/lib/python3.6/site-packages (from google-api-core<2.0.0dev,>=1.6.0->google-cloud-storage->-r requirements.txt (line 3)) (1.6.3)\n",
|
|
"Requirement already satisfied: setuptools>=34.0.0 in /opt/conda/lib/python3.6/site-packages (from google-api-core<2.0.0dev,>=1.6.0->google-cloud-storage->-r requirements.txt (line 3)) (40.9.0)\n",
|
|
"Requirement already satisfied: future in /opt/conda/lib/python3.6/site-packages (from minio>=4.0.9->seldon-core->-r requirements.txt (line 9)) (0.17.1)\n",
|
|
"Requirement already satisfied: urllib3 in /opt/conda/lib/python3.6/site-packages (from minio>=4.0.9->seldon-core->-r requirements.txt (line 9)) (1.24.1)\n",
|
|
"Requirement already satisfied: certifi in /opt/conda/lib/python3.6/site-packages (from minio>=4.0.9->seldon-core->-r requirements.txt (line 9)) (2019.3.9)\n",
|
|
"Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests->seldon-core->-r requirements.txt (line 9)) (2.8)\n",
|
|
"Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests->seldon-core->-r requirements.txt (line 9)) (3.0.4)\n",
|
|
"Collecting Jinja2>=2.10.1 (from flask->seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/65/e0/eb35e762802015cab1ccee04e8a277b03f1d8e53da3ec3106882ec42558b/Jinja2-2.10.3-py2.py3-none-any.whl (125kB)\n",
|
|
"Requirement already satisfied: Werkzeug>=0.15 in /opt/conda/lib/python3.6/site-packages (from flask->seldon-core->-r requirements.txt (line 9)) (0.15.2)\n",
|
|
"Collecting itsdangerous>=0.24 (from flask->seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/76/ae/44b03b253d6fade317f32c24d100b3b35c2239807046a4c953c7b89fa49e/itsdangerous-1.1.0-py2.py3-none-any.whl\n",
|
|
"Collecting click>=5.1 (from flask->seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/fa/37/45185cb5abbc30d7257104c434fe0b07e5a195a6847506c074527aa599ec/Click-7.0-py2.py3-none-any.whl (81kB)\n",
|
|
"Requirement already satisfied: tensorboard<1.14.0,>=1.13.0 in /opt/conda/lib/python3.6/site-packages (from tensorflow->seldon-core->-r requirements.txt (line 9)) (1.13.1)\n",
|
|
"Requirement already satisfied: keras-applications>=1.0.6 in /opt/conda/lib/python3.6/site-packages (from tensorflow->seldon-core->-r requirements.txt (line 9)) (1.0.7)\n",
|
|
"Requirement already satisfied: astor>=0.6.0 in /opt/conda/lib/python3.6/site-packages (from tensorflow->seldon-core->-r requirements.txt (line 9)) (0.7.1)\n",
|
|
"Requirement already satisfied: tensorflow-estimator<1.14.0rc0,>=1.13.0 in /opt/conda/lib/python3.6/site-packages (from tensorflow->seldon-core->-r requirements.txt (line 9)) (1.13.0)\n",
|
|
"Requirement already satisfied: gast>=0.2.0 in /opt/conda/lib/python3.6/site-packages (from tensorflow->seldon-core->-r requirements.txt (line 9)) (0.2.2)\n",
|
|
"Requirement already satisfied: keras-preprocessing>=1.0.5 in /opt/conda/lib/python3.6/site-packages (from tensorflow->seldon-core->-r requirements.txt (line 9)) (1.0.9)\n",
|
|
"Requirement already satisfied: wheel>=0.26 in /opt/conda/lib/python3.6/site-packages (from tensorflow->seldon-core->-r requirements.txt (line 9)) (0.33.1)\n",
|
|
"Requirement already satisfied: absl-py>=0.1.6 in /opt/conda/lib/python3.6/site-packages (from tensorflow->seldon-core->-r requirements.txt (line 9)) (0.7.1)\n",
|
|
"Collecting threadloop<2,>=1 (from jaeger-client==3.13.0->seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/d3/1d/8398c1645b97dc008d3c658e04beda01ede3d90943d40c8d56863cf891bd/threadloop-1.0.2.tar.gz\n",
|
|
"Collecting thrift (from jaeger-client==3.13.0->seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/c6/b4/510617906f8e0c5660e7d96fbc5585113f83ad547a3989b80297ac72a74c/thrift-0.11.0.tar.gz (52kB)\n",
|
|
"Collecting azure-common>=1.1.5 (from azure-storage-blob>=2.0.1->seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/00/55/a703923c12cd3172d5c007beda0c1a34342a17a6a72779f8a7c269af0cd6/azure_common-1.1.23-py2.py3-none-any.whl\n",
|
|
"Collecting azure-storage-common~=2.1 (from azure-storage-blob>=2.0.1->seldon-core->-r requirements.txt (line 9))\n",
|
|
" Downloading https://files.pythonhosted.org/packages/6b/a0/6794b318ce0118d1a4053bdf0149a60807407db9b710354f2b203c2f5975/azure_storage_common-2.1.0-py2.py3-none-any.whl (47kB)\n",
|
|
"Requirement already satisfied: cachetools>=2.0.0 in /opt/conda/lib/python3.6/site-packages (from google-auth<2.0dev,>=0.4.0->google-api-core<2.0.0dev,>=1.6.0->google-cloud-storage->-r requirements.txt (line 3)) (3.1.0)\n",
|
|
"Requirement already satisfied: pyasn1-modules>=0.2.1 in /opt/conda/lib/python3.6/site-packages (from google-auth<2.0dev,>=0.4.0->google-api-core<2.0.0dev,>=1.6.0->google-cloud-storage->-r requirements.txt (line 3)) (0.2.4)\n",
|
|
"Requirement already satisfied: rsa>=3.1.4 in /opt/conda/lib/python3.6/site-packages (from google-auth<2.0dev,>=0.4.0->google-api-core<2.0.0dev,>=1.6.0->google-cloud-storage->-r requirements.txt (line 3)) (4.0)\n",
|
|
"Requirement already satisfied: MarkupSafe>=0.23 in /opt/conda/lib/python3.6/site-packages (from Jinja2>=2.10.1->flask->seldon-core->-r requirements.txt (line 9)) (1.1.1)\n",
|
|
"Requirement already satisfied: markdown>=2.6.8 in /opt/conda/lib/python3.6/site-packages (from tensorboard<1.14.0,>=1.13.0->tensorflow->seldon-core->-r requirements.txt (line 9)) (3.1)\n",
|
|
"Requirement already satisfied: h5py in /opt/conda/lib/python3.6/site-packages (from keras-applications>=1.0.6->tensorflow->seldon-core->-r requirements.txt (line 9)) (2.9.0)\n",
|
|
"Requirement already satisfied: mock>=2.0.0 in /opt/conda/lib/python3.6/site-packages (from tensorflow-estimator<1.14.0rc0,>=1.13.0->tensorflow->seldon-core->-r requirements.txt (line 9)) (2.0.0)\n",
|
|
"Requirement already satisfied: cryptography in /opt/conda/lib/python3.6/site-packages (from azure-storage-common~=2.1->azure-storage-blob>=2.0.1->seldon-core->-r requirements.txt (line 9)) (2.6.1)\n",
|
|
"Requirement already satisfied: pyasn1<0.5.0,>=0.4.1 in /opt/conda/lib/python3.6/site-packages (from pyasn1-modules>=0.2.1->google-auth<2.0dev,>=0.4.0->google-api-core<2.0.0dev,>=1.6.0->google-cloud-storage->-r requirements.txt (line 3)) (0.4.5)\n",
|
|
"Requirement already satisfied: pbr>=0.11 in /opt/conda/lib/python3.6/site-packages (from mock>=2.0.0->tensorflow-estimator<1.14.0rc0,>=1.13.0->tensorflow->seldon-core->-r requirements.txt (line 9)) (5.1.3)\n",
|
|
"Requirement already satisfied: cffi!=1.11.3,>=1.8 in /opt/conda/lib/python3.6/site-packages (from cryptography->azure-storage-common~=2.1->azure-storage-blob>=2.0.1->seldon-core->-r requirements.txt (line 9)) (1.12.2)\n",
|
|
"Requirement already satisfied: asn1crypto>=0.21.0 in /opt/conda/lib/python3.6/site-packages (from cryptography->azure-storage-common~=2.1->azure-storage-blob>=2.0.1->seldon-core->-r requirements.txt (line 9)) (0.24.0)\n",
|
|
"Requirement already satisfied: pycparser in /opt/conda/lib/python3.6/site-packages (from cffi!=1.11.3,>=1.8->cryptography->azure-storage-common~=2.1->azure-storage-blob>=2.0.1->seldon-core->-r requirements.txt (line 9)) (2.19)\n",
|
|
"fairing 0.5 has requirement tornado<6.0.0,>=5.1.1, but you'll have tornado 6.0.3 which is incompatible.\n",
|
|
"jaeger-client 3.13.0 has requirement tornado<5,>=4.3, but you'll have tornado 6.0.3 which is incompatible.\n",
|
|
"seldon-core 0.4.1 has requirement google-cloud-storage>=1.16.0, but you'll have google-cloud-storage 1.14.0 which is incompatible.\n",
|
|
"Installing collected packages: fire, smmap2, gitdb2, gitpython, joblib, kubeflow-metadata, pandas, retrying, opentracing, Jinja2, itsdangerous, click, flask, Flask-OpenTracing, flatbuffers, minio, flask-cors, grpcio-opentracing, gunicorn, tornado, threadloop, thrift, jaeger-client, azure-common, azure-storage-common, azure-storage-blob, redis, seldon-core, scikit-learn, sklearn\n",
|
|
" Running setup.py install for fire: started\n",
|
|
" Running setup.py install for fire: finished with status 'done'\n",
|
|
" Running setup.py install for retrying: started\n",
|
|
" Running setup.py install for retrying: finished with status 'done'\n",
|
|
" Running setup.py install for opentracing: started\n",
|
|
" Running setup.py install for opentracing: finished with status 'done'\n",
|
|
" Found existing installation: Jinja2 2.10\n",
|
|
" Uninstalling Jinja2-2.10:\n",
|
|
" Successfully uninstalled Jinja2-2.10\n",
|
|
" Found existing installation: tornado 5.1.1\n",
|
|
" Uninstalling tornado-5.1.1:\n",
|
|
" Successfully uninstalled tornado-5.1.1\n",
|
|
" Running setup.py install for tornado: started\n",
|
|
" Running setup.py install for tornado: finished with status 'done'\n",
|
|
" Running setup.py install for threadloop: started\n",
|
|
" Running setup.py install for threadloop: finished with status 'done'\n",
|
|
" Running setup.py install for thrift: started\n",
|
|
" Running setup.py install for thrift: finished with status 'done'\n",
|
|
" Running setup.py install for jaeger-client: started\n",
|
|
" Running setup.py install for jaeger-client: finished with status 'done'\n",
|
|
" Running setup.py install for sklearn: started\n",
|
|
" Running setup.py install for sklearn: finished with status 'done'\n",
|
|
"Successfully installed Flask-OpenTracing-0.2.0 Jinja2-2.10.3 azure-common-1.1.23 azure-storage-blob-2.1.0 azure-storage-common-2.1.0 click-7.0 fire-0.2.1 flask-1.1.1 flask-cors-3.0.8 flatbuffers-1.11 gitdb2-2.0.6 gitpython-3.0.4 grpcio-opentracing-1.1.4 gunicorn-19.9.0 itsdangerous-1.1.0 jaeger-client-3.13.0 joblib-0.14.0 kubeflow-metadata-0.2.0 minio-5.0.1 opentracing-1.3.0 pandas-0.25.2 redis-3.3.11 retrying-1.3.3 scikit-learn-0.21.3 seldon-core-0.4.1 sklearn-0.0 smmap2-2.0.5 threadloop-1.0.2 thrift-0.11.0 tornado-6.0.3\n",
|
|
"You are using pip version 19.0.1, however version 19.3.1 is available.\n",
|
|
"You should consider upgrading via the 'pip install --upgrade pip' command.\n",
|
|
"\u001b[36mINFO\u001b[0m[0240] Taking snapshot of full filesystem...\n",
|
|
"\u001b[36mINFO\u001b[0m[0241] Skipping paths under /dev, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0241] Skipping paths under /etc/secrets, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0241] Skipping paths under /kaniko, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0242] Skipping paths under /proc, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0242] Skipping paths under /sys, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Skipping paths under /var/run, as it is a whitelisted directory\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/platform/__pycache__/kqueue.cpython-36.pyc\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado-5.1.1-py3.6.egg-info\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/stack_context.py\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/Jinja2-2.10.dist-info\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/test/stack_context_test.py\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/platform/kqueue.py\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/platform/epoll.py\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/platform/__pycache__/select.cpython-36.pyc\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/platform/common.py\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/platform/__pycache__/common.cpython-36.pyc\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/__pycache__/stack_context.cpython-36.pyc\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/test/__pycache__/stack_context_test.cpython-36.pyc\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/platform/__pycache__/epoll.cpython-36.pyc\n",
|
|
"\u001b[36mINFO\u001b[0m[0243] Adding whiteout for /opt/conda/lib/python3.6/site-packages/tornado/platform/select.py\n",
|
|
"\u001b[36mINFO\u001b[0m[0277] Using files from context: [/kaniko/buildcontext/app]\n",
|
|
"\u001b[36mINFO\u001b[0m[0277] Pushing layer gcr.io/jlewi-dev/fairing-job/fairing-job/cache:864fc6b813659edb48dd37b06d234c939c364db3e60df63a7de4e13b3174f933 to cache now\n",
|
|
"\u001b[36mINFO\u001b[0m[0277] COPY /app/ /app/\n",
|
|
"\u001b[36mINFO\u001b[0m[0277] Taking snapshot of files...\n",
|
|
"2019/10/25 01:47:01 pushed blob sha256:671fd5dc4379ffdf4694c30fd98b8b6bae9213cdff0939b936debf0f22f78708\n",
|
|
"2019/10/25 01:47:05 pushed blob sha256:8da7bddc0c459ae3160be07163f4012ef7befef6ae05c198bead57633e46e770\n",
|
|
"2019/10/25 01:47:05 gcr.io/jlewi-dev/fairing-job/fairing-job/cache:864fc6b813659edb48dd37b06d234c939c364db3e60df63a7de4e13b3174f933: digest: sha256:2339a62186a93347f3bb9bc85456045d45dc9152793ccc5164210b58aab5512b size: 429\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:9ad0c8331ed7f0f76b54d8e91e66661a3ca35e02a25cc83ccb48d51fa89e5573\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:ff51e784988b3a953df5d6ba36b982436c2b16a77eb081ce7a589ca67d04144c\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:969fc9c5501e60432ca0bc4b635493feb2f90e14822d2f3e3f79742fed96757d\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:432f7fba907384de9a5c1c23aed93fa3eff7d6a8d89a91f5eab99f41aa889323\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:8485e620dff15e8a69076ac02f6b23ffb3408161cdc2c0572905838765a84854\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:398d32b153e84fe343f0c5b07d65e89b05551aae6cb8b3a03bb2b662976eb3b8\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:47956fc6abae87d70180bc4f0efdad014b8e2a3b617a447ac01f674336737dfc\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:8da7bddc0c459ae3160be07163f4012ef7befef6ae05c198bead57633e46e770\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:59951887a0c1d1a227f43219b3bc84562a6f2a7e0ab5c276fbd9eaba6ebec02d\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:bd5e67bf2947497b4a4347d2751797d6b3a40f0dc5d355185815ee6da1b8ae0c\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:124c757242f88002a858c23fc79f8262f9587fa30fd92507e586ad074afb42b6\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:167108358fe643eea57fc595ff9b76a1a7e09e022c84d724346ce5b41d0148bc\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:62228d5c51598033083adbf71e8ee3d8d523d7d6d8c9d789b8c8a2d71ca988ac\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:22ea01b3a354ebdcf4386e6d2f53b6cf65bd9cdcb34a70f32e00b90a477589d0\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:c451d20886c33c47dab7b01b05ece292ee5173a9a4aced925035401a6b1de62e\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:fa3f2f277e67c5cbbf1dac21dc27111a60d3cd2ef494d94aa1515d3319f2a245\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:547e89bdafacadd9655a394a9d73c49c9890233c0cd244cbc5b1cb859be1395c\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:afde35469481d2bc446d649a7a3d099147bbf7696b66333e76a411686b617ea1\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:9d866f8bde2a0d607a6d17edc0fbd5e00b58306efc2b0a57e0ba72f269e7c6be\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:86db56dbcdfc4e5ba205e00f3de178548dd0fcd3d1d9ec011747ca0bb08a8177\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:9ab35225e174496943b6a86bf62d004409479cf722ef1d3e01ca48afc8cfaa79\n",
|
|
"2019/10/25 01:47:05 existing blob: sha256:147c5bbff888fc9cddffd4078daa35bba0d1d6f6c7175a1acb144412a43b3fce\n",
|
|
"2019/10/25 01:47:07 pushed blob sha256:80d3506bc094600aada9dc076b44354b134277700f2420838db7b742c50533ed\n",
|
|
"2019/10/25 01:47:07 pushed blob sha256:2e67912c44ec0aadea8c990a4a8fc882e4655a798807840977b49b5a972eb47d\n",
|
|
"2019/10/25 01:47:07 pushed blob sha256:2f2b9c4bf759eaf2afb42e189cc50b21d4614d1892227349409d012a90355268\n",
|
|
"2019/10/25 01:47:07 pushed blob sha256:5831cf619d1fb5d7b9430a0943017516edf2d83451941d468c78479b73f65975\n",
|
|
"2019/10/25 01:47:08 gcr.io/jlewi-dev/fairing-job/fairing-job:A486B058: digest: sha256:bf1c54b7880b81f232c15f31a0af74a70550e2eedffd2c9ff289f32f4b8d85fa size: 4325\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Use a stock jupyter image as our base image\n",
|
|
"# TODO(jlewi): Should we try to use the downward API to default to the image we are running in?\n",
|
|
"# TODO(https://github.com/kubeflow/fairing/issues/404): We need to fix 404\n",
|
|
"# before we can upgrade to the 0.7.0 image as the base image.\n",
|
|
"# We will need to use that to set the Dockerfile used by ClusterBuilder\n",
|
|
"# base_image = \"gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0\"\n",
|
|
"base_image = \"gcr.io/kubeflow-images-public/tensorflow-1.13.1-notebook-cpu:v0.5.0\"\n",
|
|
"cluster_builder = cluster.cluster.ClusterBuilder(registry=DOCKER_REGISTRY,\n",
|
|
" base_image=base_image,\n",
|
|
" preprocessor=preprocessor,\n",
|
|
" pod_spec_mutators=[fairing.cloud.gcp.add_gcp_credentials_if_exists],\n",
|
|
" context_source=cluster.gcs_context.GCSContextSource())\n",
|
|
"cluster_builder.build()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Build the actual image\n",
|
|
"\n",
|
|
"Here you use the append builder to add your code to the base image\n",
|
|
"\n",
|
|
"* Calling preprocessor.preprocess() converts your notebook file to a python file\n",
|
|
"\n",
|
|
" * You are using the [ConvertNotebookPreprocessorWithFire](https://github.com/kubeflow/fairing/blob/master/fairing/preprocessors/converted_notebook.py#L85) \n",
|
|
" * This preprocessor converts ipynb files to py files by doing the following\n",
|
|
" 1. Removing all cells which don't have a comment `# fairing:include-cell`\n",
|
|
" 1. Using [python-fire](https://github.com/google/python-fire) to add entry points for the class specified in the constructor \n",
|
|
" \n",
|
|
" * Call preprocess() will create the file build-train-deploy.py\n",
|
|
" \n",
|
|
"* You use the AppendBuilder to rapidly build a new docker image by quickly adding some files to an existing docker image\n",
|
|
" * The AppendBuilder is super fast so its very convenient for rebuilding your images as you iterate on your code\n",
|
|
" * The AppendBuilder will add the converted notebook, build-train-deploy.py, along with any files specified in `preprocessor.input_files` to `/app` in the newly created image"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 27,
|
|
"metadata": {
|
|
"scrolled": true
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Converting build-train-deploy.ipynb to build-train-deploy.py\n",
|
|
"Creating entry point for the class name ModelServe\n",
|
|
"Building image using Append builder...\n",
|
|
"Creating docker context: /tmp/fairing_context_41v9y1k9\n",
|
|
"Converting build-train-deploy.ipynb to build-train-deploy.py\n",
|
|
"Creating entry point for the class name ModelServe\n",
|
|
"build-train-deploy.py already exists in Fairing context, skipping...\n",
|
|
"Loading Docker credentials for repository 'gcr.io/jlewi-dev/fairing-job/fairing-job:A486B058'\n",
|
|
"Invoking 'docker-credential-gcloud' to obtain Docker credentials.\n",
|
|
"Successfully obtained Docker credentials.\n",
|
|
"Image successfully built in 2.0983306730049662s.\n",
|
|
"Pushing image gcr.io/jlewi-dev/fairing-job/fairing-job:7935B6A7...\n",
|
|
"Loading Docker credentials for repository 'gcr.io/jlewi-dev/fairing-job/fairing-job:7935B6A7'\n",
|
|
"Invoking 'docker-credential-gcloud' to obtain Docker credentials.\n",
|
|
"Successfully obtained Docker credentials.\n",
|
|
"Uploading gcr.io/jlewi-dev/fairing-job/fairing-job:7935B6A7\n",
|
|
"Layer sha256:80d3506bc094600aada9dc076b44354b134277700f2420838db7b742c50533ed exists, skipping\n",
|
|
"Layer sha256:8da7bddc0c459ae3160be07163f4012ef7befef6ae05c198bead57633e46e770 exists, skipping\n",
|
|
"Layer sha256:59951887a0c1d1a227f43219b3bc84562a6f2a7e0ab5c276fbd9eaba6ebec02d exists, skipping\n",
|
|
"Layer sha256:9d866f8bde2a0d607a6d17edc0fbd5e00b58306efc2b0a57e0ba72f269e7c6be exists, skipping\n",
|
|
"Layer sha256:62228d5c51598033083adbf71e8ee3d8d523d7d6d8c9d789b8c8a2d71ca988ac exists, skipping\n",
|
|
"Layer sha256:9ab35225e174496943b6a86bf62d004409479cf722ef1d3e01ca48afc8cfaa79 exists, skipping\n",
|
|
"Layer sha256:bd5e67bf2947497b4a4347d2751797d6b3a40f0dc5d355185815ee6da1b8ae0c exists, skipping\n",
|
|
"Layer sha256:5831cf619d1fb5d7b9430a0943017516edf2d83451941d468c78479b73f65975 exists, skipping\n",
|
|
"Layer sha256:8485e620dff15e8a69076ac02f6b23ffb3408161cdc2c0572905838765a84854 exists, skipping\n",
|
|
"Layer sha256:124c757242f88002a858c23fc79f8262f9587fa30fd92507e586ad074afb42b6 exists, skipping\n",
|
|
"Layer sha256:2f2b9c4bf759eaf2afb42e189cc50b21d4614d1892227349409d012a90355268 exists, skipping\n",
|
|
"Layer sha256:ff51e784988b3a953df5d6ba36b982436c2b16a77eb081ce7a589ca67d04144c exists, skipping\n",
|
|
"Layer sha256:167108358fe643eea57fc595ff9b76a1a7e09e022c84d724346ce5b41d0148bc exists, skipping\n",
|
|
"Layer sha256:432f7fba907384de9a5c1c23aed93fa3eff7d6a8d89a91f5eab99f41aa889323 exists, skipping\n",
|
|
"Layer sha256:afde35469481d2bc446d649a7a3d099147bbf7696b66333e76a411686b617ea1 exists, skipping\n",
|
|
"Layer sha256:969fc9c5501e60432ca0bc4b635493feb2f90e14822d2f3e3f79742fed96757d exists, skipping\n",
|
|
"Layer sha256:22ea01b3a354ebdcf4386e6d2f53b6cf65bd9cdcb34a70f32e00b90a477589d0 exists, skipping\n",
|
|
"Layer sha256:86db56dbcdfc4e5ba205e00f3de178548dd0fcd3d1d9ec011747ca0bb08a8177 exists, skipping\n",
|
|
"Layer sha256:c451d20886c33c47dab7b01b05ece292ee5173a9a4aced925035401a6b1de62e exists, skipping\n",
|
|
"Layer sha256:398d32b153e84fe343f0c5b07d65e89b05551aae6cb8b3a03bb2b662976eb3b8 exists, skipping\n",
|
|
"Layer sha256:47956fc6abae87d70180bc4f0efdad014b8e2a3b617a447ac01f674336737dfc exists, skipping\n",
|
|
"Layer sha256:9ad0c8331ed7f0f76b54d8e91e66661a3ca35e02a25cc83ccb48d51fa89e5573 exists, skipping\n",
|
|
"Layer sha256:fa3f2f277e67c5cbbf1dac21dc27111a60d3cd2ef494d94aa1515d3319f2a245 exists, skipping\n",
|
|
"Layer sha256:547e89bdafacadd9655a394a9d73c49c9890233c0cd244cbc5b1cb859be1395c exists, skipping\n",
|
|
"Layer sha256:147c5bbff888fc9cddffd4078daa35bba0d1d6f6c7175a1acb144412a43b3fce exists, skipping\n",
|
|
"Layer sha256:a4bc27d300aa1fec30a6da6b44b05c58052675425cb5b92e11cc081dec5af3aa pushed.\n",
|
|
"Layer sha256:f2b9523fe427b5599019aff069e474f3a7bcd829aeb084a174cd8610df588068 pushed.\n",
|
|
"Finished upload of: gcr.io/jlewi-dev/fairing-job/fairing-job:7935B6A7\n",
|
|
"Pushed image gcr.io/jlewi-dev/fairing-job/fairing-job:7935B6A7 in 3.5974774019996403s.\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"preprocessor.preprocess()\n",
|
|
"\n",
|
|
"builder = append.append.AppendBuilder(registry=DOCKER_REGISTRY,\n",
|
|
" base_image=cluster_builder.image_tag, preprocessor=preprocessor)\n",
|
|
"builder.build()\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Launch the K8s Job\n",
|
|
"\n",
|
|
"* You can use kubeflow fairing to easily launch a [Kubernetes job](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/) to invoke code\n",
|
|
"* You use fairings Kubernetes job library to build a Kubernetes job\n",
|
|
" * You use pod mutators to attach GCP credentials to the pod\n",
|
|
" * You can also use pod mutators to attch PVCs\n",
|
|
"* Since the [ConvertNotebookPreprocessorWithFire](https://github.com/kubeflow/fairing/blob/master/fairing/preprocessors/converted_notebook.py#L85) is using [python-fire](https://github.com/google/python-fire) you can easily invoke any method inside the ModelServe class just by configuring the command invoked by the Kubernetes job\n",
|
|
" * In the cell below you extend the command to include `train` as an argument because you want to invoke the train\n",
|
|
" function\n",
|
|
" \n",
|
|
"**Note** When you invoke train_deployer.deploy; kubeflow fairing will stream the logs from the Kubernetes job. The job will initially show some connection errors because the job will try to connect to the metadataserver. You can ignore these errors; the job will retry until its able to connect and then continue"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 28,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"The job fairing-job-qg87g launched.\n",
|
|
"Waiting for fairing-job-qg87g-chghc to start...\n",
|
|
"Waiting for fairing-job-qg87g-chghc to start...\n",
|
|
"Waiting for fairing-job-qg87g-chghc to start...\n",
|
|
"Pod started running True\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"model_file not supplied; using the default\n",
|
|
"model_file=mockup-model.dat\n",
|
|
"[0]\tvalidation_0-rmse:154.15\n",
|
|
"Will train until validation_0-rmse hasn't improved in 40 rounds.\n",
|
|
"[1]\tvalidation_0-rmse:147.275\n",
|
|
"[2]\tvalidation_0-rmse:140.414\n",
|
|
"[3]\tvalidation_0-rmse:135.407\n",
|
|
"[4]\tvalidation_0-rmse:131.662\n",
|
|
"[5]\tvalidation_0-rmse:127.103\n",
|
|
"[6]\tvalidation_0-rmse:123.558\n",
|
|
"[7]\tvalidation_0-rmse:118.619\n",
|
|
"[8]\tvalidation_0-rmse:115.743\n",
|
|
"[9]\tvalidation_0-rmse:112.866\n",
|
|
"[10]\tvalidation_0-rmse:110.533\n",
|
|
"[11]\tvalidation_0-rmse:108.57\n",
|
|
"[12]\tvalidation_0-rmse:107.407\n",
|
|
"[13]\tvalidation_0-rmse:104.548\n",
|
|
"[14]\tvalidation_0-rmse:102.625\n",
|
|
"[15]\tvalidation_0-rmse:100.668\n",
|
|
"[16]\tvalidation_0-rmse:99.4654\n",
|
|
"[17]\tvalidation_0-rmse:98.1461\n",
|
|
"[18]\tvalidation_0-rmse:96.71\n",
|
|
"[19]\tvalidation_0-rmse:95.4135\n",
|
|
"[20]\tvalidation_0-rmse:94.4105\n",
|
|
"[21]\tvalidation_0-rmse:92.6454\n",
|
|
"[22]\tvalidation_0-rmse:91.5752\n",
|
|
"[23]\tvalidation_0-rmse:90.4496\n",
|
|
"[24]\tvalidation_0-rmse:89.9257\n",
|
|
"[25]\tvalidation_0-rmse:88.8438\n",
|
|
"[26]\tvalidation_0-rmse:87.9895\n",
|
|
"[27]\tvalidation_0-rmse:86.42\n",
|
|
"[28]\tvalidation_0-rmse:85.2992\n",
|
|
"[29]\tvalidation_0-rmse:84.6414\n",
|
|
"[30]\tvalidation_0-rmse:84.3974\n",
|
|
"[31]\tvalidation_0-rmse:83.2113\n",
|
|
"[32]\tvalidation_0-rmse:82.5043\n",
|
|
"[33]\tvalidation_0-rmse:81.3713\n",
|
|
"[34]\tvalidation_0-rmse:81.2969\n",
|
|
"[35]\tvalidation_0-rmse:79.9762\n",
|
|
"[36]\tvalidation_0-rmse:79.084\n",
|
|
"[37]\tvalidation_0-rmse:78.8726\n",
|
|
"[38]\tvalidation_0-rmse:78.2066\n",
|
|
"[39]\tvalidation_0-rmse:77.98\n",
|
|
"[40]\tvalidation_0-rmse:76.8601\n",
|
|
"[41]\tvalidation_0-rmse:76.3929\n",
|
|
"[42]\tvalidation_0-rmse:76.0857\n",
|
|
"[43]\tvalidation_0-rmse:75.4714\n",
|
|
"[44]\tvalidation_0-rmse:74.4059\n",
|
|
"[45]\tvalidation_0-rmse:73.5268\n",
|
|
"[46]\tvalidation_0-rmse:73.0309\n",
|
|
"[47]\tvalidation_0-rmse:72.4982\n",
|
|
"[48]\tvalidation_0-rmse:71.9351\n",
|
|
"[49]\tvalidation_0-rmse:71.3068\n",
|
|
"mean_absolute_error=50.72\n",
|
|
"Model export success: mockup-model.dat\n",
|
|
"Best RMSE on eval: %.2f with %d rounds 71.306808 50\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"pod_spec = builder.generate_pod_spec()\n",
|
|
"train_deployer = job.job.Job(cleanup=False,\n",
|
|
" pod_spec_mutators=[\n",
|
|
" fairing.cloud.gcp.add_gcp_credentials_if_exists])\n",
|
|
"\n",
|
|
"# Add command line arguments\n",
|
|
"pod_spec.containers[0].command.extend([\"train\"])\n",
|
|
"result = train_deployer.deploy(pod_spec)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"* You can use kubectl to inspect the job that fairing created"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 29,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"apiVersion: v1\n",
|
|
"items:\n",
|
|
"- apiVersion: batch/v1\n",
|
|
" kind: Job\n",
|
|
" metadata:\n",
|
|
" creationTimestamp: \"2019-10-25T01:48:20Z\"\n",
|
|
" generateName: fairing-job-\n",
|
|
" labels:\n",
|
|
" fairing-deployer: job\n",
|
|
" fairing-id: 85da7b32-f6c9-11e9-8e34-46c1cdc3ff41\n",
|
|
" name: fairing-job-qg87g\n",
|
|
" namespace: kubeflow-jlewi\n",
|
|
" resourceVersion: \"625626\"\n",
|
|
" selfLink: /apis/batch/v1/namespaces/kubeflow-jlewi/jobs/fairing-job-qg87g\n",
|
|
" uid: 85df016a-f6c9-11e9-8cd6-42010a8e012b\n",
|
|
" spec:\n",
|
|
" backoffLimit: 0\n",
|
|
" completions: 1\n",
|
|
" parallelism: 1\n",
|
|
" selector:\n",
|
|
" matchLabels:\n",
|
|
" controller-uid: 85df016a-f6c9-11e9-8cd6-42010a8e012b\n",
|
|
" template:\n",
|
|
" metadata:\n",
|
|
" annotations:\n",
|
|
" sidecar.istio.io/inject: \"false\"\n",
|
|
" creationTimestamp: null\n",
|
|
" labels:\n",
|
|
" controller-uid: 85df016a-f6c9-11e9-8cd6-42010a8e012b\n",
|
|
" fairing-deployer: job\n",
|
|
" fairing-id: 85da7b32-f6c9-11e9-8e34-46c1cdc3ff41\n",
|
|
" job-name: fairing-job-qg87g\n",
|
|
" name: fairing-deployer\n",
|
|
" spec:\n",
|
|
" containers:\n",
|
|
" - command:\n",
|
|
" - python\n",
|
|
" - /app/build-train-deploy.py\n",
|
|
" - train\n",
|
|
" env:\n",
|
|
" - name: FAIRING_RUNTIME\n",
|
|
" value: \"1\"\n",
|
|
" - name: GOOGLE_APPLICATION_CREDENTIALS\n",
|
|
" value: /etc/secrets/user-gcp-sa.json\n",
|
|
" image: gcr.io/jlewi-dev/fairing-job/fairing-job:7935B6A7\n",
|
|
" imagePullPolicy: IfNotPresent\n",
|
|
" name: fairing-job\n",
|
|
" resources: {}\n",
|
|
" securityContext:\n",
|
|
" runAsUser: 0\n",
|
|
" terminationMessagePath: /dev/termination-log\n",
|
|
" terminationMessagePolicy: File\n",
|
|
" volumeMounts:\n",
|
|
" - mountPath: /etc/secrets\n",
|
|
" name: user-gcp-sa\n",
|
|
" readOnly: true\n",
|
|
" workingDir: /app/\n",
|
|
" dnsPolicy: ClusterFirst\n",
|
|
" restartPolicy: Never\n",
|
|
" schedulerName: default-scheduler\n",
|
|
" securityContext: {}\n",
|
|
" terminationGracePeriodSeconds: 30\n",
|
|
" volumes:\n",
|
|
" - name: user-gcp-sa\n",
|
|
" secret:\n",
|
|
" defaultMode: 420\n",
|
|
" secretName: user-gcp-sa\n",
|
|
" status:\n",
|
|
" completionTime: \"2019-10-25T01:48:29Z\"\n",
|
|
" conditions:\n",
|
|
" - lastProbeTime: \"2019-10-25T01:48:29Z\"\n",
|
|
" lastTransitionTime: \"2019-10-25T01:48:29Z\"\n",
|
|
" status: \"True\"\n",
|
|
" type: Complete\n",
|
|
" startTime: \"2019-10-25T01:48:20Z\"\n",
|
|
" succeeded: 1\n",
|
|
"kind: List\n",
|
|
"metadata:\n",
|
|
" resourceVersion: \"\"\n",
|
|
" selfLink: \"\"\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!kubectl get jobs -l fairing-id={train_deployer.job_id} -o yaml"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Deploy the trained model to Kubeflow for predictions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"* Now that you have trained a model you can use kubeflow fairing to deploy it on Kubernetes\n",
|
|
"* When you call deployer.deploy fairing will create a [Kubernetes Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) to serve your model\n",
|
|
"* Kubeflow fairing uses the docker image you created earlier\n",
|
|
"* The docker image you created contains your code and [Seldon core](https://www.seldon.io/)\n",
|
|
"* Kubeflow fairing uses Seldon to wrap your prediction code, ModelServe.predict, in a REST and gRPC server"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 30,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Cluster endpoint: http://fairing-service-2bhtr.kubeflow-jlewi.svc.cluster.local:5000/predict\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"from kubeflow.fairing.deployers import serving\n",
|
|
"pod_spec = builder.generate_pod_spec()\n",
|
|
"\n",
|
|
"module_name = os.path.splitext(preprocessor.executable.name)[0]\n",
|
|
"deployer = serving.serving.Serving(module_name + \".ModelServe\",\n",
|
|
" service_type=\"ClusterIP\",\n",
|
|
" labels={\"app\": \"mockup\"})\n",
|
|
" \n",
|
|
"url = deployer.deploy(pod_spec)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"* You can use kubectl to inspect the deployment that fairing created"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 31,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"apiVersion: extensions/v1beta1\n",
|
|
"kind: Deployment\n",
|
|
"metadata:\n",
|
|
" annotations:\n",
|
|
" deployment.kubernetes.io/revision: \"1\"\n",
|
|
" creationTimestamp: \"2019-10-25T01:48:34Z\"\n",
|
|
" generateName: fairing-deployer-\n",
|
|
" generation: 1\n",
|
|
" labels:\n",
|
|
" app: mockup\n",
|
|
" fairing-deployer: serving\n",
|
|
" fairing-id: 8e428b7a-f6c9-11e9-8e34-46c1cdc3ff41\n",
|
|
" name: fairing-deployer-cnv5x\n",
|
|
" namespace: kubeflow-jlewi\n",
|
|
" resourceVersion: \"625670\"\n",
|
|
" selfLink: /apis/extensions/v1beta1/namespaces/kubeflow-jlewi/deployments/fairing-deployer-cnv5x\n",
|
|
" uid: 8e43b5b8-f6c9-11e9-8cd6-42010a8e012b\n",
|
|
"spec:\n",
|
|
" progressDeadlineSeconds: 600\n",
|
|
" replicas: 1\n",
|
|
" revisionHistoryLimit: 10\n",
|
|
" selector:\n",
|
|
" matchLabels:\n",
|
|
" app: mockup\n",
|
|
" fairing-deployer: serving\n",
|
|
" fairing-id: 8e428b7a-f6c9-11e9-8e34-46c1cdc3ff41\n",
|
|
" strategy:\n",
|
|
" rollingUpdate:\n",
|
|
" maxSurge: 25%\n",
|
|
" maxUnavailable: 25%\n",
|
|
" type: RollingUpdate\n",
|
|
" template:\n",
|
|
" metadata:\n",
|
|
" annotations:\n",
|
|
" sidecar.istio.io/inject: \"false\"\n",
|
|
" creationTimestamp: null\n",
|
|
" labels:\n",
|
|
" app: mockup\n",
|
|
" fairing-deployer: serving\n",
|
|
" fairing-id: 8e428b7a-f6c9-11e9-8e34-46c1cdc3ff41\n",
|
|
" name: fairing-deployer\n",
|
|
" spec:\n",
|
|
" containers:\n",
|
|
" - command:\n",
|
|
" - seldon-core-microservice\n",
|
|
" - build-train-deploy.ModelServe\n",
|
|
" - REST\n",
|
|
" - --service-type=MODEL\n",
|
|
" - --persistence=0\n",
|
|
" env:\n",
|
|
" - name: FAIRING_RUNTIME\n",
|
|
" value: \"1\"\n",
|
|
" image: gcr.io/jlewi-dev/fairing-job/fairing-job:7935B6A7\n",
|
|
" imagePullPolicy: IfNotPresent\n",
|
|
" name: model\n",
|
|
" resources: {}\n",
|
|
" securityContext:\n",
|
|
" runAsUser: 0\n",
|
|
" terminationMessagePath: /dev/termination-log\n",
|
|
" terminationMessagePolicy: File\n",
|
|
" workingDir: /app/\n",
|
|
" dnsPolicy: ClusterFirst\n",
|
|
" restartPolicy: Always\n",
|
|
" schedulerName: default-scheduler\n",
|
|
" securityContext: {}\n",
|
|
" terminationGracePeriodSeconds: 30\n",
|
|
"status:\n",
|
|
" conditions:\n",
|
|
" - lastTransitionTime: \"2019-10-25T01:48:34Z\"\n",
|
|
" lastUpdateTime: \"2019-10-25T01:48:34Z\"\n",
|
|
" message: Deployment does not have minimum availability.\n",
|
|
" reason: MinimumReplicasUnavailable\n",
|
|
" status: \"False\"\n",
|
|
" type: Available\n",
|
|
" - lastTransitionTime: \"2019-10-25T01:48:34Z\"\n",
|
|
" lastUpdateTime: \"2019-10-25T01:48:35Z\"\n",
|
|
" message: ReplicaSet \"fairing-deployer-cnv5x-744dc89c56\" is progressing.\n",
|
|
" reason: ReplicaSetUpdated\n",
|
|
" status: \"True\"\n",
|
|
" type: Progressing\n",
|
|
" observedGeneration: 1\n",
|
|
" replicas: 1\n",
|
|
" unavailableReplicas: 1\n",
|
|
" updatedReplicas: 1\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"!kubectl get deploy -o yaml {deployer.deployment.metadata.name}"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Send an inference request to the prediction server\n",
|
|
"\n",
|
|
"* Now that you have deployed the model into your Kubernetes cluster, you can send a REST request to \n",
|
|
" preform inference\n",
|
|
"* The code below reads some data, sends, a prediction request and then prints out the response"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 32,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"(train_X, train_y), (test_X, test_y) = read_synthetic_input()\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 33,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "NameError",
|
|
"evalue": "name 'util' is not defined",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
|
"\u001b[0;32m<ipython-input-33-9d29116b903d>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mfull_url\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0murl\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;34m\":5000/predict\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mutil\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpredict_nparray\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfull_url\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtest_X\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3\u001b[0m \u001b[0mpprint\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mresult\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcontent\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
|
|
"\u001b[0;31mNameError\u001b[0m: name 'util' is not defined"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"full_url = url + \":5000/predict\"\n",
|
|
"result = util.predict_nparray(full_url, test_X)\n",
|
|
"pprint.pprint(result.content)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Clean up the prediction endpoint\n",
|
|
"\n",
|
|
"* You can use kubectl to delete the Kubernetes resources for your model\n",
|
|
"* If you want to delete the resources uncomment the following lines and run them"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# !kubectl delete service -l app=ames\n",
|
|
"# !kubectl delete deploy -l app=ames"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Track Models and Artifacts\n",
|
|
"\n",
|
|
"* Using Kubeflow's metadata server you can track models and artifacts\n",
|
|
"* The ModelServe code was instrumented to log executions and outputs\n",
|
|
"* You can access Kubeflow's metadata UI by selecting **Artifact Store** from the central dashboard\n",
|
|
" * See [here](https://www.kubeflow.org/docs/other-guides/accessing-uis/) for instructions on connecting to Kubeflow's UIs\n",
|
|
"* You can also use the python SDK to read and write entries\n",
|
|
"* This [notebook](https://github.com/kubeflow/metadata/blob/master/sdk/python/demo.ipynb) illustrates a bunch of metadata functionality"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Create a workspace\n",
|
|
"\n",
|
|
"* Kubeflow metadata uses workspaces as a logical grouping for artifacts, executions, and datasets that belong together\n",
|
|
"* Earlier in the notebook we defined the function `create_workspace` to create a workspace for this example\n",
|
|
"* You can use that function to return a workspace object and then call list to see all the artifacts in that workspace"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"ws = create_workspace()\n",
|
|
"ws.list()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Create a pipeline to train your model\n",
|
|
"\n",
|
|
"* [Kubeflow pipelines](https://www.kubeflow.org/docs/pipelines/) makes it easy to define complex workflows to build and deploy models\n",
|
|
"* Below you will define and run a simple one step pipeline to train your model\n",
|
|
"* Kubeflow pipelines uses experiments to group different runs of a pipeline together\n",
|
|
"* So you start by defining a name for your experiement"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Define the pipeline\n",
|
|
"\n",
|
|
"* To create a pipeline you create a function and decorate it with the `@dsl.pipeline` decorator\n",
|
|
" * You use the decorator to give the pipeline a name and description\n",
|
|
" \n",
|
|
"* Inside the function, each step in the function is defined by a ContainerOp that specifies\n",
|
|
" a container to invoke\n",
|
|
" \n",
|
|
"* You will use the container image that you built earlier using Kubeflow Fairing\n",
|
|
"* Since the Kubeflow Fairing preprocessor added a main function using [python-fire](https://github.com/google/python-fire), a step in your pipeline can invocation any function in the ModelServe class just by setting the command for the container op\n",
|
|
"* See the pipelines [SDK reference](https://kubeflow-pipelines.readthedocs.io/en/latest/) for more information"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"@dsl.pipeline(\n",
|
|
" name='Training pipeline',\n",
|
|
" description='A pipeline that trains an xgboost model for the Ames dataset.'\n",
|
|
")\n",
|
|
"def train_pipeline(\n",
|
|
" ): \n",
|
|
" command=[\"python\", preprocessor.executable.name, \"train\"]\n",
|
|
" train_op = dsl.ContainerOp(\n",
|
|
" name=\"train\", \n",
|
|
" image=builder.image_tag, \n",
|
|
" command=command,\n",
|
|
" ).apply(\n",
|
|
" gcp.use_gcp_secret('user-gcp-sa'),\n",
|
|
" )\n",
|
|
" train_op.container.working_dir = \"/app\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Compile the pipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"* Pipelines need to be compiled"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pipeline_func = train_pipeline\n",
|
|
"pipeline_filename = pipeline_func.__name__ + '.pipeline.zip'\n",
|
|
"compiler.Compiler().compile(pipeline_func, pipeline_filename)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Submit the pipeline for execution"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"* Pipelines groups runs using experiments\n",
|
|
"* So before you submit a pipeline you need to create an experiment or pick an existing experiment\n",
|
|
"* Once you have compiled a pipeline, you can use the pipelines SDK to submit that pipeline\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"EXPERIMENT_NAME = 'MockupModel'\n",
|
|
"\n",
|
|
"#Specify pipeline argument values\n",
|
|
"arguments = {}\n",
|
|
"\n",
|
|
"# Get or create an experiment and submit a pipeline run\n",
|
|
"client = kfp.Client()\n",
|
|
"experiment = client.create_experiment(EXPERIMENT_NAME)\n",
|
|
"\n",
|
|
"#Submit a pipeline run\n",
|
|
"run_name = pipeline_func.__name__ + ' run'\n",
|
|
"run_result = client.run_pipeline(experiment.id, run_name, pipeline_filename, arguments)\n",
|
|
"\n",
|
|
"#vvvvvvvvv This link leads to the run information page. (Note: There is a bug in JupyterLab that modifies the URL and makes the link stop working)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.6.8"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|