pipelines/samples/tutorials/mnist/00_Kubeflow_Cluster_Setup.i...

336 lines
10 KiB
Plaintext

{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Copyright 2019 Google Inc. All Rights Reserved.\n",
"#\n",
"# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# http://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License.\n",
"# =============================================================================="
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deploying a Kubeflow Cluster on Google Cloud Platform (GCP)\n",
"This notebook provides instructions for setting up a Kubeflow cluster on GCP using the command-line interface (CLI). For additional help, see the guide to [deploying Kubeflow using the CLI](https://www.kubeflow.org/docs/gke/deploy/deploy-cli/)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"- You have a [GCP project setup](https://www.kubeflow.org/docs/gke/deploy/project-setup/) for your Kubeflow Deployment with you having the [owner role](https://cloud.google.com/iam/docs/understanding-roles#primitive_role_definitions) for the project and with the following APIs enabled:\n",
" - [Compute Engine API](https://pantheon.corp.google.com/apis/library/compute.googleapis.com)\n",
" - [Kubernetes Engine API](https://pantheon.corp.google.com/apis/library/container.googleapis.com)\n",
" - [Identity and Access Management(IAM) API](https://pantheon.corp.google.com/apis/library/iam.googleapis.com)\n",
" - [Deployment Manager API](https://pantheon.corp.google.com/apis/library/deploymentmanager.googleapis.com)\n",
" - [Cloud Resource Manager API](https://pantheon.corp.google.com/apis/library/cloudresourcemanager.googleapis.com)\n",
" - [AI Platform Training & Prediction API](https://pantheon.corp.google.com/apis/library/ml.googleapis.com)\n",
"- You have set up [OAuth for Cloud IAP](https://www.kubeflow.org/docs/gke/deploy/oauth-setup/)\n",
"- You have installed and setup [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)\n",
"- You have installed [gcloud-sdk](https://cloud.google.com/sdk/)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Running Environment\n",
"\n",
"**This notebook helps in creating the Kubeflow cluster on GCP. You must run this notebook in an environment with Cloud SDK installed, such as Cloud Shell. Learn more about [installing Cloud SDK](https://cloud.google.com/sdk/docs/).**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up a Kubeflow cluster\n",
"1. Download kfctl\n",
"2. Setup environment variables\n",
"3. Create dedicated service account for deployment\n",
"4. Deploy Kubefow\n",
"5. Install Kubeflow Pipelines SDK\n",
"6. Sanity check\n",
"\n",
"`NOTE` : It is also possible to deploy the kubeflow cluster though [UI](https://www.kubeflow.org/docs/gke/deploy/deploy-ui/)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a working directory\n",
"Create a new working directory in your current directory. The default name is **kubeflow**, but you can change the name."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"work_directory_name = 'kubeflow'\n",
"\n",
"! mkdir -p $work_directory_name\n",
"\n",
"%cd $work_directory_name"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download kfctl\n",
"Download kfctl to your working directory. The default version used is v0.7.0, but you can find the latest release [here](https://github.com/kubeflow/kubeflow/releases)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"## Download kfctl v0.7.0\n",
"! curl -LO https://github.com/kubeflow/kubeflow/releases/download/v0.7.0/kfctl_v0.7.0_linux.tar.gz\n",
" \n",
"## Unpack the tar ball\n",
"! tar -xvf kfctl_v0.7.0_linux.tar.gz"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"## Create user credentials\n",
"! gcloud auth application-default login"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set up environment variables\n",
"Set up environment variables to use while installing Kubeflow. Replace variable placeholders (for example, `<VARIABLE NAME>`) with the correct values for your environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Set your GCP project ID and the zone where you want to create the Kubeflow deployment\n",
"%env PROJECT=<ADD GCP PROJECT HERE>\n",
"%env ZONE=<ADD GCP ZONE TO LAUNCH KUBEFLOW CLUSTER HERE>\n",
"\n",
"# google cloud storage bucket\n",
"%env GCP_BUCKET=gs://<ADD STORAGE LOCATION HERE>\n",
"\n",
"# Use the following kfctl configuration file for authentication with \n",
"# Cloud IAP (recommended):\n",
"uri = \"https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_iap.0.7.0.yaml\"\n",
"uri = uri.strip()\n",
"%env CONFIG_URI=$uri\n",
"\n",
"# For using Cloud IAP for authentication, create environment variables\n",
"# from the OAuth client ID and secret that you obtained earlier:\n",
"%env CLIENT_ID=<ADD OAuth CLIENT ID HERE>\n",
"%env CLIENT_SECRET=<ADD OAuth CLIENT SECRET HERE>\n",
"\n",
"# Set KF_NAME to the name of your Kubeflow deployment. You also use this\n",
"# value as directory name when creating your configuration directory. \n",
"# For example, your deployment name can be 'my-kubeflow' or 'kf-test'.\n",
"%env KF_NAME=<ADD KUBEFLOW DEPLOYMENT NAME HERE>\n",
"\n",
"# Set up name of the service account that should be created and used\n",
"# while creating the Kubeflow cluster\n",
"%env SA_NAME=<ADD SERVICE ACCOUNT NAME TO BE CREATED HERE>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Configure gcloud and add kfctl to your path."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! gcloud config set project ${PROJECT}\n",
"\n",
"! gcloud config set compute/zone ${ZONE}\n",
"\n",
"\n",
"# Set the path to the base directory where you want to store one or more \n",
"# Kubeflow deployments. For example, /opt/.\n",
"# Here we use the current working directory as the base directory\n",
"# Then set the Kubeflow application directory for this deployment.\n",
"\n",
"import os\n",
"base = os.getcwd()\n",
"%env BASE_DIR=$base\n",
"\n",
"kf_dir = os.getenv('BASE_DIR') + \"/\" + os.getenv('KF_NAME')\n",
"%env KF_DIR=$kf_dir\n",
"\n",
"# The following command is optional. It adds the kfctl binary to your path.\n",
"# If you don't add kfctl to your path, you must use the full path\n",
"# each time you run kfctl. In this example, the kfctl file is present in\n",
"# the current directory\n",
"new_path = os.getenv('PATH') + \":\" + os.getenv('BASE_DIR')\n",
"%env PATH=$new_path"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create service account\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! gcloud iam service-accounts create ${SA_NAME}\n",
"! gcloud projects add-iam-policy-binding ${PROJECT} \\\n",
" --member serviceAccount:${SA_NAME}@${PROJECT}.iam.gserviceaccount.com \\\n",
" --role 'roles/owner'\n",
"! gcloud iam service-accounts keys create key.json \\\n",
" --iam-account ${SA_NAME}@${PROJECT}.iam.gserviceaccount.com"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set GOOGLE_APPLICATION_CREDENTIALS"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"key_path = os.getenv('BASE_DIR') + \"/\" + 'key.json'\n",
"%env GOOGLE_APPLICATION_CREDENTIALS=$key_path"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup and deploy Kubeflow"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! mkdir -p ${KF_DIR}\n",
"%cd $kf_dir\n",
"! kfctl apply -V -f ${CONFIG_URI}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install Kubeflow Pipelines SDK"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"\n",
"# Install the SDK (Uncomment the code if the SDK is not installed before)\n",
"! pip3 install 'kfp>=0.1.36' --quiet --user"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sanity Check: Check the ingress created"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! kubectl -n istio-system describe ingress"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Access the Kubeflow cluster at **`https://<KF_NAME>.endpoints.<gcp_project_id>.cloud.goog/`**\n",
"\n",
"Note that it may take up to 15-20 mins for the above url to be functional."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}