mirror of https://github.com/kubeflow/website.git
620 lines
25 KiB
Plaintext
620 lines
25 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "-pxRW-6f_uNq"
|
||
},
|
||
"source": [
|
||
"# Build a Pipeline\n",
|
||
"> A tutorial on building pipelines to orchestrate your ML workflow\n",
|
||
"\n",
|
||
"\n",
|
||
"A Kubeflow pipeline is a portable and scalable definition of a machine learning\n",
|
||
"(ML) workflow. Each step in your ML workflow, such as preparing data or\n",
|
||
"training a model, is an instance of a pipeline component. This document\n",
|
||
"provides an overview of pipeline concepts and best practices, and instructions\n",
|
||
"describing how to build an ML pipeline.\n",
|
||
"\n",
|
||
"## Before you begin\n",
|
||
"\n",
|
||
"1. Run the following command to install the Kubeflow Pipelines SDK. If you run this command in a Jupyter\n",
|
||
" notebook, restart the kernel after installing the SDK. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "04mM73j7nWJ-"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"!pip install kfp --upgrade"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "8KExWR1i_7Ur"
|
||
},
|
||
"source": [
|
||
"2. Import the `kfp` and `kfp.components` packages."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "TLAhMbMG_M3A"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import kfp\n",
|
||
"import kfp.components as comp"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "deBrhgzrD3Fr"
|
||
},
|
||
"source": [
|
||
"## Understanding pipelines\n",
|
||
"\n",
|
||
"A Kubeflow pipeline is a portable and scalable definition of an ML workflow,\n",
|
||
"based on containers. A pipeline is composed of a set of input parameters and a\n",
|
||
"list of the steps in this workflow. Each step in a pipeline is an instance of a\n",
|
||
"component, which is represented as an instance of \n",
|
||
"[`ContainerOp`][container-op].\n",
|
||
"\n",
|
||
"You can use pipelines to:\n",
|
||
"\n",
|
||
"* Orchestrate repeatable ML workflows.\n",
|
||
"* Accelerate experimentation by running a workflow with different sets of\n",
|
||
" hyperparameters.\n",
|
||
"\n",
|
||
"### Understanding pipeline components\n",
|
||
"\n",
|
||
"A pipeline component is a containerized application that performs one step in a\n",
|
||
"pipeline's workflow. Pipeline components are defined in\n",
|
||
"[component specifications][component-spec], which define the following:\n",
|
||
"\n",
|
||
"* The component's interface, its inputs and outputs.\n",
|
||
"* The component's implementation, the container image and the command to\n",
|
||
" execute.\n",
|
||
"* The component's metadata, such as the name and description of the\n",
|
||
" component.\n",
|
||
"\n",
|
||
"You can build components by [defining a component specification for a\n",
|
||
"containerized application][component-dev], or you can [use the Kubeflow\n",
|
||
"Pipelines SDK to generate a component specification for a Python\n",
|
||
"function][python-function-component]. You can also [reuse prebuilt components\n",
|
||
"in your pipeline][prebuilt-components]. \n",
|
||
"\n",
|
||
"### Understanding the pipeline graph\n",
|
||
"\n",
|
||
"Each step in your pipeline's workflow is an instance of a component. When\n",
|
||
"you define your pipeline, you specify the source of each step's inputs. Step\n",
|
||
"inputs can be set from the pipeline's input arguments, constants, or step\n",
|
||
"inputs can depend on the outputs of other steps in this pipeline. Kubeflow\n",
|
||
"Pipelines uses these dependencies to define your pipeline's workflow as\n",
|
||
"a graph.\n",
|
||
"\n",
|
||
"For example, consider a pipeline with the following steps: ingest data,\n",
|
||
"generate statistics, preprocess data, and train a model. The following\n",
|
||
"describes the data dependencies between each step.\n",
|
||
"\n",
|
||
"* **Ingest data**: This step loads data from an external source which is\n",
|
||
" specified using a pipeline argument, and it outputs a dataset. Since\n",
|
||
" this step does not depend on the output of any other steps, this step\n",
|
||
" can run first.\n",
|
||
"* **Generate statistics**: This step uses the ingested dataset to generate\n",
|
||
" and output a set of statistics. Since this step depends on the dataset\n",
|
||
" produced by the ingest data step, it must run after the ingest data step.\n",
|
||
"* **Preprocess data**: This step preprocesses the ingested dataset and\n",
|
||
" transforms the data into a preprocessed dataset. Since this step depends\n",
|
||
" on the dataset produced by the ingest data step, it must run after the\n",
|
||
" ingest data step.\n",
|
||
"* **Train a model**: This step trains a model using the preprocessed dataset,\n",
|
||
" the generated statistics, and pipeline parameters, such as the learning\n",
|
||
" rate. Since this step depends on the preprocessed data and the generated\n",
|
||
" statistics, it must run after both the preprocess data and generate\n",
|
||
" statistics steps are complete.\n",
|
||
"\n",
|
||
"Since the generate statistics and preprocess data steps both depend on the\n",
|
||
"ingested data, the generate statistics and preprocess data steps can run in\n",
|
||
"parallel. All other steps are executed once their data dependencies are\n",
|
||
"available.\n",
|
||
"\n",
|
||
"## Designing your pipeline\n",
|
||
"\n",
|
||
"When designing your pipeline, think about how to split your ML workflow into\n",
|
||
"pipeline components. The process of splitting an ML workflow into pipeline\n",
|
||
"components is similar to the process of splitting a monolithic script into\n",
|
||
"testable functions. The following rules can help you define the components\n",
|
||
"that you need to build your pipeline.\n",
|
||
"\n",
|
||
"* Components should have a single responsibility. Having a single\n",
|
||
" responsibility makes it easier to test and reuse a component. For example,\n",
|
||
" if you have a component that loads data you can reuse that for similar\n",
|
||
" tasks that load data. If you have a component that loads and transforms\n",
|
||
" a dataset, the component can be less useful since you can use it only when\n",
|
||
" you need to load and transform that dataset. \n",
|
||
"\n",
|
||
"* Reuse components when possible. Kubeflow Pipelines provides [components for\n",
|
||
" common pipeline tasks and for access to cloud\n",
|
||
" services][prebuilt-components].\n",
|
||
"\n",
|
||
"* Consider what you need to know to debug your pipeline and research the\n",
|
||
" lineage of the models that your pipeline produces. Kubeflow Pipelines\n",
|
||
" stores the inputs and outputs of each pipeline step. By interrogating the\n",
|
||
" artifacts produced by a pipeline run, you can better understand the\n",
|
||
" variations in model quality between runs or track down bugs in your\n",
|
||
" workflow.\n",
|
||
"\n",
|
||
"In general, you should design your components with composability in mind. \n",
|
||
"\n",
|
||
"Pipelines are composed of component instances, also called steps. Steps can\n",
|
||
"define their inputs as depending on the output of another step. The\n",
|
||
"dependencies between steps define the pipeline workflow graph.\n",
|
||
"\n",
|
||
"### Building pipeline components\n",
|
||
"\n",
|
||
"Kubeflow pipeline components are containerized applications that perform a\n",
|
||
"step in your ML workflow. Here are the ways that you can define pipeline\n",
|
||
"components:\n",
|
||
"\n",
|
||
"* If you have a containerized application that you want to use as a\n",
|
||
" pipeline component, create a component specification to define this\n",
|
||
" container image as a pipeline component.\n",
|
||
" \n",
|
||
" This option provides the flexibility to include code written in any\n",
|
||
" language in your pipeline, so long as you can package the application\n",
|
||
" as a container image. Learn more about [building pipeline\n",
|
||
" components][component-dev].\n",
|
||
"\n",
|
||
"* If your component code can be expressed as a Python function, [evaluate if\n",
|
||
" your component can be built as a Python function-based\n",
|
||
" component][python-function-component]. The Kubeflow Pipelines SDK makes it\n",
|
||
" easier to build lightweight Python function-based components by saving you\n",
|
||
" the effort of creating a component specification.\n",
|
||
"\n",
|
||
"Whenever possible, [reuse prebuilt components][prebuilt-components] to save\n",
|
||
"yourself the effort of building custom components.\n",
|
||
"\n",
|
||
"The example in this guide demonstrates how to build a pipeline that uses a\n",
|
||
"Python function-based component and reuses a prebuilt component.\n",
|
||
"\n",
|
||
"### Understanding how data is passed between components\n",
|
||
"\n",
|
||
"When Kubeflow Pipelines runs a component, a container image is started in a\n",
|
||
"Kubernetes Pod and your component’s inputs are passed in as command-line\n",
|
||
"arguments. When your component has finished, the component's outputs are\n",
|
||
"returned as files.\n",
|
||
"\n",
|
||
"In your component's specification, you define the components inputs and outputs\n",
|
||
"and how the inputs and output paths are passed to your program as command-line\n",
|
||
"arguments. You can pass small inputs, such as short strings or numbers, to your\n",
|
||
"component by value. Large inputs, such as datasets, must be passed to your\n",
|
||
"component as file paths. Outputs are written to the paths that Kubeflow\n",
|
||
"Pipelines provides.\n",
|
||
"\n",
|
||
"Python function-based components make it easier to build pipeline components\n",
|
||
"by building the component specification for you. Python function-based\n",
|
||
"components also handle the complexity of passing inputs into your component\n",
|
||
"and passing your function’s outputs back to your pipeline.\n",
|
||
"\n",
|
||
"Learn more about how [Python function-based components handle inputs and\n",
|
||
"outputs][python-function-component-data-passing]. \n",
|
||
"\n",
|
||
"## Getting started building a pipeline\n",
|
||
"\n",
|
||
"The following sections demonstrate how to get started building a Kubeflow\n",
|
||
"pipeline by walking through the process of converting a Python script into\n",
|
||
"a pipeline.\n",
|
||
"\n",
|
||
"### Design your pipeline\n",
|
||
"\n",
|
||
"The following steps walk through some of the design decisions you may face\n",
|
||
"when designing a pipeline.\n",
|
||
"\n",
|
||
"1. Evaluate the process. In the following example, a Python function downloads\n",
|
||
" a zipped tar file (`.tar.gz`) that contains several CSV files, from a\n",
|
||
" public website. The function extracts the CSV files and then merges them\n",
|
||
" into a single file.\n",
|
||
"\n",
|
||
"[container-op]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.dsl.html#kfp.dsl.ContainerOp\n",
|
||
"[component-spec]: https://www.kubeflow.org/docs/components/pipelines/reference/component-spec/\n",
|
||
"[python-function-component]: https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/\n",
|
||
"[component-dev]: https://www.kubeflow.org/docs/components/pipelines/sdk/component-development/\n",
|
||
"[python-function-component-data-passing]: https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/#understanding-how-data-is-passed-between-components\n",
|
||
"[prebuilt-components]: https://www.kubeflow.org/docs/examples/shared-resources/"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "Vn9MXolH_2BG"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import glob\n",
|
||
"import pandas as pd\n",
|
||
"import tarfile\n",
|
||
"import urllib.request\n",
|
||
" \n",
|
||
"def download_and_merge_csv(url: str, output_csv: str):\n",
|
||
" with urllib.request.urlopen(url) as res:\n",
|
||
" tarfile.open(fileobj=res, mode=\"r|gz\").extractall('data')\n",
|
||
" df = pd.concat(\n",
|
||
" [pd.read_csv(csv_file, header=None) \n",
|
||
" for csv_file in glob.glob('data/*.csv')])\n",
|
||
" df.to_csv(output_csv, index=False, header=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "cWmF17kyIKGF"
|
||
},
|
||
"source": [
|
||
"2. Run the following Python command to test the function. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "he6MK5x1Fwbk"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"download_and_merge_csv(\n",
|
||
" url='https://storage.googleapis.com/ml-pipeline-playground/iris-csv-files.tar.gz', \n",
|
||
" output_csv='merged_data.csv')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"3. Run the following to print the first few rows of the\n",
|
||
" merged CSV file."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"!head merged_data.csv"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "yT6Di92BOrNQ"
|
||
},
|
||
"source": [
|
||
"4. Design your pipeline. For example, consider the following pipeline designs.\n",
|
||
"\n",
|
||
" * Implement the pipeline using a single step. In this case, the pipeline\n",
|
||
" contains one component that works similarly to the example function.\n",
|
||
" This is a straightforward function, and implementing a single-step\n",
|
||
" pipeline is a reasonable approach in this case.\n",
|
||
" \n",
|
||
" The down side of this approach is that the zipped tar file would not be\n",
|
||
" an artifact of your pipeline runs. Not having this artifact available \n",
|
||
" could make it harder to debug this component in production.\n",
|
||
" \n",
|
||
" * Implement this as a two-step pipeline. The first step downloads a file\n",
|
||
" from a website. The second step extracts the CSV files from a zipped\n",
|
||
" tar file and merges them into a single file. \n",
|
||
" \n",
|
||
" This approach has a few benefits:\n",
|
||
" \n",
|
||
" * You can reuse the [Web Download component][web-download-component]\n",
|
||
" to implement the first step.\n",
|
||
" * Each step has a single responsibility, which makes the components\n",
|
||
" easier to reuse.\n",
|
||
" * The zipped tar file is an artifact of the first pipeline step.\n",
|
||
" This means that you can examine this artifact when debugging\n",
|
||
" pipelines that use this component.\n",
|
||
" \n",
|
||
" This example implements a two-step pipeline.\n",
|
||
"\n",
|
||
"### Build your pipeline components\n",
|
||
"\n",
|
||
" \n",
|
||
"1. Build your pipeline components. This example modifies the initial script to\n",
|
||
" extract the contents of a zipped tar file, merge the CSV files that were\n",
|
||
" contained in the zipped tar file, and return the merged CSV file.\n",
|
||
" \n",
|
||
" This example builds a Python function-based component. You can also package\n",
|
||
" your component's code as a Docker container image and define the component\n",
|
||
" using a ComponentSpec.\n",
|
||
" \n",
|
||
" In this case, the following modifications were required to the original\n",
|
||
" function.\n",
|
||
"\n",
|
||
" * The file download logic was removed. The path to the zipped tar file\n",
|
||
" is passed as an argument to this function.\n",
|
||
" * The import statements were moved inside of the function. Python\n",
|
||
" function-based components require standalone Python functions. This\n",
|
||
" means that any required import statements must be defined within the\n",
|
||
" function, and any helper functions must be defined within the function.\n",
|
||
" Learn more about [building Python function-based\n",
|
||
" components][python-function-components].\n",
|
||
" * The function's arguments are decorated with the\n",
|
||
" [`kfp.components.InputPath`][input-path] and the\n",
|
||
" [`kfp.components.OutputPath`][output-path] annotations. These\n",
|
||
" annotations let Kubeflow Pipelines know to provide the path to the\n",
|
||
" zipped tar file and to create a path where your function stores the\n",
|
||
" merged CSV file. \n",
|
||
" \n",
|
||
" The following example shows the updated `merge_csv` function.\n",
|
||
"\n",
|
||
"[web-download-component]: https://github.com/kubeflow/pipelines/blob/master/components/web/Download/component.yaml\n",
|
||
"[python-function-components]: https://www.kubeflow.org/docs/components/pipelines/sdk/python-function-components/\n",
|
||
"[input-path]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html?highlight=inputpath#kfp.components.InputPath\n",
|
||
"[output-path]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html?highlight=outputpath#kfp.components.OutputPath"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "NB3eNHmNCN2C"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def merge_csv(file_path: comp.InputPath('Tarball'),\n",
|
||
" output_csv: comp.OutputPath('CSV')):\n",
|
||
" import glob\n",
|
||
" import pandas as pd\n",
|
||
" import tarfile\n",
|
||
"\n",
|
||
" tarfile.open(name=file_path, mode=\"r|gz\").extractall('data')\n",
|
||
" df = pd.concat(\n",
|
||
" [pd.read_csv(csv_file, header=None) \n",
|
||
" for csv_file in glob.glob('data/*.csv')])\n",
|
||
" df.to_csv(output_csv, index=False, header=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "WATSaHZkJySg"
|
||
},
|
||
"source": [
|
||
"2. Use [`kfp.components.create_component_from_func`][create_component_from_func]\n",
|
||
" to return a factory function that you can use to create pipeline steps.\n",
|
||
" This example also specifies the base container image to run this function\n",
|
||
" in, the path to save the component specification to, and a list of PyPI\n",
|
||
" packages that need to be installed in the container at runtime.\n",
|
||
"\n",
|
||
"[create_component_from_func]: (https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html#kfp.components.create_component_from_func\n",
|
||
"[container-op]: https://kubeflow-pipelines.readthedocs.io/en/stable/source/kfp.dsl.html#kfp.dsl.ContainerOp"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "RDVQ8QjWOniD"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"create_step_merge_csv = kfp.components.create_component_from_func(\n",
|
||
" func=merge_csv,\n",
|
||
" output_component_file='component.yaml', # This is optional. It saves the component spec for future use.\n",
|
||
" base_image='python:3.7',\n",
|
||
" packages_to_install=['pandas==1.1.4'])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "j9Axem9HPHP2"
|
||
},
|
||
"source": [
|
||
"### Build your pipeline\n",
|
||
"\n",
|
||
"1. Use [`kfp.components.load_component_from_url`][load_component_from_url]\n",
|
||
" to load the component specification YAML for any components that you are\n",
|
||
" reusing in this pipeline.\n",
|
||
"\n",
|
||
"[load_component_from_url]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html?highlight=load_component_from_url#kfp.components.load_component_from_url"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "QDzFCaGQa_oR"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"web_downloader_op = kfp.components.load_component_from_url(\n",
|
||
" 'https://raw.githubusercontent.com/kubeflow/pipelines/master/components/web/Download/component.yaml')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "p4bIwiHhbACy"
|
||
},
|
||
"source": [
|
||
"2. Define your pipeline as a Python function. \n",
|
||
"\n",
|
||
" Your pipeline function's arguments define your pipeline's parameters. Use\n",
|
||
" pipeline parameters to experiment with different hyperparameters, such as\n",
|
||
" the learning rate used to train a model, or pass run-level inputs, such as\n",
|
||
" the path to an input file, into a pipeline run.\n",
|
||
" \n",
|
||
" Use the factory functions created by\n",
|
||
" `kfp.components.create_component_from_func` and\n",
|
||
" `kfp.components.load_component_from_url` to create your pipeline's tasks. \n",
|
||
" The inputs to the component factory functions can be pipeline parameters,\n",
|
||
" the outputs of other tasks, or a constant value. In this case, the\n",
|
||
" `web_downloader_task` task uses the `url` pipeline parameter, and the\n",
|
||
" `merge_csv_task` uses the `data` output of the `web_downloader_task`.\n",
|
||
" "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "WsyKJeBOTlkz"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Define a pipeline and create a task from a component:\n",
|
||
"def my_pipeline(url):\n",
|
||
" web_downloader_task = web_downloader_op(url=url)\n",
|
||
" merge_csv_task = create_step_merge_csv(file=web_downloader_task.outputs['data'])\n",
|
||
" # The outputs of the merge_csv_task can be referenced using the\n",
|
||
" # merge_csv_task.outputs dictionary: merge_csv_task.outputs['output_csv']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "OT3O_2GgVKoT"
|
||
},
|
||
"source": [
|
||
"### Compile and run your pipeline\n",
|
||
"\n",
|
||
"After defining the pipeline in Python as described in the preceding section, use one of the following options to compile the pipeline and submit it to the Kubeflow Pipelines service.\n",
|
||
"\n",
|
||
"#### Option 1: Compile and then upload in UI\n",
|
||
"\n",
|
||
"1. Run the following to compile your pipeline and save it as `pipeline.yaml`. \n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "U0Ll8ve2WNUo"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"kfp.compiler.Compiler().compile(\n",
|
||
" pipeline_func=my_pipeline,\n",
|
||
" package_path='pipeline.yaml')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"2. Upload and run your `pipeline.yaml` using the Kubeflow Pipelines user interface.\n",
|
||
"See the guide to [getting started with the UI][quickstart].\n",
|
||
"\n",
|
||
"[quickstart]: https://www.kubeflow.org/docs/components/pipelines/overview/quickstart"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "jNLI1-_bfEky"
|
||
},
|
||
"source": [
|
||
"#### Option 2: run the pipeline using Kubeflow Pipelines SDK client\n",
|
||
"\n",
|
||
"1. Create an instance of the [`kfp.Client` class][kfp-client] following steps in [connecting to Kubeflow Pipelines using the SDK client][connect-api].\n",
|
||
"\n",
|
||
"[kfp-client]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.client.html#kfp.Client\n",
|
||
"[connect-api]: https://www.kubeflow.org/docs/components/pipelines/sdk/connect-api"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"client = kfp.Client() # change arguments accordingly"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"2. Run the pipeline using the `kfp.Client` instance:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "jRNHZpfnVJ0h"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"client.create_run_from_pipeline_func(\n",
|
||
" my_pipeline,\n",
|
||
" arguments={\n",
|
||
" 'url': 'https://storage.googleapis.com/ml-pipeline-playground/iris-csv-files.tar.gz'\n",
|
||
" })"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "pnhZm12y_wvc"
|
||
},
|
||
"source": [
|
||
"\n",
|
||
"## Next steps\n",
|
||
"\n",
|
||
"* Learn about advanced pipeline features, such as [authoring recursive\n",
|
||
" components][recursion] and [using conditional execution in a\n",
|
||
" pipeline][conditional].\n",
|
||
"* Learn how to [manipulate Kubernetes resources in a\n",
|
||
" pipeline][k8s-resources] (Experimental).\n",
|
||
"\n",
|
||
"[conditional]: https://github.com/kubeflow/pipelines/blob/master/samples/tutorials/DSL%20-%20Control%20structures/DSL%20-%20Control%20structures.py\n",
|
||
"[recursion]: https://www.kubeflow.org/docs/components/pipelines/sdk/dsl-recursion/\n",
|
||
"[k8s-resources]: https://www.kubeflow.org/docs/components/pipelines/sdk/manipulate-resources/"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"colab": {
|
||
"collapsed_sections": [],
|
||
"name": "Copy of build-pipelines.ipynb",
|
||
"provenance": [],
|
||
"toc_visible": true
|
||
},
|
||
"environment": {
|
||
"name": "tf2-2-3-gpu.2-3.m56",
|
||
"type": "gcloud",
|
||
"uri": "gcr.io/deeplearning-platform-release/tf2-2-3-gpu.2-3:m56"
|
||
},
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.7.10"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
} |