mirror of https://github.com/kubeflow/website.git
224 lines
9.8 KiB
Markdown
224 lines
9.8 KiB
Markdown
+++
|
|
title = "Overview of Kubeflow Pipelines"
|
|
description = "Overview of Kubeflow Pipelines"
|
|
weight = 10
|
|
+++
|
|
|
|
Kubeflow Pipelines is a platform for building and deploying portable,
|
|
scalable machine learning (ML) workflows based on Docker containers.
|
|
|
|
## Quickstart
|
|
|
|
Run your first pipeline by following the
|
|
[pipelines quickstart guide](/docs/guides/pipelines/pipelines-quickstart).
|
|
|
|
## What is Kubeflow Pipelines?
|
|
|
|
The Kubeflow Pipelines platform consists of:
|
|
|
|
* A user interface (UI) for managing and tracking experiments, jobs, and runs.
|
|
* An engine for scheduling multi-step ML workflows.
|
|
* An SDK for defining and manipulating pipelines and components.
|
|
* Notebooks for interacting with the system using the SDK.
|
|
|
|
The following are the goals of Kubeflow Pipelines:
|
|
|
|
* End-to-end orchestration: enabling and simplifying the orchestration of
|
|
machine learning pipelines.
|
|
* Easy experimentation: making it easy for you to try numerous ideas and
|
|
techniques and manage your various trials/experiments.
|
|
* Easy re-use: enabling you to re-use components and pipelines to quickly
|
|
create end-to-end solutions without having to rebuild each time.
|
|
|
|
In
|
|
[Kubeflow v0.1.3 and later](https://github.com/kubeflow/pipelines/releases/tag/0.1.3),
|
|
Kubeflow Pipelines is one of the Kubeflow core components. It's
|
|
automatically deployed during Kubeflow deployment. You can try it currently
|
|
with a Kubeflow deployment on GKE in Google Cloud Platform (GCP). See the guide
|
|
to [deploying Kubeflow on GCP](/docs/gke/deploy/).
|
|
|
|
{{% pipelines-compatibility %}}
|
|
|
|
## What is a pipeline?
|
|
|
|
A _pipeline_ is a description of an ML workflow, including all of the components
|
|
in the workflow and how they combine in the form of a graph. (See the
|
|
screenshot below showing an example of a pipeline graph.) The pipeline
|
|
includes the definition of the inputs (parameters) required to run the pipeline
|
|
and the inputs and outputs of each component.
|
|
|
|
After developing your pipeline, you can upload and share it on the
|
|
Kubeflow Pipelines UI.
|
|
|
|
A _pipeline component_ is a self-contained set of user code, packaged as a
|
|
[Docker image](https://docs.docker.com/get-started/), that
|
|
performs one step in the pipeline. For example, a component can be responsible
|
|
for data preprocessing, data transformation, model training, and so on.
|
|
|
|
See the conceptual guides to [pipelines](/docs/pipelines/concepts/pipeline/)
|
|
and [components](/docs/pipelines/concepts/component/).
|
|
|
|
## Example of a pipeline
|
|
|
|
The screenshots and code below show the `xgboost-training-cm.py` pipeline, which
|
|
creates an XGBoost model using structured data in CSV format. You can see the
|
|
source code and other information about the pipeline on
|
|
[GitHub](https://github.com/kubeflow/pipelines/tree/master/samples/xgboost-spark).
|
|
|
|
### The runtime execution graph of the pipeline
|
|
|
|
The screenshot below shows the example pipeline's runtime execution graph in the
|
|
Kubeflow Pipelines UI:
|
|
|
|
<img src="/docs/images/pipelines-xgboost-graph.png"
|
|
alt="XGBoost results on the pipelines UI"
|
|
class="mt-3 mb-3 border border-info rounded">
|
|
|
|
### The Python code that represents the pipeline
|
|
|
|
Below is an extract from the Python code that defines the
|
|
`xgboost-training-cm.py` pipeline. You can see the full code on
|
|
[GitHub](https://github.com/kubeflow/pipelines/tree/master/samples/xgboost-spark).
|
|
|
|
```python
|
|
|
|
@dsl.pipeline(
|
|
name='XGBoost Trainer',
|
|
description='A trainer that does end-to-end distributed training for XGBoost models.'
|
|
)
|
|
def xgb_train_pipeline(
|
|
output,
|
|
project,
|
|
region='us-central1',
|
|
train_data='gs://ml-pipeline-playground/sfpd/train.csv',
|
|
eval_data='gs://ml-pipeline-playground/sfpd/eval.csv',
|
|
schema='gs://ml-pipeline-playground/sfpd/schema.json',
|
|
target='resolution',
|
|
rounds=200,
|
|
workers=2,
|
|
true_label='ACTION',
|
|
):
|
|
delete_cluster_op = DeleteClusterOp('delete-cluster', project, region).apply(gcp.use_gcp_secret('user-gcp-sa'))
|
|
with dsl.ExitHandler(exit_op=delete_cluster_op):
|
|
create_cluster_op = CreateClusterOp('create-cluster', project, region, output).apply(gcp.use_gcp_secret('user-gcp-sa'))
|
|
|
|
analyze_op = AnalyzeOp('analyze', project, region, create_cluster_op.output, schema,
|
|
train_data, '%s/{{workflow.name}}/analysis' % output).apply(gcp.use_gcp_secret('user-gcp-sa'))
|
|
|
|
transform_op = TransformOp('transform', project, region, create_cluster_op.output,
|
|
train_data, eval_data, target, analyze_op.output,
|
|
'%s/{{workflow.name}}/transform' % output).apply(gcp.use_gcp_secret('user-gcp-sa'))
|
|
|
|
train_op = TrainerOp('train', project, region, create_cluster_op.output, transform_op.outputs['train'],
|
|
transform_op.outputs['eval'], target, analyze_op.output, workers,
|
|
rounds, '%s/{{workflow.name}}/model' % output).apply(gcp.use_gcp_secret('user-gcp-sa'))
|
|
|
|
predict_op = PredictOp('predict', project, region, create_cluster_op.output, transform_op.outputs['eval'],
|
|
train_op.output, target, analyze_op.output, '%s/{{workflow.name}}/predict' % output).apply(gcp.use_gcp_secret('user-gcp-sa'))
|
|
|
|
confusion_matrix_op = ConfusionMatrixOp('confusion-matrix', predict_op.output,
|
|
'%s/{{workflow.name}}/confusionmatrix' % output).apply(gcp.use_gcp_secret('user-gcp-sa'))
|
|
|
|
roc_op = RocOp('roc', predict_op.output, true_label, '%s/{{workflow.name}}/roc' % output).apply(gcp.use_gcp_secret('user-gcp-sa'))
|
|
```
|
|
|
|
### Pipeline data on the Kubeflow Pipelines UI
|
|
|
|
The screenshot below shows the Kubeflow Pipelines UI for kicking off a run of
|
|
the pipeline. The pipeline definition in your code determines which parameters
|
|
appear in the UI form. The pipeline definition can also set default values for
|
|
these parameters. The arrows on the screenshot indicate the
|
|
parameters that do not have useful default values in this particular example:
|
|
|
|
<img src="/docs/images/pipelines-start-xgboost-run.png"
|
|
alt="Starting the XGBoost run on the pipelines UI"
|
|
class="mt-3 mb-3 border border-info rounded">
|
|
|
|
### Outputs from the pipeline
|
|
|
|
The following screenshots show examples of the pipeline output visible on
|
|
the Kubeflow Pipelines UI.
|
|
|
|
Prediction results:
|
|
|
|
<img src="/docs/images/predict.png"
|
|
alt="Prediction output"
|
|
class="mt-3 mb-3 p-3 border border-info rounded">
|
|
|
|
Confusion matrix:
|
|
|
|
<img src="/docs/images/cm.png"
|
|
alt="Confusion matrix"
|
|
class="mt-3 mb-3 p-3 border border-info rounded">
|
|
|
|
Receiver operating characteristics (ROC) curve:
|
|
|
|
<img src="/docs/images/roc.png"
|
|
alt="ROC"
|
|
class="mt-3 mb-3 p-3 border border-info rounded">
|
|
|
|
## Architectural overview
|
|
|
|
<img src="/docs/images/pipelines-architecture.png"
|
|
alt="Pipelines architectural diagram"
|
|
class="mt-3 mb-3 p-3 border border-info rounded">
|
|
|
|
At a high level, the execution of a pipeline proceeds as follows:
|
|
|
|
* **Python SDK**: You create components or specify a pipeline using the Kubeflow
|
|
Pipelines domain-specific language
|
|
([DSL](https://github.com/kubeflow/pipelines/tree/master/sdk/python/kfp/dsl)).
|
|
* **DSL compiler**: The
|
|
[DSL compiler](https://github.com/kubeflow/pipelines/tree/master/sdk/python/kfp/compiler)
|
|
transforms your pipeline's Python code into a static configuration (YAML).
|
|
* **Pipeline Service**: You call the Pipeline Service to create a
|
|
pipeline run from the static configuration.
|
|
* **Kubernetes resources**: The Pipeline Service calls the Kubernetes API
|
|
server to create the necessary Kubernetes resources
|
|
([CRDs](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/))
|
|
to run the pipeline.
|
|
* **Orchestration controllers**: A set of orchestration controllers
|
|
execute the containers needed to complete the pipeline execution specified
|
|
by the Kubernetes resources
|
|
([CRDs](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)).
|
|
The containers execute within Kubernetes Pods on virtual machines. An
|
|
example controller is the **[Argo
|
|
Workflow**](https://github.com/argoproj/argo) controller, which
|
|
orchestrates task-driven workflows.
|
|
* **Artifact storage**: The Pods store two kinds of data:
|
|
|
|
* **Metadata:** Experiments, jobs, runs, etc. Also single scalar metrics,
|
|
generally aggregated for the purposes of sorting and filtering.
|
|
Kubeflow Pipelines stores the metadata in a MySQL database.
|
|
* **Artifacts:** Pipeline packages, views, etc. Also
|
|
large-scale metrics like time series, usually used for investigating an
|
|
individual run's performance and for debugging. Kubeflow Pipelines
|
|
stores the artifacts in an artifact store like
|
|
[Minio server](https://docs.minio.io/) or
|
|
[Cloud Storage](https://cloud.google.com/storage/docs/).
|
|
|
|
The MySQL database and the Minio server are both backed by the Kubernetes
|
|
[PersistentVolume](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes)
|
|
(PV) subsystem.
|
|
|
|
* **Persistence agent and ML metadata**: The Pipeline Persistence Agent
|
|
watches the Kubernetes resources created by the Pipeline Service and
|
|
persists the state of these resources in the ML Metadata Service. The
|
|
Pipeline Persistence Agent records the set of containers that executed as
|
|
well as their inputs and outputs. The input/output consists of either
|
|
container parameters or data artifact URIs.
|
|
* **Pipeline web server**: The Pipeline web server gathers data from various
|
|
services to display relevant views: the list of pipelines currently running,
|
|
the history of pipeline execution, the list of data artifacts, debugging
|
|
information about individual pipeline runs, execution status about individual
|
|
pipeline runs.
|
|
|
|
## Next steps
|
|
|
|
* Follow the
|
|
[pipelines quickstart guide](/docs/guides/pipelines/pipelines-quickstart) to
|
|
deploy Kubeflow and run a sample pipeline directly from the
|
|
Kubeflow Pipelines UI.
|
|
* Follow the full guide to experimenting with
|
|
[the Kubeflow Pipelines samples](/docs/pipelines/tutorials/build-pipeline/).
|
|
|