pipelines

Machine Learning Pipelines for Kubeflow

data-science kubeflow kubeflow-pipelines kubernetes machine-learning mlops pipeline

Go to file

Alexey Volkov 578d8de91d SDK - Reduce python component limitations - no import errors for cust… (#3106 ) * SDK - Reduce python component limitations - no import errors for custom type annotations By default, create_component_from_func copies the source code of the function and creates a component using that source code. No global imports are captured. This is problematic for the function definition, since any annotation, that uses a type that needs to be imported, will cause error. There were some special provisions for NamedTuple, InputPath and OutputPath, but even they were brittle (for example, "typing.NamedTuple" or "components.InputPath" annotations still caused failures at runtime). This commit fixes the issue by stripping the type annotations from function declarations. Fixes cases that were failing before: ```python import typing import collections MyFuncOutputs = typing.NamedTuple('Outputs', [('sum', int), ('product', int)]) @create_component_from_func def my_func( param1: CustomType, # This caused failure previously param2: collections.OrderedDict, # This caused failure previously ) -> MyFuncOutputs: # This caused failure previously pass ``` * Fixed the compiler tests * Fixed crashes on print function Code `print(line, end="")` was causing error: "lib2to3.pgen2.parse.ParseError: bad input: type=22, value='=', context=('', (2, 15))" * Using the strip_hints library to strip the annotations * Updating test workflow yamls * Workaround for bug in untokenize * Switched to the new strip_string_to_string method * Fixed typo. Co-Authored-By: Jiaxiao Zheng <jxzheng@google.com> Co-authored-by: Jiaxiao Zheng <jxzheng@google.com>		2020-02-24 20:50:48 -08:00
.github/ISSUE_TEMPLATE	Update BUG_REPORT.md	2020-02-24 11:43:15 +08:00
backend	Fix broken doc link (#3144 )	2020-02-24 08:31:53 -08:00
components	ml_engine component READMEs incorrect (#3103 )	2020-02-18 16:18:24 -08:00
contrib	Make wget quieter (#2069 )	2019-09-09 14:32:54 -07:00
docs	Docs - Added the kfp root members (#2183 )	2019-10-07 18:33:19 -07:00
frontend	[MLMD][Lineage] Navigate to ArtifactDetails Overview on row click [long term] (#3141 )	2020-02-24 17:54:48 -08:00
manifests	bump version to 0.2.4 and tiny BTW doc fix (#3115 )	2020-02-19 03:54:25 -08:00
proxy	[Proxy] Split domain name (#2851 )	2020-01-16 14:00:31 -08:00
release	Components - Google Cloud Storage (#2532 )	2019-11-07 18:06:19 -08:00
samples	[Sample] Update doc in taxi pipeline demo (#3149 )	2020-02-24 11:17:35 -08:00
sdk	SDK - Reduce python component limitations - no import errors for cust… (#3106 )	2020-02-24 20:50:48 -08:00
test	[Frontend] Migrate to create-react-app (#3156 )	2020-02-24 17:05:35 -08:00
third_party	pin envoy (#2968 )	2020-02-03 12:49:25 -08:00
tools	done (#3028 )	2020-02-11 18:34:15 -08:00
.cloudbuild.yaml	Build deployer for each post-submit to avoid manual work (#2873 )	2020-01-19 03:21:35 -08:00
.dockerignore	Initial commit of the kubeflow/pipeline project.	2018-11-02 14:02:31 -07:00
.gitattributes	Support filtering on storage state (#629 )	2019-01-11 11:01:01 -08:00
.gitignore	License crawler for third party golang libraries (#2393 )	2019-10-25 03:15:40 -07:00
.pylintrc	[Request for comments] Add config for yapf and pylintrc (#2446 )	2019-10-21 12:34:22 -07:00
.release.cloudbuild.yaml	Metadata: Update Metadata server version to v0.21.1 (#2931 )	2020-01-30 12:32:20 -08:00
.style.yapf	[Request for comments] Add config for yapf and pylintrc (#2446 )	2019-10-21 12:34:22 -07:00
.travis.yml	release TFX from bypassing (#3146 )	2020-02-21 11:00:12 -08:00
BUILD.bazel	apiserver: Remove TFX output artifact recording to metadatastore (#1904 )	2019-08-21 13:44:31 -07:00
CHANGELOG.md	update changelog and document (#2990 )	2020-02-05 03:19:54 -08:00
CONTRIBUTING.md	fix link validation complaint. (#2727 )	2019-12-18 21:49:56 -08:00
LICENSE	Initial commit of the kubeflow/pipeline project.	2018-11-02 14:02:31 -07:00
Makefile	Fix Makefile to add licenses using Go modules. (#674 )	2019-01-14 15:25:27 -08:00
OWNERS	clean up owner file (#1928 )	2019-08-22 15:29:19 -07:00
README.md	add community meeting/slack onto README (#2613 )	2019-11-18 13:57:41 -08:00
ROADMAP.md	ROADMAP.md cosmetic changes (#846 )	2019-02-22 15:03:45 -08:00
WORKSPACE	Backend - Removed Tensorflow from backend WORKSPACE (#2856 )	2020-02-18 14:16:25 -08:00
developer_guide.md	fix doc link (#2681 )	2019-12-03 22:44:57 -08:00
go.mod	Fix documentation for filter.proto (#2447 )	2019-10-25 02:35:38 -07:00
go.sum	move pipeline runner service account to backend (#1988 )	2019-08-29 16:03:14 -07:00

README.md

SDK:

Overview of the Kubeflow pipelines service

Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

Kubeflow pipelines are reusable end-to-end ML workflows built using the Kubeflow Pipelines SDK.

The Kubeflow pipelines service has the following goals:

End to end orchestration: enabling and simplifying the orchestration of end to end machine learning pipelines
Easy experimentation: making it easy for you to try numerous ideas and techniques, and manage your various trials/experiments.
Easy re-use: enabling you to re-use components and pipelines to quickly cobble together end to end solutions, without having to re-build each time.

Documentation

Get started with your first pipeline and read further information in the Kubeflow Pipelines overview.

See the various ways you can use the Kubeflow Pipelines SDK.

See the Kubeflow Pipelines API doc for API specification.

Consult the Python SDK reference docs when writing pipelines using the Python SDK.

Kubeflow Pipelines Community Meeting

The meeting is happening every other Wed 10-11AM (PST) Calendar Invite or Join Meeting Directly

Meeting notes

Kubeflow Pipelines Slack Channel

#kubeflow-pipelines

Blog posts

Getting started with Kubeflow Pipelines (By Amy Unruh)
How to create and deploy a Kubeflow Machine Learning Pipeline (By Lak Lakshmanan)
- Part 1: How to create and deploy a Kubeflow Machine Learning Pipeline
- Part 2: How to deploy Jupyter notebooks as components of a Kubeflow ML pipeline

Acknowledgments

Kubeflow pipelines uses Argo under the hood to orchestrate Kubernetes resources. The Argo community has been very supportive and we are very grateful.