Machine Learning Pipelines for Kubeflow
Go to file
Alexey Volkov 578d8de91d
SDK - Reduce python component limitations - no import errors for cust… (#3106)
* SDK - Reduce python component limitations - no import errors for custom type annotations

By default, create_component_from_func copies the source code of the function and creates a component using that source code. No global imports are captured. This is problematic for the function definition, since any annotation, that uses a type that needs to be imported, will cause error. There were some special provisions for
NamedTuple,  InputPath and OutputPath, but even they were brittle (for example, "typing.NamedTuple" or "components.InputPath" annotations still caused failures at runtime).

This commit fixes the issue by stripping the type annotations from function declarations.

Fixes cases that were failing before:

```python
import typing
import collections

MyFuncOutputs = typing.NamedTuple('Outputs', [('sum', int), ('product', int)])

@create_component_from_func
def my_func(
    param1: CustomType,  # This caused failure previously
    param2: collections.OrderedDict,  # This caused failure previously
) -> MyFuncOutputs: # This caused failure previously
    pass
```

* Fixed the compiler tests

* Fixed crashes on print function

Code `print(line, end="")` was causing error: "lib2to3.pgen2.parse.ParseError: bad input: type=22, value='=', context=('', (2, 15))"

* Using the strip_hints library to strip the annotations

* Updating test workflow yamls

* Workaround for bug in untokenize

* Switched to the new strip_string_to_string method

* Fixed typo.

Co-Authored-By: Jiaxiao Zheng <jxzheng@google.com>

Co-authored-by: Jiaxiao Zheng <jxzheng@google.com>
2020-02-24 20:50:48 -08:00
.github/ISSUE_TEMPLATE Update BUG_REPORT.md 2020-02-24 11:43:15 +08:00
backend Fix broken doc link (#3144) 2020-02-24 08:31:53 -08:00
components ml_engine component READMEs incorrect (#3103) 2020-02-18 16:18:24 -08:00
contrib Make wget quieter (#2069) 2019-09-09 14:32:54 -07:00
docs Docs - Added the kfp root members (#2183) 2019-10-07 18:33:19 -07:00
frontend [MLMD][Lineage] Navigate to ArtifactDetails Overview on row click [long term] (#3141) 2020-02-24 17:54:48 -08:00
manifests bump version to 0.2.4 and tiny BTW doc fix (#3115) 2020-02-19 03:54:25 -08:00
proxy [Proxy] Split domain name (#2851) 2020-01-16 14:00:31 -08:00
release Components - Google Cloud Storage (#2532) 2019-11-07 18:06:19 -08:00
samples [Sample] Update doc in taxi pipeline demo (#3149) 2020-02-24 11:17:35 -08:00
sdk SDK - Reduce python component limitations - no import errors for cust… (#3106) 2020-02-24 20:50:48 -08:00
test [Frontend] Migrate to create-react-app (#3156) 2020-02-24 17:05:35 -08:00
third_party pin envoy (#2968) 2020-02-03 12:49:25 -08:00
tools done (#3028) 2020-02-11 18:34:15 -08:00
.cloudbuild.yaml Build deployer for each post-submit to avoid manual work (#2873) 2020-01-19 03:21:35 -08:00
.dockerignore Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00
.gitattributes Support filtering on storage state (#629) 2019-01-11 11:01:01 -08:00
.gitignore License crawler for third party golang libraries (#2393) 2019-10-25 03:15:40 -07:00
.pylintrc [Request for comments] Add config for yapf and pylintrc (#2446) 2019-10-21 12:34:22 -07:00
.release.cloudbuild.yaml Metadata: Update Metadata server version to v0.21.1 (#2931) 2020-01-30 12:32:20 -08:00
.style.yapf [Request for comments] Add config for yapf and pylintrc (#2446) 2019-10-21 12:34:22 -07:00
.travis.yml release TFX from bypassing (#3146) 2020-02-21 11:00:12 -08:00
BUILD.bazel apiserver: Remove TFX output artifact recording to metadatastore (#1904) 2019-08-21 13:44:31 -07:00
CHANGELOG.md update changelog and document (#2990) 2020-02-05 03:19:54 -08:00
CONTRIBUTING.md fix link validation complaint. (#2727) 2019-12-18 21:49:56 -08:00
LICENSE Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00
Makefile Fix Makefile to add licenses using Go modules. (#674) 2019-01-14 15:25:27 -08:00
OWNERS clean up owner file (#1928) 2019-08-22 15:29:19 -07:00
README.md add community meeting/slack onto README (#2613) 2019-11-18 13:57:41 -08:00
ROADMAP.md ROADMAP.md cosmetic changes (#846) 2019-02-22 15:03:45 -08:00
WORKSPACE Backend - Removed Tensorflow from backend WORKSPACE (#2856) 2020-02-18 14:16:25 -08:00
developer_guide.md fix doc link (#2681) 2019-12-03 22:44:57 -08:00
go.mod Fix documentation for filter.proto (#2447) 2019-10-25 02:35:38 -07:00
go.sum move pipeline runner service account to backend (#1988) 2019-08-29 16:03:14 -07:00

README.md

Build Status Coverage Status SDK: Documentation Status

Overview of the Kubeflow pipelines service

Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

Kubeflow pipelines are reusable end-to-end ML workflows built using the Kubeflow Pipelines SDK.

The Kubeflow pipelines service has the following goals:

  • End to end orchestration: enabling and simplifying the orchestration of end to end machine learning pipelines
  • Easy experimentation: making it easy for you to try numerous ideas and techniques, and manage your various trials/experiments.
  • Easy re-use: enabling you to re-use components and pipelines to quickly cobble together end to end solutions, without having to re-build each time.

Documentation

Get started with your first pipeline and read further information in the Kubeflow Pipelines overview.

See the various ways you can use the Kubeflow Pipelines SDK.

See the Kubeflow Pipelines API doc for API specification.

Consult the Python SDK reference docs when writing pipelines using the Python SDK.

Kubeflow Pipelines Community Meeting

The meeting is happening every other Wed 10-11AM (PST) Calendar Invite or Join Meeting Directly

Meeting notes

Kubeflow Pipelines Slack Channel

#kubeflow-pipelines

Blog posts

Acknowledgments

Kubeflow pipelines uses Argo under the hood to orchestrate Kubernetes resources. The Argo community has been very supportive and we are very grateful.