History

Jiaxin Shan d5147b9776 Add HyperParameters back to SageMaker training job (#1377 )		2019-05-30 19:24:21 -07:00
..
arena	check if data and env format (#1337 )	2019-05-15 13:20:23 -07:00
aws	Add HyperParameters back to SageMaker training job (#1377 )	2019-05-30 19:24:21 -07:00
bigquery	common build image script (#815 )	2019-02-13 10:37:19 -08:00
dataflow	Release components (#1347 )	2019-05-17 17:45:08 -07:00
dataproc	common build image script (#815 )	2019-02-13 10:37:19 -08:00
gcp	Release components (#1347 )	2019-05-17 17:45:08 -07:00
ibm-components	Fix watson train component (#1259 )	2019-05-23 10:16:29 -07:00
kubeflow	Release components (#1347 )	2019-05-17 17:45:08 -07:00
local	Release components (#1347 )	2019-05-17 17:45:08 -07:00
nuclio	add nuclio components (to build/deploy, delete, invoke functions) (#1295 )	2019-05-08 01:58:33 -07:00
sample/keras/train_classifier	Marked all scripts as executable (#1177 )	2019-04-23 16:12:00 -07:00
OWNERS	Adding myself as a reviewer for components (#1161 )	2019-04-15 13:23:21 -07:00
README.md	Restructure dataproc components (#542 )	2018-12-17 16:57:58 -08:00
build_image.sh	common build image script (#815 )	2019-02-13 10:37:19 -08:00
license.sh	Initial commit of the kubeflow/pipeline project.	2018-11-02 14:02:31 -07:00
release.sh	switch the release script from staging to test (#1328 )	2019-05-14 11:29:50 -07:00
test_load_all_components.sh	Test loading all component.yaml definitions (#1045 )	2019-04-02 12:25:18 -07:00
third_party_licenses.csv	add licenses for katib-launcher (#1056 )	2019-03-27 21:52:42 -07:00

README.md

Kubeflow pipeline components

Kubeflow pipeline components are implementations of Kubeflow pipeline tasks. Each task takes one or more artifacts as input and may produce one or more artifacts as output.

Example: XGBoost DataProc components

Each task usually includes two parts:

Client code The code that talks to endpoints to submit jobs. For example, code to talk to Google Dataproc API to submit a Spark job.

Runtime code The code that does the actual job and usually runs in the cluster. For example, Spark code that transforms raw data into preprocessed data.

Container A container image that runs the client code.

Note the naming convention for client code and runtime code—for a task named "mytask":

The mytask.py program contains the client code.
The mytask directory contains all the runtime code.

See how to build your own components