History

Nicholas Thomson f2a860b84c [AWS SageMaker] Integration tests automation (#3768 ) * # This is a combination of 5 commits. # This is the 1st commit message: Add initial scripts # This is the commit message #2: Add working pytest script # This is the commit message #3: Add initial scripts # This is the commit message #4: Add environment variable files # This is the commit message #5: Remove old cluster script * Add initial scripts Add working pytest script Add initial scripts Add environment variable files Remove old cluster script Update pipeline credentials to OIDC Add initial scripts Add working pytest script Add initial scripts Add working pytest script * Remove debugging mark * Update example EKS cluster name * Remove quiet from Docker build * Manually pass env * Update env list vars as string * Update use array directly * Update variable array to export * Update to using read for splitting * Move to helper script * Update export from CodeBuild * Add wait for minio * Update kubectl wait timeout * Update minor changes for PR * Update integration test buildspec to quiet build * Add region to delete EKS * Add wait for pods * Updated README * Add fixed interval wait * Fix CodeBuild step order * Add file lock for experiment ID * Fix missing pytest parameter * Update run create only once * Add filelock to conda env * Update experiment name ensuring creation each time * Add try/catch with create experiment * Remove caching from KFP deployment * Remove disable KFP caching * Move .gitignore changes to inside component * Add blank line to default .gitignore		2020-05-20 14:18:19 -07:00
..
arena	Make wget quieter (#2069 )	2019-09-09 14:32:54 -07:00
aws	[AWS SageMaker] Integration tests automation (#3768 )	2020-05-20 14:18:19 -07:00
deprecated	Remove dataflow components (#2161 )	2019-09-23 11:12:27 -07:00
diagnostics/diagnose_me	quick fix for quota list (#3075 )	2020-02-14 09:18:18 -08:00
filesystem	Components - Filesystem (#2659 )	2019-11-27 11:43:03 -08:00
gcp	raise error when LRO has an error (#3666 )	2020-05-01 11:48:29 -07:00
git/clone	Components - Git clone (#2658 )	2019-11-26 20:29:19 -08:00
google-cloud/storage	Components - Google Cloud Storage (#2532 )	2019-11-07 18:06:19 -08:00
ibm-components	Update Watson ML example to take output param path (#3316 )	2020-03-20 22:10:44 -07:00
kubeflow	Katib Launcher Experiment Name Conflict (#3508 )	2020-05-04 21:41:07 -07:00
local	Release `ad9bd5648d` (#3560 )	2020-04-22 14:00:15 -07:00
nuclio	add nuclio components (to build/deploy, delete, invoke functions) (#1295 )	2019-05-08 01:58:33 -07:00
presto/query	add presto pipeline component (#3261 )	2020-03-14 17:32:34 -07:00
sample/keras/train_classifier	SDK - Hiding Argo's workflow.uid placeholder behind DSL (#1683 )	2019-10-07 18:33:11 -07:00
tensorflow/tensorboard/prepare_tensorboard	Components - Tensorboard visualization (#3760 )	2020-05-13 18:32:22 -07:00
tfx	Fixed small syntax error in a sample notebook (#3721 )	2020-05-08 17:07:51 -07:00
OWNERS	add jiaxiao to the component owners (#2804 )	2020-01-07 12:48:18 -08:00
README.md	move old gcp components to deprecated folder (#2031 )	2019-09-06 16:29:20 -07:00
build_image.sh	common build image script (#815 )	2019-02-13 10:37:19 -08:00
license.sh	Initial commit of the kubeflow/pipeline project.	2018-11-02 14:02:31 -07:00
release.sh	Build - Fix building TF images (#2736 )	2019-12-16 14:25:38 -08:00
test_load_all_components.sh	Test loading all component.yaml definitions (#1045 )	2019-04-02 12:25:18 -07:00
third_party_licenses.csv	add missing license for component image (#3543 )	2020-04-17 17:44:25 -07:00

README.md

Kubeflow pipeline components

Kubeflow pipeline components are implementations of Kubeflow pipeline tasks. Each task takes one or more artifacts as input and may produce one or more artifacts as output.

Example: XGBoost DataProc components

Each task usually includes two parts:

Client code The code that talks to endpoints to submit jobs. For example, code to talk to Google Dataproc API to submit a Spark job.

Runtime code The code that does the actual job and usually runs in the cluster. For example, Spark code that transforms raw data into preprocessed data.

Container A container image that runs the client code.

Note the naming convention for client code and runtime code—for a task named "mytask":

The mytask.py program contains the client code.
The mytask directory contains all the runtime code.

See how to use the Kubeflow Pipelines SDK and build your own components.