pipelines/components
Nicholas Thomson f2a860b84c
[AWS SageMaker] Integration tests automation (#3768)
* # This is a combination of 5 commits.
# This is the 1st commit message:

Add initial scripts

# This is the commit message #2:

Add working pytest script

# This is the commit message #3:

Add initial scripts

# This is the commit message #4:

Add environment variable files

# This is the commit message #5:

Remove old cluster script

* Add initial scripts

Add working pytest script

Add initial scripts

Add environment variable files

Remove old cluster script

Update pipeline credentials to OIDC

Add initial scripts

Add working pytest script

Add initial scripts

Add working pytest script

* Remove debugging mark

* Update example EKS cluster name

* Remove quiet from Docker build

* Manually pass env

* Update env list vars as string

* Update use array directly

* Update variable array to export

* Update to using read for splitting

* Move to helper script

* Update export from CodeBuild

* Add wait for minio

* Update kubectl wait timeout

* Update minor changes for PR

* Update integration test buildspec to quiet build

* Add region to delete EKS

* Add wait for pods

* Updated README

* Add fixed interval wait

* Fix CodeBuild step order

* Add file lock for experiment ID

* Fix missing pytest parameter

* Update run create only once

* Add filelock to conda env

* Update experiment name ensuring creation each time

* Add try/catch with create experiment

* Remove caching from KFP deployment

* Remove disable KFP caching

* Move .gitignore changes to inside component

* Add blank line to default .gitignore
2020-05-20 14:18:19 -07:00
..
arena Make wget quieter (#2069) 2019-09-09 14:32:54 -07:00
aws [AWS SageMaker] Integration tests automation (#3768) 2020-05-20 14:18:19 -07:00
deprecated Remove dataflow components (#2161) 2019-09-23 11:12:27 -07:00
diagnostics/diagnose_me quick fix for quota list (#3075) 2020-02-14 09:18:18 -08:00
filesystem Components - Filesystem (#2659) 2019-11-27 11:43:03 -08:00
gcp raise error when LRO has an error (#3666) 2020-05-01 11:48:29 -07:00
git/clone Components - Git clone (#2658) 2019-11-26 20:29:19 -08:00
google-cloud/storage Components - Google Cloud Storage (#2532) 2019-11-07 18:06:19 -08:00
ibm-components Update Watson ML example to take output param path (#3316) 2020-03-20 22:10:44 -07:00
kubeflow Katib Launcher Experiment Name Conflict (#3508) 2020-05-04 21:41:07 -07:00
local Release ad9bd5648d (#3560) 2020-04-22 14:00:15 -07:00
nuclio add nuclio components (to build/deploy, delete, invoke functions) (#1295) 2019-05-08 01:58:33 -07:00
presto/query add presto pipeline component (#3261) 2020-03-14 17:32:34 -07:00
sample/keras/train_classifier SDK - Hiding Argo's workflow.uid placeholder behind DSL (#1683) 2019-10-07 18:33:11 -07:00
tensorflow/tensorboard/prepare_tensorboard Components - Tensorboard visualization (#3760) 2020-05-13 18:32:22 -07:00
tfx Fixed small syntax error in a sample notebook (#3721) 2020-05-08 17:07:51 -07:00
OWNERS add jiaxiao to the component owners (#2804) 2020-01-07 12:48:18 -08:00
README.md move old gcp components to deprecated folder (#2031) 2019-09-06 16:29:20 -07:00
build_image.sh common build image script (#815) 2019-02-13 10:37:19 -08:00
license.sh Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00
release.sh Build - Fix building TF images (#2736) 2019-12-16 14:25:38 -08:00
test_load_all_components.sh Test loading all component.yaml definitions (#1045) 2019-04-02 12:25:18 -07:00
third_party_licenses.csv add missing license for component image (#3543) 2020-04-17 17:44:25 -07:00

README.md

Kubeflow pipeline components

Kubeflow pipeline components are implementations of Kubeflow pipeline tasks. Each task takes one or more artifacts as input and may produce one or more artifacts as output.

Example: XGBoost DataProc components

Each task usually includes two parts:

Client code The code that talks to endpoints to submit jobs. For example, code to talk to Google Dataproc API to submit a Spark job.

Runtime code The code that does the actual job and usually runs in the cluster. For example, Spark code that transforms raw data into preprocessed data.

Container A container image that runs the client code.

Note the naming convention for client code and runtime code—for a task named "mytask":

  • The mytask.py program contains the client code.
  • The mytask directory contains all the runtime code.

See how to use the Kubeflow Pipelines SDK and build your own components.