* # This is a combination of 5 commits. # This is the 1st commit message: Add initial scripts # This is the commit message #2: Add working pytest script # This is the commit message #3: Add initial scripts # This is the commit message #4: Add environment variable files # This is the commit message #5: Remove old cluster script * Add initial scripts Add working pytest script Add initial scripts Add environment variable files Remove old cluster script Update pipeline credentials to OIDC Add initial scripts Add working pytest script Add initial scripts Add working pytest script * Remove debugging mark * Update example EKS cluster name * Remove quiet from Docker build * Manually pass env * Update env list vars as string * Update use array directly * Update variable array to export * Update to using read for splitting * Move to helper script * Update export from CodeBuild * Add wait for minio * Update kubectl wait timeout * Update minor changes for PR * Update integration test buildspec to quiet build * Add region to delete EKS * Add wait for pods * Updated README * Add fixed interval wait * Fix CodeBuild step order * Add file lock for experiment ID * Fix missing pytest parameter * Update run create only once * Add filelock to conda env * Update experiment name ensuring creation each time * Add try/catch with create experiment * Remove caching from KFP deployment * Remove disable KFP caching * Move .gitignore changes to inside component * Add blank line to default .gitignore |
||
|---|---|---|
| .. | ||
| arena | ||
| aws | ||
| deprecated | ||
| diagnostics/diagnose_me | ||
| filesystem | ||
| gcp | ||
| git/clone | ||
| google-cloud/storage | ||
| ibm-components | ||
| kubeflow | ||
| local | ||
| nuclio | ||
| presto/query | ||
| sample/keras/train_classifier | ||
| tensorflow/tensorboard/prepare_tensorboard | ||
| tfx | ||
| OWNERS | ||
| README.md | ||
| build_image.sh | ||
| license.sh | ||
| release.sh | ||
| test_load_all_components.sh | ||
| third_party_licenses.csv | ||
README.md
Kubeflow pipeline components
Kubeflow pipeline components are implementations of Kubeflow pipeline tasks. Each task takes one or more artifacts as input and may produce one or more artifacts as output.
Example: XGBoost DataProc components
Each task usually includes two parts:
Client code
The code that talks to endpoints to submit jobs. For example, code to talk to Google
Dataproc API to submit a Spark job.
Runtime code
The code that does the actual job and usually runs in the cluster. For example, Spark code
that transforms raw data into preprocessed data.
Container
A container image that runs the client code.
Note the naming convention for client code and runtime code—for a task named "mytask":
- The
mytask.pyprogram contains the client code. - The
mytaskdirectory contains all the runtime code.
See how to use the Kubeflow Pipelines SDK and build your own components.