pipelines/components
Pascal Vicaire 67b011f228 Upgrading the container versions to 0.0.42, the version of the first release of kubeflow/pipelines. 2018-11-02 16:57:37 -07:00
..
dataflow Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00
dataproc Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00
kubeflow Upgrading the container versions to 0.0.42, the version of the first release of kubeflow/pipelines. 2018-11-02 16:57:37 -07:00
local Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00
resnet-cmle Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00
README.md Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00
license.sh Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00
third_party_licenses.csv Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00

README.md

ML Pipeline Components

ML Pipeline Components are implementation of ML Pipeline tasks. Each task takes one or more artifacts as input and may produce one or more artifacts.

XGBoost DataProc Components

Each task usually includes two parts:

Client Code The code that talks to endpoints to submit jobs. For example, code to talk to Google Dataproc API to submit Spark job.

Runtime Code The code that does the actual job and usually run in cluster. For example, Spark code that transform raw data into preprocessed data.

Container A container image that runs the client code.

There is a naming convention to client code and runtime code. For a task named "mytask", there is mytask.py including client code, and there is also a mytask directory holding all runtime code.