mirror of https://github.com/kubeflow/examples.git
51 lines
1.1 KiB
Markdown
51 lines
1.1 KiB
Markdown
KFP version: 1.7.0+
|
|
Kubernetes version: 1.17+
|
|
|
|
# Orchestrate Spark Jobs using Kubeflow pipelines
|
|
|
|
## Install kubeflow pipelines standalone or full kubeflow
|
|
|
|
### for standalone kubeflow pipelines installation
|
|
https://www.kubeflow.org/docs/components/pipelines/installation/
|
|
|
|
### for full kubeflow installation
|
|
https://www.kubeflow.org/docs/started/installing-kubeflow/
|
|
|
|
## Install Spark Operator
|
|
|
|
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator#installation
|
|
|
|
## Create Spark Service Account and add permissions
|
|
|
|
```
|
|
kubectl apply -f ./scripts/spark-rbac.yaml
|
|
```
|
|
|
|
## Run the notebok kubeflow-pipeline.ipynb
|
|
|
|
## Access Kubflow/KFP UI
|
|
|
|

|
|
|
|
## OR
|
|
|
|

|
|
|
|
## Upload pipeline
|
|
|
|
Upload the spark_job_pipeline.yaml file
|
|
|
|

|
|
|
|
# Create Run
|
|
|
|

|
|
|
|
# Start Pipeline add service account `spark-sa`
|
|
|
|

|
|
|
|
# Wait till the execution is finished. check the `print-message` logs to view the result
|
|
|
|

|