mirror of https://github.com/kubeflow/examples.git
Remove outdated video, Add KFP spark example (#902)
* Add js-ts as an approver * Remove outdated video * Add kfp-spark example
This commit is contained in:
parent
79418168c3
commit
11ebbba517
|
|
@ -0,0 +1,201 @@
|
|||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
|
|
@ -0,0 +1,50 @@
|
|||
KFP version: 1.7.0+
|
||||
Kubernetes version: 1.17+
|
||||
|
||||
# Orchestrate Spark Jobs using Kubeflow pipelines
|
||||
|
||||
## Install kubeflow pipelines standalone or full kubeflow
|
||||
|
||||
### for standalone kubeflow pipelines installation
|
||||
https://www.kubeflow.org/docs/components/pipelines/installation/
|
||||
|
||||
### for full kubeflow installation
|
||||
https://www.kubeflow.org/docs/started/installing-kubeflow/
|
||||
|
||||
## Install Spark Operator
|
||||
|
||||
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator#installation
|
||||
|
||||
## Create Spark Service Account and add permissions
|
||||
|
||||
```
|
||||
kubectl apply -f ./scripts/spark-rbac.yaml
|
||||
```
|
||||
|
||||
## Run the notebok kubeflow-pipeline.ipynb
|
||||
|
||||
## Access Kubflow/KFP UI
|
||||
|
||||

|
||||
|
||||
## OR
|
||||
|
||||

|
||||
|
||||
## Upload pipeline
|
||||
|
||||
Upload the spark_job_pipeline.yaml file
|
||||
|
||||

|
||||
|
||||
# Create Run
|
||||
|
||||

|
||||
|
||||
# Start Pipeline add service account `spark-sa`
|
||||
|
||||

|
||||
|
||||
# Wait till the execution is finished. check the `print-message` logs to view the result
|
||||
|
||||

|
||||
Binary file not shown.
|
After Width: | Height: | Size: 260 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 37 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 106 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 346 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 35 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 49 KiB |
|
|
@ -0,0 +1,32 @@
|
|||
name: Apply Kubernetes object
|
||||
inputs:
|
||||
- {name: Object, type: JsonObject}
|
||||
outputs:
|
||||
- {name: Name, type: String}
|
||||
- {name: Kind, type: String}
|
||||
- {name: Object, type: JsonObject}
|
||||
metadata:
|
||||
annotations:
|
||||
author: Alexey Volkov <alexey.volkov@ark-kun.com>
|
||||
implementation:
|
||||
container:
|
||||
image: bitnami/kubectl:1.17.17
|
||||
command:
|
||||
- bash
|
||||
- -exc
|
||||
- |
|
||||
object_path=$0
|
||||
output_name_path=$1
|
||||
output_kind_path=$2
|
||||
output_object_path=$3
|
||||
mkdir -p "$(dirname "$output_name_path")"
|
||||
mkdir -p "$(dirname "$output_kind_path")"
|
||||
mkdir -p "$(dirname "$output_object_path")"
|
||||
kubectl apply -f "$object_path" --output=json > "$output_object_path"
|
||||
|
||||
< "$output_object_path" jq '.metadata.name' --raw-output > "$output_name_path"
|
||||
< "$output_object_path" jq '.kind' --raw-output > "$output_kind_path"
|
||||
- {inputPath: Object}
|
||||
- {outputPath: Name}
|
||||
- {outputPath: Kind}
|
||||
- {outputPath: Object}
|
||||
|
|
@ -0,0 +1,37 @@
|
|||
name: Get Kubernetes object
|
||||
inputs:
|
||||
- {name: Name, type: String}
|
||||
- {name: Kind, type: String}
|
||||
outputs:
|
||||
- {name: Name, type: String}
|
||||
- {name: ApplicationState, type: String}
|
||||
- {name: Object, type: JsonObject}
|
||||
metadata:
|
||||
annotations:
|
||||
author: Alexey Volkov <alexey.volkov@ark-kun.com>
|
||||
implementation:
|
||||
container:
|
||||
image: bitnami/kubectl:1.17.17
|
||||
command:
|
||||
- bash
|
||||
- -exc
|
||||
- |
|
||||
object_name=$0
|
||||
object_type=$1
|
||||
output_name_path=$2
|
||||
output_state_path=$3
|
||||
output_object_path=$4
|
||||
mkdir -p "$(dirname "$output_name_path")"
|
||||
mkdir -p "$(dirname "$output_state_path")"
|
||||
mkdir -p "$(dirname "$output_object_path")"
|
||||
|
||||
kubectl get "$object_type" "$object_name" --output=json > "$output_object_path"
|
||||
|
||||
< "$output_object_path" jq '.metadata.name' --raw-output > "$output_name_path"
|
||||
< "$output_object_path" jq '.status.applicationState.state' --raw-output > "$output_state_path"
|
||||
|
||||
- {inputValue: Name}
|
||||
- {inputValue: Kind}
|
||||
- {outputPath: Name}
|
||||
- {outputPath: ApplicationState}
|
||||
- {outputPath: Object}
|
||||
|
|
@ -0,0 +1,264 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Run the following command to install the Kubeflow Pipelines SDK. If you run this command in a Jupyter\n",
|
||||
" notebook, restart the kernel after installing the SDK. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install kfp --upgrade\n",
|
||||
"# to install tekton compiler uncomment the line below\n",
|
||||
"# %pip install kfp_tekton"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Import Packages"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import json\n",
|
||||
"import time\n",
|
||||
"import yaml\n",
|
||||
"\n",
|
||||
"import kfp\n",
|
||||
"import kfp.components as comp\n",
|
||||
"import kfp.dsl as dsl"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"SPARK_COMPLETED_STATE = \"COMPLETED\"\n",
|
||||
"SPARK_APPLICATION_KIND = \"sparkapplications\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\n",
|
||||
"def get_spark_job_definition():\n",
|
||||
" \"\"\"\n",
|
||||
" Read Spark Operator job manifest file and return the corresponding dictionary and\n",
|
||||
" add some randomness in the job name\n",
|
||||
" :return: dictionary defining the spark job\n",
|
||||
" \"\"\"\n",
|
||||
" # Read manifest file\n",
|
||||
" with open(\"spark-job.yaml\", \"r\") as stream:\n",
|
||||
" spark_job_manifest = yaml.safe_load(stream)\n",
|
||||
"\n",
|
||||
" # Add epoch time in the job name\n",
|
||||
" epoch = int(time.time())\n",
|
||||
" spark_job_manifest[\"metadata\"][\"name\"] = spark_job_manifest[\"metadata\"][\"name\"].format(epoch=epoch)\n",
|
||||
"\n",
|
||||
" return spark_job_manifest"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def print_op(msg):\n",
|
||||
" \"\"\"\n",
|
||||
" Op to print a message.\n",
|
||||
" \"\"\"\n",
|
||||
" return dsl.ContainerOp(\n",
|
||||
" name=\"Print message.\",\n",
|
||||
" image=\"alpine:3.6\",\n",
|
||||
" command=[\"echo\", msg],\n",
|
||||
" )"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"@dsl.graph_component # Graph component decorator is used to annotate recursive functions\n",
|
||||
"def graph_component_spark_app_status(input_application_name):\n",
|
||||
" k8s_get_op = comp.load_component_from_file(\"k8s-get-component.yaml\")\n",
|
||||
" check_spark_application_status_op = k8s_get_op(\n",
|
||||
" name=input_application_name,\n",
|
||||
" kind=SPARK_APPLICATION_KIND\n",
|
||||
" )\n",
|
||||
" # Remove cache\n",
|
||||
" check_spark_application_status_op.execution_options.caching_strategy.max_cache_staleness = \"P0D\"\n",
|
||||
"\n",
|
||||
" time.sleep(5)\n",
|
||||
" with dsl.Condition(check_spark_application_status_op.outputs[\"applicationstate\"] != SPARK_COMPLETED_STATE):\n",
|
||||
" graph_component_spark_app_status(check_spark_application_status_op.outputs[\"name\"])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"@dsl.pipeline(\n",
|
||||
" name=\"Spark Operator job pipeline\",\n",
|
||||
" description=\"Spark Operator job pipeline\"\n",
|
||||
")\n",
|
||||
"def spark_job_pipeline():\n",
|
||||
"\n",
|
||||
" # Load spark job manifest\n",
|
||||
" spark_job_definition = get_spark_job_definition()\n",
|
||||
"\n",
|
||||
" # Load the kubernetes apply component\n",
|
||||
" k8s_apply_op = comp.load_component_from_file(\"k8s-apply-component.yaml\")\n",
|
||||
"\n",
|
||||
" # Execute the apply command\n",
|
||||
" spark_job_op = k8s_apply_op(object=json.dumps(spark_job_definition))\n",
|
||||
"\n",
|
||||
" # Fetch spark job name\n",
|
||||
" spark_job_name = spark_job_op.outputs[\"name\"]\n",
|
||||
"\n",
|
||||
" # Remove cache for the apply operator\n",
|
||||
" spark_job_op.execution_options.caching_strategy.max_cache_staleness = \"P0D\"\n",
|
||||
"\n",
|
||||
" spark_application_status_op = graph_component_spark_app_status(spark_job_op.outputs[\"name\"])\n",
|
||||
" spark_application_status_op.after(spark_job_op)\n",
|
||||
"\n",
|
||||
" print_message = print_op(f\"Job {spark_job_name} is completed.\")\n",
|
||||
" print_message.after(spark_application_status_op)\n",
|
||||
" print_message.execution_options.caching_strategy.max_cache_staleness = \"P0D\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Compile and run your pipeline\n",
|
||||
"\n",
|
||||
"After defining the pipeline in Python as described in the preceding section, use one of the following options to compile the pipeline and submit it to the Kubeflow Pipelines service.\n",
|
||||
"\n",
|
||||
"#### Option 1: Compile and then upload in UI\n",
|
||||
"\n",
|
||||
"1. Run the following to compile your pipeline and save it as `spark_job_pipeline.yaml`. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"For Argo (Default)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# create piepline file for argo backend the default one if you use tekton use the block below\n",
|
||||
"if __name__ == \"__main__\":\n",
|
||||
" # Compile the pipeline\n",
|
||||
" import kfp.compiler as compiler\n",
|
||||
" import logging\n",
|
||||
" logging.basicConfig(level=logging.INFO)\n",
|
||||
" pipeline_func = spark_job_pipeline\n",
|
||||
" pipeline_filename = pipeline_func.__name__ + \".yaml\"\n",
|
||||
" compiler.Compiler().compile(pipeline_func, pipeline_filename)\n",
|
||||
" logging.info(f\"Generated pipeline file: {pipeline_filename}.\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"For Tekton"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# uncomment the block below to create pipeline file for tekton\n",
|
||||
"\n",
|
||||
"# if __name__ == '__main__':\n",
|
||||
"# from kfp_tekton.compiler import TektonCompiler\n",
|
||||
"# import logging\n",
|
||||
"# logging.basicConfig(level=logging.INFO)\n",
|
||||
"# pipeline_func = spark_job_pipeline\n",
|
||||
"# pipeline_filename = pipeline_func.__name__ + \".yaml\"\n",
|
||||
"# TektonCompiler().compile(pipeline_func, pipeline_filename)\n",
|
||||
"# logging.info(f\"Generated pipeline file: {pipeline_filename}.\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"2. Upload and run your `spark_job_pipeline.yaml` using the Kubeflow Pipelines user interface.\n",
|
||||
"See the guide to [getting started with the UI][quickstart].\n",
|
||||
"\n",
|
||||
"[quickstart]: https://www.kubeflow.org/docs/components/pipelines/overview/quickstart"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Option 2: run the pipeline using Kubeflow Pipelines SDK client\n",
|
||||
"\n",
|
||||
"1. Create an instance of the [`kfp.Client` class][kfp-client] following steps in [connecting to Kubeflow Pipelines using the SDK client][connect-api].\n",
|
||||
"\n",
|
||||
"[kfp-client]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.client.html#kfp.Client\n",
|
||||
"[connect-api]: https://www.kubeflow.org/docs/components/pipelines/sdk/connect-api"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"client = kfp.Client() # change arguments accordingly"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"client.create_run_from_pipeline_func(\n",
|
||||
" spark_job_pipeline)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
},
|
||||
"orig_nbformat": 4
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
|
|
@ -0,0 +1,54 @@
|
|||
#
|
||||
# Copyright 2017 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# https://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
apiVersion: "sparkoperator.k8s.io/v1beta2"
|
||||
kind: SparkApplication
|
||||
metadata:
|
||||
name: spark-pi-{epoch}
|
||||
namespace: kubeflow
|
||||
spec:
|
||||
type: Scala
|
||||
mode: cluster
|
||||
image: "gcr.io/spark-operator/spark:v3.1.1"
|
||||
imagePullPolicy: Always
|
||||
mainClass: org.apache.spark.examples.SparkPi
|
||||
mainApplicationFile: "local:///opt/spark/examples/jars/spark-examples_2.12-3.1.1.jar"
|
||||
sparkVersion: "3.1.1"
|
||||
restartPolicy:
|
||||
type: Never
|
||||
volumes:
|
||||
- name: "test-volume"
|
||||
hostPath:
|
||||
path: "/tmp"
|
||||
type: Directory
|
||||
driver:
|
||||
cores: 1
|
||||
coreLimit: "1200m"
|
||||
memory: "512m"
|
||||
labels:
|
||||
version: 3.1.1
|
||||
serviceAccount: spark-sa
|
||||
volumeMounts:
|
||||
- name: "test-volume"
|
||||
mountPath: "/tmp"
|
||||
executor:
|
||||
cores: 1
|
||||
instances: 2
|
||||
memory: "1024m"
|
||||
labels:
|
||||
version: 3.1.1
|
||||
volumeMounts:
|
||||
- name: "test-volume"
|
||||
mountPath: "/tmp"
|
||||
|
|
@ -0,0 +1,32 @@
|
|||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: spark-sa
|
||||
namespace: kubeflow
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: Role
|
||||
metadata:
|
||||
namespace: kubeflow
|
||||
name: spark-role
|
||||
rules:
|
||||
- apiGroups: [""]
|
||||
resources: ["pods", "services", "configmaps", "pods/log"]
|
||||
verbs: ["create", "get", "watch", "list", "post", "delete", "patch"]
|
||||
- apiGroups: ["sparkoperator.k8s.io"]
|
||||
resources: ["sparkapplications"]
|
||||
verbs: ["create", "get", "watch", "list", "post", "delete", "patch"]
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: RoleBinding
|
||||
metadata:
|
||||
name: spark-role-binding
|
||||
namespace: kubeflow
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: spark-sa
|
||||
namespace: kubeflow
|
||||
roleRef:
|
||||
kind: Role
|
||||
name: spark-role
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
|
|
@ -0,0 +1,201 @@
|
|||
apiVersion: argoproj.io/v1alpha1
|
||||
kind: Workflow
|
||||
metadata:
|
||||
generateName: spark-operator-job-pipeline-
|
||||
annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10, pipelines.kubeflow.org/pipeline_compilation_time: '2021-12-14T17:26:58.647651',
|
||||
pipelines.kubeflow.org/pipeline_spec: '{"description": "Spark Operator job pipeline",
|
||||
"name": "Spark Operator job pipeline"}'}
|
||||
labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.10}
|
||||
spec:
|
||||
entrypoint: spark-operator-job-pipeline
|
||||
templates:
|
||||
- name: apply-kubernetes-object
|
||||
container:
|
||||
args: []
|
||||
command:
|
||||
- bash
|
||||
- -exc
|
||||
- |
|
||||
object_path=$0
|
||||
output_name_path=$1
|
||||
output_kind_path=$2
|
||||
output_object_path=$3
|
||||
mkdir -p "$(dirname "$output_name_path")"
|
||||
mkdir -p "$(dirname "$output_kind_path")"
|
||||
mkdir -p "$(dirname "$output_object_path")"
|
||||
kubectl apply -f "$object_path" --output=json > "$output_object_path"
|
||||
|
||||
< "$output_object_path" jq '.metadata.name' --raw-output > "$output_name_path"
|
||||
< "$output_object_path" jq '.kind' --raw-output > "$output_kind_path"
|
||||
- /tmp/inputs/Object/data
|
||||
- /tmp/outputs/Name/data
|
||||
- /tmp/outputs/Kind/data
|
||||
- /tmp/outputs/Object/data
|
||||
image: bitnami/kubectl:1.17.17
|
||||
inputs:
|
||||
artifacts:
|
||||
- name: Object
|
||||
path: /tmp/inputs/Object/data
|
||||
raw: {data: '{"apiVersion": "sparkoperator.k8s.io/v1beta2", "kind": "SparkApplication",
|
||||
"metadata": {"name": "spark-pi-1639502813", "namespace": "kubeflow"},
|
||||
"spec": {"type": "Scala", "mode": "cluster", "image": "gcr.io/spark-operator/spark:v3.1.1",
|
||||
"imagePullPolicy": "Always", "mainClass": "org.apache.spark.examples.SparkPi",
|
||||
"mainApplicationFile": "local:///opt/spark/examples/jars/spark-examples_2.12-3.1.1.jar",
|
||||
"sparkVersion": "3.1.1", "restartPolicy": {"type": "Never"}, "volumes":
|
||||
[{"name": "test-volume", "hostPath": {"path": "/tmp", "type": "Directory"}}],
|
||||
"driver": {"cores": 1, "coreLimit": "1200m", "memory": "512m", "labels":
|
||||
{"version": "3.1.1"}, "serviceAccount": "spark-sa", "volumeMounts": [{"name":
|
||||
"test-volume", "mountPath": "/tmp"}]}, "executor": {"cores": 1, "instances":
|
||||
2, "memory": "1024m", "labels": {"version": "3.1.1"}, "volumeMounts":
|
||||
[{"name": "test-volume", "mountPath": "/tmp"}]}}}'}
|
||||
outputs:
|
||||
parameters:
|
||||
- name: apply-kubernetes-object-Name
|
||||
valueFrom: {path: /tmp/outputs/Name/data}
|
||||
artifacts:
|
||||
- {name: apply-kubernetes-object-Kind, path: /tmp/outputs/Kind/data}
|
||||
- {name: apply-kubernetes-object-Name, path: /tmp/outputs/Name/data}
|
||||
- {name: apply-kubernetes-object-Object, path: /tmp/outputs/Object/data}
|
||||
metadata:
|
||||
annotations: {author: Alexey Volkov <alexey.volkov@ark-kun.com>, pipelines.kubeflow.org/component_spec: '{"implementation":
|
||||
{"container": {"command": ["bash", "-exc", "object_path=$0\noutput_name_path=$1\noutput_kind_path=$2\noutput_object_path=$3\nmkdir
|
||||
-p \"$(dirname \"$output_name_path\")\"\nmkdir -p \"$(dirname \"$output_kind_path\")\"\nmkdir
|
||||
-p \"$(dirname \"$output_object_path\")\"\nkubectl apply -f \"$object_path\"
|
||||
--output=json > \"$output_object_path\"\n\n< \"$output_object_path\" jq
|
||||
''.metadata.name'' --raw-output > \"$output_name_path\"\n< \"$output_object_path\"
|
||||
jq ''.kind'' --raw-output > \"$output_kind_path\"\n", {"inputPath": "Object"},
|
||||
{"outputPath": "Name"}, {"outputPath": "Kind"}, {"outputPath": "Object"}],
|
||||
"image": "bitnami/kubectl:1.17.17"}}, "inputs": [{"name": "Object", "type":
|
||||
"JsonObject"}], "metadata": {"annotations": {"author": "Alexey Volkov <alexey.volkov@ark-kun.com>"}},
|
||||
"name": "Apply Kubernetes object", "outputs": [{"name": "Name", "type":
|
||||
"String"}, {"name": "Kind", "type": "String"}, {"name": "Object", "type":
|
||||
"JsonObject"}]}', pipelines.kubeflow.org/component_ref: '{"digest": "31e4123b45bebd4323a4ffd51fea3744046f9be8e77a2ccf06ba09f80359fcf5",
|
||||
"url": "k8s-apply-component.yaml"}', pipelines.kubeflow.org/max_cache_staleness: P0D}
|
||||
labels:
|
||||
pipelines.kubeflow.org/kfp_sdk_version: 1.8.10
|
||||
pipelines.kubeflow.org/pipeline-sdk-type: kfp
|
||||
pipelines.kubeflow.org/enable_caching: "true"
|
||||
- name: condition-2
|
||||
inputs:
|
||||
parameters:
|
||||
- {name: get-kubernetes-object-Name}
|
||||
dag:
|
||||
tasks:
|
||||
- name: graph-graph-component-spark-app-status-1
|
||||
template: graph-graph-component-spark-app-status-1
|
||||
arguments:
|
||||
parameters:
|
||||
- {name: apply-kubernetes-object-Name, value: '{{inputs.parameters.get-kubernetes-object-Name}}'}
|
||||
- name: get-kubernetes-object
|
||||
container:
|
||||
args: []
|
||||
command:
|
||||
- bash
|
||||
- -exc
|
||||
- |
|
||||
object_name=$0
|
||||
object_type=$1
|
||||
output_name_path=$2
|
||||
output_state_path=$3
|
||||
output_object_path=$4
|
||||
mkdir -p "$(dirname "$output_name_path")"
|
||||
mkdir -p "$(dirname "$output_state_path")"
|
||||
mkdir -p "$(dirname "$output_object_path")"
|
||||
|
||||
kubectl get "$object_type" "$object_name" --output=json > "$output_object_path"
|
||||
|
||||
< "$output_object_path" jq '.metadata.name' --raw-output > "$output_name_path"
|
||||
< "$output_object_path" jq '.status.applicationState.state' --raw-output > "$output_state_path"
|
||||
- '{{inputs.parameters.apply-kubernetes-object-Name}}'
|
||||
- sparkapplications
|
||||
- /tmp/outputs/Name/data
|
||||
- /tmp/outputs/ApplicationState/data
|
||||
- /tmp/outputs/Object/data
|
||||
image: bitnami/kubectl:1.17.17
|
||||
inputs:
|
||||
parameters:
|
||||
- {name: apply-kubernetes-object-Name}
|
||||
outputs:
|
||||
parameters:
|
||||
- name: get-kubernetes-object-ApplicationState
|
||||
valueFrom: {path: /tmp/outputs/ApplicationState/data}
|
||||
- name: get-kubernetes-object-Name
|
||||
valueFrom: {path: /tmp/outputs/Name/data}
|
||||
artifacts:
|
||||
- {name: get-kubernetes-object-ApplicationState, path: /tmp/outputs/ApplicationState/data}
|
||||
- {name: get-kubernetes-object-Name, path: /tmp/outputs/Name/data}
|
||||
- {name: get-kubernetes-object-Object, path: /tmp/outputs/Object/data}
|
||||
metadata:
|
||||
annotations: {author: Alexey Volkov <alexey.volkov@ark-kun.com>, pipelines.kubeflow.org/component_spec: '{"implementation":
|
||||
{"container": {"command": ["bash", "-exc", "object_name=$0\nobject_type=$1\noutput_name_path=$2\noutput_state_path=$3\noutput_object_path=$4\nmkdir
|
||||
-p \"$(dirname \"$output_name_path\")\"\nmkdir -p \"$(dirname \"$output_state_path\")\"\nmkdir
|
||||
-p \"$(dirname \"$output_object_path\")\"\n\nkubectl get \"$object_type\"
|
||||
\"$object_name\" --output=json > \"$output_object_path\"\n\n< \"$output_object_path\"
|
||||
jq ''.metadata.name'' --raw-output > \"$output_name_path\"\n< \"$output_object_path\"
|
||||
jq ''.status.applicationState.state'' --raw-output > \"$output_state_path\"\n",
|
||||
{"inputValue": "Name"}, {"inputValue": "Kind"}, {"outputPath": "Name"},
|
||||
{"outputPath": "ApplicationState"}, {"outputPath": "Object"}], "image":
|
||||
"bitnami/kubectl:1.17.17"}}, "inputs": [{"name": "Name", "type": "String"},
|
||||
{"name": "Kind", "type": "String"}], "metadata": {"annotations": {"author":
|
||||
"Alexey Volkov <alexey.volkov@ark-kun.com>"}}, "name": "Get Kubernetes object",
|
||||
"outputs": [{"name": "Name", "type": "String"}, {"name": "ApplicationState",
|
||||
"type": "String"}, {"name": "Object", "type": "JsonObject"}]}', pipelines.kubeflow.org/component_ref: '{"digest":
|
||||
"fde6162e7783ca7b16b16ad04b667ab01a29c1fb133191941312cc4605114a2c", "url":
|
||||
"k8s-get-component.yaml"}', pipelines.kubeflow.org/arguments.parameters: '{"Kind":
|
||||
"sparkapplications", "Name": "{{inputs.parameters.apply-kubernetes-object-Name}}"}',
|
||||
pipelines.kubeflow.org/max_cache_staleness: P0D}
|
||||
labels:
|
||||
pipelines.kubeflow.org/kfp_sdk_version: 1.8.10
|
||||
pipelines.kubeflow.org/pipeline-sdk-type: kfp
|
||||
pipelines.kubeflow.org/enable_caching: "true"
|
||||
- name: graph-graph-component-spark-app-status-1
|
||||
inputs:
|
||||
parameters:
|
||||
- {name: apply-kubernetes-object-Name}
|
||||
dag:
|
||||
tasks:
|
||||
- name: condition-2
|
||||
template: condition-2
|
||||
when: '"{{tasks.get-kubernetes-object.outputs.parameters.get-kubernetes-object-ApplicationState}}"
|
||||
!= "COMPLETED"'
|
||||
dependencies: [get-kubernetes-object]
|
||||
arguments:
|
||||
parameters:
|
||||
- {name: get-kubernetes-object-Name, value: '{{tasks.get-kubernetes-object.outputs.parameters.get-kubernetes-object-Name}}'}
|
||||
- name: get-kubernetes-object
|
||||
template: get-kubernetes-object
|
||||
arguments:
|
||||
parameters:
|
||||
- {name: apply-kubernetes-object-Name, value: '{{inputs.parameters.apply-kubernetes-object-Name}}'}
|
||||
- name: print-message
|
||||
container:
|
||||
command: [echo, 'Job {{inputs.parameters.apply-kubernetes-object-Name}} is completed.']
|
||||
image: alpine:3.6
|
||||
inputs:
|
||||
parameters:
|
||||
- {name: apply-kubernetes-object-Name}
|
||||
metadata:
|
||||
labels:
|
||||
pipelines.kubeflow.org/kfp_sdk_version: 1.8.10
|
||||
pipelines.kubeflow.org/pipeline-sdk-type: kfp
|
||||
pipelines.kubeflow.org/enable_caching: "true"
|
||||
annotations: {pipelines.kubeflow.org/max_cache_staleness: P0D}
|
||||
- name: spark-operator-job-pipeline
|
||||
dag:
|
||||
tasks:
|
||||
- {name: apply-kubernetes-object, template: apply-kubernetes-object}
|
||||
- name: graph-graph-component-spark-app-status-1
|
||||
template: graph-graph-component-spark-app-status-1
|
||||
dependencies: [apply-kubernetes-object]
|
||||
arguments:
|
||||
parameters:
|
||||
- {name: apply-kubernetes-object-Name, value: '{{tasks.apply-kubernetes-object.outputs.parameters.apply-kubernetes-object-Name}}'}
|
||||
- name: print-message
|
||||
template: print-message
|
||||
dependencies: [apply-kubernetes-object, graph-graph-component-spark-app-status-1]
|
||||
arguments:
|
||||
parameters:
|
||||
- {name: apply-kubernetes-object-Name, value: '{{tasks.apply-kubernetes-object.outputs.parameters.apply-kubernetes-object-Name}}'}
|
||||
arguments:
|
||||
parameters: []
|
||||
serviceAccountName: pipeline-runner
|
||||
|
|
@ -1,12 +0,0 @@
|
|||
# Kubeflow Videos
|
||||
|
||||
This repository contains the show notes for videos that highlight Kubeflow
|
||||
capabilities. Here you can find the Terminal commands and links from your favorite
|
||||
videos, to save on manual transcription.
|
||||
|
||||
## Installation
|
||||
|
||||
* [From Zero to Kubeflow](from_zero_to_kubeflow/): Michelle Casbon gives a
|
||||
walkthrough of two different ways to install Kubeflow from scratch on GCP:
|
||||
via the web and command-line.
|
||||
|
||||
|
|
@ -1,51 +0,0 @@
|
|||
# From Zero to Kubeflow
|
||||
|
||||
Video link: [YouTube](https://www.youtube.com/watch?v=AF-WH967_s4)
|
||||
|
||||
## Description
|
||||
|
||||
Michelle Casbon gives a straightforward walkthrough of two different ways to
|
||||
install Kubeflow from scratch on GCP:
|
||||
|
||||
* Web-based - [Click-to-deploy](https://deploy.kubeflow.cloud)
|
||||
* CLI - [kfctl](https://www.kubeflow.org/docs/gke/deploy/deploy-cli/)
|
||||
|
||||
## Commands
|
||||
|
||||
The following Terminal commands are used.
|
||||
|
||||
### Download the `kfctl` binary
|
||||
|
||||
```
|
||||
export KUBEFLOW_TAG=0.5.1
|
||||
wget -P /tmp https://github.com/kubeflow/kubeflow/releases/download/v${KUBEFLOW_TAG}/kfctl_v${KUBEFLOW_TAG}_darwin.tar.gz
|
||||
tar -xvf /tmp/kfctl_v${KUBEFLOW_TAG}_darwin.tar.gz -C ${HOME}/bin
|
||||
```
|
||||
|
||||
### Generate the project directory
|
||||
|
||||
```
|
||||
export PROJECT_ID=<project_id>
|
||||
export CLIENT_ID=<oauth_client_id>
|
||||
export CLIENT_SECRET=<oauth_client_secret>
|
||||
kfctl init kubeflow-cli --platform gcp --project ${PROJECT_ID}
|
||||
```
|
||||
|
||||
### Generate all files
|
||||
|
||||
```
|
||||
kfctl generate all --zone us-central1-c
|
||||
```
|
||||
|
||||
### Create all platform and Kubernetes objects
|
||||
|
||||
```
|
||||
kfctl apply all
|
||||
```
|
||||
|
||||
## Links
|
||||
|
||||
* [codelabs.developers.google.com](https://codelabs.developers.google.com/)
|
||||
* [github.com/kubeflow/examples](https://github.com/kubeflow/examples)
|
||||
* [kubeflow.org](https://www.kubeflow.org/)
|
||||
|
||||
Loading…
Reference in New Issue