Add recurring run demo (#917)

* Add recurring run demo
This commit is contained in:
Alex 2022-02-02 15:21:57 -05:00 committed by GitHub
parent 881853e53f
commit 373b18559a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
7 changed files with 465 additions and 0 deletions

View File

@ -11,3 +11,7 @@ that's suitable for presentation to public audiences.
* [Simple pipeline](simple_pipeline/): highlights the use of pipelines and
hyperparameter tuning on a GKE cluster with node autoprovisioning.
* [Recurring Run](recurring_run/): A simple demo that illustrates how to use
the Kubeflow Pipelines SDK to provision [recurring
runs](https://www.kubeflow.org/docs/components/pipelines/concepts/run/).

27
demos/recurring/README.md Normal file
View File

@ -0,0 +1,27 @@
# Kubeflow demo - Recurring runs with the KFP SDK
## 1. Setup your environment
This demo assumes that you have a functioning Kubeflow Pipelines deployment. If
not, follow the instructions
[here](https://www.kubeflow.org/docs/components/pipelines/installation/) and
[here](https://www.kubeflow.org/docs/components/pipelines/sdk/install-sdk/).
This demo has been verified to work with:
- KFP version `1.7.1`
- KFP SDK version `1.8.11`
Activate the conda environment you created following the above steps.
Create a Jupyter kernel for your conda environment.
```bash
ipython kernel install --name "kfp" --user
```
## 2. Run the KFP SDK script
Step through the provided [notebook](recurring.ipynb) to create a recurring run
using the KFP SDK. Make sure to select the `kfp` kernel that you created
earlier.

View File

@ -0,0 +1,19 @@
name: Download
inputs:
- {name: Url, type: URI}
metadata:
annotations:
author: Alexander Perlman <mythicalsunlight@gmail.com>
implementation:
container:
image: alpine/curl
command:
- sh
- -exc
- |
url="$0"
path='/tmp/script'
curl "$url" -o "$path"
chmod 700 "$path"
/bin/sh "$path"
- inputValue: Url

View File

@ -0,0 +1,56 @@
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: pipeline-
annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.11, pipelines.kubeflow.org/pipeline_compilation_time: '2022-01-31T21:51:10.599476',
pipelines.kubeflow.org/pipeline_spec: '{"inputs": [{"name": "url"}], "name": "Pipeline"}'}
labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.11}
spec:
entrypoint: pipeline
templates:
- name: download
container:
args: []
command:
- sh
- -exc
- |
url="$0"
path='/tmp/script'
curl "$url" -o "$path"
chmod 700 "$path"
/bin/sh "$path"
- '{{inputs.parameters.url}}'
image: alpine/curl
inputs:
parameters:
- {name: url}
metadata:
annotations: {author: Alexander Perlman <mythicalsunlight@gmail.com>, pipelines.kubeflow.org/component_spec: '{"implementation":
{"container": {"command": ["sh", "-exc", "url=\"$0\"\npath=''/tmp/script''\ncurl
\"$url\" -o \"$path\"\nchmod 700 \"$path\"\n/bin/sh \"$path\"\n", {"inputValue":
"Url"}], "image": "alpine/curl"}}, "inputs": [{"name": "Url", "type": "URI"}],
"metadata": {"annotations": {"author": "Alexander Perlman <mythicalsunlight@gmail.com>"}},
"name": "Download"}', pipelines.kubeflow.org/component_ref: '{"digest":
"1bb47e384d056817b16202398d1e5fc8ce02daf1e40f69e3103218402c05437b", "url":
"https://raw.githubusercontent.com/droctothorpe/examples/master/demos/recurring/component.yaml"}',
pipelines.kubeflow.org/arguments.parameters: '{"Url": "{{inputs.parameters.url}}"}'}
labels:
pipelines.kubeflow.org/kfp_sdk_version: 1.8.11
pipelines.kubeflow.org/pipeline-sdk-type: kfp
pipelines.kubeflow.org/enable_caching: "true"
- name: pipeline
inputs:
parameters:
- {name: url}
dag:
tasks:
- name: download
template: download
arguments:
parameters:
- {name: url, value: '{{inputs.parameters.url}}'}
arguments:
parameters:
- {name: url}
serviceAccountName: pipeline-runner

View File

@ -0,0 +1,302 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Recurring runs with the KFP SDK"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you're running on a local cluster, expose the GUI and API, respectively, with\n",
"the following commands:\n",
"\n",
"```\n",
"kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80\n",
"kubectl port-forward -n kubeflow svc/ml-pipeline-ui 3000:80\n",
"```\n",
"\n",
"The rest of this demo assumes that you're running locally.\n",
"\n",
"Instantiate the KFP SDK client. Set the host variable to the url and port where\n",
"you expose the KFP API. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import kfp\n",
"\n",
"host = 'http://localhost:3000'\n",
"client = kfp.Client(host=host)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a pipeline component from the provided component file. This component\n",
"retrieves and executes a script from a provided URL."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run_script = kfp.components.load_component_from_url(\n",
" 'https://raw.githubusercontent.com/kubeflow/examples/master/demos/recurring/component.yaml'\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a pipeline function."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def pipeline(url):\n",
" run_script_task = run_script(url=url)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Compile the pipeline function. We will pass the resulting yaml to the pipeline\n",
"execution invocations."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"kfp.compiler.Compiler().compile(\n",
" pipeline_func=pipeline,\n",
" package_path='download.yaml',\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a parameters dictionary with the url key. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"parameters = {\n",
" 'url': 'https://raw.githubusercontent.com/kubeflow/examples/master/demos/recurring/success.sh'\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can _optionally_ validate the pipeline with a single run before creating a recurring run."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"result = client.create_run_from_pipeline_func(\n",
" pipeline_func=pipeline,\n",
" arguments=parameters,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can retrieve the result of the pipeline run through the Kubeflow GUI, which\n",
"is the recommended approach. That being said, we can also interrogate the result\n",
"programmatically."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\n",
"result.wait_for_run_completion()\n",
"print(result.run_info)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we've validated a single run, let's create a recurring run.\n",
"\n",
"We first need to create an experiment since the `create_recurring_run` method\n",
"requires an `experiment_id`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"experiment = client.create_experiment('test')\n",
"\n",
"job = client.create_recurring_run(\n",
" experiment_id=experiment.id,\n",
" job_name='test',\n",
" cron_expression='*/2 * * * *', # Runs once every two minutes.\n",
" pipeline_package_path='download.yaml', # Pass in compiled output.\n",
" params=parameters,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The Kubeflow Pipelines GUI provides an excellent interface for interacting with\n",
"recurring runs, but you can interrogate the job programmatically if you prefer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(job)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the GUI, you can retrieve the logs of an individual run. They should\n",
"culminate with `Success!`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To disable the recurring run:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"client.disable_job(job.id)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To list recurring runs:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"client.list_recurring_runs()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To get details about an individual recurring run:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"client.get_recurring_run(job.id)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To delete a recurring run programmatically:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"result = client.delete_job(job.id)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Additional recurring run interactions via the SDK are documented [here](https://kubeflow-pipelines.readthedocs.io/en/stable/)."
]
}
],
"metadata": {
"interpreter": {
"hash": "8d1899d3d453529ab54a548c453eb03872168ef6a9900e12952b62a455030e12"
},
"kernelspec": {
"display_name": "Python 3.7.9 64-bit ('base': conda)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@ -0,0 +1 @@
echo "Success!"

56
download.yaml Normal file
View File

@ -0,0 +1,56 @@
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: pipeline-
annotations: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.11, pipelines.kubeflow.org/pipeline_compilation_time: '2022-02-01T08:52:49.948958',
pipelines.kubeflow.org/pipeline_spec: '{"inputs": [{"name": "url"}], "name": "Pipeline"}'}
labels: {pipelines.kubeflow.org/kfp_sdk_version: 1.8.11}
spec:
entrypoint: pipeline
templates:
- name: download
container:
args: []
command:
- sh
- -exc
- |
url="$0"
path='/tmp/script'
curl "$url" -o "$path"
chmod 700 "$path"
/bin/sh "$path"
- '{{inputs.parameters.url}}'
image: alpine/curl
inputs:
parameters:
- {name: url}
metadata:
annotations: {author: Alexander Perlman <mythicalsunlight@gmail.com>, pipelines.kubeflow.org/component_spec: '{"implementation":
{"container": {"command": ["sh", "-exc", "url=\"$0\"\npath=''/tmp/script''\ncurl
\"$url\" -o \"$path\"\nchmod 700 \"$path\"\n/bin/sh \"$path\"\n", {"inputValue":
"Url"}], "image": "alpine/curl"}}, "inputs": [{"name": "Url", "type": "URI"}],
"metadata": {"annotations": {"author": "Alexander Perlman <mythicalsunlight@gmail.com>"}},
"name": "Download"}', pipelines.kubeflow.org/component_ref: '{"digest":
"1bb47e384d056817b16202398d1e5fc8ce02daf1e40f69e3103218402c05437b", "url":
"https://raw.githubusercontent.com/droctothorpe/examples/master/demos/recurring/component.yaml"}',
pipelines.kubeflow.org/arguments.parameters: '{"Url": "{{inputs.parameters.url}}"}'}
labels:
pipelines.kubeflow.org/kfp_sdk_version: 1.8.11
pipelines.kubeflow.org/pipeline-sdk-type: kfp
pipelines.kubeflow.org/enable_caching: "true"
- name: pipeline
inputs:
parameters:
- {name: url}
dag:
tasks:
- name: download
template: download
arguments:
parameters:
- {name: url, value: '{{inputs.parameters.url}}'}
arguments:
parameters:
- {name: url}
serviceAccountName: pipeline-runner