Apply Docs Restructure to `v1.2-branch` = update `v1.2-branch` to current `master` v2 (#2612)

* Create "Distributions" with kfctl + Kubeflow Operator (#2492)

* Create methods folder == section

* Move /operator under /methods

* Update links on Operator

* Add 'kfctl' folder == section

* mv kfctl specific minikube docs under /kfctl

* Update links on minikube

* mv kustomize from other-guides to /kfctl

* fix links for kustomize change

* delete outdated redirect

* move istio-dex-auth to /kfctl + rename to multi-user

* fix links after name change

* move kfctl install under /kfctl + rename to deployment

* fix links after move

* Add OWNERS for accountability

Update kfctl description

Update content/en/docs/methods/_index.md

* Add redirects for Operator

* Add redirects for kfctl

* Rename "methods" to "distributions"

* update redirects to distributions as folder name

* doc: Add instructions to access cluster with IBM Cloud vpc-gen2. (#2530)

* doc, Add instructions to access cluster with IBM Cloud vpc-gen2.

* added extra steps.

* Improved formatting

* Added details for creating cluster against existing VPC

* Apply suggestions from code review

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Formatting fixes, as per the review.

* Added a note about security.

* Choose between a classic or vpc-gen2 provider.

* added a note

* formatting fixes

* Apply suggestions from code review

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Document split up.

* Cleanup.

* Apply suggestions from code review

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Tommy Li <Tommy.chaoping.li@ibm.com>

* Formatting improvements and cleanup.

* format fixes

* Apply suggestions from code review

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: Tommy Li <Tommy.chaoping.li@ibm.com>

* Add RFMVasconcelos to OWNERS/approvers (#2539)

* Deletes old redirects - pages do not exist anymore (#2552)

* Move `AWS` platform under /distributions (#2551)

* move /aws under /distributions

* fix AWS redirects + add catch-all

* update broken link (#2557)

* UPDATE fix broken links to tensoflorw serving (#2558)

* Move `Google` platform under /distributions (#2547)

* move /gke folder to under /distributions

* update redirects

* Move `Azure` platform under /distributions (#2548)

* mv /azure to /distributions

* add catch-all azure to redirects

* KFP - Update Python function-based component doc with param naming rules (#2544)

* Describe pipeline param naming

Adds notes on how the KFP SDK updates param names to describe the data instead of the implementation. Updates passing data by value to indicate that users can pass lists and dictionaries.

* Update auto-gen Markdown 

Updates python-function-components.md with changes to python-function-components.ipynb.

* Move `Openshift` platform under /distributions (#2550)

* move /openshift to under /distributions

* add openshift catch-all to redirects

* Move `IBM` platform under /distributions (#2549)

* move /ibm to under /distributions

* Add IBM catch-all to redirects

* [IBM] Update openshift kubeflow installation (#2560)

* Make kfctl first distribution (#2562)

* Move getting started on K8s page to under kfctl distribution (#2569)

* mv overview to under kfctl

* delete empty getting started with k8s section

* Add redirect to catch traffic

* Update GCP distribution OWNERS (#2574)

* Update KFP shortcodes OWNERS (#2575)

* Move MicroK8s to distributions (#2577)

* create microk8s folder in distributions

* move microk8s docs to distributions

* update title

* Add redirect for MicroK8s move - missed on #2577 (#2579)

* Add Charmed Kubeflow Operators to list of available Kubeflow distributions (#2578)

* Uplevel clouds for a level playing field

* Add Owners + Index of Charmed Kubeflow

* Add install page to Charmed Kubeflow distribution

* Link to Charmed Kubeflow docs

* Naming corrections

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Update content/en/docs/distributions/charmed/install-kubeflow.md

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* final fixes

Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>

* Fix broken link (#2581)

* IBM Cloud docs: update pipelines SDK setup for single-user (#2571)

Made the following changes to the instructions for setting
up the pipelines SDK for single-user.

* append '/pipeline' to the host string
* add client.list_experiments to make sure the setup is working,
consistent with the multi-user example in section 2
* add a note about KUBEFLOW_PUBLIC_ENDPOINT_URL since the user
may or may not have exposed the endpoint as a LoadBalancer

Signed-off-by: Chin Huang <chhuang@us.ibm.com>

* update broken links / tweak names (#2583)

* Move MiniKF to distributions (#2576)

* create minikf folde + index

* move minikf docs to minikf folder

* Add redirects for external links

* Change naming according to request

* update description minikf

* Clean up "Frameworks for training" + rename to "Training Operators" (#2584)

* Remove outdated banners from Pytorch and TF

* delete chainer

* order TF and pyT up

* rename "Frameworks for training" to "Training operators"

* Fix broken link (#2580)

* Remove "outdated" banners from MPI + MXnet operators (#2585)

* docs: Update MPI and MXNet operator pages (#2586)

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>

* Pin the version of kustomize, v4 is not supported. (#2572)

* Pin the version of kustomize, v4 is not supported.

There are issues installing Kubeflow with version v4. 

Note: 
https://github.com/kubeflow/website/issues/2570
https://github.com/kubeflow/kubeflow/issues/5755

* Add refrence to manifest repo version.

* Default to 3.2.0

* Update gke/anthos.md (#2591)

* fix broken link (#2603)

Co-authored-by: Prashant Sharma <prashsh1@in.ibm.com>
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
Co-authored-by: Tommy Li <Tommy.chaoping.li@ibm.com>
Co-authored-by: Mathew Wicks <thesuperzapper@users.noreply.github.com>
Co-authored-by: JohanWork <39947546+JohanWork@users.noreply.github.com>
Co-authored-by: Joe Liedtke <joeliedtke@google.com>
Co-authored-by: Mofizur Rahman <moficodes@gmail.com>
Co-authored-by: Yuan (Bob) Gong <4957653+Bobgy@users.noreply.github.com>
Co-authored-by: Chin Huang <chhuang@us.ibm.com>
Co-authored-by: brett koonce <koonce@gmail.com>
Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>
Co-authored-by: drPytho <filip@voiapp.io>
Co-authored-by: Ihor Sychevskyi <arhell333@gmail.com>
This commit is contained in:
Rui Vasconcelos 2021-04-20 23:00:41 +02:00 committed by GitHub
parent 4dda8b22e7
commit 4e2602bd53
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
144 changed files with 945 additions and 237 deletions

7
OWNERS
View File

@ -2,6 +2,7 @@ approvers:
- animeshsingh
- Bobgy
- joeliedtke
- RFMVasconcelos
reviewers:
- 8bitmp3
- aronchick
@ -9,9 +10,7 @@ reviewers:
- dansanche
- dsdinter
- Jeffwan
- jinchihe
- jinchihe
- nickchase
- pdmack
- RFMVasconcelos
- terrytangyuan
- terrytangyuan

View File

@ -34,22 +34,22 @@
/docs/pipelines/tutorials/pipelines-tutorial/ /docs/components/pipelines/tutorials/cloud-tutorials/
/docs/gke/pipelines-tutorial/ /docs/components/pipelines/tutorials/cloud-tutorials/
/docs/gke/pipelines/pipelines-tutorial/ /docs/components/pipelines/tutorials/cloud-tutorials/
/docs/gke/authentication-pipelines/ /docs/gke/pipelines/authentication-pipelines/
/docs/gke/authentication-pipelines/ /docs/distributions/gke/pipelines/authentication-pipelines/
/docs/pipelines/metrics/ /docs/components/pipelines/sdk/pipelines-metrics/
/docs/pipelines/metrics/pipelines-metrics/ /docs/components/pipelines/sdk/pipelines-metrics/
/docs/pipelines/metrics/output-viewer/ /docs/components/pipelines/sdk/output-viewer/
/docs/pipelines/pipelines-overview/ /docs/components/pipelines/overview/pipelines-overview/
/docs/pipelines/enable-gpu-and-tpu/ /docs/gke/pipelines/enable-gpu-and-tpu/
/docs/pipelines/sdk/enable-gpu-and-tpu/ /docs/gke/pipelines/enable-gpu-and-tpu/
/docs/pipelines/sdk/gcp/enable-gpu-and-tpu/ /docs/gke/pipelines/enable-gpu-and-tpu/
/docs/pipelines/preemptible/ /docs/gke/pipelines/preemptible/
/docs/pipelines/sdk/gcp/preemptible/ /docs/gke/pipelines/preemptible/
/docs/pipelines/enable-gpu-and-tpu/ /docs/distributions/gke/pipelines/enable-gpu-and-tpu/
/docs/pipelines/sdk/enable-gpu-and-tpu/ /docs/distributions/gke/pipelines/enable-gpu-and-tpu/
/docs/pipelines/sdk/gcp/enable-gpu-and-tpu/ /docs/distributions/gke/pipelines/enable-gpu-and-tpu/
/docs/pipelines/preemptible/ /docs/distributions/gke/pipelines/preemptible/
/docs/pipelines/sdk/gcp/preemptible/ /docs/distributions/gke/pipelines/preemptible/
/docs/pipelines/reusable-components/ /docs/examples/shared-resources/
/docs/pipelines/sdk/reusable-components/ /docs/examples/shared-resources/
# Moved the guide to monitoring GKE deployments.
/docs/other-guides/monitoring/ /docs/gke/monitoring/
/docs/other-guides/monitoring/ /docs/distributions/gke/monitoring/
# Created a new section for pipeline concepts.
/docs/pipelines/pipelines-concepts/ /docs/components/pipelines/concepts/
@ -88,24 +88,20 @@ docs/started/requirements/ /docs/started/getting-started/
# Restructured the getting-started and other-guides sections.
/docs/started/getting-started-k8s/ /docs/started/k8s/
/docs/started/getting-started-minikf/ /docs/started/workstation/getting-started-minikf/
/docs/started/getting-started-minikube/ /docs/started/workstation/minikube-linux/
/docs/started/getting-started-minikube/ /docs/started/distributions/kfctl/minikube/
/docs/other-guides/virtual-dev/getting-started-minikf/ /docs/started/workstation/getting-started-minikf/
/docs/started/getting-started-multipass/ /docs/started/workstation/getting-started-multipass/
/docs/other-guides/virtual-dev/getting-started-multipass/ /docs/started/workstation/getting-started-multipass/
/docs/other-guides/virtual-dev/ /docs/started/workstation/
/docs/started/getting-started-aws/ /docs/started/cloud/getting-started-aws/
/docs/started/getting-started-azure/ /docs/started/cloud/getting-started-azure/
/docs/started/getting-started-gke/ /docs/started/cloud/getting-started-gke/
/docs/started/getting-started-iks/ /docs/started/cloud/getting-started-iks/
/docs/use-cases/kubeflow-on-multinode-cluster/ /docs/other-guides/kubeflow-on-multinode-cluster/
/docs/use-cases/job-scheduling/ /docs/other-guides/job-scheduling/
# Remove Kubeflow installation on existing EKS cluster
/docs/aws/deploy/existing-cluster/ /docs/aws/deploy/install-kubeflow/
/docs/aws/deploy/existing-cluster/ /docs/distributions/aws/deploy/install-kubeflow/
# Move the kustomize guide to the config section
/docs/components/misc/kustomize/ /docs/other-guides/kustomize/
/docs/components/misc/kustomize/ /docs/distributions/kfctl/kustomize/
# Merged the UIs page with the new central dashboard page
/docs/other-guides/accessing-uis/ /docs/components/central-dash/overview/
@ -116,12 +112,38 @@ docs/started/requirements/ /docs/started/getting-started/
# Rename TensorRT Inference Server to Triton Inference Server
/docs/components/serving/trtinferenceserver /docs/components/serving/tritoninferenceserver
# Kubeflow Operator move to under distributions
/docs/operator /docs/distributions/operator
/docs/operator/introduction /docs/distributions/operator/introduction
/docs/operator/install-operator /docs/distributions/operator/install-operator
/docs/operator/install-kubeflow /docs/distributions/operator/install-kubeflow
/docs/operator/uninstall-kubeflow /docs/distributions/operator/uninstall-kubeflow
/docs/operator/uninstall-operator /docs/distributions/operator/uninstall-operator
/docs/operator/troubleshooting /docs/distributions/operator/troubleshooting
# kfctl move to under distributions
/docs/started/workstation/minikube-linux /docs/distributions/kfctl/minikube
/docs/other-guides/kustomize /docs/distributions/kfctl/kustomize
/docs/started/k8s/kfctl-istio-dex /docs/distributions/kfctl/multi-user
/docs/started/k8s/kfctl-k8s-istio /docs/distributions/kfctl/deployment
# Moved Job scheduling under Training
/docs/other-guides/job-scheduling/ /docs/components/training/job-scheduling/
# Moved KFServing
/docs/components/serving/kfserving/ /docs/components/kfserving
# Moved MicroK8s to distributions
/docs/started/workstation/kubeflow-on-microk8s /docs/distributions/microk8s/kubeflow-on-microk8s
# Moved K8s deployment overview to under kfctl
/docs/started/k8s/overview /docs/distributions/kfctl/overview
# Moved MiniKF to distributions
/docs/started/workstation/getting-started-minikf /docs/distributions/getting-started-minikf
/docs/started/workstation/minikf-aws /docs/distributions/minikf-aws
/docs/started/workstation/minikf-gcp /docs/distributions/minikf-gcp
# ===============
# IMPORTANT NOTE:
# Catch-all redirects should be added at the end of this file as redirects happen from top to bottom
@ -129,3 +151,8 @@ docs/started/requirements/ /docs/started/getting-started/
/docs/guides/* /docs/:splat
/docs/pipelines/concepts/* /docs/components/pipelines/overview/concepts/:splat
/docs/pipelines/* /docs/components/pipelines/:splat
/docs/aws/* /docs/distributions/aws/:splat
/docs/azure/* /docs/distributions/azure/:splat
/docs/gke/* /docs/distributions/gke/:splat
/docs/ibm/* /docs/distributions/ibm/:splat
/docs/openshift/* /docs/distributions/openshift/:splat

View File

@ -74,7 +74,7 @@ Port-forwarding typically does not work if any of the following are true:
with the [CLI deployment](/docs/gke/deploy/deploy-cli/). (If you want to
use port forwarding, you must deploy Kubeflow on an existing Kubernetes
cluster using the [`kfctl_k8s_istio`
configuration](/docs/started/k8s/kfctl-k8s-istio/).)
configuration](/docs/methods/kfctl/deployment).)
* You've configured the Istio ingress to only accept
HTTPS traffic on a specific domain or IP address.

View File

@ -755,7 +755,7 @@ kubectl apply -f <your-path/your-experiment-config.yaml>
- (Optional) Katib's experiments don't work with
[Istio sidecar injection](https://istio.io/latest/docs/setup/additional-setup/sidecar-injection/#automatic-sidecar-injection).
If you install Kubeflow using
[Istio config](https://www.kubeflow.org/docs/started/k8s/kfctl-k8s-istio/),
[Istio config](https://www.kubeflow.org/docs/methods/kfctl/deployment),
you have to disable sidecar injection. To do that, specify this annotation:
`sidecar.istio.io/inject: "false"` in your experiment's trial template. For
examples on how to do it for `Job`, `TFJob` (TensorFlow) or

View File

@ -141,7 +141,7 @@ an experiment using the random algorithm example:
1. (Optional) **Note:** Katib's experiments don't work with
[Istio sidecar injection](https://istio.io/latest/docs/setup/additional-setup/sidecar-injection/#automatic-sidecar-injection).
If you installed Kubeflow using
[Istio config](/docs/started/k8s/kfctl-k8s-istio/),
[Istio config](/docs/methods/kfctl/deployment),
you have to disable sidecar injection. To do that, specify this annotation:
`sidecar.istio.io/inject: "false"` in your experiment's trial template.
@ -394,7 +394,7 @@ the Kubeflow's TensorFlow training job operator, TFJob:
1. (Optional) **Note:** Katib's experiments don't work with
[Istio sidecar injection](https://istio.io/latest/docs/setup/additional-setup/sidecar-injection/#automatic-sidecar-injection).
If you installed Kubeflow using
[Istio config](/docs/started/k8s/kfctl-k8s-istio/),
[Istio config](/docs/methods/kfctl/deployment),
you have to disable sidecar injection. To do that, specify this annotation:
`sidecar.istio.io/inject: "false"` in your experiment's trial template.
For the provided `TFJob` example check
@ -438,7 +438,7 @@ using Kubeflow's PyTorch training job operator, PyTorchJob:
1. (Optional) **Note:** Katib's experiments don't work with
[Istio sidecar injection](https://istio.io/latest/docs/setup/additional-setup/sidecar-injection/#automatic-sidecar-injection).
If you installed Kubeflow using
[Istio config](/docs/started/k8s/kfctl-k8s-istio/),
[Istio config](/docs/methods/kfctl/deployment),
you have to disable sidecar injection. To do that, specify this annotation:
`sidecar.istio.io/inject: "false"` in your experiment's trial template.
For the provided `PyTorchJob` example setting the annotation should be similar to

View File

@ -133,7 +133,7 @@ You can use the following interfaces to interact with Katib:
- **kfctl** is the Kubeflow CLI that you can use to install and configure
Kubeflow. Learn about kfctl in the guide to
[configuring Kubeflow](/docs/other-guides/kustomize/).
[configuring Kubeflow](/docs/methods/kfctl/kustomize/).
- The Kubernetes CLI, **kubectl**, is useful for running commands against your
Kubeflow cluster. Learn about kubectl in the [Kubernetes

View File

@ -53,7 +53,7 @@ master should share the same identity management.
## Supported platforms
* Kubeflow multi-tenancy is enabled by default if you deploy Kubeflow on GCP with [IAP](/docs/gke/deploy).
* If you are not on GCP, you can deploy multi-tenancy to [your existing cluster](/docs/started/k8s/kfctl-istio-dex/).
* If you are not on GCP, you can deploy multi-tenancy to [your existing cluster](/docs/methods/kfctl/multi-user).
## Next steps

View File

@ -56,7 +56,7 @@ A _pipeline component_ is a self-contained set of user code, packaged as a
performs one step in the pipeline. For example, a component can be responsible
for data preprocessing, data transformation, model training, and so on.
See the conceptual guides to [pipelines](/docs/components/pipelines/concepts/pipeline/)
See the conceptual guides to [pipelines](/docs/components/pipelines/overview/concepts/pipeline/)
and [components](/docs/components/pipelines/concepts/component/).
## Example of a pipeline

View File

@ -5,7 +5,7 @@ weight = 140
+++
You can use the [KFP-Tekton SDK](https://github.com/kubeflow/kfp-tekton/sdk)
You can use the [KFP-Tekton SDK](https://github.com/kubeflow/kfp-tekton/tree/master/sdk)
to compile, upload and run your Kubeflow Pipeline DSL Python scripts on a
[Kubeflow Pipelines with Tekton backend](https://github.com/kubeflow/kfp-tekton/tree/master/tekton_kfp_guide.md).

View File

@ -287,14 +287,55 @@
" storage service. Kubeflow Pipelines passes parameters to your component by\n",
" file, by passing their paths as a command-line argument.\n",
"\n",
"<a name=\"parameter-names\"></a>\n",
"#### Input and output parameter names\n",
"\n",
"When you use the Kubeflow Pipelines SDK to convert your Python function to a\n",
"pipeline component, the Kubeflow Pipelines SDK uses the function's interface\n",
"to define the interface of your component in the following ways.\n",
"\n",
"* Some arguments define input parameters.\n",
"* Some arguments define output parameters.\n",
"* The function's return value is used as an output parameter. If the return\n",
" value is a [`collections.namedtuple`][named-tuple], the named tuple is used\n",
" to return several small values. \n",
"\n",
"Since you can pass parameters between components as a value or as a path, the\n",
"Kubeflow Pipelines SDK removes common parameter suffixes that leak the\n",
"component's expected implementation. For example, a Python function-based\n",
"component that ingests data and outputs CSV data may have an output argument\n",
"that is defined as `csv_path: comp.OutputPath(str)`. In this case, the output\n",
"is the CSV data, not the path. So, the Kubeflow Pipelines SDK simplifies the\n",
"output name to `csv`.\n",
"\n",
"The Kubeflow Pipelines SDK uses the following rules to define the input and\n",
"output parameter names in your component's interface:\n",
"\n",
"* If the argument name ends with `_path` and the argument is annotated as an\n",
" [`kfp.components.InputPath`][input-path] or\n",
" [`kfp.components.OutputPath`][output-path], the parameter name is the\n",
" argument name with the trailing `_path` removed.\n",
"* If the argument name ends with `_file`, the parameter name is the argument\n",
" name with the trailing `_file` removed.\n",
"* If you return a single small value from your component using the `return`\n",
" statement, the output parameter is named `output`.\n",
"* If you return several small values from your component by returning a \n",
" [`collections.namedtuple`][named-tuple], the Kubeflow Pipelines SDK uses\n",
" the tuple's field names as the output parameter names. \n",
"\n",
"Otherwise, the Kubeflow Pipelines SDK uses the argument name as the parameter\n",
"name.\n",
"\n",
"<a name=\"pass-by-value\"></a>\n",
"#### Passing parameters by value\n",
"\n",
"Python function-based components make it easier to pass parameters between\n",
"components by value (such as numbers, booleans, and short strings), by letting\n",
"you define your components interface by annotating your Python function. The\n",
"supported types are `int`, `float`, `bool`, and `string`. If you do not\n",
"annotate your function, these input parameters are passed as strings.\n",
"supported types are `int`, `float`, `bool`, and `str`. You can also pass \n",
"`list` or `dict` instances by value, if they contain small values, such as\n",
"`int`, `float`, `bool`, or `str` values. If you do not annotate your function,\n",
"these input parameters are passed as strings.\n",
"\n",
"If your component returns multiple outputs by value, annotate your function\n",
"with the [`typing.NamedTuple`][named-tuple-hint] type hint and use the\n",
@ -320,7 +361,9 @@
"[named-tuple-hint]: https://docs.python.org/3/library/typing.html#typing.NamedTuple\n",
"[named-tuple]: https://docs.python.org/3/library/collections.html#collections.namedtuple\n",
"[kfp-visualize]: https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/\n",
"[kfp-metrics]: https://www.kubeflow.org/docs/components/pipelines/sdk/pipelines-metrics/"
"[kfp-metrics]: https://www.kubeflow.org/docs/components/pipelines/sdk/pipelines-metrics/\n",
"[input-path]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html#kfp.components.InputPath\n",
"[output-path]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html#kfp.components.OutputPath"
]
},
{

View File

@ -5,7 +5,7 @@ weight = 50
+++
<!--
AUTOGENERATED FROM content/en/docs/components/pipelines/sdk/python-function-components.ipynb
AUTOGENERATED FROM content/en/docs/pipelines/sdk/python-function-components.ipynb
PLEASE UPDATE THE JUPYTER NOTEBOOK AND REGENERATE THIS FILE USING scripts/nb_to_md.py.-->
<style>
@ -26,8 +26,8 @@ background-position: left center;
}
</style>
<div class="notebook-links">
<a class="colab-link" href="https://colab.research.google.com/github/kubeflow/website/blob/master/content/en/docs/components/pipelines/sdk/python-function-components.ipynb">Run in Google Colab</a>
<a class="github-link" href="https://github.com/kubeflow/website/blob/master/content/en/docs/components/pipelines/sdk/python-function-components.ipynb">View source on GitHub</a>
<a class="colab-link" href="https://colab.research.google.com/github/kubeflow/website/blob/master/content/en/docs/pipelines/sdk/python-function-components.ipynb">Run in Google Colab</a>
<a class="github-link" href="https://github.com/kubeflow/website/blob/master/content/en/docs/pipelines/sdk/python-function-components.ipynb">View source on GitHub</a>
</div>
@ -257,14 +257,55 @@ The following sections describe how to pass parameters by value and by file.
storage service. Kubeflow Pipelines passes parameters to your component by
file, by passing their paths as a command-line argument.
<a name="parameter-names"></a>
#### Input and output parameter names
When you use the Kubeflow Pipelines SDK to convert your Python function to a
pipeline component, the Kubeflow Pipelines SDK uses the function's interface
to define the interface of your component in the following ways.
* Some arguments define input parameters.
* Some arguments define output parameters.
* The function's return value is used as an output parameter. If the return
value is a [`collections.namedtuple`][named-tuple], the named tuple is used
to return several small values.
Since you can pass parameters between components as a value or as a path, the
Kubeflow Pipelines SDK removes common parameter suffixes that leak the
component's expected implementation. For example, a Python function-based
component that ingests data and outputs CSV data may have an output argument
that is defined as `csv_path: comp.OutputPath(str)`. In this case, the output
is the CSV data, not the path. So, the Kubeflow Pipelines SDK simplifies the
output name to `csv`.
The Kubeflow Pipelines SDK uses the following rules to define the input and
output parameter names in your component's interface:
* If the argument name ends with `_path` and the argument is annotated as an
[`kfp.components.InputPath`][input-path] or
[`kfp.components.OutputPath`][output-path], the parameter name is the
argument name with the trailing `_path` removed.
* If the argument name ends with `_file`, the parameter name is the argument
name with the trailing `_file` removed.
* If you return a single small value from your component using the `return`
statement, the output parameter is named `output`.
* If you return several small values from your component by returning a
[`collections.namedtuple`][named-tuple], the Kubeflow Pipelines SDK uses
the tuple's field names as the output parameter names.
Otherwise, the Kubeflow Pipelines SDK uses the argument name as the parameter
name.
<a name="pass-by-value"></a>
#### Passing parameters by value
Python function-based components make it easier to pass parameters between
components by value (such as numbers, booleans, and short strings), by letting
you define your components interface by annotating your Python function. The
supported types are `int`, `float`, `bool`, and `string`. If you do not
annotate your function, these input parameters are passed as strings.
supported types are `int`, `float`, `bool`, and `str`. You can also pass
`list` or `dict` instances by value, if they contain small values, such as
`int`, `float`, `bool`, or `str` values. If you do not annotate your function,
these input parameters are passed as strings.
If your component returns multiple outputs by value, annotate your function
with the [`typing.NamedTuple`][named-tuple-hint] type hint and use the
@ -291,6 +332,8 @@ including component metadata and metrics.
[named-tuple]: https://docs.python.org/3/library/collections.html#collections.namedtuple
[kfp-visualize]: https://www.kubeflow.org/docs/components/pipelines/sdk/output-viewer/
[kfp-metrics]: https://www.kubeflow.org/docs/components/pipelines/sdk/pipelines-metrics/
[input-path]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html#kfp.components.InputPath
[output-path]: https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.components.html#kfp.components.OutputPath
```python
@ -540,6 +583,6 @@ client.create_run_from_pipeline_func(calc_pipeline, arguments=arguments)
<div class="notebook-links">
<a class="colab-link" href="https://colab.research.google.com/github/kubeflow/website/blob/master/content/en/docs/components/pipelines/sdk/python-function-components.ipynb">Run in Google Colab</a>
<a class="github-link" href="https://github.com/kubeflow/website/blob/master/content/en/docs/components/pipelines/sdk/python-function-components.ipynb">View source on GitHub</a>
<a class="colab-link" href="https://colab.research.google.com/github/kubeflow/website/blob/master/content/en/docs/pipelines/sdk/python-function-components.ipynb">Run in Google Colab</a>
<a class="github-link" href="https://github.com/kubeflow/website/blob/master/content/en/docs/pipelines/sdk/python-function-components.ipynb">View source on GitHub</a>
</div>

View File

@ -1,5 +1,5 @@
+++
title = "Frameworks for Training"
description = "Training of ML models in Kubeflow"
title = "Training Operators"
description = "Training of ML models in Kubeflow through operators"
weight = 70
+++

View File

@ -1,19 +0,0 @@
+++
title = "Chainer Training"
description = "See Kubeflow [v0.6 docs](https://v0-6.kubeflow.org/docs/components/training/chainer/) for instructions on using Chainer for training"
weight = 4
toc = true
+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}
{{% alpha-status
feedbacklink="https://github.com/kubeflow/chainer-operator/issues" %}}
[Chainer](https://github.com/kubeflow/chainer-operator) is not supported in
Kubeflow versions greater than v0.6. See the [Kubeflow v0.6
documentation](https://v0-6.kubeflow.org/docs/components/training/chainer/)
for earlier support for Chainer training.

View File

@ -4,10 +4,6 @@ description = "Instructions for using MPI for training"
weight = 25
+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}
{{% alpha-status
feedbacklink="https://github.com/kubeflow/mpi-operator/issues" %}}
@ -26,7 +22,7 @@ cd mpi-operator
kubectl create -f deploy/v1alpha2/mpi-operator.yaml
```
Alternatively, follow the [getting started guide](/docs/started/getting-started/) to deploy Kubeflow.
Alternatively, follow the [getting started guide](https://www.kubeflow.org/docs/started/getting-started/) to deploy Kubeflow.
An alpha version of MPI support was introduced with Kubeflow 0.2.0. You must be using a version of Kubeflow newer than 0.2.0.
@ -48,9 +44,9 @@ mpijobs.kubeflow.org 4d
If it is not included you can add it as follows using [kustomize](https://github.com/kubernetes-sigs/kustomize):
```bash
git clone https://github.com/kubeflow/manifests
cd manifests/mpi-job/mpi-operator
kustomize build base | kubectl apply -f -
git clone https://github.com/kubeflow/mpi-operator
cd mpi-operator/manifests
kustomize build overlays/kubeflow | kubectl apply -f -
```
Note that since Kubernetes v1.14, `kustomize` became a subcommand in `kubectl` so you can also run the following command instead:
@ -66,6 +62,7 @@ You can create an MPI job by defining an `MPIJob` config file. See [TensorFlow b
```
cat examples/v1alpha2/tensorflow-benchmarks.yaml
```
Deploy the `MPIJob` resource to start training:
```
@ -166,7 +163,6 @@ status:
startTime: "2019-07-09T22:15:51Z"
```
Training should run for 100 steps and takes a few minutes on a GPU cluster. You can inspect the logs to see the training progress. When the job starts, access the logs from the `launcher` pod:
```
@ -192,20 +188,20 @@ Variables: horovod
...
40 images/sec: 154.4 +/- 0.7 (jitter = 4.0) 8.280
40 images/sec: 154.4 +/- 0.7 (jitter = 4.1) 8.482
50 images/sec: 154.8 +/- 0.6 (jitter = 4.0) 8.397
50 images/sec: 154.8 +/- 0.6 (jitter = 4.2) 8.450
60 images/sec: 154.5 +/- 0.5 (jitter = 4.1) 8.321
60 images/sec: 154.5 +/- 0.5 (jitter = 4.4) 8.349
70 images/sec: 154.5 +/- 0.5 (jitter = 4.0) 8.433
70 images/sec: 154.5 +/- 0.5 (jitter = 4.4) 8.430
80 images/sec: 154.8 +/- 0.4 (jitter = 3.6) 8.199
80 images/sec: 154.8 +/- 0.4 (jitter = 3.8) 8.404
90 images/sec: 154.6 +/- 0.4 (jitter = 3.7) 8.418
90 images/sec: 154.6 +/- 0.4 (jitter = 3.6) 8.459
100 images/sec: 154.2 +/- 0.4 (jitter = 4.0) 8.372
100 images/sec: 154.2 +/- 0.4 (jitter = 4.0) 8.542
40 images/sec: 154.4 +/- 0.7 (jitter = 4.0) 8.280
40 images/sec: 154.4 +/- 0.7 (jitter = 4.1) 8.482
50 images/sec: 154.8 +/- 0.6 (jitter = 4.0) 8.397
50 images/sec: 154.8 +/- 0.6 (jitter = 4.2) 8.450
60 images/sec: 154.5 +/- 0.5 (jitter = 4.1) 8.321
60 images/sec: 154.5 +/- 0.5 (jitter = 4.4) 8.349
70 images/sec: 154.5 +/- 0.5 (jitter = 4.0) 8.433
70 images/sec: 154.5 +/- 0.5 (jitter = 4.4) 8.430
80 images/sec: 154.8 +/- 0.4 (jitter = 3.6) 8.199
80 images/sec: 154.8 +/- 0.4 (jitter = 3.8) 8.404
90 images/sec: 154.6 +/- 0.4 (jitter = 3.7) 8.418
90 images/sec: 154.6 +/- 0.4 (jitter = 3.6) 8.459
100 images/sec: 154.2 +/- 0.4 (jitter = 4.0) 8.372
100 images/sec: 154.2 +/- 0.4 (jitter = 4.0) 8.542
----------------------------------------------------------------
total images/sec: 308.27
```
@ -214,5 +210,5 @@ total images/sec: 308.27
Docker images are built and pushed automatically to [mpioperator on Dockerhub](https://hub.docker.com/u/mpioperator). You can use the following Dockerfiles to build the images yourself:
* [mpi-operator](https://github.com/kubeflow/mpi-operator/blob/master/Dockerfile)
* [kubectl-delivery](https://github.com/kubeflow/mpi-operator/blob/master/cmd/kubectl-delivery/Dockerfile)
- [mpi-operator](https://github.com/kubeflow/mpi-operator/blob/master/Dockerfile)
- [kubectl-delivery](https://github.com/kubeflow/mpi-operator/blob/master/cmd/kubectl-delivery/Dockerfile)

View File

@ -4,31 +4,34 @@ description = "Instructions for using MXNet"
weight = 25
+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}
{{% alpha-status
feedbacklink="https://github.com/kubeflow/mxnet-operator/issues" %}}
This guide walks you through using MXNet with Kubeflow.
This guide walks you through using [Apache MXNet (incubating)](https://github.com/apache/incubator-mxnet) with Kubeflow.
## Installing MXNet Operator
MXNet Operator provides a Kubernetes custom resource `MXJob` that makes it easy to run distributed or non-distributed
Apache MXNet jobs (training and tuning) and other extended framework like [BytePS](https://github.com/bytedance/byteps)
jobs on Kubernetes. Using a Custom Resource Definition (CRD) gives users the ability to create
and manage Apache MXNet jobs just like built-in K8S resources.
If you haven't already done so please follow the [Getting Started Guide](https://www.kubeflow.org/docs/started/getting-started/) to deploy Kubeflow.
## Installing the MXJob CRD and operator on your k8s cluster
A version of MXNet support was introduced with Kubeflow 0.2.0. You must be using a version of Kubeflow newer than 0.2.0.
### Deploy MXJob CRD and Apache MXNet Operator
## Verify that MXNet support is included in your Kubeflow deployment
```
kustomize build manifests/overlays/v1 | kubectl apply -f -
```
Check that the MXNet custom resource is installed
### Verify that MXJob CRD and Apache MXNet Operator are installed
Check that the Apache MXNet custom resource is installed via:
```
kubectl get crd
```
The output should include `mxjobs.kubeflow.org`
The output should include `mxjobs.kubeflow.org` like the following:
```
NAME AGE
@ -37,72 +40,119 @@ mxjobs.kubeflow.org 4d
...
```
If it is not included you can add it as follows
Check that the Apache MXNet operator is running via:
```
git clone https://github.com/kubeflow/manifests
cd manifests/mxnet-job/mxnet-operator
kubectl kustomize base | kubectl apply -f -
kubectl get pods
```
Alternatively, you can deploy the operator with default settings without using kustomize by running the following from the repo:
The output should include `mxnet-operaror-xxx` like the following:
```
git clone https://github.com/kubeflow/mxnet-operator.git
cd mxnet-operator
kubectl create -f manifests/crd-v1beta1.yaml
kubectl create -f manifests/rbac.yaml
kubectl create -f manifests/deployment.yaml
NAME READY STATUS RESTARTS AGE
mxnet-operator-d466b46bc-xbqvs 1/1 Running 0 4m37s
```
## Creating a MXNet training job
You create a training job by defining a MXJob with MXTrain mode and then creating it with
### Creating a Apache MXNet training job
You create a training job by defining a `MXJob` with `MXTrain` mode and then creating it with.
```
kubectl create -f examples/v1beta1/train/mx_job_dist_gpu.yaml
kubectl create -f examples/train/mx_job_dist_gpu_v1.yaml
```
Each `replicaSpec` defines a set of Apache MXNet processes.
The `mxReplicaType` defines the semantics for the set of processes.
The semantics are as follows:
## Creating a TVM tuning job (AutoTVM)
**scheduler**
* A job must have 1 and only 1 scheduler
* The pod must contain a container named mxnet
* The overall status of the `MXJob` is determined by the exit code of the
mxnet container
* 0 = success
* 1 || 2 || 126 || 127 || 128 || 139 = permanent errors:
* 1: general errors
* 2: misuse of shell builtins
* 126: command invoked cannot execute
* 127: command not found
* 128: invalid argument to exit
* 139: container terminated by SIGSEGV(Invalid memory reference)
* 130 || 137 || 143 = retryable error for unexpected system signals:
* 130: container terminated by Control-C
* 137: container received a SIGKILL
* 143: container received a SIGTERM
* 138 = reserved in tf-operator for user specified retryable errors
* others = undefined and no guarantee
**worker**
* A job can have 0 to N workers
* The pod must contain a container named mxnet
* Workers are automatically restarted if they exit
**server**
* A job can have 0 to N servers
* parameter servers are automatically restarted if they exit
[TVM](https://docs.tvm.ai/tutorials/) is a end to end deep learning compiler stack, you can easily run AutoTVM with mxnet-operator.
For each replica you define a **template** which is a K8S
[PodTemplateSpec](https://kubernetes.io/docs/api-reference/v1.8/#podtemplatespec-v1-core).
The template allows you to specify the containers, volumes, etc... that
should be created for each replica.
### Creating a TVM tuning job (AutoTVM)
[TVM](https://docs.tvm.ai/tutorials/) is a end to end deep learning compiler stack, you can easily run AutoTVM with mxnet-operator.
You can create a auto tuning job by define a type of MXTune job and then creating it with
```
kubectl create -f examples/v1beta1/tune/mx_job_tune_gpu.yaml
kubectl create -f examples/tune/mx_job_tune_gpu_v1.yaml
```
Before you use the auto-tuning example, there is some preparatory work need to be finished in advance.
To let TVM tune your network, you should create a docker image which has TVM module.
Then, you need a auto-tuning script to specify which network will be tuned and set the auto-tuning parameters.
For more details, please see [tutorials](https://docs.tvm.ai/tutorials/autotvm/tune_relay_mobile_gpu.html#sphx-glr-tutorials-autotvm-tune-relay-mobile-gpu-py).
Finally, you need a startup script to start the auto-tuning program. In fact, mxnet-operator will set all the parameters as environment variables and the startup script need to reed these variable and then transmit them to auto-tuning script.
We provide an example under `examples/tune/`, tuning result will be saved in a log file like resnet-18.log in the example we gave. You can refer it for details.
Before you use the auto-tuning example, there is some preparatory work need to be finished in advance. To let TVM tune your network, you should create a docker image which has TVM module. Then, you need a auto-tuning script to specify which network will be tuned and set the auto-tuning parameters, For more details, please see https://docs.tvm.ai/tutorials/autotvm/tune_relay_mobile_gpu.html#sphx-glr-tutorials-autotvm-tune-relay-mobile-gpu-py. Finally, you need a startup script to start the auto-tuning program. In fact, mxnet-operator will set all the parameters as environment variables and the startup script need to reed these variable and then transmit them to auto-tuning script. We provide an example under examples/v1beta1/tune/, tuning result will be saved in a log file like resnet-18.log in the example we gave. You can refer it for details.
### Using GPUs
MXNet Operator supports training with GPUs.
## Monitoring a MXNet Job
Please verify your image is available for distributed training with GPUs.
For example, if you have the following, MXNet Operator will arrange the pod to nodes to satisfy the GPU limit.
```
command: ["python"]
args: ["/incubator-mxnet/example/image-classification/train_mnist.py","--num-epochs","1","--num-layers","2","--kv-store","dist_device_sync","--gpus","0"]
resources:
limits:
nvidia.com/gpu: 1
```
### Monitoring your Apache MXNet job
To get the status of your job
```bash
kubectl get -o yaml mxjobs ${JOB}
```
kubectl get -o yaml mxjobs $JOB
```
Here is sample output for an example job
```yaml
apiVersion: kubeflow.org/v1beta1
apiVersion: kubeflow.org/v1
kind: MXJob
metadata:
creationTimestamp: 2019-03-19T09:24:27Z
creationTimestamp: 2021-03-24T15:37:27Z
generation: 1
name: mxnet-job
namespace: default
resourceVersion: "3681685"
selfLink: /apis/kubeflow.org/v1beta1/namespaces/default/mxjobs/mxnet-job
uid: cb11013b-4a28-11e9-b7f4-704d7bb59f71
resourceVersion: "5123435"
selfLink: /apis/kubeflow.org/v1/namespaces/default/mxjobs/mxnet-job
uid: xx11013b-4a28-11e9-s5a1-704d7bb912f91
spec:
cleanPodPolicy: All
jobMode: MXTrain
@ -164,22 +214,22 @@ spec:
limits:
nvidia.com/gpu: "1"
status:
completionTime: 2019-03-19T09:25:11Z
completionTime: 2021-03-24T09:25:11Z
conditions:
- lastTransitionTime: 2019-03-19T09:24:27Z
lastUpdateTime: 2019-03-19T09:24:27Z
- lastTransitionTime: 2021-03-24T15:37:27Z
lastUpdateTime: 2021-03-24T15:37:27Z
message: MXJob mxnet-job is created.
reason: MXJobCreated
status: "True"
type: Created
- lastTransitionTime: 2019-03-19T09:24:27Z
lastUpdateTime: 2019-03-19T09:24:29Z
- lastTransitionTime: 2021-03-24T15:37:27Z
lastUpdateTime: 2021-03-24T15:37:29Z
message: MXJob mxnet-job is running.
reason: MXJobRunning
status: "False"
type: Running
- lastTransitionTime: 2019-03-19T09:24:27Z
lastUpdateTime: 2019-03-19T09:25:11Z
- lastTransitionTime: 2021-03-24T15:37:27Z
lastUpdateTime: 2021-03-24T09:25:11Z
message: MXJob mxnet-job is successfully completed.
reason: MXJobSucceeded
status: "True"
@ -188,5 +238,51 @@ status:
Scheduler: {}
Server: {}
Worker: {}
startTime: 2019-03-19T09:24:29Z
startTime: 2021-03-24T15:37:29Z
```
The first thing to note is the **RuntimeId**. This is a random unique
string which is used to give names to all the K8s resouces
(e.g Job controllers & services) that are created by the `MXJob`.
As with other K8S resources status provides information about the state
of the resource.
**phase** - Indicates the phase of a job and will be one of
- Creating
- Running
- CleanUp
- Failed
- Done
**state** - Provides the overall status of the job and will be one of
- Running
- Succeeded
- Failed
For each replica type in the job, there will be a `ReplicaStatus` that
provides the number of replicas of that type in each state.
For each replica type, the job creates a set of K8s
[Job Controllers](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/)
named
```
${REPLICA-TYPE}-${RUNTIME_ID}-${INDEX}
```
For example, if you have 2 servers and the runtime id is "76n0", then `MXJob`
will create the following two jobs:
```
server-76no-0
server-76no-1
```
## Contributing
Please refer to the [this document](./CONTRIBUTING.md) for contributing guidelines.
## Community
Please check out [Kubeflow community page](https://www.kubeflow.org/docs/about/community/) for more information on how to get involved in our community.

View File

@ -1,13 +1,9 @@
+++
title = "PyTorch Training"
description = "Instructions for using PyTorch"
weight = 35
weight = 15
+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}
{{% stable-status %}}

View File

@ -2,13 +2,9 @@
title = "TensorFlow Training (TFJob)"
linkTitle = "TensorFlow Training (TFJob)"
description = "Using TFJob to train a model with TensorFlow"
weight = 60
weight = 10
+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}
{{% stable-status %}}

View File

@ -0,0 +1,6 @@
approvers:
- Bobgy
- RFMVasconcelos
reviewers:
- 8bitmp3

View File

@ -0,0 +1,5 @@
+++
title = "Distributions"
description = "A list of available Kubeflow distributions"
weight = 40
+++

View File

@ -1,5 +1,5 @@
+++
title = "Kubeflow on AWS"
description = "Running Kubeflow on Kubernetes Engine and Amazon Web Services"
weight = 50
weight = 20
+++

View File

@ -1,5 +1,5 @@
+++
title = "Kubeflow on Azure"
description = "Running Kubeflow on Kubernetes Engine and Microsoft Azure"
weight = 50
weight = 20
+++

View File

@ -184,4 +184,4 @@ Run the following commands to set up and deploy Kubeflow.
## Additional information
You can find general information about Kubeflow configuration in the guide to [configuring Kubeflow with kfctl and kustomize](/docs/other-guides/kustomize/).
You can find general information about Kubeflow configuration in the guide to [configuring Kubeflow with kfctl and kustomize](/docs/methods/kfctl/kustomize/).

View File

Before

Width:  |  Height:  |  Size: 239 KiB

After

Width:  |  Height:  |  Size: 239 KiB

View File

Before

Width:  |  Height:  |  Size: 206 KiB

After

Width:  |  Height:  |  Size: 206 KiB

View File

Before

Width:  |  Height:  |  Size: 82 KiB

After

Width:  |  Height:  |  Size: 82 KiB

View File

Before

Width:  |  Height:  |  Size: 69 KiB

After

Width:  |  Height:  |  Size: 69 KiB

View File

Before

Width:  |  Height:  |  Size: 33 KiB

After

Width:  |  Height:  |  Size: 33 KiB

View File

Before

Width:  |  Height:  |  Size: 116 KiB

After

Width:  |  Height:  |  Size: 116 KiB

View File

Before

Width:  |  Height:  |  Size: 175 KiB

After

Width:  |  Height:  |  Size: 175 KiB

View File

Before

Width:  |  Height:  |  Size: 66 KiB

After

Width:  |  Height:  |  Size: 66 KiB

View File

Before

Width:  |  Height:  |  Size: 191 KiB

After

Width:  |  Height:  |  Size: 191 KiB

View File

Before

Width:  |  Height:  |  Size: 266 KiB

After

Width:  |  Height:  |  Size: 266 KiB

View File

Before

Width:  |  Height:  |  Size: 230 KiB

After

Width:  |  Height:  |  Size: 230 KiB

View File

@ -0,0 +1,5 @@
approvers:
- RFMVasconcelos
- knkski
reviewers:
- DomFleischmann

View File

@ -0,0 +1,5 @@
+++
title = "Kubeflow Charmed Operators"
description = "Charmed Operators for Kubeflow deployment and day-2 operations"
weight = 50
+++

View File

@ -0,0 +1,110 @@
+++
title = "Installing Kubeflow with Charmed Operators"
description = "Instructions for Kubeflow deployment with Kubeflow Charmed Operators"
weight = 10
+++
This guide outlines the steps you need to install and deploy Kubeflow with [Charmed Operators](https://charmed-kubeflow.io/docs) and [Juju](https://juju.is/docs/kubernetes) on any conformant Kubernetes, including [Azure Kubernetes Service (AKS)](https://docs.microsoft.com/en-us/azure/aks/), [Amazon Elastic Kubernetes Service (EKS)](https://docs.aws.amazon.com/eks/index.html), [Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine/docs/), [OpenShift](https://docs.openshift.com), and any [kubeadm](https://kubernetes.io/docs/reference/setup-tools/kubeadm/)-deployed cluster (provided that you have access to it via `kubectl`).
#### 1. Install the Juju client
On Linux, install `juju` via [snap](https://snapcraft.io/docs/installing-snapd) with the following command:
```bash
snap install juju --classic
```
If you use macOS, you can use [Homebrew](https://brew.sh) and type `brew install juju` in the command line. For Windows, download the Windows [installer for Juju](https://launchpad.net/juju/2.8/2.8.5/+download/juju-setup-2.8.5-signed.exe).
#### 2. Connect Juju to your Kubernetes cluster
To operate workloads in your Kubernetes cluster with Juju, you have to add the cluster to the list of *clouds* in Juju via the `add-k8s` command.
If your Kubernetes config file is in the default location (such as `~/.kube/config` on Linux) and you only have one cluster, you can simply run:
```bash
juju add-k8s myk8s
```
If your kubectl config file contains multiple clusters, you can specify the appropriate one by name:
```bash
juju add-k8s myk8s --cluster-name=foo
```
Finally, to use a different config file, you can set the `KUBECONFIG` environment variable to point to the relevant file. For example:
```bash
KUBECONFIG=path/to/file juju add-k8s myk8s
```
For more details, go to the [official Juju documentation](https://juju.is/docs/clouds).
#### 3. Create a controller
To operate workloads on your Kubernetes cluster, Juju uses controllers. You can create a controller with the `bootstrap` command:
```bash
juju bootstrap myk8s my-controller
```
This command will create a couple of pods under the `my-controller` namespace. You can see your controllers with the `juju controllers` command.
You can read more about controllers in the [Juju documentation](https://juju.is/docs/creating-a-controller).
#### 4. Create a model
A model in Juju is a blank canvas where your operators will be deployed, and it holds a 1:1 relationship with a Kubernetes namespace.
You can create a model and give it a name, e.g. `kubeflow`, with the `add-model` command, and you will also be creating a Kubernetes namespace of the same name:
```bash
juju add-model kubeflow
```
You can list your models with the `juju models` command.
#### 5. Deploy Kubeflow
[note type="caution" status="MIN RESOURCES"]
To deploy `kubeflow`, you'll need at least 50Gb available of disk, 14Gb of RAM, and 2 CPUs available in your machine/VM.
If you have fewer resources, deploy `kubeflow-lite` or `kubeflow-edge`.
[/note]
Once you have a model, you can simply `juju deploy` any of the provided [Kubeflow bundles](https://charmed-kubeflow.io/docs/operators-and-bundles) into your cluster. For the _Kubeflow lite_ bundle, run:
```bash
juju deploy kubeflow-lite
```
and your Kubeflow installation should begin!
You can observe your Kubeflow deployment getting spun-up with the command:
```bash
watch -c juju status --color
```
#### 6. Add an RBAC role for Istio
At the time of writing this guide, to set up Kubeflow with [Istio](https://istio.io) correctly, you need to provide the `istio-ingressgateway` operator access to Kubernetes resources. Use the following command to create the appropriate role:
```bash
kubectl patch role -n kubeflow istio-ingressgateway-operator -p '{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"Role","metadata":{"name":"istio-ingressgateway-operator"},"rules":[{"apiGroups":["*"],"resources":["*"],"verbs":["*"]}]}'
```
#### 7. Set URL in authentication methods
Finally, you need to enable your Kubeflow dashboard access. Provide the dashboard's public URL to dex-auth and oidc-gatekeeper as follows:
```bash
juju config dex-auth public-url=http://<URL>
juju config oidc-gatekeeper public-url=http://<URL>
```
where in place of `<URL>` you should use the hostname that the Kubeflow dashboard responds to.
#### More documentation
For more documentation, visit the [Charmed Kubeflow website](https://charmed-kubeflow.io/docs).
#### Having issues?
If you have any issues or questions, feel free to create a GitHub issue [here](https://github.com/canonical/bundle-kubeflow/issues).

View File

@ -1,7 +1,7 @@
approvers:
- Bobgy
- joeliedtke
- rmgogogo
- zijianjoy
reviewers:
- 8bitmp3
- joeliedtke

View File

@ -1,5 +1,5 @@
+++
title = "Kubeflow on GCP"
description = "Running Kubeflow on Kubernetes Engine and Google Cloud Platform"
weight = 50
weight = 20
+++

View File

@ -4,10 +4,6 @@ description = "Running Kubeflow across on-premises and cloud environments with A
weight = 12
+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}
[Anthos](https://cloud.google.com/anthos) is a hybrid and multi-cloud
application platform developed and supported by Google. Anthos is built on
@ -16,7 +12,7 @@ open source technologies, including Kubernetes, Istio, and Knative.
Using Anthos, you can create a consistent setup across your on-premises and
cloud environments, helping you to automate policy and security at scale.
Kubeflow on GKE On Prem is a work in progress. To track progress you can subscribe
We are collecting interest for Kubeflow on GKE On Prem. You can subscribe
to the GitHub issue [kubeflow/gcp-blueprints#138](https://github.com/kubeflow/gcp-blueprints/issues/138).
## Next steps

View File

@ -105,7 +105,7 @@ You can use [kustomize](https://kustomize.io/) to customize Kubeflow.
Make sure that you have the minimum required version of kustomize:
<b>{{% kustomize-min-version %}}</b> or later. For more information about
kustomize in Kubeflow, see
[how Kubeflow uses kustomize](/docs/other-guides/kustomize/).
[how Kubeflow uses kustomize](/docs/methods/kfctl/kustomize/).
To customize the Kubernetes resources running within the cluster, you can modify
the kustomize manifests in `${KF_DIR}/kustomize`.

View File

@ -78,14 +78,14 @@ purpose. No tools will assume they actually exists in your terminal environment.
1. Install [Kustomize](https://kubectl.docs.kubernetes.io/installation/kustomize/).
**Note:** Prior to Kubeflow v1.2, Kubeflow was compatible only with Kustomize `v3.2.1`. Starting from Kubeflow v1.2, you can now use the latest Kustomize versions to install Kubeflow.
**Note:** Prior to Kubeflow v1.2, Kubeflow was compatible only with Kustomize `v3.2.1`. Starting from Kubeflow v1.2, you can now use any `v3` Kustomize version to install Kubeflow. Kustomize `v4` is not supported out of the box yet. [Official Version](https://github.com/kubeflow/manifests/tree/master#prerequisites)
To deploy the latest version of Kustomize on a Linux or Mac machine, run the following commands:
```bash
# Detect your OS and download the corresponding latest Kustomize binary
curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash
curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" > install_kustomize.sh
bash ./install_kustomize.sh 3.2.0
# Add the kustomize package to your $PATH env variable
sudo mv ./kustomize /usr/local/bin/kustomize
```

View File

@ -124,7 +124,7 @@ It's time to get started!
[tensorflow]: https://www.tensorflow.org/
[tf-train]: https://www.tensorflow.org/api_guides/python/train
[tf-serving]: https://www.tensorflow.org/serving/
[tf-serving]: https://www.tensorflow.org/tfx/guide/serving
[kubernetes]: https://kubernetes.io/
[kubernetes-engine]: https://cloud.google.com/kubernetes-engine/

View File

@ -1,5 +1,5 @@
+++
title = "Kubeflow on IBM Cloud"
description = "Running Kubeflow on IBM Cloud Kubernetes Service (IKS)"
weight = 50
weight = 20
+++

View File

@ -0,0 +1,289 @@
+++
title = "Create or access an IBM Cloud Kubernetes cluster on a VPC"
description = "Instructions for creating or connecting to a Kubernetes cluster on IBM Cloud vpc-gen2"
weight = 4
+++
## Create and setup a new cluster
Follow these steps to create and setup a new IBM Cloud Kubernetes Service(IKS) cluster on `vpc-gen2` provider.
A `vpc-gen2` cluster does not expose each node to the public internet directly and thus has more secure
and more complex network setup. It is recommended setup for secured production use cases of Kubeflow.
### Setting environment variables
Choose the region and the worker node provider for your cluster, and set the environment variables.
```shell
export KUBERNERTES_VERSION=1.18
export CLUSTER_ZONE=us-south-3
export CLUSTER_NAME=kubeflow-vpc
```
where:
- `KUBERNETES_VERSION`: Run `ibmcloud ks versions` to see the supported Kubernetes versions. Refer to
[Supported version matrix](https://www.kubeflow.org/docs/started/k8s/overview/#minimum-system-requirements).
- `CLUSTER_ZONE`: Run `ibmcloud ks locations` to list supported zones. For example, choose `us-south-3` to create your
cluster in the Dallas (US) data center.
- `CLUSTER_NAME` must be lowercase and unique among any other Kubernetes
clusters in the specified `${CLUSTER_ZONE}`.
**Notice**: Refer to [Creating clusters](https://cloud.ibm.com/docs/containers?topic=containers-clusters) in the IBM
Cloud documentation for additional information on how to set up other providers and zones in your cluster.
### Choosing a worker node flavor
The worker nodes flavor name varies from zones and providers. Run
`ibmcloud ks flavors --zone ${CLUSTER_ZONE} --provider vpc-gen2` to list available flavors.
Below are some examples of flavors supported in the `us-south-3` zone with `vpc-gen2` node provider:
```shell
ibmcloud ks flavors --zone us-south-3 --provider vpc-gen2
```
Example output:
```
For more information about these flavors, see 'https://ibm.biz/flavors'
Name Cores Memory Network Speed OS Server Type Storage Secondary Storage Provider
bx2.16x64 16 64GB 16Gbps UBUNTU_18_64 virtual 100GB 0B vpc-gen2
bx2.2x8† 2 8GB 4Gbps UBUNTU_18_64 virtual 100GB 0B vpc-gen2
bx2.32x128 32 128GB 16Gbps UBUNTU_18_64 virtual 100GB 0B vpc-gen2
bx2.48x192 48 192GB 16Gbps UBUNTU_18_64 virtual 100GB 0B vpc-gen2
bx2.4x16 4 16GB 8Gbps UBUNTU_18_64 virtual 100GB 0B vpc-gen2
...
```
The recommended configuration for a cluster is at least 8 vCPU cores with 16GB memory. Hence, we recommend
`bx2.4x16` flavor to create a two-worker-node cluster. Keep in mind that you can always scale the cluster
by adding more worker nodes should your application scales up.
Now set the environment variable with the flavor you choose.
```shell
export WORKER_NODE_FLAVOR=bx2.4x16
```
## Create an IBM Cloud Kubernetes cluster for `vpc-gen2` infrastructure
Creating a `vpc-gen2` based cluster needs a VPC, a subnet and a public gateway attached to it. Fortunately, this is a one
time setup. Future `vpc-gen2` clusters can reuse the same VPC/subnet(with attached public-gateway).
1. Begin with installing a `vpc-infrastructure` plugin:
```shell
ibmcloud plugin install vpc-infrastructure
```
Refer to this [link](https://cloud.ibm.com/docs/containers?topic=containers-vpc_ks_tutorial), for more information.
2. Target `vpc-gen 2` to access gen 2 VPC resources:
```shell
ibmcloud is target --gen 2
```
Verify that the target is correctly set up:
```shell
ibmcloud is target
```
Example output:
```
Target Generation: 2
```
3. Create or use an existing VPC:
a) Use an existing VPC:
```shell
ibmcloud is vpcs
```
Example output:
```
Listing vpcs for generation 2 compute in all resource groups and region ...
ID Name Status Classic access Default network ACL Default security group Resource group
r006-hidden-68cc-4d40-xxxx-4319fa3gxxxx my-vpc1 available false husker-sloping-bee-resize blimp-hasty-unaware-overflow kubeflow
```
If the above list contains the VPC that can be used to deploy your cluster - make a note of its ID.
b) To create a new VPC, proceed as follows:
```shell
ibmcloud is vpc-create my-vpc
```
Example output:
```
Creating vpc my-vpc in resource group kubeflow under account IBM as ...
ID r006-hidden-68cc-4d40-xxxx-4319fa3fxxxx
Name my-vpc
...
```
**Save the ID in a variable `VPC_ID` as follows, so that we can use it later.**
```shell
export VPC_ID=r006-hidden-68cc-4d40-xxxx-4319fa3fxxxx
```
4. Create or use an existing subnet:
a) To use an existing subnet:
```shell
ibmcloud is subnets
```
Example output:
```
Listing subnets for generation 2 compute in all resource groups and region ...
ID Name Status Subnet CIDR Addresses ACL Public Gateway VPC Zone Resource group
0737-27299d09-1d95-4a9d-a491-a6949axxxxxx my-subnet available 10.240.128.0/18 16373/16384 husker-sloping-bee-resize my-gateway my-vpc us-south-3 kubeflow
```
If the above list contains the subnet corresponding to your VPC, that can be used to deploy your cluster - make sure
you note it's ID.
b) To create a new subnet:
- List address prefixes and note the CIDR block corresponding to a Zone;
in the below example, for Zone: `us-south-3` the CIDR block is : `10.240.128.0/18`.
```shell
ibmcloud is vpc-address-prefixes $VPC_ID
```
Example output:
```
Listing address prefixes of vpc r006-hidden-68cc-4d40-xxxx-4319fa3fxxxx under account IBM as user new@user-email.com...
ID Name CIDR block Zone Has subnets Is default Created
r006-xxxxxxxx-4002-46d2-8a4f-f69e7ba3xxxx rising-rectified-much-brew 10.240.0.0/18 us-south-1 false true 2021-03-05T14:58:39+05:30
r006-xxxxxxxx-dca9-4321-bb6c-960c4424xxxx retrial-reversal-pelican-cavalier 10.240.64.0/18 us-south-2 false true 2021-03-05T14:58:39+05:30
r006-xxxxxxxx-7352-4a46-bfb1-fcbac6cbxxxx subfloor-certainly-herbal-ajar 10.240.128.0/18 us-south-3 false true 2021-03-05T14:58:39+05:30
```
- Now create a subnet as follows:
```shell
ibmcloud is subnet-create my-subnet $VPC_ID $CLUSTER_ZONE --ipv4-cidr-block "10.240.128.0/18"
```
Example output:
```
Creating subnet my-subnet in resource group kubeflow under account IBM as user new@user-email.com...
ID 0737-27299d09-1d95-4a9d-a491-a6949axxxxxx
Name my-subnet
```
- Make sure you export the subnet IDs follows:
```shell
export SUBNET_ID=0737-27299d09-1d95-4a9d-a491-a6949axxxxxx
```
5. Create a `vpc-gen2` based Kubernetes cluster:
```shell
ibmcloud ks cluster create vpc-gen2 \
--name $CLUSTER_NAME \
--zone $CLUSTER_ZONE \
--version ${KUBERNETES_VERSION} \
--flavor ${WORKER_NODE_FLAVOR} \
--vpc-id ${VPC_ID} \
--subnet-id ${SUBNET_ID} \
--workers 2
```
6. Attach a public gateway
This step is mandatory for Kubeflow deployment to succeed, because pods need public internet access to download images.
- First, check if your cluster is already assigned a public gateway:
```shell
ibmcloud is pubgws
```
Example output:
```
Listing public gateways for generation 2 compute in all resource groups and region ...
ID Name Status Floating IP VPC Zone Resource group
r006-xxxxxxxx-5731-4ffe-bc51-1d9e5fxxxxxx my-gateway available xxx.xxx.xxx.xxx my-vpc us-south-3 default
```
In the above run, the gateway is already attached for the vpc: `my-vpc`. In case no gateway is attached, proceed with
the rest of the setup.
- Next, attach a public gateway by running the following command:
```shell
ibmcloud is public-gateway-create my-gateway $VPC_ID $CLUSTER_ZONE
```
Example output:
```
ID: r006-xxxxxxxx-5731-4ffe-bc51-1d9e5fxxxxxx
```
Save the above generated gateway ID as follows:
```shell
export GATEWAY_ID="r006-xxxxxxxx-5731-4ffe-bc51-1d9e5fxxxxxx"
```
- Finally, attach the public gateway to the subnet:
```shell
ibmcloud is subnet-update $SUBNET_ID --public-gateway-id $GATEWAY_ID
```
Example output:
```
Updating subnet 0737-27299d09-1d95-4a9d-a491-a6949axxxxxx under account IBM as user new@user-email.com...
ID 0737-27299d09-1d95-4a9d-a491-a6949axxxxxx
Name my-subnet
...
```
### Verifying the cluster
To use the created cluster, switch the Kubernetes context to point to the cluster:
```shell
ibmcloud ks cluster config --cluster ${CLUSTER_NAME}
```
Make sure all worker nodes are up with the command below:
```shell
kubectl get nodes
```
and verify that all the nodes are in `Ready` state.
### Delete the cluster
Delete the cluster including it's storage:
```shell
ibmcloud ks cluster rm --force-delete-storage -c ${CLUSTER_NAME}
```

View File

@ -44,12 +44,23 @@ Get the Kubeconfig file:
ibmcloud ks cluster config --cluster $CLUSTER_NAME
```
From here on, please see [Install Kubeflow](/docs/ibm/deploy/install-kubeflow).
From here on, go to [Install Kubeflow on IKS](/docs/ibm/deploy/install-kubeflow-on-iks) for more information.
## Create and setup a new cluster
Follow these steps to create and setup a new [IBM Cloud Kubernetes Service(IKS) cluster:
* Use a `classic` provider if you want to try out Kubeflow.
* Use a `vpc-gen2` provider if you are familiar with Cloud networking and want to deploy Kubeflow on a secure environment.
A `classic` provider exposes each cluster node to the public internet and therefore has
a relatively simpler networking setup. Services exposed using Kubernetes `NodePort` need to be secured using
authentication mechanism.
To create a cluster with `vpc-gen2` provider, follow the
[Create a cluster on IKS with a `vpc-gen2` provider](/docs/ibm/create-cluster-vpc)
guide.
The next section will explain how to create and set up a new IBM Cloud Kubernetes Service (IKS)
### Setting environment variables
@ -62,20 +73,41 @@ export WORKER_NODE_PROVIDER=classic
export CLUSTER_NAME=kubeflow
```
- `KUBERNETES_VERSION` specifies the Kubernetes version for the cluster. Run `ibmcloud ks versions` to see the supported Kubernetes versions. If this environment variable is not set, the cluster will be created with the default version set by IBM Cloud Kubernetes Service. Refer to the [Minimum system requirements](https://www.kubeflow.org/docs/started/k8s/overview/#minimum-system-requirements) and choose a Kubernetes version compatible with the Kubeflow release to be deployed.
- `CLUSTER_ZONE` identifies the regions or location where CLUSTER_NAME will be created. Run `ibmcloud ks locations` to list supported IBM Cloud Kubernetes Service locations. For example, choose `dal13` to create CLUSTER_NAME in the Dallas (US) data center.
- `WORKER_NODE_PROVIDER` specifies the kind of IBM Cloud infrastructure on which the Kubernetes worker nodes will be created. The `classic` type supports worker nodes with GPUs. There are other worker nodes providers including `vpc-classic` and `vpc-gen2` where zone names and worker flavors will be different. Please use `ibmcloud ks zones --provider ${WORKER_NODE_PROVIDER}` to list zone names if using other providers and set the `CLUSTER_ZONE` accordingly.
where:
- `KUBERNETES_VERSION` specifies the Kubernetes version for the cluster. Run `ibmcloud ks versions` to see the supported
Kubernetes versions. If this environment variable is not set, the cluster will be created with the default version set
by IBM Cloud Kubernetes Service. Refer to
[Minimum system requirements](https://www.kubeflow.org/docs/started/k8s/overview/#minimum-system-requirements)
and choose a Kubernetes version compatible with the Kubeflow release to be deployed.
- `CLUSTER_ZONE` identifies the regions or location where cluster will be created. Run `ibmcloud ks locations` to
list supported IBM Cloud Kubernetes Service locations. For example, choose `dal13` to create your cluster in the
Dallas (US) data center.
- `WORKER_NODE_PROVIDER` specifies the kind of IBM Cloud infrastructure on which the Kubernetes worker nodes will be
created. The `classic` type supports worker nodes with GPUs. There are other worker nodes providers including
`vpc-classic` and `vpc-gen2` where zone names and worker flavors will be different. Run
`ibmcloud ks zones --provider classic` to list zone names for `classic` provider and set the `CLUSTER_ZONE`
accordingly.
- `CLUSTER_NAME` must be lowercase and unique among any other Kubernetes
clusters in the specified `${CLUSTER_ZONE}`.
**Notice**: If choosing other Kubernetes worker nodes providers than `classic`, refer to the IBM Cloud official document [Creating clusters](https://cloud.ibm.com/docs/containers?topic=containers-clusters) for detailed steps.
**Notice**: Refer to [Creating clusters](https://cloud.ibm.com/docs/containers?topic=containers-clusters) in the IBM
Cloud documentation for additional information on how to set up other providers and zones in your cluster.
### Choosing a worker node flavor
The worker nodes flavor name varies from zones and providers. Run `ibmcloud ks flavors --zone ${CLUSTER_ZONE} --provider ${WORKER_NODE_PROVIDER}` to list available flavors. For example, following are some flavors supported in the `dal13` zone with `classic` worker node provider.
The worker node flavor name varies from zones and providers. Run
`ibmcloud ks flavors --zone ${CLUSTER_ZONE} --provider ${WORKER_NODE_PROVIDER}` to list available flavors.
```text
$ ibmcloud ks flavors --zone dal13 --provider classic
For example, the following are some worker node flavors supported in the `dal13` zone with a `classic` node provider.
```shell
ibmcloud ks flavors --zone dal13 --provider classic
```
Example output:
```
OK
For more information about these flavors, see 'https://ibm.biz/flavors'
Name Cores Memory Network Speed OS Server Type Storage Secondary Storage Provider
@ -92,15 +124,18 @@ b3c.8x32 8 32GB 1000Mbps UBUNTU_18_64 virtua
...
```
Choose a flavor that will work for your applications. For the purpose of the Kubeflow deployment, the recommended configuration for a cluster is at least 8 vCPU cores with 16GB memory. Hence you can either choose the `b3c.8x32` flavor to create a one-worker-node cluster or choose the `b3c.4x16` flavor to create a two-worker-node cluster. Keep in mind that you can always scale the cluster by adding more worker nodes should your application scales up.
Choose a flavor that will work for your applications. For the purpose of the Kubeflow deployment, the recommended
configuration for a cluster is at least 8 vCPU cores with 16GB memory. Hence you can either choose the `b3c.8x32` flavor
to create a one-worker-node cluster or choose the `b3c.4x16` flavor to create a two-worker-node cluster. Keep in mind
that you can always scale the cluster by adding more worker nodes should your application scales up.
Now set the environment variable with the flavor you choose.
Now, set the environment variable with the worker node flavor of your choice:
```shell
export WORKER_NODE_FLAVOR=b3c.4x16
```
### Creating a IBM Cloud Kubernetes cluster
### Creating an IBM Cloud Kubernetes cluster
Run with the following command to create a cluster:
@ -115,7 +150,11 @@ ibmcloud ks cluster create ${WORKER_NODE_PROVIDER} \
Replace the `workers` parameter above with the desired number of worker nodes.
Note: If you're starting in a fresh account with no public and private VLANs, they are created automatically for you when creating a Kubernetes cluster with worker nodes provider `classic` for the first time. If you already have VLANs configured in your account, retrieve them via `ibmcloud ks vlans --zone ${CLUSTER_ZONE}` and include the public and private VLAN ids (set in the `PUBLIC_VLAN_ID` and `PRIVATE_VLAN_ID` environment variables) in the command, for example:
**Note**: If you're starting in a fresh account with no public and private VLANs, they are created automatically for you
when creating a Kubernetes cluster with worker nodes provider `classic` for the first time. If you already have VLANs
configured in your account, retrieve them via `ibmcloud ks vlans --zone ${CLUSTER_ZONE}` and include the public and
private VLAN ids (set in the `PUBLIC_VLAN_ID` and `PRIVATE_VLAN_ID` environment variables) in the command, for example:
```shell
ibmcloud ks cluster create ${WORKER_NODE_PROVIDER} \
@ -128,10 +167,11 @@ ibmcloud ks cluster create ${WORKER_NODE_PROVIDER} \
--public-vlan ${PUBLIC_VLAN_ID}
```
Wait until the cluster is deployed and configured. It can take a while for the cluster to be ready. Run with following command to periodically check the state of your cluster. Your cluster is ready when the state is `normal`.
Wait until the cluster is deployed and configured. It can take a while for the cluster to be ready. Run with following
command to periodically check the state of your cluster. Your cluster is ready when the state is `normal`.
```shell
ibmcloud ks clusters --provider ${WORKER_NODE_PROVIDER} |grep ${CLUSTER_NAME}|awk '{print "Name:"$1"\tState:"$3}'
ibmcloud ks clusters --provider ${WORKER_NODE_PROVIDER} |grep ${CLUSTER_NAME} |awk '{print "Name:"$1"\tState:"$3}'
```
### Verifying the cluster
@ -149,3 +189,11 @@ kubectl get nodes
```
and make sure all the nodes are in `Ready` state.
### Delete the cluster
Delete the cluster including it's storage:
```shell
ibmcloud ks cluster rm --force-delete-storage -c ${CLUSTER_NAME}
```

Some files were not shown because too many files have changed in this diff Show More