pipelines/test
Ning 6554e133dd rename sample_test to component_test and sample_test_v2 to sample_test (#1341) 2019-05-16 11:38:29 -07:00
..
api-integration-test Creates a default experiment at API server set up time (#1089) 2019-04-15 15:55:04 -07:00
frontend-integration-test Adds a toggle between one-off and recurring runs to NewRun page (#1274) 2019-05-04 11:29:37 -07:00
imagebuilder Tests - Updated image-builder Makefile (#500) 2018-12-21 00:13:12 -08:00
initialization-test Creates a default experiment at API server set up time (#1089) 2019-04-15 15:55:04 -07:00
minikube Tests/Minikube - Fix Google credentials (#194) 2018-11-12 13:26:35 -08:00
sample-test update the image in the samples to use the new component images (#1267) 2019-05-02 19:46:17 -07:00
OWNERS Adding myself to test owners (#1312) 2019-05-13 15:53:09 -07:00
README.md Updating references to the project repository to kubeflow/pipelines. 2018-11-02 15:06:54 -07:00
build_image.yaml merge build image to test suit (#799) 2019-02-09 00:23:06 -08:00
check-argo-status.sh Marked all scripts as executable (#1177) 2019-04-23 16:12:00 -07:00
component_test.yaml rename sample_test to component_test and sample_test_v2 to sample_test (#1341) 2019-05-16 11:38:29 -07:00
deploy-kubeflow.sh use kubeflow/pipelines branch for deployment in test (#1143) 2019-04-11 23:12:45 -07:00
deploy-pipeline.sh keep the api image name consistent between the presubmit test and staging (#1279) 2019-05-03 17:49:38 -07:00
e2e_test_gke_v2.yaml Creates a default experiment at API server set up time (#1089) 2019-04-15 15:55:04 -07:00
install-argo.sh Testing - Clean up the Argo controller that was used to build images (#1311) 2019-05-13 14:51:09 -07:00
postsubmit-tests-with-pipeline-deployment.sh add argo install in postsubmit (#1333) 2019-05-14 17:43:02 -07:00
presubmit-tests-gce-minikube.sh Tests - Improve Minikube source code copy process (#367) 2018-11-27 10:56:49 -08:00
presubmit-tests-with-pipeline-deployment.sh Testing - Clean up the Argo controller that was used to build images (#1311) 2019-05-13 14:51:09 -07:00
presubmit-tests.gke.sh activate public prow service account (#153) 2018-11-08 16:00:30 -08:00
sample_test.yaml rename sample_test to component_test and sample_test_v2 to sample_test (#1341) 2019-05-16 11:38:29 -07:00
test-prep.sh fix postsubmit bugs (#1248) 2019-04-29 14:55:35 -07:00

README.md

ML pipeline test infrastructure

This folder contains the integration/e2e tests for ML pipeline. We use Argo workflow to run the tests.

At a high level, a typical test workflow will

  • build docker images for all components
  • create a dedicate test namespace in the cluster
  • deploy ml pipeline using the newly built components
  • run the test
  • delete the namespace
  • delete the images

All these steps will be taking place in the same Kubernetes cluster. You can use GKE to test against the code in a Github Branch. The image will be temporarily stored in the GCR repository in the same project.

Tests are run automatically on each commit in a Kubernetes cluster using Prow. Tests can also be run manually, see the next section.

Run tests using GKE

You could run the tests against a specific commit.

Setup

Here are the one-time steps to prepare for your GKE testing cluster:

  • Follow the main page to create a GKE cluster.
  • Install Argo in the cluster. If you have Argo CLI installed locally, just run
    argo install
    
  • Create cluster role binding.
    kubectl create clusterrolebinding default-as-admin
    --clusterrole=cluster-admin --serviceaccount default:default
    
  • Follow the guideline to create a ssh deploy key, and store as Kubernetes secret in your cluster, so the job can later access the code. Note it requires admin permission to add a deploy key to github repo. This step is not needed when the project is public.
    kubectl create secret generic ssh-key-secret
    --from-file=id_rsa=/path/to/your/id_rsa
    --from-file=id_rsa.pub=/path/to/your/id_rsa.pub
    

Run tests

Simply submit the test workflow to the GKE cluster, with a parameter specifying the commit you want to test (master HEAD by default):

argo submit integration_test_gke.yaml -p commit-sha=<commit>

You can check the result by doing:

argo list

The workflow will create a temporary namespace with the same name as the Argo workflow. All the images will be stored in gcr.io/project_id/workflow_name/branch_name/*. By default when the test is *finished, the namespace and images will be deleted. However you can keep them by providing additional parameter.

argo submit integration_test_gke.yaml -p branch="my-branch" -p cleanup="false"

Troubleshooting

Q: Why is my test taking so long on GKE?

The cluster downloads a bunch of images during the first time the test runs. It will be faster the second time since the images are cached. The image building steps are running in parallel and usually takes 2~3 minutes in total. If you are experiencing high latency, it might due to the resource constrains on your GKE cluster. In that case you need to deploy a larger cluster.