* Refactor presubmit-tests-with-pipeline-deployment.sh so that it can be run from a different project
* Simplify getting service account from cluster.
* Migrate presubmit-tests-with-pipeline-deployment.sh to use kfp
lightweight deployment.
* Add option to cache built images to make debugging faster.
* Fix cluster set up
* Copy image builder image instead of granting permission
* Add missed yes command
* fix stuff
* Let other usages of image-builder image become configurable
* let test workflow use image builder image
* Fix permission issue
* Hide irrelevant error logs
* Use shared service account key instead
* Move test manifest to test folder
* Move build-images.sh to a different script file
* Update README.md
* add cluster info dump
* Use the same cluster resources as kubeflow deployment
* Remove cluster info dump
* Add timing to test log
* cleaned up code
* fix tests
* address cr comments
* Address cr comments
* Enable image caching to improve retest speed
Change to add base framework for cleaning up resources in a GCP project.
The resource specification is specified declaratively using a YAML file.
As per current requirements this change only adds cleaning up of GKE
clusters.
* Refactor presubmit-tests-with-pipeline-deployment.sh so that it can be run from a different project
* Simplify getting service account from cluster.
* Copy image builder image instead of granting permission
* Add missed yes command
* fix stuff
* Let other usages of image-builder image become configurable
* let test workflow use image builder image
* Remove redundant import.
* Simplify sample_test.yaml by using withItem syntax.
* Simplify sample_test.yaml by using withItem syntax.
* Change dict to str in withItems.
* Move tensorflow installation into notebooks.
* Remove redundant import.
* Simplify sample_test.yaml by using withItem syntax.
* Simplify sample_test.yaml by using withItem syntax.
* Change dict to str in withItems.
* remove redundant sed options.
* Fix format/style issues
* [WIP] Refactor repeated logic into two utility functions.
* [WIP] Add a utility function to validate the test results from a notebook test.
* [WIP] Refactor test cases (except for notebook sample tests) into adopting utility functions.
TODO: Need to move the functions of run_*_test.py into a unified run_sample_test.py.
* [WIP] Fix a typo in test name and incorporate tfx-cab-classification, kubeflow-training-classification, xgboost-training-cm and basic ones into one run_sample_test.py
* Fix/add some comments.
* Refactor notebook tests into using utility functions
* lint
* Unify naming in sample_test.yaml
* Remove old *_test.py files
* Fix tests by fixing test names.
* Fix string formatting, per Ark-kun's comment.
* Fix names of papermill-generated python notebook.
* Fix tests
* Fix test by fixing experiment names, and test names in yaml.
* Fix test by fixing experiment names.
* Fix dsl type checking test that does not require experiment set-up.
* Remove redundant commands and usage of ipython
* Revert "Remove redundant commands and usage of ipython"
This reverts commit 23a0e014
* Remove redundant string subs and edit an AI.
* Move image name injection to a utility function to improve readability.
* Revert lint changes of check_notebook_results.py
* Unify test case naming convention to underscore.
* Fix .py name
* Fix README.md
* Fix test
* Add TODO items.
* Add a utility function to inject kubeflow_training_classification python sample file.
* Fix redundant cd command.
* Fix indentation.
* Fix test names in component_test.yaml
* Remove redundant clean_cmle_models.py
* Fix nit problem.
* Fix comment.
* Remove redundant import.
* Simplify sample_test.yaml by using withItem syntax.
* Simplify sample_test.yaml by using withItem syntax.
* Change dict to str in withItems.
* Testing - Clean up the Argo controller that was used to build images
* Try to not install Argo in the first place
* Added the test-runner service account
* Cleanup
* Changing the install-argo.sh script instead
* Allows toggling between one-off and recurring runs in the new run page
* Clean up and adds tests
* Fix integration test - account for extra field in form
* Cleanup and PR comments
* Clear default exp table on delete and create default exp on run create
if no default exists
With this change, if the delete experiment API is called on the default
experiment, then the ID will also be removed from the default_experiments
table.
Additionally, if the default experiment doesn't exist and a new run is
created without an experiment, a new default experiment will be created,
and the run will be placed within this experiment.
* Adds integration test for creating a run without an experiment
* Fixes failure to close database connection and adds tests for recreating and deleting default experiment
* Rename function
* Revert some row.Close() calls
* Testing/Sample - Made checking confusion matrix data more robust
The sample tests no longer depend on particular file names inside the archive. Now they only depend on the artifact name.
* Fixed json loading on python 3.5
`json.load` only supports reading from binary files in python 3.6+. https://docs.python.org/3/library/json.html#json.load
* Adds 'Create run' button to experiment list / all runs page
* Add run without experiment and filtering to FE integration test
* Update snapshots
* Add refresh and wait to integration test
* Adjust
* Adjust
* Don't exit integration test early if npm test fails
* PR comments
* TEMP - take screenshots to debug integration test
* Store screenshots
* Remove create run without experiment integration test for now as it fails due to the default experiment being deleted at the end of the API initialization and integration test suites
* WIP - Create default experiment upon API server initialization
* Default experiment initialization caused crashes if API server pod was restarted without clearing DB
* Adding new table to store default experiment ID
* Add default experiment type model definition
* Minor fixes, everything seems to work now
* Clean up. Renamed to default_experiment_store
* Adds tests for the default_experiment_store
* Add integration test for verifying initial cluster state. Currently only covers existence of default experiment
* Don't run initialization tests except as integration tests
* Fixes comments
* PR comments and cleanup
* Extract code in resource_manager to helper func
* add type checking sample to sample tests
* Dadd the test script exit code to the sample test result; update the check_notebook_result script to not validate the pipeline runs when experiment arg is not provided
* fix typo
* Fixed Kubeflow sample test
* Fixed the artifact-finding logic in `get_artifact_in_minio`.
It was just taking the first artifact before.
Now it properly searches the artifact by name.
* Update swagger definitions
* WIP - Adds ability to terminate runs to frontend
* Update snapshots
* Adds tests. Also changes warning message color to orange rather than red
* Remove refresh button from run details page
* Elaborate terminate confirmation message
* Minor fixes
* Remove references to refresh button from integration tests
* add a While in the ops group
* deepcopy the while conditions when entering and exiting
* add while condition resolution in the compiler
* define graph component decorator
* remove while loop related codes
* fixes
* remove while loop related code
* fix bugs
* generate a unique ops group name and being able to retrieve by name
* resolve the opsgroups inputs and dependencies based on the pipelineparam in the condition
* add a recursive ops_groups
* fix bugs of the recursive opsgroup template name
* resolve the recursive template name and arguments
* add validity checks
* add more comments
* add usage comment in graph_component
* add a sample
* add unit test for the graph opsgraph
* refactor the opsgroup
* add unit test for the graph_component decorator
* exposing graph_component decorator
* add recursive compiler unit tests
* add the sample test
* fix the bug of opsgroup name
adjust the graph_component usage example
fix index bugs
use with statement in the graph_component instead of directly calling
the enter/exit functions
* add a todo to combine the graph_component and component decorators
* fix some merging bug
* fix typo
* add more comments in the sample
* update comments
* dsl generate zip file
* minor fix
* fix zip read in the unit test
* update sample tests
* dsl compiler generates pipeline based on the input name suffix
* add unit tests for different output format
* update the sdk client to support tar zip and yaml
* fix typo
* fix file write
* add postsubmit script and yaml
* remove old sample tests component file
* extract deploy-pipeline.sh, deploy-kubeflow.sh and test-prep.sh from presubmit and postsubmit scripts
* component build support for both python2 and python3
* add sample test
* remove the annotations for python2 component build
* add pathlib for python2 component build
* fix component build unit test
* fix bug in the dockerfile generator
* remove exist_ok in path.mkdir to make python2 compatible
* adjust unit test
* remove pathlib dependency for python2 component build
* remove the pathlib codes in python3 component build, but use python2 code instead; add a todo to create a new sample
* merge build image to test suit
* update image
* Update presubmit-tests-with-pipeline-deployment.sh
* add permission to access to gcr
* add service account
* test
* fix
* not exit
* speed boost
* Uses 'Create' for all actions that lead to creation flow, or result in a static object (experiment). 'Start' is used solely for initiating runs
* Update integration test
* Update sample notebook to clean up deployed models.
Update SDK client to return correct links in local Jupyter with user's own proxy connection.
* Fix sample tests.
* add another sample test to test the current sample codes instead of using newly built component images
* rename sample test yamls
* use the v2 name
* bash bug
* tf-training bug fix
* output argo log in case of exceptions for tf-training sample
* disable gpu
* try go version in travis
* add back old travis tests with backend tests
* remove backend unit tests prow config
* remove unit_test_gke
* test backend/src directory
* update comment to call out unit test
* add vendor to gitignore
* switch to go module
* switch to go module
* switch to go module
* prune go mod
* prune go mod
* turn on go mod for test
* enable go module in docker image
* enable go module in docker image
* fix images
* debug
* debug
* debug
* update image
* add get_experiment_id and list_runs_by_experiment
* offer only one get_experiment function
* return experiment body instead of id
* simply codes
* simply code 2
* remove experiment_id check in the while loop
* minor bug
* Adds an experiment selector to the new run page. Needs tests
* Adds an experiment selector to the new run page. Needs tests
* Adds tests for the new experiment selector in NewRun
* Rename PipelineSelector -> ResourceSelector since it handles experiments as well
* Makes ResourceSelector more abstract. No longer coupled to experiments and pipelines
* PR comments, NewRun clean-up
* Moves resourceToRow function into ResourceSelector
* Fix e2e test
* add notebook sample tests for tfx
* parameterize component image tag
* parameterize base and target image tags
* install tensorflow package for the notebook tfx sample test
* bug fixes
* start debug mode
* fix bugs
* add namespace arg to check_notebook_results, copy test results to gcs, fix minor bugs
add CMLE model deletion
* install the correct KFP version in the notebook; parameterize deployer model name and version
* fix CMLE model name bug
* add notebook sample test in v2
* add gcp sa in notebook tfx sample and shutdown debug mode
* import kfp.gcp
* Tests - Getting rid of git clone in */run_test.sh
run_test.sh scripts no longer pull the repo code, because the code is now correctly baked in during the image build. This saves ~11 pulls per commit
Backend unit test image is now build as part of the test suite
* Added target-image-prefix parameter to simplify test configuration
* Build all images from source code prepared by Prow. Got rid of git pulls
All images are now built from archived version of code the source code prepared by Prow.
This saves 25 more pulls and improves test reliability.
The archived source code location is passed through image-build-context-gcs-uri parameter.
* Addressed the PR feedback.