Commit Graph

115 Commits

Author SHA1 Message Date
Ning 6554e133dd rename sample_test to component_test and sample_test_v2 to sample_test (#1341) 2019-05-16 11:38:29 -07:00
Ning 1ab558a7cd
add argo install in postsubmit (#1333) 2019-05-14 17:43:02 -07:00
Ning f8253235c0 ml-pipeline-test has a cb job (#1322) 2019-05-14 12:59:13 -07:00
Alexey Volkov 32875e4b2f Adding myself to test owners (#1312) 2019-05-13 15:53:09 -07:00
Alexey Volkov e9aa69e353 Testing - Clean up the Argo controller that was used to build images (#1311)
* Testing - Clean up the Argo controller that was used to build images

* Try to not install Argo in the first place

* Added the test-runner service account

* Cleanup

* Changing the install-argo.sh script instead
2019-05-13 14:51:09 -07:00
Riley Bauer 25cb766dae Adds a toggle between one-off and recurring runs to NewRun page (#1274)
* Allows toggling between one-off and recurring runs in the new run page

* Clean up and adds tests

* Fix integration test - account for extra field in form

* Cleanup and PR comments
2019-05-04 11:29:37 -07:00
Ning d0429b63f9 keep the api image name consistent between the presubmit test and staging (#1279) 2019-05-03 17:49:38 -07:00
Ning 727c48c690 update the image in the samples to use the new component images (#1267)
* update the image in the samples to use the new component images

* replace the image tag in the yaml
2019-05-02 19:46:17 -07:00
Ning 02ecab8174 fix postsubmit bugs (#1248)
* fix postsubmit bugs

* fix bugs
2019-04-29 14:55:35 -07:00
Ning a50997c98d remove unnecessary args (#1249) 2019-04-29 13:57:36 -07:00
Riley Bauer d88ba380bc Clear default exp table on delete and create default exp on run create if none exists (#1199)
* Clear default exp table on delete and create default exp on run create
if no default exists

With this change, if the delete experiment API is called on the default
experiment, then the ID will also be removed from the default_experiments
table.

Additionally, if the default experiment doesn't exist and a new run is
created without an experiment, a new default experiment will be created,
and the run will be placed within this experiment.

* Adds integration test for creating a run without an experiment

* Fixes failure to close database connection and adds tests for recreating and deleting default experiment

* Rename function

* Revert some row.Close() calls
2019-04-29 12:13:35 -07:00
Alexey Volkov e2492896aa Testing/Sample - Made checking confusion matrix data more robust (#1196)
* Testing/Sample - Made checking confusion matrix data more robust
The sample tests no longer depend on particular  file names inside the archive. Now they only depend on the artifact name.

* Fixed json loading on python 3.5

`json.load` only supports reading from binary files in python 3.6+. https://docs.python.org/3/library/json.html#json.load
2019-04-25 15:46:27 -07:00
Alexey Volkov c9d3d39377 Testing - Fixed the postsubmit tests (#1210) 2019-04-24 20:12:45 -07:00
Riley Bauer 4de20179c6 Update to version 3.0.2 of npm package 'extend' (#1211)
* Update to version 3.0.2 of npm package 'extend'

* Use 'new' with Storage
2019-04-24 10:24:10 -07:00
Alexey Volkov 173ecbda4c Marked all scripts as executable (#1177) 2019-04-23 16:12:00 -07:00
Riley Bauer b29266351e Allow creating runs without experiments (#1175)
* Adds 'Create run' button to experiment list / all runs page

* Add run without experiment and filtering to FE integration test

* Update snapshots

* Add refresh and wait to integration test

* Adjust

* Adjust

* Don't exit integration test early if npm test fails

* PR comments

* TEMP - take screenshots to debug integration test

* Store screenshots

* Remove create run without experiment integration test for now as it fails due to the default experiment being deleted at the end of the API initialization and integration test suites
2019-04-22 11:59:45 -07:00
Riley Bauer fbae0f855e Creates a default experiment at API server set up time (#1089)
* WIP - Create default experiment upon API server initialization

* Default experiment initialization caused crashes if API server pod was restarted without clearing DB

* Adding new table to store default experiment ID

* Add default experiment type model definition

* Minor fixes, everything seems to work now

* Clean up. Renamed to default_experiment_store

* Adds tests for the default_experiment_store

* Add integration test for verifying initial cluster state. Currently only covers existence of default experiment

* Don't run initialization tests except as integration tests

* Fixes comments

* PR comments and cleanup

* Extract code in resource_manager to helper func
2019-04-15 15:55:04 -07:00
Ning 71325c3316 new kubernetes packages contain breaking change, thus fixing the version in the sample test image (#1159)
* new kubernetes packages contain breaking change, thus fixing the version

* also fixing the kubernetes version in the python sdk dependency

* fix bug
2019-04-14 21:36:00 -07:00
IronPan ac7366b9d8 use kubeflow/pipelines branch for deployment in test (#1143)
/assign @Ark-kun
2019-04-11 23:12:45 -07:00
Ning 06e544ba8c add type checking sample to sample tests (#1129)
* add type checking sample to sample tests

* Dadd the test script exit code to the sample test result; update the check_notebook_result script to not validate the pipeline runs when experiment arg is not provided

* fix typo
2019-04-11 21:40:45 -07:00
Alexey Volkov c9382474d6 Fixed Kubeflow sample test (#1096)
* Fixed Kubeflow sample test

* Fixed the artifact-finding logic in `get_artifact_in_minio`.
It was just taking the first artifact before.
Now it properly searches the artifact by name.
2019-04-06 01:00:27 -07:00
Riley Bauer 94925ff2bd Add run termination controls to ui (#1039)
* Update swagger definitions

* WIP - Adds ability to terminate runs to frontend

* Update snapshots

* Adds tests. Also changes warning message color to orange rather than red

* Remove refresh button from run details page

* Elaborate terminate confirmation message

* Minor fixes

* Remove references to refresh button from integration tests
2019-04-01 10:18:35 -07:00
IronPan 1fe8497182 Pin specific version of kubeflow instead of using master (#995)
* ping specific version of kubeflow instead of from master

* Update deploy-kubeflow.sh
2019-03-30 09:33:08 -07:00
Ning 1d617b50bf Add a recursion sample (#1016)
* add a While in the ops group

* deepcopy the while conditions when entering and exiting

* add while condition resolution in the compiler

* define graph component decorator

* remove while loop related codes

* fixes

* remove while loop related code

* fix bugs

* generate a unique ops group name and being able to retrieve by name

* resolve the opsgroups inputs and dependencies based on the pipelineparam in the condition

* add a recursive ops_groups

* fix bugs of the recursive opsgroup template name

* resolve the recursive template name and arguments

* add validity checks

* add more comments

* add usage comment in graph_component

* add a sample

* add unit test for the graph opsgraph

* refactor the opsgroup

* add unit test for the graph_component decorator

* exposing graph_component decorator

* add recursive compiler unit tests

* add the sample test

* fix the bug of opsgroup name
adjust the graph_component usage example
fix index bugs
use with statement in the graph_component instead of directly calling
the enter/exit functions

* add a todo to combine the graph_component and component decorators

* fix some merging bug

* fix typo

* add more comments in the sample

* update comments
2019-03-27 20:44:43 -07:00
Ning 554731e478 dsl generate zip file (#855)
* dsl generate zip file

* minor fix

* fix zip read in the unit test

* update sample tests

* dsl compiler generates pipeline based on the input name suffix

* add unit tests for different output format

* update the sdk client to support tar zip and yaml

* fix typo

* fix file write
2019-03-26 15:14:50 -07:00
IronPan 44233c2ea0 use pending commit id for cluster and source code name (#994)
* fix cluster name

* Update deploy-kubeflow.sh
2019-03-20 10:15:45 -07:00
Yasser Elsayed b79bb5f527 Deflake frontend e2e test (#904) 2019-03-04 17:55:03 -08:00
IronPan f8c1dde3a8 move integration test to sub dir (#888)
* move integration test to sub dir

* revert
2019-03-01 15:32:56 -08:00
IronPan 18878f1bed fix bunch of issues in prow test (#866)
* update tests

* explicit return successful

* fix

* move variable

* Update deploy-kubeflow.sh
2019-02-27 14:40:01 -08:00
Ning b3dee0543a sample test image build failure (#871)
* sample test image build failure

* fix the base image tag to avoid future breaks
2019-02-27 11:16:51 -08:00
Ning fa3ebefe09 add sigint sigterm (#863) 2019-02-26 17:35:49 -08:00
Ning 508210d40b
Add postsubmit component test (#613)
* add postsubmit script and yaml
* remove old sample tests component file
* extract deploy-pipeline.sh, deploy-kubeflow.sh and test-prep.sh from presubmit and postsubmit scripts
2019-02-25 13:29:04 -08:00
Ning a6763b9599 component build support for both python2 and python3 (#730)
* component build support for both python2 and python3

* add sample test

* remove the annotations for python2 component build

* add pathlib for python2 component build

* fix component build unit test

* fix bug in the dockerfile generator

* remove exist_ok in path.mkdir to make python2 compatible

* adjust unit test

* remove pathlib dependency for python2 component build

* remove the pathlib codes in python3 component build, but use python2 code instead; add a todo to create a new sample
2019-02-25 12:56:19 -08:00
hongye-sun ad370933c7 Move e2e tests to us-east1 (#847)
* move to us-east1

* switch to us-east1-b
2019-02-22 10:51:42 -08:00
hongye-sun 749d0aab9f Update swagger codegen version (#839) 2019-02-21 12:21:38 -08:00
Riley Bauer 600acbdd23
Updates lodash to version 4.17.11 (#803)
* Pin lodash version

* Commit updated package-locks
2019-02-13 10:40:21 -08:00
IronPan cc257f29a2 switch test to us-west1 (#808)
* switch test to east1

* Update presubmit-tests-with-pipeline-deployment.sh
2019-02-11 16:10:00 -08:00
IronPan e9bd7c6a4d
merge build image to test suit (#799)
* merge build image to test suit

* update image

* Update presubmit-tests-with-pipeline-deployment.sh

* add permission to access to gcr

* add service account

* test

* fix

* not exit

* speed boost
2019-02-09 00:23:06 -08:00
hongye-sun 969bb4ed2c Revert "Add gpu pool to test deployment and enable gpu in sample test (#696)" (#778)
This reverts commit 72a7de9d47.
2019-02-04 21:59:19 -08:00
IronPan 6ad2601ec7 Remove pipeline bootstrapper (#739)
* remove bootstrapper

* more cleanups

* more cleanups
2019-01-31 17:52:03 -08:00
Yasser Elsayed be19cbc259 Refactor UI buttons to lib file (#737)
* refactor buttons to lib file

* Add license header

* fix e2e test
2019-01-28 11:27:45 -08:00
hongye-sun 72a7de9d47 Add gpu pool to test deployment and enable gpu in sample test (#696)
* add gpu pool to test deployment and enable gpu in sample test

* enable clean up
2019-01-25 09:44:08 -08:00
IronPan 4c551bac60 bump ks version (#693) 2019-01-23 15:08:25 -08:00
qimingj 4a043c1823 Add CMLE sample test script. (#724)
The test is not added to the list to run automatically yet since it takes about 25 min.
2019-01-22 20:44:34 -08:00
Riley Bauer d9665549ce Use "create" rather than "start" except when initiating a run (#650)
* Uses 'Create' for all actions that lead to creation flow, or result in a static object (experiment). 'Start' is used solely for initiating runs

* Update integration test
2019-01-09 11:56:45 -08:00
Ning f86fb2a677 output argo log in case of exception throw (#635) 2019-01-07 10:43:30 -08:00
Ning ea72316ac4 fix deploy model name conflict in case of concurrent notebook sample test (#636)
* fix deploy model name conflict in case of concurrent notebook sample test

* minor fix
2019-01-05 09:15:18 -08:00
qimingj 410f9b979f Update sample notebook to clean up deployed models. (#622)
* Update sample notebook to clean up deployed models.

Update SDK client to return correct links in local Jupyter with user's own proxy connection.

* Fix sample tests.
2019-01-04 13:07:30 -08:00
Ning 5abc1a4f59 Add sample test without image build (#578)
* add another sample test to test the current sample codes instead of using newly built component images

* rename sample test yamls

* use the v2 name

* bash bug

* tf-training bug fix

* output argo log in case of exceptions for tf-training sample

* disable gpu
2019-01-03 15:17:51 -08:00
Yasser Elsayed 3ef16462b6 Move backend unit tests to Travis (#589)
* try go version in travis

* add back old travis tests with backend tests

* remove backend unit tests prow config

* remove unit_test_gke

* test backend/src directory

* update comment to call out unit test
2019-01-02 15:00:09 -08:00