Commit Graph

95 Commits

Author SHA1 Message Date
Ajay Gopinathan b97969f882 Fix retrying logic which was causing persistenceagent to crash loop. (#633) 2019-01-05 12:00:27 -08:00
Ajay Gopinathan eea6999b5b Add the Viewer CRD controller for managing web views such as Tensorboard instances from within the Pipelines UI. (#449)
* Add initial CRD types for Viewer resource, and generate corresponding
code.

* Use controller-runtime to scaffold out a controller main

* Start adding a deployment

* Clean up and separate reconciler logic into its own package for future testing.

* Clean up with comments

* Run dep ensure

* Update auto-generate script. Only need deepcopy funcs for viewer crd types

* Cleanup previously generated but unused viewer client code

* [WIP] Adding tests

* More tests

* Completed unit tests for reconciler with logic for max viewers

* Add CRD definition, sample instance and update README.

* Fix merge conflict

* Fix readme typo for kube and add direct port-forwarding instructions.

* Add tests for when persistent volume is used with Tensorboard viewer.

Also add a sample YAML to show how to mount and use a GCE persistent
disk in the viewer CRD.

* Remove vendor directory
2019-01-04 18:22:48 -08:00
Ajay Gopinathan 163545b370 Use Bazel to build the entire backend and perform API code generation (#609)
* Use Bazel to build the entire backend.

This also uses Bazel to generate code from the API definition in the
proto files.

The Makefile is replaced with a script that uses Bazel to first generate
the code, and then copy them back into the source tree.

Most of the BUILD files were generated automatically using Gazelle.

* Fix indentation in generate_api.sh

* Clean up WORKSPACE

* Add README for building/testing backend.

Also fix the missing licenses in the generated proto files.

* Add license to files under go_http_client
2019-01-04 17:17:20 -08:00
IronPan 808a83ca71 Load sample when pipeline initially started (#615)
* add system info table

* fix

* remove load sample job

* fix

* add logging

* add logging

* address comments

* address comments
2019-01-04 00:22:30 -08:00
Ajay Gopinathan 65d0f6a1a3 URLEncode instead of base64 encode the filter string (#620) 2019-01-03 17:01:26 -08:00
Ajay Gopinathan 8616398602 Encode filter parameter as a base64-encoded JSON string in List requests (#563)
* Make all ListXXX operations use POST instead of GET.

Generate new swagger definitions and use these to generate the frontend
APIs using `npm run apis`.

This is to support filtering in List requests, as the current
grpc-gateway swagger generator tool does not support repeated fields in
requests used in GET endpoints.

* Use base64-encoded JSON-stringified version of Filter instead.

This lets us keep filter as a simple parameter in the ListXXX requests,
and gets around having to use POST for List requests.

* refactor filter parsing to parseAPIFilter and add tests

* Hack to ensure correct Swagger definitions are generated for Filter.

* Fix merge conflicts with master after rebase

* fix indentation

* Fix hack so frontend apis compile.

* print failing experiments

* try print again.

* revert experiment_api_test

* Use StdEncoding for base64 encoding

* Fix nil pointer dereference error caused err variable shadowing
2019-01-02 13:03:14 -08:00
IronPan e7af13263a switch from go dep to go module (#581)
* add vendor to gitignore

* switch to go module

* switch to go module

* switch to go module

* prune go mod

* prune go mod

* turn on go mod for test

* enable go module in docker image

* enable go module in docker image

* fix images

* debug

* debug

* debug

* update image
2018-12-21 16:28:21 -08:00
IronPan 66d238d464 fix list run (#583) 2018-12-21 12:21:59 -08:00
IronPan e182e37f47 retry on create table (#582) 2018-12-21 11:36:50 -08:00
Alexey Volkov c33c62f59f Backend - Removed hardcoded metrics file name (#574)
* Backend - Removed hardcoded metrics file name

* Addressed the PR feedback

* Addressed the PR feedback.
2018-12-20 21:26:29 -08:00
Yasser Elsayed ca22c29f02 allow runs with no experiments (#572) 2018-12-20 14:12:05 -08:00
Yasser Elsayed 549a366c39 Support archiving/unarchiving runs on the backend (#552)
* skip integration tests when unit test flag is set to true

* wip

* add StorageState enum to proto

* add StorageState to model

* archive proto/model changes

* wip archive endpoint

* wip adding tests

* archive test

* unarchive proto and implementation

* cleanup

* make storage state required, with a default value

* remove unspecified value from storage state enum

* pr comments

* pr comments

* fix archive/unarchive endpoints, add api integration test

* typo
2018-12-19 14:01:06 -08:00
Chen Zhiwei d798135435 update dockerfile and build steps (#562) 2018-12-18 11:42:34 -08:00
IronPan 330693416e add job to load sample (#509)
* add job to load sample

* debug

* add job to ks list

* update permission

* update permission
2018-12-17 12:50:08 -08:00
Yasser Elsayed ba261f368e Skip backend integration tests when cli flag isn't passed (#527)
* skip integration tests when unit test flag is set to true

* use cli arg to run integration tests

* use runIntegrationTests flag instead
2018-12-17 10:47:37 -08:00
IronPan ddab2fdd36 update test to specify name when create pipeline (#543) 2018-12-14 19:25:12 -08:00
IronPan 399fc584d0 fix persistence agent to use in cluster DNS instead of kube proxy to access API (#538)
* fix pa

* comment
2018-12-14 15:33:27 -08:00
Ajay Gopinathan 0c0d8a2b91 Add filtering ability for all backend API ListXXX requests (#537)
* WIP: Add filter package with tests.

* Add tests for IN predicate.

* Add listing functions

* Try updating list experiments

* Cleanup and finalize list API.

Add tests for list package, and let ExperimentStore use this new API.
Update tests for the latter as well.

* Add comments. BuildSQL -> AddToSelect for flexibility

* Run dep ensure

* Add filter proto to all other resources

* Add filtering for pipeline server

* Add filtering for job server

* Add filtering for run server

* Try to fix integration tests
2018-12-14 00:18:31 -08:00
IronPan 9eee2bbb47 move name (#498) 2018-12-07 16:41:15 -08:00
Ajay Gopinathan d3a30889bc Pin versions of libraries and tools required for proto generation. (#492)
This change pins the versions of the libraries that were used to
generate the proto definitions using dep. The Makefile is then modified
so that the tool and library versions used to build the proto generated
files are from the vendor directory. This is a hacky, short-term
solution to ensure a reproducible build while we work on switching to
bazel.

The versions in the Gopkg.toml file were chosen based on my experiments
that generated proto files that did not change from what is already
checked in.
2018-12-07 15:08:46 -08:00
Ajay Gopinathan 8a13287b09 Add Gopkg dependency for kubernetes code-generator. (#371)
The code generator should not be run from HEAD, as it will generate code
that diverges from the pinned version of client-go, and also any
previously generated CRD controller clients.

This change pins both code generator and client-go to the specified
kubernetes release, and ensures the update-codegen.sh script uses the
code-generator specified in the vendor directory rather than HEAD. This
ensures the build is always reproducible.
2018-11-30 10:41:20 -08:00
IronPan 9b77d4a8a6 Switching test to kubeflow deployment (#351)
* test

* fix

* fix

* fix

* fix

* fix

* update

* cleanup

* fix

* coopy test

* chmod

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* update

* fix

* fix

* fix

* fix

* fix

* fix

* fix sample test

* fix

* fix

* merge

* update image builder image

* update script

* mount permission
2018-11-28 21:36:12 -08:00
Alexey Volkov 2a3ec15993 Backend - Fixed handling of sample compilation failure (#387)
* Backend - Fixed handling of sample compilation failure

* Update Dockerfile

* Fixed the pipeline enumeration with "while read" loop.
2018-11-27 21:33:25 -08:00
Yasser Elsayed b528f50686 add finish timestamp to API interceptor (#386) 2018-11-27 11:39:52 -08:00
qimingj 0b7120c322 Now pipeline function takes direct default values rather than dsp.PipelineParam. (#110)
* Now pipeline function takes direct default values rather than dsp.PipelineParam. It simplifies the sample code a lot.

* Remove extraneous parenthesis.

* Follow up CR comments.

* Change Dockerfile (not done).

* Fix dockerfile.

* Fix Dockerfile again.

* Remove unneeded installation of packages in Dockerfile.
2018-11-26 17:13:55 -08:00
Yang Pan b494e3daa0 Add integration tests for API servers (#112)
* add pipeline e2e test back

* delete samples

* more changes to pipeline test

* add delete run/experiment

* update test name

* stage

* add api

* fii

* add tensorboard routing rule (#143)

* add tensorboard routing rule

* rename tb routing rule

* address comments

* remove debugger

* fix url prefix

* more fixes

* update tests

* revert debugging

* update

* update test

* fix

* update test

* fix

* fix pipeline tests

* exp

* update run

* update jobs
2018-11-09 20:39:15 -08:00
Pascal Vicaire a44b4dd81a Saving progress. 2018-11-08 18:31:14 -08:00
Pascal Vicaire 26e96e2836 Saving progress. 2018-11-08 18:14:46 -08:00
Pascal Vicaire 26c929741d Saving progress. 2018-11-07 20:54:54 -08:00
Pascal Vicaire c03add4f28 Fixing imported json library in the CLI integration tests. 2018-11-07 14:12:26 -08:00
Pascal Vicaire b04883fbd5 Addressing code review comments. 2018-11-07 12:22:23 -08:00
Pascal Vicaire 7a18430ec5 Fix typo 2018-11-06 22:10:24 -08:00
Pascal Vicaire 425403ae83 Adding integrations test for the CLI commands related to pipelines. 2018-11-06 22:03:33 -08:00
Pascal Vicaire 7753bc04e5 First integration test for the ML Pipeline CLI (Pipeline List). (#81)
* First integration test for the ML Pipeline CLI (Pipeline List).

* Fixing an issue with an undefined variable

* Adding the --debug flag to help with debugging.

* Changing the namespace to Kubeflow.
2018-11-06 16:56:23 -08:00
Yang Pan 7847b74778 Compile samples instead of hard code them in API server (#76)
* compile samples

* update logging

* update description

* update sample

* add immediate value sample

* revert

* fail fast if the samples are failed to load

* comment

* address comments

* comment out

* update command

* comments
2018-11-06 15:08:28 -08:00
Yang Pan fba58cdb43 Fix validation check for maximum size limit (#104) 2018-11-06 13:11:06 -08:00
Yang Pan 315677c5a8 sort by run display name by default (#96) 2018-11-06 12:34:38 -08:00
Yang Pan c514b08703 fix miscellaneous List API issue (#90)
* fix

* Update job_store.go

* Update run_store.go

* Update presubmit-tests.sh

* Update job_store.go
2018-11-06 11:32:59 -08:00
Pascal Vicaire 3df28a9700 Updating OWNERS files. Adding per-subdirectory OWNER files. 2018-11-05 14:03:33 -08:00
Yang Pan e5d5e8c0e6 Propagate name for runs from scheduled job (#33) 2018-11-04 12:55:14 -08:00
Pascal Vicaire 9ab085ab41 Merge remote-tracking branch 'origin/master' into vicaire/updateImageVersions 2018-11-02 18:47:06 -07:00
Pascal Vicaire 67b011f228 Upgrading the container versions to 0.0.42, the version of the first release of kubeflow/pipelines. 2018-11-02 16:57:37 -07:00
Pascal Vicaire d18f37785d Updating references to the project repository to kubeflow/pipelines. 2018-11-02 15:06:54 -07:00
Pascal Vicaire 1bab424abf Fixing the GO import paths to reference the kubeflow/pipelines repository. 2018-11-02 14:53:42 -07:00
Pascal Vicaire 633e2ddcc8 Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00