pipelines

Commit Graph

Author	SHA1	Message	Date
StefanoFioravanzo	b5c10d2a21	feat(swf): Add a [[RunUUID]] macro (#4995 ) * [SWF] Add [[RunUUID]] macro Signed-off-by: Stefano Fioravanzo <stefano@arrikto.com> * [SWF] Fix typo in function name Signed-off-by: Stefano Fioravanzo <stefano@arrikto.com>	2021-02-01 03:35:02 -08:00
Chen Sun	26de102f82	chore(release): bumped version to 1.4.0-rc.1	2021-02-01 00:18:50 -08:00
James Liu	35bc50e8e7	Upgrade tfx version to 0.26.0 in backend (#5052 )	2021-01-28 17:49:01 -08:00
Niklas Hansson	2f04bc6697	fix(backend): fix periodic schedule to begin at start time. Fixes #3935 (#5027 ) * Fixed periodic schedule to start at starttime * clean up * clean up	2021-01-26 03:37:08 -08:00
James Liu	a8b7fc97b1	fix(test): Fix presubmit with python version upgrade (#5033 ) * Fix presubmit with python version upgrade * Update Dockerfile Co-authored-by: Yuan (Bob) Gong <4957653+Bobgy@users.noreply.github.com>	2021-01-26 01:47:00 -08:00
Ben Wallace	9a30e973d9	fix(backend): allow multiple values per key/op in filter. Fixes #4975 (#4990 ) * modify filter fields to map strings to slices of structs * fix broken json tests * add unit tests for new functionality	2021-01-20 15:57:00 -08:00
Niklas Hansson	ca09c7a026	fix(backend): add default value for CRON_SCHEDULE_TIMEZONE (#4977 ) * fixed config so the paramter is avilabel for gcp market place as well * wrong value	2021-01-14 12:33:31 -08:00
capri-xiyue	4ab5d63f71	fix(backend):Change enqueue base delay in non-error mode to 1 second for persistent agent (#4957 ) * Change enqueue base delay in non-error mode to 1 second * added Terminating constants * used terminating constants in store layer * modified comments	2021-01-14 11:18:55 -08:00
Niklas Hansson	eeb7f8f04a	fix(backend): make the scheduleworkflowontroller timezone aware. Fixes #2653 (#4641 ) * intial work' * small fixes * updated tests and how parameter are set * try to fix test * check with out adding missing test * fixed small typo * test changes * updated config * typo * updated after feedback * fixed pointer error * test to add paramter * moved to init so removed not needed code * updated further * updated tests to also check endtime * clean up test * fixed failing test * fixed the expected test results * added timezone examples * further clean up * fixed time format * Update params.env * moved location to cronjobscheduler * clean up * set env variable to empty * reverted back * updated to make magic nbr to constant * updated the tests with comment * added comments on cron expressions * update naming and return types * updated to UTC as default * updated with an alpha notice	2021-01-10 00:59:05 -08:00
Yang Pan	c484cfa46c	chore(release): bumped version to 1.3.0	2021-01-07 00:39:26 -08:00
Yuan (Bob) Gong	c72bac36b1	chore: add capri-xiyue as backend reviewer (#4964 )	2021-01-06 19:11:45 -08:00
capri-xiyue	768317aee3	fix(backend): fixed validation logic and resource manager logic when creating job and run (#4914 ) * modified validation logic of run and job * fixed resource manager logic when creating job and run * removed unused methods, changed to nested if else * fixed nits * fixed nits * fixed nits	2020-12-22 23:24:26 -08:00
capri-xiyue	1791d8e185	docs(backend)update docs of deploying apiserver (#4930 )	2020-12-22 22:34:26 -08:00
Chen Sun	5445ce82c7	chore(release): bumped version to 1.2.0	2020-12-17 23:24:32 -08:00
Yuan (Bob) Gong	135bfbb9f2	test(cache): fix cache deployer image build in apk add command (#4902 )	2020-12-15 20:06:19 -08:00
Yuan (Bob) Gong	9acc440c31	docs(backend): clean up readme (#4896 )	2020-12-14 21:37:48 -08:00
numerology	1449cfe0a5	chore(release): bumped version to 1.1.2	2020-12-14 09:43:07 -08:00
Yuan (Bob) Gong	6895f1977b	chore(backend): delete outdated backend/api/Makefile. Fixes #4717 (#4893 )	2020-12-13 18:05:28 -08:00
Yuan (Bob) Gong	4df8925b05	fix(backend): job api -- deletion/disabling should succeed when swf not found. Fixes #4871 (#4884 ) * fix(backend): job api -- deletion should succeed when swf not found * bug reproducing unit test * fix the bug and pass reproducing unit test * reproducing integration test * fix integration test * clarify error message * disable job should also succeed, unify term to CR instead of CRD * fix unit test error * fix error message * improve logging	2020-12-09 18:12:52 -08:00
Rui Fang	a0a1a5d0cf	chore(release): bumped version to 1.1.2-rc.1	2020-12-04 07:09:17 +00:00
Rui Fang	8a22a89c7d	chore(release): upgrade mlmd to 0.25.1 (#4859 ) * Initial execution cache This commit adds initial execution cache service. Including http service and execution key generation. * fix master * fix go.sum * upgrade mlmd to 0.25.1 * Update requirement.txt and it's scripts	2020-12-02 22:13:00 -08:00
hilcj	c1aebb5d22	chore(release): bumped version to 1.1.1-beta.1	2020-11-26 17:58:04 +00:00
hilcj	4fe4a30545	Revert "chore(release): bumped version to 1.1.1-beta.1" This reverts commit `9af3e79c10`.	2020-11-26 16:10:10 +00:00
hilcj	9af3e79c10	chore(release): bumped version to 1.1.1-beta.1	2020-11-26 04:32:09 +00:00
hilcj	bd86072a8c	Revert "chore(release): bumped version to 1.1.1.beta.1" This reverts commit `5928a2659b`.	2020-11-26 04:20:10 +00:00
hilcj	5928a2659b	chore(release): bumped version to 1.1.1.beta.1	2020-11-26 03:07:26 +00:00
DavidSpek	0df9473bba	feat: Set current namespace for in-cluster SDK in multi-user mode and add healthz endpoint to API backends (#4638 ) * Set current namespace in local KFP context if running from notebook * Create "~/.config/kfp/" instead of ".config/kfp/" At first it was assumed the `get_user_namespace` command would be executed from the home directory. * Create local context file if it doesn't exist during set_user_namespace * Grab path from LOCAL_KFP_CONTEXT when creating folder Instead of harcoding the os.mkdirs path to `~/.config/kfp` it now grabs it from the LOCAL_KFP_CONTEXT. Also, removed path creation in `get_user_namespace` as that is now handled in `set_user_namespace`. Also, it now checks if the path exists rather than the local_context_file to remove the situation where it tries to create ~/.config/kfp/ because the context.json doesn't exist when the path does. * add multi-user setting to healthz api * Add http prefix to health api url * move healtz api call to own function and fix multi_user boolean * Fix HEALTH_PATH declaration * Move check to Client __init__ and change get_kfp_healthz to avoid breaking in case of old apiserver image * Add multi_user to frontend healthz * Expose multi_user in frontend and add integration test * Fix integration test * Fix host hardcoding and error handling * Handle empty API response, check if API up to date * Fix response return * remove API check due to empty response * retry API call if first response empty * retry getting healthz api if no response * change health_api to https The healthz_api has been returning empty responses which might be caused by sending an http request to an https endpoint. Although requests handles redirects, this commit is to test if this solves the issue. * Add some debug info to healthz exception * add url to debug and lower retries to 1 * Use api_client to get healthz data * Debug info for API response * Follow API redirect history * Fix indentation * Add healthz proto * Try getting healthz api with new python backend * Add installation of kfp_server_api in tests * Fix incorrect setup location * Replace old .get with new http backend .multi_user * Code clean up * Small fixes and TimeOutError for retries healthz api * Remove changes to go dependencies * Send empty proto request and fix exception client * Remove unused commit_sha and tag_name	2020-11-24 15:36:39 -08:00
Ilias Katsakioris	39203d5ffa	feat(backend): Refactor authz to perform SubjectAccessReview. Fixes #3513 (#4723 ) * [Backend] Return proper error codes for failures during auth * [Backend] Implement helpers to initialize a SubjectAccessReview client In preparation of SubjectAccessReview, we implement some helpers to create a new Kubernetes Authorization clientset and return the SubjectAccessReview client. We also define some fake clients to be used by future tests. * [Backend] Introduce RBAC-related constants In preparation of SubjectAccessReview, introduce RBAC groups, resources, and verbs. * [Backend] Extend managers with a SubjectAccessReviewClient * [Backend] Refactor the authorization mechanism for requests Authorization should be based on performing some action on a resource living in a namespace. This commit refactors the authorization utilities to reflect this and perform SubjectAccessReview. This commit also deletes some tests based on old authn/authz mechanism. A following commit will fix/extend the tests for the new mechanism * [Backend] Adjust endpoints to pass resource attributes for authz With KFAM authorization, we passed only the namespace attribute for authorization. With SubjectAccessReview, we need a richer list of attributes. Thus, we adjust endpoints to pass request details (resource attributes) necessary for authorizing the request. We only change the already authorized endpoints, not introducing any new checks. * [Backend] Adjust apiserver/server tests to SubjectAccessReview * [Backend] Purge KFAM Since we no longer use KFAM, we may as well purge it * [Backend] Update BUILD files Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * [Manifests] Extend manifests for SubjectAccessReview * API Server: Allow creating SubjectAccessReviews * Add view/edit roles in a multi-user kustomization	2020-11-17 14:56:05 -08:00
Niklas Hansson	1b924e6e72	fix(Process): update backend development README. Fixes #4750 (#4774 ) * updated backend README.md * updated the name	2020-11-16 21:14:04 -08:00
Rafał Bigaj	678ae0fe08	feat(backend): new server API to read run log. Fixes #4468 (#4493 ) * New server API: read run log - The new server API endpoint (/apis/v1beta1/runs/{run_id}/nodes/{node_id}/log) to fetch run log - `ARCHIVE_LOG_FILE_NAME` and `ARCHIVE_LOG_PATH_PREFIX` options allows to control archive log path - UI Server fetches logs from server API or directly from k8s depending on `STREAM_LOGS_FROM_SERVER_API` option * New server API: read run log - ml-pipeline rbac update: allow for access to log * Read run log: enhanced error handling - log message on Pod access errors * Read run log: enhanced log archive options * Code format * Test update after getPodLogs signature change * Updated comments after review * `follow` query parameter in GET /apis/v1beta1/runs/{run_id}/nodes/{node_id}/log * Env variable friendly config names & comments - Config options: ARCHIVE_CONFIG_LOG_FILE_NAME, ARCHIVE_CONFIG_LOG_PATH_PREFIX - Copyright message update - New endpoint as `v1alpha1` * Licence updates - fluent-bit licence inlined - copyright message updates * Master merge - dependency conflicts	2020-11-11 00:37:50 -08:00
Yuan Gong	e28ec4d2df	chore: update OWNERS	2020-11-09 10:20:31 +08:00
Yuan Gong	66ccf335e8	chore: update OWNERS	2020-11-09 10:18:18 +08:00
Niklas Hansson	92a932e9d9	fix(backend): added setup structure to simplify adding new tests and removing duplicated code. Fixes #4630 (#4639 ) * simplified test * Updated and refacoted the tests further * fixed error in search replace' * test cleanup' * fixed line length * updated the naming * updated tests * removed paramter that is not needed	2020-11-04 14:54:53 -08:00
Yuan Gong	7d36f48482	chore(release): bumped version to 1.1.0-alpha.1	2020-11-02 03:01:27 +00:00
Niklas Hansson	a6c79c2e2c	test(backend): added missing assert statement in job_api_test . (#4705 )	2020-11-01 16:36:52 -08:00
Alexey Volkov	ec65dfe70a	feat(backend): Metadata Writer - Record parameter argument values to MLMD (#4564 ) Previously, Metadata Writer could only store input artifacts, but could not store input parameter arguments (since they were not available). The SDK can now preserve parameter arguments in Argo template annotation. The commit makes Metadata Writer extract information from that annotation and record it to MLMD. Fixes https://github.com/kubeflow/pipelines/issues/4556	2020-10-27 00:23:59 -07:00
Alexey Volkov	fc6b5d6c2e	fix(backend): Metadata Writer - Fixed setting execution custom properties (#4670 )	2020-10-24 20:19:00 -07:00
Niklas Hansson	d7793aff1b	fix(backend): updated the argo version too 2.7.7. Fixes #4392 (#4498 ) * updated the version * updated the serializer * fixed test * fixed some more changes * tested to update versions of k8 packages * reverted package update * change in API * fixed dependencies, need to fix broken tests now * updated fake client and fixed test due to updates in timestam.timestamp * missed to update fake client pod * fixed issue in controller * tested to update * updated * updated controller viewer * updates to fix go mod vendor * Updated the client * updated the golang versions * missed one docker file update, from 1.11 -> 1.13 * testing to fixe persistinace agent issues * Updated after feedback Co-authored-by: Niklas hansson <niklashansson@Niklass-MacBook-Pro.local>	2020-10-22 17:09:36 -07:00
Alexey Volkov	dde7f9a5d6	fix(backend): Caching - Fixed deployer failure on Kubernetes v1.16+. Fixes #4627 (#4632 ) * Backend - Caching - Fixed deployer failure on Kubernetes v1.16+ The sideEffects field field became required in v1 version of the resource https://github.com/kubernetes/kubernetes/pull/79549 Also adding failurePolicy: Ignore, because the default value has changed to Fail in v1.16. These changes are not needed for v1beta1, but I still add them for those cases as well for consistency. * The admissionReviewVersions field became required in the v1 API in v1.16 See https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#request	2020-10-19 20:18:08 -07:00
Niklas Hansson	2317015085	feat(backend): allow configuring if default version should be updated when uploading new pipeline version. Fixes #4049 (#4476 ) * update to fetch remote * missed to add the description * fixed merge conflict * initial work * fixed test and bug * updated python client * clean up * clean up * added config default * fixed bug in API * moved config value * reverted to load from config * clean up * Update _client.py * removed unecessary function and updated after feedback * missed to save pipeline.proto * updated the last parts after feedback * reverted back to use string and env variable * updated typo * fix typo in path * clean up * removed option in api * clean up python part * typo, cant run test locally * clean up, problems with local env * clean up missing differences * reverted proto files * further clean up * clean up * updated after feedback * Added tests * error in my defer statement * Updated the test	2020-10-19 02:08:14 -07:00
Niklas Hansson	5742991c1a	feat(API): exposing api for setting the default version of pipeline. Fixes #4049 (#4406 ) * initial work on exposing the default version of pipeline * update description * added missing files * updated api build, unsure if this is correct ... * updated after feedback * clean up * remove empty line * started to make the integration test * added integreation test * fixed build and feedback * updated the tests * Updated the test * new test * typo * updated the pipeline default * updated the pipeline version * formatting * error in comparison	2020-10-15 16:19:25 -07:00
Alexey Volkov	58584e1d1f	fix(cache): Cache deployer - Using the same kubectl version as the server (#4525 ) * Cache deployer - Using the same kubectl version as the server Fixes https://github.com/kubeflow/pipelines/issues/4505 * Changed the PATH precedence * Unquoted the jq output * Fixed the curl options	2020-09-29 22:47:24 -07:00
Alexey Volkov	24217ff4ab	fix(backend): Cache Deployer - Fixed grep call (#4568 )	2020-09-29 21:55:24 -07:00
Yihong Wang	06bf42998e	chore(backend): remove unused import in .proto (#4448 ) nit: Clean up unused `import`s in proto files. Since no actually code change, code gen is skipped.	2020-09-22 00:28:47 -07:00
Yuan (Bob) Gong	d91a0c9da1	chore(release): bump version to 1.0.1 on master branch (#4492 ) * chore(release): bump version to 1.0.1 on master branch * remove rc changelog	2020-09-14 01:28:58 -07:00
Yuan (Bob) Gong	29a6aaa4e4	fix(backend): persistence agent - workflow not found error should be a permanent error (#4486 ) * fix(backend): workflow not found error should be permanent * failing test case * Fix logic * fix another case * Switched to not found error * not found error should be permanent	2020-09-11 12:42:09 -07:00
Yuan (Bob) Gong	c64820ad25	build: remove our own tools to comply with pypi package licenses. Fixes #4461 (#4462 ) * improve license.sh logging * build: remove our own scripts to comply with pypi package licenses * Remove unneeded packages when we do not need to handle licensing ourselves	2020-09-03 17:41:43 -07:00
Yuan (Bob) Gong	b1dcedc6bd	test(backend): Fix upgrade_test flakiness. Fixes #4426 (#4460 )	2020-09-02 23:55:41 -07:00
Erhan Kesken	1f2d417e31	fix(backend): skip reporting native Argo workflows which do not have Run ID label. Fixes #3584 (#4438 ) Fixes 3584. For clusters with existing native Argo Workflows, ml-pipeline logs were dirtied with unneccessary stack traces due to "missing Run ID label" situation. Made persistenceagent skip the workflow if it misses the Run ID label, and added workflow name to previous error message in apiserver side.	2020-09-02 02:57:13 -07:00
Alexey Volkov	6b54eecf28	fix(backend): Caching - Only send cache-enabled pods to the caching webhook (#4429 ) * Backend - Caching - Only send cache-enabled pods to the caching webhook The caching webhook already checks whether the pod is cache-enabled, but this change makes the check happen sooner - even before calling the webhook. This way the webhook cannot possibly affect any non-KFP pods. This feature requires API v1 and Kubernetes v1.15, so we use it conditionally. * Support filtering on Kubernetes v1.15 as well	2020-09-02 02:09:25 -07:00
Daehee Kim	cd9c9ff2b2	fix(backend): add `MaxCallRecvMsgSize(math.MaxInt32)` to proxy server (#4402 )	2020-09-02 02:09:17 -07:00
frozeNinK	32c9c2ac86	fix(backend): fix typo in reference key type (#4376 ) * Fix typo in reference key type * well...	2020-09-02 02:09:09 -07:00
Erhan Kesken	ec733c9a42	fix(backend): prevent seg fault if workflow manifest is deleted. Fixes #4389 (#4439 ) Fixes #4389 (partially). When the workflow manifest file is deleted from s3 due to the retention policy, we were getting this segmentation fault in the next createRun attempt for that pipeline: ``` I0831 06:36:53.916141 1 interceptor.go:29] /api.RunService/CreateRun handler starting panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x156e140] goroutine 183 [running]: github.com/kubeflow/pipelines/backend/src/common/util.(Workflow).VerifyParameters(0xc000010610, 0xc00036b6b0, 0x0, 0xc00036b6b0) backend/src/common/util/workflow.go:66 +0x90 github.com/kubeflow/pipelines/backend/src/apiserver/resource.(ResourceManager).CreateRun(0xc00088b5e0, 0xc00088b880, 0xc0009c3c50, 0xc000010450, 0x1) backend/src/apiserver/resource/resource_manager.go:326 +0x27c github.com/kubeflow/pipelines/backend/src/apiserver/server.(RunServer).CreateRun(0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c3c80, 0xc0000b8718, 0x2ddc6e9, 0xc00014e070) backend/src/apiserver/server/run_server.go:43 +0xce github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler.func1(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc0008cbb40, 0x1, 0x1, 0x7f9e4d6466d0) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1399 +0x86 main.apiServerInterceptor(0x1e7bc20, 0xc0009c3c50, 0x1aa80e0, 0xc0009c3c80, 0xc000778ca0, 0xc000778cc0, 0xc0004dcbd0, 0x4e7bba, 0x1a98e00, 0xc0009c3c50) backend/src/apiserver/interceptor.go:30 +0xf8 github.com/kubeflow/pipelines/backend/api/go_client._RunService_CreateRun_Handler(0x1ac4a20, 0xc0000b8718, 0x1e7bc20, 0xc0009c3c50, 0xc0009c6e40, 0x1c6bd70, 0x1e7bc20, 0xc0009c3c50, 0xc0004321c0, 0x66) bazel-out/k8-opt/bin/backend/api/linux_amd64_stripped/go_client_go_proto%/github.com/kubeflow/pipelines/backend/api/go_client/run.pb.go:1401 +0x158 google.golang.org/grpc.(Server).processUnaryRPC(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0xc00071ab70, 0x2e14040, 0x0, 0x0, 0x0) external/org_golang_google_grpc/server.go:995 +0x466 google.golang.org/grpc.(Server).handleStream(0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700, 0x0) external/org_golang_google_grpc/server.go:1275 +0xda6 google.golang.org/grpc.(Server).serveStreams.func1.1(0xc0004e9084, 0xc00064eb00, 0x1ea2840, 0xc00061cd80, 0xc00046c700) external/org_golang_google_grpc/server.go:710 +0x9f created by google.golang.org/grpc.(*Server).serveStreams.func1 external/org_golang_google_grpc/server.go:708 +0xa1 ``` It was same in CreateJob calls. Scenario described in #4389 also seems causing the same issue. With this PR, we aim not to have the segmentation fault at least, because in our case it's expected that manifest files will be deleted after some time due to the retention policy. Other problems about right pipeline version picking described in issue #4389 still need to be addressed.	2020-09-02 00:13:06 -07:00
Yuan (Bob) Gong	7840d30274	test(backend): fix TestJobApis test is flaky. Fixes #4419 (#4451 )	2020-09-01 22:13:06 -07:00
Yuan (Bob) Gong	3fc32ace8b	build(backend): fix "ModuleNotFoundError: No module named setuptools._distutils". Part of #4443 (#4445 )	2020-09-02 06:35:25 +08:00
Alexey Volkov	ada18bc6e6	fix(backend): Caching - Reduced the cache webhook timeout (#4428 ) Reduced the timeout from 30 seconds to 5. This should not be needed as most users tell us that pods work even when the cache service is not available. But there was at least one customer who experienced timeout failures when creating pods after the service was deleted, but not the webhook config.	2020-08-28 05:16:53 -07:00
Alexey Volkov	40353bf6ac	fix(cache): Cache executions with no outputs. Fixes #3507 (#3808 ) Fixes https://github.com/kubeflow/pipelines/issues/3507	2020-08-20 17:37:39 -07:00
jingzhang36	ec59846718	chore(backend): use go build instead of bazel build for api server's docker file (#4373 ) * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * switch from bazel build to go build	2020-08-20 00:33:23 -07:00
Alexey Volkov	6d4c6632d3	chore(backend): Visualization: Using the correct Tensorflow image to prevent build time-outs (#4353 ) Decrease failures like the following: https://00e9e64bac96f3b5ceb1c3cfb1005d0a00d4d7f2cd03a990dc-apidata.googleusercontent.com/download/storage/v1/b/363997316495.cloudbuild-logs.googleusercontent.com/o/log-4b63e78a-ea23-4c52-b3f8-adcfa52f7dce.txt?qk=AD5uMEtFzOOoFfudNzxaUuDUFdIaxG_zPSxzo-bRUvvcNMZcfl4hOOouQsL6l6WObtQzXTlxCdNKGYS9eD1oRPD6QkBD7Tb5eTO2s2LECqJUkBMAiHyiaJYqyWHbNvKk6l3l5wjGrQx2ToBQkBTCJhNO_lttWaQUvTDN1acZU85s1K7X51e3Z7sB_hBMDKdHdjZPlNJKaHaQvUshQJRlE_4HMA40sEhECxMWb8xncijYL1Wijnppj1Y6f9ANFcqR2DsNqeC-fLVqPYpegj9idSVBW_z23iRZRCjzCOXzk-LLkkbe-O1XK_NCIeaUrzoDNll8hiI2tJ4Yy_ozVYXj4UNtMV4sxAR72i6w7dy2u-Q1U6I1ocwSCSH84UV0LPzPmo3c9w6vacjJVLm4DnQNBDc6RfNgojutNGXL8_hn8TXOVgXKMRs4SbbLr216QpQlXn9McG84GOsWM-V1ayyS5aUpW-4DubF3-aeGNKgyvMwnTBqVDTHoi1NflKXxRCDDAW1i43q4hbeiq4LJXRl_5OqHl1PEC8gAGgLKNDy1jXMp3NM6cJqF3VpPzNEgS_y3I1DqjyFavkqvFxYMM8FX6cOKf-rBteF8TMzu3kAlGAc-E4QmQSSP_ygjb5ISdjkQ7bwHAit0BmeBej7ldsPAV8web8xIu3akvMlH0ZFuX-m807cYdJZU26-OlPwafVS5J_iyg--GNgQTVcKydAcjf590pmUuRLONsic5rEg67zBw_YdWR5bQoDRr8XSyHsg8EjN7aIjXudE06VN_eQOCxG940MH-zjJKhO5tIZTdpJT4-mcAw8x7Pg4H3Vihx-MVhSWGUfRJIJXASNgYEBJ5DEH1wN41zadYnBsc9r6OuPwfI7Z96zhSbghFz60KnRZbIuMVXow8yTlP&isca=1	2020-08-19 23:25:22 -07:00
jingzhang36	1747f8f258	backend: add prometheus metrics collection to KFP server. (#4354 ) * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * prometheus configs; basic metrics in pipeline server to collect prometheus metrics * make version consistent s Please enter the commit message for your changes. Lines starting * check if we gc workflows * add prom deps * upload counts * remove non-code changes * more metrics * upload server metrics guarded by flag * todo for a flag * fix test * fix tests * fix tests	2020-08-15 07:40:17 -07:00
frozeNinK	384afac4b1	fix(backend): improve forward / backward compatibility on db status table (#4351 )	2020-08-13 00:26:13 -07:00
Alexey Volkov	89cbebb003	chore(bakend): Visualization - Do not fail on extra licenses (#4355 )	2020-08-12 17:00:15 -07:00
frozeNinK	514120167e	fix(backend): improve forward / backward compatibility on experiment table (#4349 )	2020-08-12 06:55:45 -07:00
Alexey Volkov	f35462fdb3	feat(cache): Explicitly specifying which attributes affect the cache key (#4076 ) * Backend - Cache - Explicitly specifying which attributes affect the cache key Fixes https://github.com/kubeflow/pipelines/issues/4038 Fixes https://github.com/kubeflow/pipelines/issues/3972 * Fixed the test * Added comments to intersectStructureWithSkeleton * Fixed the tests incorrectly modifying a global variable * Added the test verifying the template cleanup	2020-08-12 03:43:45 -07:00
myonlyzzy	bb21597d43	fix(backend): logs error when failing to init mysql. Fixes #4334 (#4335 )	2020-08-11 22:21:44 -07:00
jingzhang36	b6c2c2aee6	doc: readme for how to auto-generate api reference from the backend api definitions (#4348 ) * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * add a quick guide on how to generate api reference from kfp api definition * remove trailing lines	2020-08-10 21:12:16 -07:00
Yuan (Bob) Gong	01a79980f6	fix(backend): reduce confusing ReadArtifact errors for metrics in api server. Fixes #3699 (#4338 )	2020-08-09 21:26:19 -07:00
Alexey Volkov	fe77c197d1	fix(backend): Backend - Cache - Fixed reinstallation. Fixes #4299 (#4320 ) * Backend - Cache - Fixed reinstallation by adding missing roles * Stop ignoring the deletion errors * Added patch permission as well It should not be triggered, but might be useful in the future.	2020-08-04 18:48:28 -07:00
Gabriele Santomaggio	d4aabd15b1	chore(backend): fixes typo empty space (#4318 ) Fix empty space on the script, it could raise an error when use different shell as zsh.	2020-08-04 08:00:20 -07:00
Yuan (Bob) Gong	335323353f	chore: pin visualization server python dependencies. Fixes #3078 (#4310 ) * build: fix visualization server build failure by adding missing licenses for new deps * wip * chore: pin visualization server python dependencies * updates * update licenses	2020-08-03 03:55:40 -07:00
Yuan (Bob) Gong	2a65eec1fa	build: fix visualization server build failure by adding missing licenses for new deps (#4309 ) * build: fix visualization server build failure by adding missing licenses for new deps * download updated licenses	2020-08-02 20:39:39 -07:00
jingzhang36	9c6738fa80	feat(backend): sort by run metrics - step 3. Part of #3591 (#4251 ) * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * sorting by run metrics is different from sorting by name, uuid, created at, etc. The lattre are direct field in listable object, the former is an element in an arrary-typed field in listable object. In other words, the latter are columns in table, the former is not. * unit test: add sorting on metrics with both asc and desc order * GetFieldValue in all models * fix unit test * whether to test in list_test. It's hacky when check mode == 'run' * move model specific code to model; prevent model package depends on list package; let list package depends on modelpackage; marshal/unmarshal listable interface; include listable interface in token. * some assumption on token's Model field * fix the regular field checking logic * add comment to help devs to use the new field * add a validation check * Listable object can be too large to be in token. So replace it with only relevant fields taken out of it. In the future, if more fields in Listable object become relevant, manually add it to token * matches func update	2020-07-31 02:41:06 -07:00
jingzhang36	d4d361626e	feat(backend): sort by run metrics - step 2 (#4235 ) * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * sorting by run metrics is different from sorting by name, uuid, created at, etc. The lattre are direct field in listable object, the former is an element in an arrary-typed field in listable object. In other words, the latter are columns in table, the former is not. * unit test: add sorting on metrics with both asc and desc order * list is generic. model specific test is put to run_store_test.go	2020-07-20 23:27:14 -07:00
Yuan (Bob) Gong	988f5b02e4	chore(release): bump version to 1.0.0 on master branch (#4249 )	2020-07-20 02:04:51 -07:00
Alexey Volkov	1990588404	fix(backend): Metadata Writer - Fixed regression with artifact type retrieval. Fixes #3971 (#4231 ) * Metadata Writer - Fixed regression with artifact type retrieval The DSL compiler has changed the output name sanitization rules, so we should change them here accordingly. * Added the code link	2020-07-17 03:53:01 -07:00
Yao Xiao	d3d4dcbbc2	fix(backend): fixes useless error message when visualization-server is not accessible. Fixes #4157 (#4201 ) Add another unit test to handle ServiceHostNotExistError	2020-07-16 17:05:00 -07:00
jingzhang36	a96e8fe94e	feat(backend): sorting on run metrics - step 1 (#4203 ) * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * metrics as the outermost * columns swap	2020-07-13 17:33:21 -07:00
Yuan (Bob) Gong	e4f4250fa8	fix(cache): cache-deployer should check both secret and config (#4186 )	2020-07-09 02:38:02 -07:00
Yuan (Bob) Gong	75336f7395	fix(cache): Fix cache deployer not regenerating secrets when secret not present (#4171 )	2020-07-08 09:59:09 -07:00
jingzhang36	cf29c61e49	fix(backend): fix the google-api-core to 1.16.0 for backend visualization server. (#4158 ) * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * fix google api core to 1.16.0 until it gets newer release than 1.21.0 * add comments	2020-07-08 07:30:07 +08:00
Alexey Volkov	bd0f4d23b9	chore(backend): Only compiling the preloaded samples. Fixes #4117 (#4118 ) * Backend - Only compiling the preloaded samples Fixes https://github.com/kubeflow/pipelines/issues/4117 * Fixed the paths * Removed -o pipefail for now since sh does not support it * Fixed the quotes * Removed the __future__ imports Python 2 is no longer supported. The annotations cause compilation problems: ``` File "/samples/core/iris/iris.py", line 18 from __future__ import absolute_import ^ SyntaxError: from __future__ imports must occur at the beginning of the file ```	2020-07-06 20:03:57 -07:00
jingzhang36	ce51c591f3	fix: increase TFX version from 0.20.2 to 0.22.0. Fixes #4084 , fixes #4114 (#4133 ) * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * tfx 0.21.2 -> 0.22.2 * tfx 0.20.2 -> 0.22.0 * update requirements.txt	2020-07-02 00:40:46 -07:00
Yuan (Bob) Gong	be13a819f6	chore: fix kjd/idna license location change (#4127 )	2020-07-02 10:30:34 +08:00
Yuan (Bob) Gong	79e0ee2b49	chore: remove inactive reviewers (#4111 ) * Update OWNERS * Update OWNERS * Update OWNERS * Update OWNERS * Update OWNERS * Update OWNERS * Update OWNERS * Update OWNERS * Update OWNERS	2020-06-30 19:10:06 -07:00
frozeNinK	8a2d11c96a	feat(backend): Make number of persistence worker goroutine configurable (#3904 ) * Make number of persistence worker configurable * address comments * address comments * address comments	2020-06-29 21:37:58 -07:00
Yuan (Bob) Gong	042ff09100	fix(backend): allow empty userid header prefix. Fixes #4091 (4098)	2020-06-28 18:08:14 -07:00
Lida Li	91f08c4849	Validate resourcekey to avoid apiserver being panic for invalid inputs (#3999 )	2020-06-24 11:42:38 -07:00
Niklas Hansson	0f83eece66	chore(backend): mention the Bazel version requirements in the README.md (#3969 ) * Update README.md * Update backend/README.md	2020-06-24 16:58:07 +08:00
Yuan (Bob) Gong	f456ee9768	doc(sdk/client): fix kfp-server-api py client's docstring format (#4047 ) * Pull templates from upstream 4.3.1 * update templates according to OpenAPITools/openapi-generator/pull/6391 * regenerate python client	2020-06-23 06:21:45 -07:00
jingzhang36	6fdf03a164	[Backend] Bug fix: applying filter in listing versions (#4052 ) * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * add filter for listing versions * add another filter in test * comment revision	2020-06-23 03:07:41 -07:00
Renmin	7f39f18db7	better native-keras based sample (#3900 ) * move seq * for test * updated test	2020-06-22 02:24:40 -07:00
Alexey Volkov	0417f13dce	Metadata Writer - Added timeouts (#4037 )	2020-06-22 01:40:39 -07:00
dushyanthsc	24423ffa5c	Metadata-Writer: Updates metadata writer to use mlmd 0.22.0 (#4027 ) This change updates Metadata writer to use MLMD library version 0.22.0	2020-06-18 18:29:11 -07:00
jingzhang36	8553497c3c	Reduce ttl of persisted final workflow to 1 day (#4005 ) * reduce ttl of pesisted final workflow to 1 day * add comment * enable pagination when expanding experiment in both the home page and the archive page * Revert "enable pagination when expanding experiment in both the home page and the archive page" This reverts commit `5b672739dd`. * Address comments	2020-06-18 00:22:06 -07:00
Alexey Volkov	4f5a7f0c20	Metadata Writer - Stopped using artifact properties (#4004 )	2020-06-16 21:40:39 -07:00
Alexey Volkov	8f8ac52c34	Cache - Deployer should check whether the secret is installed (#3992 ) Fixes https://github.com/kubeflow/pipelines/issues/3815	2020-06-15 23:32:03 -07:00
Chen Sun	fcd2559b2c	[Backend][Mutli-user] Allow shared read in the special multi-user mode. (#3858 ) * Allow shared read in the special multi-user mode. * remove shared read on list functions until it's comfirmed needed.	2020-06-14 00:37:55 -07:00
jingzhang36	2fa6e2b7f6	[Backend] Filter run on status (#3959 ) * filter run on status * unit test * add assertion * one more test for not equal * verify generated args as well * assertion	2020-06-11 01:32:56 -07:00
Yuan (Bob) Gong	c6ac5e0b1f	[Python Client] Clean up generated python client template to facilitate version bump (#3937 ) * Remove version from generated python client header comment * Regenerate client * Bump to 1.0.0-dev.2 to showcase version bump diff	2020-06-09 18:20:04 -07:00
jingzhang36	5f9e56a744	Only pending or running workflows are considered not-final (#3940 ) * only pending or running workflows are considered not-final * rephrase comment	2020-06-09 06:05:18 -07:00

1 2 3 4 5 ...

492 Commits