pipelines

Commit Graph

Author	SHA1	Message	Date
Alexey Volkov	cc83e1089b	Assigned copyright to the project authors (#5587 )	2021-05-05 13:53:22 +08:00
Yuan (Bob) Gong	2500812914	[Testing] Reduce image build flakiness by share and retry cloudbuild jobs (#3492 ) * Let presubmit tests share and retry cloudbuild * Fix ongoing_build_ids * Add retry for workload identity binding * Fix build id * fix * Parralelize image buidling for api server and others * Fix * fix * fix * Fix again * Allow retry twice instead * Update deploy-pipeline-lite.sh * Update batch_build.yaml * Refine log and retry tests * Update log and retry * Update and retry * Update build-images.sh	2020-04-13 20:33:11 -07:00
Rui Fang	ccdb885519	[Backend]Initial execution cache (#3036 ) * Initial execution cache This commit adds initial execution cache service. Including http service and execution key generation. * Add initial server logic * Add const * Change folder name * Change execution key name * Fix unit test * Add Dockerfile and OWNERS file This commit adds Dockerfile for building source code and OWNERS file for easy review. This commit also renames some functions. * fix go.sum This PR fixes changes on go.sum * Add local deployment scripts This commit adds local deployment scripts which can deploy cache service to an existing cluster with KFP installed. * refactor src code * Add standalone deployment scripts and yamls This commit adds execution cache deployment scripts and yaml files in KFP standalone deployment. Including a deployer which will generate the certification and mutatingwebhookconfiguration and execution cache deployment. * Minor fix * Add execution cache image build in test folder * fix test cloudbuild * Fix cloudbuild * Add execution cache deployer image to test folder * Add copyright * Fix deployer build * Add license for execution cache and cloudbuild for execution cache images This commit adds licenses for execution cache source code. Also adds cloud build step for building cache image and cache deployer image. Change the manifest name based on changed image. * Refactor license intermediate data * Fix execution cache image manifest * Typo fix for cache and cache deployer images * Add arguments in ca generation scripts and change deployer base image to google/cloud * minor fix * fix arg * Mirror source code with MPL in execution_cache image * Minor fix * minor refactor on error handling * Refactor cache source code, Docker image and manifest * Fix variable names * Add images in .release.cloudbuild.yaml * Change execution_cache to generic name * revice readme * Move deployer job out of upgrade script * fix tests * fix tests * Seperate cache service and cache deployer job * mysql set up * Delete cache service in manifest, only test in presubmit tests * fix * fix presubmit tests * fix * fix * revert unnecessary change * fix cache image tag * change image gcr to ml-pipeline-test * Remove namespace in standalone manifest and add to test manifest	2020-03-03 16:13:47 -08:00
Alexey Volkov	dc34a3568d	Service - Metadata writer (#2674 ) * Metadata writer * Added sleeper-based metadata writer * Sleeper * First working draft * Added properties to Executions Artifacts and Contexts Also added attributions. context_id is now stored as label. * Prefix the execution type names * Ignoring TFX pods * Fixed the deployment container spec * Cleaned up the file and added deployment spec * Added the Kubernetes deployment * Added startup logging * Made python output unbuffered * Fixed None exception * Formatting exceptions * Prefixing the log message * Improved handling non-S3 artifacts * Logging input artifacts * Extracted code to the link_execution_to_input_artifact function * Setting execution's pipeline_name to workflow name * Adding annotation with input artifact IDs * Running infinitely * Added component version to execution type name * Marking metadata as written even for failed pods * Cleaned up some comments * Do not fail when upstream artifact is missing * Change the completion detection logic Waiting for Argo's "completed=true" instead of Kubernetes' "phase: Completed" introduced delays that lead to problems with missing input artifacts. This changes allows us to log the outpuyt artifacts earlier. * Added Dockerfile * Added release deployment manifest * Added OWNERS * Switching to using MLMD service instead of direct DB access * Adding licenses to the image * Pinned Python's minor version * Moved code to /backend/metadata_writer Moved manifest to /manifests * Added image building to CloudBuild * Added Metadata Writer to release CloudBuild * Added Metadata Writer to test scripts * Finished the kustomization manifests * Added Metadata Writer to marketplace manifests * Added ServiceAccount, Role and RoleBinding for MW * Fixed merge conflict * Removed the debug deployment * Forgot to add the chart templates for the SA and roles * Specified the service account * Switched to watching a single namespace * Resolved feedback Removed dev deployment comment from python code. Added license. Fixed the range of kubernetes package versions. * More review fixes * Extracted the metadata helper functions * Improved the error message when context type is unexpected * Fixed the import * Checking the connection to MLMD The latest tests started to have connection problems - "failed to connect to all addresses" and "Failed to pick subchannel". * Improved the MLMD connection error logging * Try creating MLMD client on each retry and using a different request * Changed the MLMD connection check request All get requests fail when the DB is empty, so we have to use a put request. See https://github.com/google/ml-metadata/issues/28 * Using unbuffered IO to improve the logging latency * Changed the URI schema for the artifacts * Cleanup * Simplified the kubernetes config loading code * Resolving the feedback	2020-01-14 23:17:32 -08:00
Yuan (Bob) Gong	3d008f96e9	Fix obsolete image cache when the same PR commit is tested with a new master (#2738 )	2019-12-16 17:09:38 -08:00
IronPan	0711566754	Build inverse proxy image as part of the presubmit test (#2187 ) * small fixes * add * Delete Makefile	2019-09-21 16:53:23 -07:00
Kirin Patel	41b394b045	Add e2e visualization tests (#1981 ) * Created visualization_api_test.go * Updated BUILD.bazel files * Removed clean_up from e2e test * Revert "Removed clean_up from e2e test" This reverts commit `82fd4f5a00`. * Update e2e tests to build visualizationserver and viewer-crd * Fix bug where wrong image is set * Fixed incorrect image names * Fixed additional instance of incorrect image names	2019-08-30 13:54:10 -07:00
Yuan (Bob) Gong	fe8d96ffb5	Use cloud build to build images instead (#1923 ) * Use cloud build to build images instead * Batch 3 image buld tasks * Fix check cloud build status script	2019-08-23 01:43:18 -07:00
Yuan (Bob) Gong	d11fae78d8	Use KFP lite deployment for presubmit tests (#1808 ) * Refactor presubmit-tests-with-pipeline-deployment.sh so that it can be run from a different project * Simplify getting service account from cluster. * Migrate presubmit-tests-with-pipeline-deployment.sh to use kfp lightweight deployment. * Add option to cache built images to make debugging faster. * Fix cluster set up * Copy image builder image instead of granting permission * Add missed yes command * fix stuff * Let other usages of image-builder image become configurable * let test workflow use image builder image * Fix permission issue * Hide irrelevant error logs * Use shared service account key instead * Move test manifest to test folder * Move build-images.sh to a different script file * Update README.md * add cluster info dump * Use the same cluster resources as kubeflow deployment * Remove cluster info dump * Add timing to test log * cleaned up code * fix tests * address cr comments * Address cr comments * Enable image caching to improve retest speed	2019-08-20 17:25:20 -07:00

9 Commits