pipelines

Commit Graph

Author	SHA1	Message	Date
Vitalii Vokhmin	2f1db59798	fix(sdk): compile ParallelFor in a deterministic manner (#4926 ) * fix(sdk): compile ParallelFor in a deterministic manner During compilataion ParallelFor components end up with randomized names, which makes it very inconvenient to compare two versions of a pipeline. This commit fixes this issue. * fix(sdk): fix new parallel-for test cases	2021-01-29 18:31:09 -08:00
Chen Sun	ecb14f40bb	chore(sdk): Remove v2 components fork, use v1 instead. (#5042 ) * Remove v2.components fork * fix setup.py	2021-01-28 18:20:07 -08:00
Jiaxiao Zheng	279694ec6d	feat(sdk): Container entrypoint used for new styled KFP component authoring (#4978 ) * skeleton * add entrypoint utils to parse param * wip: artifact parsing * add input param artifacts passing and clean unused code * wip * add output artifact inspection * add parameter output * finish entrypoint implementation * add entrypoint_utils_test.py * add entrypoint test * add entrypoint test * get rid of tf * fix test * fix file location * fix tests * fix tests * resolving comments * Partially rollback * resolve comments in entrypoint.py * resolve comments	2021-01-14 16:01:21 -08:00
Niklas Hansson	c2a8bd0b93	Added checks for parallism values (#4950 ) * Added checks for parallism values * fix variable name	2021-01-08 08:28:55 -08:00
Jiaxiao Zheng	a56efb2061	feat(sdk): Merge artifact ontology from v2 to the classic KFP. (#4963 ) * move modules back to v1 * move and fix ontology tests	2021-01-07 23:00:53 -08:00
Jiaxiao Zheng	7540ba5c3b	feat(sdk): Implements artifact URI placeholder. (#4932 ) * add placeholder to spec * add output_directory to pipeline * respect uri placeholder in file outputs * wip: add data passing rewriting logic to respect the uri semantics * merge input_uri and paths when instantiating ContainerOp * fix * fix workflow rewriting * Add topology rewriting * add a test case, and various fixes * make the test case more complex * Fix the case when working with OpsGroup * Fix test case * fix resolving test * fix redundant cmd lines * fix redundant cmd lines * resolve comments * fix file outputs * resolve comments * copy file outputs instead of modifying inplace.	2021-01-05 20:39:51 -08:00
Niklas Hansson	24732b9dae	feat(compiler): add dsl operation for parallelism on sub dag level (#4199 ) * Added subdag parallelism Authored-by: NikeNano <niklas.sven.hansson@gmail.com> Co-authored-by: guanhuichen <guanhuichen@gmail.com> * added error handling, fixed comment and refactored * updated with sleep and TODO * fix imports Co-authored-by: guanhuichen <guanhuichen@gmail.com>	2020-12-26 22:10:27 -08:00
Ilias Katsakioris	8f70bf325e	fix(sdk): Do not wait for resource deletion (#4820 ) When calling the delete() method of a ResourceOp we need to ensure we do not wait for its deletion. The reason for this is described in [1]: If a pipeline creates a resource which is being consumed by its steps (e.g., a PVC), the step deleting the resource will hang waiting for the Kubernetes resource deletion which, in turn, is waiting for the other steps to get deleted. As a result, the pipeline never finishes. This commit allows specifying flags for the ResourceOp kubectl commands and defaults to the '--wait=false' flag for the deletion. Specifying flags for a ResourceTemplate is not supported in Argo v2.7 that we currently deploy. But they will be once we upgrade to v2.11+ [2]. This does not affect the delete() method because we don't rely on Argo's ResourceTemplate for it. [1] https://github.com/kubeflow/pipelines/issues/4506 [2] https://github.com/kubeflow/pipelines/issues/4553 Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>	2020-12-17 16:54:24 -08:00
Michalina Kotwica	5169489be5	feat(sdk): allow calling GroupOp.after with multiple ops (#4788 ) * allow calling ContainerOp.after with multiple ops * m: simplify signature of OpsGroup.after as it's not public anyway, chances of calling it with by-keyword argument are small compared to calling it with unpacking an arbitrary list, which could be empty (previous signature would fail for empty list)	2020-12-16 15:51:09 -08:00
Kenta Onishi	5a4b70e37c	feat(sdk): Add settings of the dnsConfig field. Fixes #4836 (#4837 ) * feat(sdk): Add settings of the dnsConfig field. Fixes #4836 * feat(sdk): Add dnsConfig example and sample. * feat(sdk): Refactor dnsConfig param. * feat(sdk): Refactor dnsConfig param.	2020-12-14 20:05:49 -08:00
Vitalii Vokhmin	2f3a686e54	feat(sdk): add ability to set retry policy (#4858 ) * feat(sdk): add ability to set retry policy This fixes the second part of the issue described in #4333 The first part was addressed in #4392 * feat(sdk): validate retry policy name * feat(sdk): simplify retry policy interface	2020-12-11 14:47:29 -08:00
Rui Fang	518b8b886d	[Doc] update docs that still refer to KFP latest SDK reference (#4845 ) * Initial execution cache This commit adds initial execution cache service. Including http service and execution key generation. * fix master * fix go.sum * update docs that still refer to KFP latest SDK reference	2020-12-02 20:02:59 -08:00
Jiaxiao Zheng	fb15223f7e	chore: Add doc strings marking the feature stages for SDK. (#4575 ) * add doc strings * Simplify the docstring * fix unittest * recover cli.py * recover cli.py * substitute docstring in resource ops with TODOs * revert stable labels	2020-11-24 00:19:00 -08:00
David Przybilla	5f992f5d06	fix(sdk): VolumeOp has apiVersion as parameter (#4694 )	2020-11-21 03:05:33 -08:00
Jiaxiao Zheng	c4dd7871a6	chore: Support resource spec in v2 compiler (#4669 ) * skeleton of code * commit resource spec in IR proto * add resource setter * add accelerator setters * fix unit conversion * fix attribute proxy * add and fix unittests * add e2e test * clean up * clean up * clean up * clean up * bypass subclass overriding * clean up * clean up * clean up * resolve comments	2020-10-26 16:40:02 -07:00
Alexey Volkov	80e1d7063d	fix(sdk): Fixed UI metadata and metrics (#4672 ) Reverting most of the #2334 which inadvertently broke those artifacts by causing the names to be mangled. KFP's DSL compiler prepends template names to output names to ensure global uniqueness of input names (DSL's ContainerOp does not have concept of inputs, so the inputs are generated during the compilation including input names). But prepending template names to the output names stops the backend from recognizing the mlpipeline-ui-metadata and mlpipeline-metrics artifacts.	2020-10-25 19:37:00 -07:00
Abhishek Vilas Munagekar	c52a81c1af	fix(sdk): fixes dsl.ContainerOp deprecation warning not shown (#4658 ) * change dsl.ContainerOp warning to FutureWarning * fix tests	2020-10-23 18:07:01 -07:00
Alexey Volkov	e8fb58a221	feat(sdk): Preserve parameter arguments and input names (#4563 ) ContainerOp has no concept of inputs, so it looses any information about them such as input names and in some cases even the passed argument values (which are just injected into the command line). This commit fixes that issue by preserving the paramater arguments map and ultimately storing it in an Argo template annotation. Fixes https://github.com/kubeflow/pipelines/issues/4556	2020-10-11 20:32:48 -07:00
Alexey Volkov	1aa8068507	fix(sdk): DSL - Enabled arbitrary ContainerOp names (#4554 ) Fixes https://github.com/kubeflow/pipelines/issues/4522	2020-09-29 05:21:35 -07:00
Abhishek Vilas Munagekar	0653e7c766	feat(sdk): adds support for ephemeral-storage in container-op (#4504 ) * add type annotation to container class * add support for ephemeral storage in container op	2020-09-20 03:30:29 -07:00
Alexey Volkov	03325848fc	feat(sdk): Components - Prevent passing unserializable objects to components. Fixes #4040 (#4496 )	2020-09-16 02:23:22 -07:00
Niklas Hansson	c32ea232d5	feat(compiled): set pod disruption budget for pipelines. Fixes #3877 (#4178 ) * Update _client.py * Update _client.py * added pod disruption budget * clean up * Update sdk/python/kfp/dsl/_pipeline.py * fixed parameter * updated after feedback * removed selector	2020-09-14 13:45:26 -07:00
Alexey Volkov	7dc051b982	refactor(sdk): Refactored ResourceOp deletion (#3841 )	2020-08-25 23:12:02 -07:00
Alexey Volkov	d87c3be611	DEPRECATE(sdk): DSL - Deprecated output_artifact_paths parameter in ContainerOp constructor (#2334 ) The users should switch to file_outputs instead. Previously `file_outputs` only supported small data outputs, but now it supports big files.	2020-08-07 21:44:19 -07:00
Alex Latchford	704c8c7660	chore: Clean up KFP SDK docstrings, make formatting a little more consistent (#4218 ) * Prepare SDK docs environment so its easier to understand how to build the docs locally so theyre consistent with ReadTheDocs. * Clean up docstrings for kfp.Client * Add in updates to the docs for compiler and components * Update components area to add in code references and make formatting a little more consistent. * Clean up containers, add in custom CSS to ensure we do not overflow on inline code blocks * Clean up containers, add in custom CSS to ensure we do not overflow on inline code blocks * Remove unused kfp.notebook package links * Clean up a few more errant references * Clean up the DSL docs some more * Update SDK docs for KFP extensions to follow Sphinx guidelines * Clean up formatting of docstrings after Ark-Kuns comments	2020-08-04 00:33:47 +08:00
Alexey Volkov	2f9482758b	feat(sdk): SDK - Deprecation warning when using ContainerOp (#4166 ) * SDK - Added warning when not using components We have long advised our users to create reusable components. Creating reusable components is as easy as creating ContainerOp instances, but the components are shareable, portable and are easier to support going forward. * Disable warning for TFX * Fixed the warning disabling logic * Added tests	2020-07-08 23:16:53 -07:00
Niklas Hansson	c6ac83f72c	feat: add parallelism for dsl.ParallelFor. Fixes #4089 (#4149 ) * Added parallism at sub-dag level * updated the parallism * remove yaml file * reformatting * Update sdk/python/kfp/compiler/compiler.py * Update sdk/python/kfp/compiler/compiler.py * Update samples/core/loop_parallelism/loop_parallelism.py Co-authored-by: Alexey Volkov <alexey.volkov@ark-kun.com> Co-authored-by: Alexey Volkov <alexey.volkov@ark-kun.com>	2020-07-08 11:27:13 -07:00
Alexey Volkov	d707b93fb4	feat(sdk): DSL - Added support for volatile components (#4104 ) Volatile components do not reuse the cached results by default. The pipeline authors can re-enable cache reuse if they want.	2020-07-06 18:09:57 -07:00
Alexey Volkov	6960366846	fix(sdk): Compiler - Fixed the input argument mapping when using dsl.graph_component. Fixes #3915 (4082) * SDK - Compiler - Fixed the input argument mapping when using dsl.graph_component Fixes https://github.com/kubeflow/pipelines/issues/3915 * Stopped relying on the argument order at all This can make the compilation less fragile.	2020-06-29 02:31:37 -07:00
Alexey Volkov	54a596abd8	SDK - Compiler - Added support for volume-based data passing (3371) * SDK - Compiler - Added support for volume-based data passing Currently artifact passing is performed by Argo sidecar containers what download input data and upload output data to artifact repository (usually, S3-compatible blob storage like Minio). The performance of this method is not optimal and it requires that pod disks have enough capacity to hold all artifact data. This commit adds support for volume-based data passing. This method involves using a single milti-write Kubernetes data volume to pass all intermediate data. Parts of the volume are mounted to the input/output artifact directories, so when the user program reads and writes files, the files actually reside in the data volume. This method improves the performance and reduces storage resource requirements. The data volume must exist and support "READ_WRITE_MANY". Limitations: * All artifact file names must be the same (e.g. "data"). All auto-generated paths are already consistent. Avoid using any hard-coded paths. * Passing constant values (text) as arguments for artifact inputs is not supported. * The feature is experimental. * Added data_passing_methods.KubernetesVolume This class represents a configured volume-based artifact passing method. * Added PipelineConf.data_passing_method This property allows setting the method that will be used for intermediate data passing. Added the compiler support for the new feature. Example: ```python from kfp.dsl import PipelineConf, data_passing_methods from kubernetes.client.models import V1Volume, V1PersistentVolumeClaim pipeline_conf = PipelineConf() pipeline_conf.data_passing_method = data_passing_methods.KubernetesVolume( volume=V1Volume( name='data', persistent_volume_claim=V1PersistentVolumeClaim('data-volume'), ), path_prefix='artifact_data/', ) ``` * Added unit test * Fixed bug in the unit test Kubernetes does not validate the structures at all... * Fixed bug in the result structure * Fixed the test data The class should be V1PersistentVolumeClaimVolumeSource, not V1PersistentVolumeClaimSpec. * Fixed the test	2020-06-25 16:11:31 -07:00
Eterna2	4812d35283	Fix #3906 - mount_pvc transform should ignore non-ContainerOps (#3912 ) * Fix #3906 - check that ops to be transformed is a containerOp * Update docstring for add_op_transformer to clarify that not only containerOp will be transformed.	2020-06-09 19:04:05 -07:00
Shotaro Kohama	699ce937da	Modify docstrings to replace 'InitContainer' to 'UserContainer' (#3863 )	2020-05-28 21:07:22 -07:00
Thi Nguyen	ec9445aa01	Allow PipelineParams in dict keys too. (#3565 ) Co-authored-by: Thi Nguyen <duongnt@users.noreply.github.com>	2020-05-19 17:54:19 -07:00
Alexey Volkov	92a0d11853	SDK - Moved some data from the component_ref annotation to the component_spec annotation (#3751 ) Removing the component spec from component_ref (since it would be a duplicate), but making sure the whole spec if available in component_spec.	2020-05-15 01:02:58 -07:00
Alexey Volkov	8ba366b03f	SDK - Made outputs with original names available in ContainerOp.outputs (#3734 ) * SDK - Made outputs with original names available in ContainerOp.outputs Previously, ContainerOp had strict requirements for the output names, so we had to convert all the names before passing them to the ContainerOp constructor. Outputs with non-pythonic names could not be accessed using their original names. Now ContainerOp supports any output names, so we're now using the original output names. However to support legacy pipelines, we're also adding output references with pythonic names. * Fixed the compiler test data * Fixed the duplicate parameter outputs in the compiled workflow * Fixed long line * Stabilized the output naming conflict resolution * Fix case of missing special outputs	2020-05-12 19:08:26 -07:00
Alexey Volkov	2279bde698	SDK - Annotate pods with component_ref (#3727 ) * SDK - Annotate pods with component_ref This preserves the information about the digest of the component and the location from which the component was loaded. * Fixed compiler tests	2020-05-11 17:18:21 -07:00
Niklas Hansson	05c1537f28	Add Nodeselector to pipelineconfig fix issue #2863 (#3616 ) * updated version * added pipeline nodeselector * removed old legacy * renaming * update test * Update sdk/python/kfp/compiler/compiler.py	2020-05-05 00:11:08 -07:00
Eterna2	9167da1b4e	Support execution throttling for executing the pipelines (#3346 ) (#3439 ) * Add parallelism limits to pipeline in kfp sdk * fix lint error	2020-05-04 23:25:08 -07:00
Sascha Grunert	62e33c711b	Fix container.set_image_pull_policy documentation string (#3653 ) The API function seemed to be not correct, which is now fixed. Signed-off-by: Sascha Grunert <sgrunert@suse.com>	2020-04-29 12:44:18 -07:00
Niklas Hansson	2354776e1e	fix #2802 : Set ImagePullPolicy per pipeline. (#3534 ) * bump version * default image pull policy * Update sdk/python/kfp/dsl/_pipeline.py * task setting should dominate * Update sdk/python/kfp/dsl/_pipeline.py * fixed merge misstake	2020-04-23 07:09:13 -07:00
Alexey Volkov	b63ad7e614	SDK - Removed the ArtifactLocation feature (#3517 ) * SDK - Removed the ArtifactLocation feature The feature was deprecated in v0.1.34 https://github.com/kubeflow/pipelines/pull/2326 * Removed the artifact_location sample	2020-04-23 00:49:44 -07:00
Renmin	8d9a04a8ca	Enable NFS dynamic PVC (#3314 )	2020-04-14 01:35:12 -07:00
Rui Fang	e4637285c9	[Test]Fix e2e test (#3471 ) * Initial execution cache This commit adds initial execution cache service. Including http service and execution key generation. * fix master * fix go.sum * Fix e2e test * Add max_cache_staleness for flipA * add comments	2020-04-09 10:35:44 -07:00
Alexey Volkov	95aec25db4	SDK - Support kubernetes client v11 (#3319 ) Fixes https://github.com/kubeflow/pipelines/issues/3275	2020-03-22 02:54:44 -07:00
Alexey Volkov	be12ccf2a1	SDK - Moved the @python_component decorator test to dsl tests (#3324 ) * SDK - Moved the @python_component decorator test to dsl tests * Deprecate @python_component	2020-03-21 08:14:43 -07:00
Alexey Volkov	734b43e3db	SDK - Added support for maxCacheStaleness (#3318 ) * SDK - Added support for maxCacheStaleness * Added the vendor prefix to the annotation	2020-03-20 13:38:09 -07:00
Alexey Volkov	264ff37c1e	SDK - Moved _dsl_bridge to dsl (#3267 ) This is a pure refactoring change. The components library should not have any dependencies on the DSL library.	2020-03-14 00:12:34 -07:00
Ilias Katsakioris	c220059c8d	SDK/DSL: Enable the deletion of a resource via ResourceOp method (#3213 ) * SDK/DSL: Enable the deletion of a resource via ResourceOp method * Add the method delete() to ResourceOps * Extend ResourceOp & VolumeOp tests Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Fix ValueError not being raised	2020-03-10 16:07:36 -07:00
xiaohanhuang	e704067d15	add an optional name for dsl.Condition (kubeflow#3210) (#3212 ) * add an optional name for dsl.Condition (kubeflow#3210) * add unit test	2020-03-05 21:45:22 -08:00
Alexey Volkov	4a1b282461	SDK - Compiler - Fixed ParallelFor argument resolving (#3029 ) * SDK - Compiler - Fixed ParallelFor name clashes The ParallelFor argument reference resolving was really broken. The logic "worked" like this - of the name of the referenced output contained the name of the loop collection source output, then it was considered to be the reference to the loop item. This broke lots of scenarios especially in cases where there were multiple components with same output name (e.g. the default "Output" output name). The logic also did not distinguish between references to the loop collection item vs. references to the loop collection source itself. I've rewritten the argument resolving logic, to fix the issues. * Argo cannot use {{item}} when withParams items are dicts * Stabilize the loop template names * Renamed the test case	2020-02-11 12:18:09 -08:00

1 2 3 4

157 Commits