pipelines

Commit Graph

Author	SHA1	Message	Date
Chen Sun	051a022937	feat(sdk.v2): Allow set pipeline_root via @dsl.pipeline decorator. Make pipeline_root optional. (#5107 ) * Allow set pipeline_root via @dsl.pipeline decorator. * test covering pipeline_root not set	2021-02-07 02:06:32 -08:00
Niklas Hansson	c2a8bd0b93	Added checks for parallism values (#4950 ) * Added checks for parallism values * fix variable name	2021-01-08 08:28:55 -08:00
Jiaxiao Zheng	7540ba5c3b	feat(sdk): Implements artifact URI placeholder. (#4932 ) * add placeholder to spec * add output_directory to pipeline * respect uri placeholder in file outputs * wip: add data passing rewriting logic to respect the uri semantics * merge input_uri and paths when instantiating ContainerOp * fix * fix workflow rewriting * Add topology rewriting * add a test case, and various fixes * make the test case more complex * Fix the case when working with OpsGroup * Fix test case * fix resolving test * fix redundant cmd lines * fix redundant cmd lines * resolve comments * fix file outputs * resolve comments * copy file outputs instead of modifying inplace.	2021-01-05 20:39:51 -08:00
Kenta Onishi	5a4b70e37c	feat(sdk): Add settings of the dnsConfig field. Fixes #4836 (#4837 ) * feat(sdk): Add settings of the dnsConfig field. Fixes #4836 * feat(sdk): Add dnsConfig example and sample. * feat(sdk): Refactor dnsConfig param. * feat(sdk): Refactor dnsConfig param.	2020-12-14 20:05:49 -08:00
Jiaxiao Zheng	fb15223f7e	chore: Add doc strings marking the feature stages for SDK. (#4575 ) * add doc strings * Simplify the docstring * fix unittest * recover cli.py * recover cli.py * substitute docstring in resource ops with TODOs * revert stable labels	2020-11-24 00:19:00 -08:00
Alexey Volkov	1aa8068507	fix(sdk): DSL - Enabled arbitrary ContainerOp names (#4554 ) Fixes https://github.com/kubeflow/pipelines/issues/4522	2020-09-29 05:21:35 -07:00
Niklas Hansson	c32ea232d5	feat(compiled): set pod disruption budget for pipelines. Fixes #3877 (#4178 ) * Update _client.py * Update _client.py * added pod disruption budget * clean up * Update sdk/python/kfp/dsl/_pipeline.py * fixed parameter * updated after feedback * removed selector	2020-09-14 13:45:26 -07:00
Alex Latchford	704c8c7660	chore: Clean up KFP SDK docstrings, make formatting a little more consistent (#4218 ) * Prepare SDK docs environment so its easier to understand how to build the docs locally so theyre consistent with ReadTheDocs. * Clean up docstrings for kfp.Client * Add in updates to the docs for compiler and components * Update components area to add in code references and make formatting a little more consistent. * Clean up containers, add in custom CSS to ensure we do not overflow on inline code blocks * Clean up containers, add in custom CSS to ensure we do not overflow on inline code blocks * Remove unused kfp.notebook package links * Clean up a few more errant references * Clean up the DSL docs some more * Update SDK docs for KFP extensions to follow Sphinx guidelines * Clean up formatting of docstrings after Ark-Kuns comments	2020-08-04 00:33:47 +08:00
Alexey Volkov	54a596abd8	SDK - Compiler - Added support for volume-based data passing (3371) * SDK - Compiler - Added support for volume-based data passing Currently artifact passing is performed by Argo sidecar containers what download input data and upload output data to artifact repository (usually, S3-compatible blob storage like Minio). The performance of this method is not optimal and it requires that pod disks have enough capacity to hold all artifact data. This commit adds support for volume-based data passing. This method involves using a single milti-write Kubernetes data volume to pass all intermediate data. Parts of the volume are mounted to the input/output artifact directories, so when the user program reads and writes files, the files actually reside in the data volume. This method improves the performance and reduces storage resource requirements. The data volume must exist and support "READ_WRITE_MANY". Limitations: * All artifact file names must be the same (e.g. "data"). All auto-generated paths are already consistent. Avoid using any hard-coded paths. * Passing constant values (text) as arguments for artifact inputs is not supported. * The feature is experimental. * Added data_passing_methods.KubernetesVolume This class represents a configured volume-based artifact passing method. * Added PipelineConf.data_passing_method This property allows setting the method that will be used for intermediate data passing. Added the compiler support for the new feature. Example: ```python from kfp.dsl import PipelineConf, data_passing_methods from kubernetes.client.models import V1Volume, V1PersistentVolumeClaim pipeline_conf = PipelineConf() pipeline_conf.data_passing_method = data_passing_methods.KubernetesVolume( volume=V1Volume( name='data', persistent_volume_claim=V1PersistentVolumeClaim('data-volume'), ), path_prefix='artifact_data/', ) ``` * Added unit test * Fixed bug in the unit test Kubernetes does not validate the structures at all... * Fixed bug in the result structure * Fixed the test data The class should be V1PersistentVolumeClaimVolumeSource, not V1PersistentVolumeClaimSpec. * Fixed the test	2020-06-25 16:11:31 -07:00
Eterna2	4812d35283	Fix #3906 - mount_pvc transform should ignore non-ContainerOps (#3912 ) * Fix #3906 - check that ops to be transformed is a containerOp * Update docstring for add_op_transformer to clarify that not only containerOp will be transformed.	2020-06-09 19:04:05 -07:00
Niklas Hansson	05c1537f28	Add Nodeselector to pipelineconfig fix issue #2863 (#3616 ) * updated version * added pipeline nodeselector * removed old legacy * renaming * update test * Update sdk/python/kfp/compiler/compiler.py	2020-05-05 00:11:08 -07:00
Eterna2	9167da1b4e	Support execution throttling for executing the pipelines (#3346 ) (#3439 ) * Add parallelism limits to pipeline in kfp sdk * fix lint error	2020-05-04 23:25:08 -07:00
Niklas Hansson	2354776e1e	fix #2802 : Set ImagePullPolicy per pipeline. (#3534 ) * bump version * default image pull policy * Update sdk/python/kfp/dsl/_pipeline.py * task setting should dominate * Update sdk/python/kfp/dsl/_pipeline.py * fixed merge misstake	2020-04-23 07:09:13 -07:00
Alexey Volkov	b63ad7e614	SDK - Removed the ArtifactLocation feature (#3517 ) * SDK - Removed the ArtifactLocation feature The feature was deprecated in v0.1.34 https://github.com/kubeflow/pipelines/pull/2326 * Removed the artifact_location sample	2020-04-23 00:49:44 -07:00
Alexey Volkov	264ff37c1e	SDK - Moved _dsl_bridge to dsl (#3267 ) This is a pure refactoring change. The components library should not have any dependencies on the DSL library.	2020-03-14 00:12:34 -07:00
Alexey Volkov	2d9f2524c1	SDK - Components refactoring (#2865 ) * SDK - Components refactoring This change is a pure refactoring of the implementation of component task creation. For pipelines compiled using the DSL compiler (the compile() function or the command-line program) nothing should change. The main goal of the refactoring is to change the way the component instantiation can be customized. Previously, the flow was like this: `ComponentSpec` + arguments --> `TaskSpec` --resolving+transform--> `ContainerOp` This PR changes it to more direct path: `ComponentSpec` + arguments --constructor--> `ContainerOp` or `ComponentSpec` + arguments --constructor--> `TaskSpec` or `ComponentSpec` + arguments --constructor--> `SomeCustomTask` The original approach where the flow always passes through `TaskSpec` had some issues since TaskSpec only accepts string arguments (and two other reference classes). This made it harder to handle custom types of arguments like PipelineParam or Channel. Low-level refactoring changes: Resolving of command-line argument placeholders has been extracted into a function usable by different task constructors. Changed `_components._created_task_transformation_handler` to `_components._container_task_constructor`. Previously, the handler was receiving a `TaskSpec` instance. Now it receives `ComponentSpec` + arguments [+ `ComponentReference`]. Moved the `ContainerOp` construction handler setup to the `kfp.dsl.Pipeline` context class as planned. Extracted `TaskSpec` creation to `_components._create_task_spec_from_component_and_arguments`. Refactored `_dsl_bridge.create_container_op_from_task` to `_components._resolve_command_line_and_paths` which returns `_ResolvedCommandLineAndPaths`. Renamed `_dsl_bridge._create_container_op_from_resolved_task` to `_dsl_bridge._create_container_op_from_component_and_arguments`. The signature of `_components._resolve_graph_task` was changed and it now returns `_ResolvedGraphTask` instead of modified `TaskSpec`. Some of the component tests still expect ContainerOp and its attributes. These tests will be changed later. * Adapted the _python_op tests * Fixed linter failure I do not want to add any top-level kfp imports in this file to prevent circular references. * Added docstrings * FIxed the return type forward reference	2020-01-25 08:39:01 -08:00
Alexey Volkov	27f7e77356	SDK - Unified the function signature parsing implementations (#2689 ) * Replaced `_instance_to_dict(obj)` with `obj.to_dict()` * Fixed the capitalization in _python_function_name_to_component_name It now only changes the case of the first letter. * Replaced the _extract_component_metadata function with _extract_component_interface * Stopped adding newline to the component description. * Handling None inputs and outputs * Not including emply inputs and outputs in component spec * Renamed the private attributes that the @pipeline decorator sets * Changged _extract_pipeline_metadata to use _extract_component_interface * Fixed issues based on feedback	2019-12-27 10:05:40 -08:00
Alexey Volkov	fd6c756dd2	SDK - DSL - Make is_exit_handler unnecessary in ContainerOp (#2411 ) Fixed two broken tests. The tests did not have `is_exit_handler=True` which was required before this commit.	2019-10-16 13:26:15 -07:00
Alexey Volkov	0e2bf15dbc	SDK - Refactoring - Replaced the Meta classes with the Spec classes (#1944 ) * SDK - Refactoring - Replaced the ParameterMeta class with InputSpec and OutputSpec * SDK - Refactoring - Replaced the internal PipelineMeta class with ComponentSpec * SDK - Refactoring - Replaced the internal ComponentMeta class with ComponentSpec * SDK - Refactoring - Replaced the Meta classes with the Spec classes Replaced the ComponentMeta class with ComponentSpec Replaced the PipelineMeta class with ComponentSpec Replaced the ParameterMeta class with InputSpec and OutputSpec * Removed empty fields	2019-09-16 18:41:12 -07:00
Christian Clauss	8e1e823139	Lint Python code for undefined names (#1721 ) * Lint Python code for undefined names * Lint Python code for undefined names * Exclude tfdv.py to workaround an overzealous pytest * Fixup for tfdv.py * Fixup for tfdv.py * Fixup for tfdv.py	2019-08-21 15:04:31 -07:00
Eterna2	08ff76f5f1	[Feature] Set ttlSecondsAfterFinished in argo workflow with PipelineConf (#1594 ) * Add PipelineConf method to set ttlSecondsAfterFinished in argo workflow spec * remove unnecessary compile test for ttl. add unit test for ttl instead.	2019-07-24 09:26:15 -07:00
Ning	5061fcffcf	Add timeout out in dsl (#1465 ) * add timeout in dsl * add pipeline level timeout	2019-06-06 17:42:10 -07:00
Ning	86a49e9f42	expose add_op_transformer in the PipelineConf and add an example (#1440 ) * expose add_op_transformer in the PipelineConf and add an example	2019-06-06 13:10:09 -07:00
Alexey Volkov	2a9bbdf120	SDK/Compiler - Added the ability to apply a function to all ops in a pipeline (#1209 ) * SDK/Compiler - Added op and template transformers They can be used to apply some functions (e.g. to add secrets) to all pipeline ops. * Removed the template_transformers for now * Moved the op_transformers to PipelineConf * Added op_transformers test	2019-05-22 19:48:23 -07:00
Alexey Volkov	8382595a98	SDK - Made description and name parameters optional in the @pipeline decorator (#1335 )	2019-05-16 14:36:28 -07:00
Eterna2	91d941d6e5	[Feature] Supports parameterized S3Artifactory for Pipeline and ContainerOp in kfp package (#1064 ) * kfp can declare custom artifact location in pipeline and containerop. * Removed default artifact location * Minor fixes	2019-05-14 19:48:20 -07:00
Ilias Katsakioris	b675e0272b	Remove cops and rops pipeline attributes (#1298 ) * Remove the separated dictionaries for ContainerOps and ResourceOps * Fix the sanitization performed by the compiler to iterate through ops dict and do type-check for the special fields file_outputs and attribute_outputs Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>	2019-05-09 17:25:57 -07:00
Ilias Katsakioris	07cb50ee0c	Extend the DSL to implement the design of #801 (#926 ) * SDK: Create BaseOp class * BaseOp class is the base class for any Argo Template type * ContainerOp derives from BaseOp * Rename dependent_names to deps Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * SDK: In preparation for the new feature ResourceOps (#801) * Add cops attributes to Pipeline. This is a dict having all the ContainerOps of the pipeline. * Set some processing in _op_to_template as ContainerOp specific Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * SDK: Simplify the consumption of Volumes by ContainerOps Add `pvolumes` argument and attribute to ContainerOp. It is a dict having mount paths as keys and V1Volumes as values. These are added to the pipeline and mounted by the container of the ContainerOp. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * SDK: Add ResourceOp * ResourceOp is the SDK's equivalent for Argo's resource template * Add rops attribute to Pipeline: Dictionary containing ResourceOps * Extend _op_to_template to produce the template for ResourceOps * Use processed_op instead of op everywhere in _op_to_template() * Add samples/resourceop/resourceop_basic.py * Add tests/dsl/resource_op_tests.py * Extend tests/compiler/compiler_tests.py Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * SDK: Simplify the creation of PersistentVolumeClaim instances * Add VolumeOp: A specified ResourceOp for PVC creation * Add samples/resourceops/volumeop_basic.py * Add tests/dsl/volume_op_tests.py * Extend tests/compiler/compiler_tests.py Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * SDK: Emit a V1Volume as `.volume` from dsl.VolumeOp * Extend VolumeOp so it outputs a `.volume` attribute ready to be consumed by the `pvolumes` argument to ContainerOp's constructor * Update samples/resourceop/volumeop_basic.py * Extend tests/dsl/volume_op_tests.py * Update tests/compiler/compiler_tests.py Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * SDK: Add PipelineVolume * PipelineVolume inherits from V1Volume and it comes with its own set of KFP-specific dependencies. It is aligned with how PipelineParam instances are used. I.e. consuming a PipelineVolume leads to implicit dependencies without the user having to call the `.after()` method on a ContainerOp. * PipelineVolume comes with its own `.after()` method, which can be used to append extra dependencies to the instance. * Extend ContainerOp to handle PipelineVolume deps * Set `.volume` attribute of VolumeOp to be a PipelineVolume instead * Add samples/resourceops/volumeop_{parallel,dag,sequential}.py * Fix tests/dsl/volume_op_tests.py * Add tests/dsl/pipeline_volume_tests.py * Extend tests/compiler/compiler_tests.py Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * SDK: Simplify the creation of VolumeSnapshot instances * VolumeSnapshotOp: A specified ResourceOp for VolumeSnapshot creation * Add samples/resourceops/volume_snapshotop_{sequential,rokurl}.py * Add tests/dsl/volume_snapshotop_tests.py * Extend tests/compiler/compiler_tests.py NOTE: VolumeSnapshots is an Alpha feature at the time of this commit. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Extend UI for the ResourceOp and Volumes feature of the Compiler * Add VolumeMounts tab/entry (Run/Pipeline view) * Add Manifest tab/entry (Run/Pipeline view) * Add & Extend tests * Update tests snapshot files Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Cleaning up the diff (before moving things back) * Renamed op.deps back to op.dependent_names * Moved Container, Sidecar and BaseOp classed back to _container_op.py This way the diff is much smaller and more understandable. We can always split or refactor the file later. Refactorings should not be mixed with genuine changes.	2019-04-25 10:40:48 -07:00
Alexey Volkov	c777401bf1	SDK - Decoupling ContainerOp from compiler (#1168 ) * SDK - Decoupling ContainerOp from compiler Currently, some code in DSL module depends on some classes that belong to the DSL-compiler. Ideally, the dependency should go the the other way - the DSL-compiler should depend on DSL, but not the other way around. This commit fixes that issue for the ContainerOp class. * Switched from a list of handlers to a single handler	2019-04-23 13:42:01 -07:00
Alexey Volkov	ee119ec627	SDK - Got rid of the global variable collecting all created pipelines (#1167 ) * SDK - Got rid of the global variable collecting all created pipelines This list was only used by the command-line compiler. The command-line compiler can still collect the created pipelines by registering a handler function in `_pipeline_decorator_handlers`. * Replaced handler stack with a single handler.	2019-04-18 18:19:54 -07:00
Ajay Gopinathan	7043862da0	Allow adding pipeline with name and description. (#1139 )	2019-04-12 19:28:02 -07:00
Alexey Volkov	c67aea779e	SDK - Simplified the @pipeline decorator (#1120 ) * SDK - Simplified the @pipeline decorator Moved metadata-related code to _metadata. `Pipeline.get_pipeline_functions` now returns the list of pipeline functions. * Addressed @gaoning777's PR feedback	2019-04-12 13:14:47 -07:00
Ning	8c09090985	Support recursions in a function (#1014 ) * add a While in the ops group * deepcopy the while conditions when entering and exiting * add while condition resolution in the compiler * define graph component decorator * remove while loop related codes * fixes * remove while loop related code * fix bugs * generate a unique ops group name and being able to retrieve by name * resolve the opsgroups inputs and dependencies based on the pipelineparam in the condition * add a recursive ops_groups * fix bugs of the recursive opsgroup template name * resolve the recursive template name and arguments * add validity checks * add more comments * add usage comment in graph_component * add unit test for the graph opsgraph * refactor the opsgroup * add unit test for the graph_component decorator * exposing graph_component decorator * add recursive compiler unit tests * fix the bug of opsgroup name adjust the graph_component usage example fix index bugs use with statement in the graph_component instead of directly calling the enter/exit functions * add a todo to combine the graph_component and component decorators	2019-03-26 14:17:18 -07:00
Alexey Volkov	b68fbbd897	Fixed small bug in DSL code that generates unique names for ops (#923 ) Before the fix it would generate names as follows: name name-2 name-2-3 After the fix: name name-2 name-3	2019-03-06 14:24:41 -08:00
Ning	974d602b74	Pass meta to containerop and pipeline (#905 ) pass metadata from python conf to containerop and the pipeline	2019-03-06 13:42:23 -08:00
Ning	05b1a07740	Add python conf to the metadata (#894 ) * add core types and type checking function * fix unit test bug * avoid defining dynamic classes * typo fix * add component metadata format * add a construct for the component decorator * add default values for the meta classes * add input/output types to the metadata * add from_dict in TypeMeta * small fix * add unit tests * use python struct for the openapi schema * add default in parameter * add default value * remove the str restriction for the param default * bug fix * add pipelinemeta * add pipeline metadata * ignore annotation if it is not str/BaseType/dict * update param name in the check_type functions remove schema validators for GCRPath, and adjust for GCRPath, GCSPath change _check_valid_dict to _check_valid_type_dict to avoid confusion fix typo in the comments adjust function order for readability * remove default values for non-primitive types in the function signature update the _check_valid_type_dict name * typo in the comments * move the metadata classes to a separate module * fix unit test * add __eq__ to meta classes not export _metadata classes * fix unit test * fix bug: duplicate variable of args * move python_component and _component decorator in _component file * remove the print	2019-03-05 22:14:02 -08:00
Ning	9ebbaa313d	support pipeline level imagepullsecret in DSL (#745 ) * support pipeline level imagepullsecret in DSL * use kubernetes native input parameter for imagepullsecrets * expose a module level function to configure the pipeline settings for the current default pipeline	2019-02-05 13:16:43 -08:00
Ning	d3c4add0a9	DSL refactor (#619 ) * add comments * relocate functions in compiler to aggregate similar functions; move _build_conventional_artifact as a nested function * reduce sanitize functions into one in the dsl. * more comments * move all sanitization(op name, param name) from dsl to compiler * sanitize pipelineparam name and op_name; remove format check in pipelineparam * remove unit test for pipelineparam op_name format checking * fix bug: correctly replace input in the argument list * fix bug: replace arguments with found ones * Sanitize the file_output keys, Matches the param in the args/cmds with the whole serialized param str, Verify both param name and container name * loosen the containerop and param name restrictions	2019-01-08 20:00:17 -08:00
Pascal Vicaire	633e2ddcc8	Initial commit of the kubeflow/pipeline project.	2018-11-02 14:02:31 -07:00

39 Commits