* SDK - Compiler - Allow creating portable pipelines
This change allows directly passing the PipelineConf instance to compiler or launcher which makes it easier to create portable pipelines by allowing the environment-specific configuration to be directly passed to the environment-specific launcher.
Background:
PipelineConf holds all pipeline-level configuration including `op_transformers`, `image_pull_secrets` etc. Some of these are specific to particular execution environment (e.g. GCP secret or Argo artifact location or Kubernetes-specific options).
Previously, the only way to modify `PipelineConf` was to do it inside the piepline function. That tied the pipeline function to specific execution environment (e.g. GCP, Argo or Kubernetes)
Solution: This change allows directly passing the PipelineConf instance to compiler or launcher. This allows writing portable enlauncher and environment agnostic pipeline functions. All environment-specific configurations can be moved to launching stage.
Before:
```python
# Defining pipeline
def my_pipeline():
# portable pipeline code
dsl.get_pipeline_conf().add_op_transformer(gcp.use_gcp_secret('user-gcp-sa'))
# Launching pipeline
kfp.Clinet().create_run_from_pipeline_func(my_pipeline, arguments={})
```
After:
```python
# Defining pipeline
def my_pipeline():
# portable pipeline code
# Launching pipeline
pipeline_conf = dsl.PipelineConf()
pipeline_conf.add_op_transformer(gcp.use_gcp_secret('user-gcp-sa'))
kfp.Clinet().create_run_from_pipeline_func(my_pipeline, arguments={}, pipeline_conf=pipeline_conf)
```
After 2 *(launching same portable pipeline using different launchers):
```python
# Loading portable pipeline
from portable_pipeline import my_pipeline
# Launching pipeline on Kubeflow
pipeline_conf = dsl.PipelineConf()
pipeline_conf.add_op_transformer(gcp.use_gcp_secret('user-gcp-sa'))
kfp.Clinet().create_run_from_pipeline_func(my_pipeline, arguments={}, pipeline_conf=pipeline_conf)
# Launching pipeline on locally (not implemented yet)
kfp.run_pipeline_func_locally(my_pipeline, arguments={})
```
* Added parameter docstring
* SDK - Compiler - Fix large data passing
Stop outputting parameters unless they're consumed as parameters downstream.
This prevents the situaltion when component outputs a big file, but DSL compiler instructs Argo to pick it up as parameter (parameters only hold few kilobytes of data).
As byproduct, this change fixes some minor compiler data passing bugs where some parameters were being passed around, but never consumed (happened with `ResourceOp`, `dsl.Condition` and recursion).
* Replaced ... with `raise AssertionError`
* Fixed small bug
* Removed unused variables
* Fixed names of the mark_upstream_ios_of_* functions
* Fixed detection of parameter output references
* Fixed handling of volumes
* SDK - Refactoring - Replaced the ParameterMeta class with InputSpec and OutputSpec
* SDK - Refactoring - Replaced the internal PipelineMeta class with ComponentSpec
* SDK - Refactoring - Replaced the internal ComponentMeta class with ComponentSpec
* SDK - Refactoring - Replaced the *Meta classes with the *Spec classes
Replaced the ComponentMeta class with ComponentSpec
Replaced the PipelineMeta class with ComponentSpec
Replaced the ParameterMeta class with InputSpec and OutputSpec
* Removed empty fields
* first working commit
* incrememtal commit
* in the middle of converting loop args constructor to accept pipeline param
* both cases working
* output works, passed doesn't
* about to redo compiler section
* rewrite draft done
* added withparam tests
* removed sdk/python/comp.yaml
* minor
* subvars work
* more tests
* removed unneeded artifact outputs from test yaml
* sort keys
* removed dead artifact code
* Refactor. Expose a public API to append pipeline param without interacting with dsl.Pipeline obj.
* Add unit test and fix.
* Fix docstring.
* Fix test
* Fix test
* Fix two nit problems
* Refactor
* SDK - Added support for raw artifact values to ContainerOp
* `ContainerOp` now gets artifact artguments from command line instead of the constructor.
* Added back input_artifact_arguments to the ContainerOp constructor.
In some scenarios it's hard to provide the artifact arguments through the `command` list when it already has resolved artifact paths.
* Exporting InputArtifactArgument from kfp.dsl
* Updated the sample
* Properly passing artifact arguments as task arguments
as opposed to default input values.
* Renamed input_artifact_arguments to artifact_arguments to reduce confusion
* Renamed InputArtifactArgument to InputArgumentPath
Also renamed input_artifact_arguments to artifact_argument_paths in the ContainerOp's constructor
* Replaced getattr with isinstance checks.
getattr is too fragile and can be broken by renames.
* Fixed the type annotations
* Unlocked the input artifact support in components
Added the test_input_path_placeholder_with_constant_argument test
* SDK - Refactoring - Replaced the TypeMeta class
The PipelineParam no longer exposes the private TypeMeta class
Fixes#1420
The refactoring PR is part of a series of PR which unifies the metadata and specification types.
* Add PipelineConf method to set ttlSecondsAfterFinished in argo workflow spec
* remove unnecessary compile test for ttl. add unit test for ttl instead.
* Configure gcp connectors in dsl
* Make configure_gcp_connector more extensible
* Add add_pod_env op handler.
* Only apply add_pod_env on ContainerOp
* Update license header
* SDK/Compiler - Added op and template transformers
They can be used to apply some functions (e.g. to add secrets) to all pipeline ops.
* Removed the template_transformers for now
* Moved the op_transformers to PipelineConf
* Added op_transformers test
* Remove the separated dictionaries for ContainerOps and ResourceOps
* Fix the sanitization performed by the compiler to iterate through ops
dict and do type-check for the special fields file_outputs and
attribute_outputs
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Create BaseOp class
* BaseOp class is the base class for any Argo Template type
* ContainerOp derives from BaseOp
* Rename dependent_names to deps
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: In preparation for the new feature ResourceOps (#801)
* Add cops attributes to Pipeline. This is a dict having all the
ContainerOps of the pipeline.
* Set some processing in _op_to_template as ContainerOp specific
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Simplify the consumption of Volumes by ContainerOps
Add `pvolumes` argument and attribute to ContainerOp. It is a dict
having mount paths as keys and V1Volumes as values. These are added to
the pipeline and mounted by the container of the ContainerOp.
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Add ResourceOp
* ResourceOp is the SDK's equivalent for Argo's resource template
* Add rops attribute to Pipeline: Dictionary containing ResourceOps
* Extend _op_to_template to produce the template for ResourceOps
* Use processed_op instead of op everywhere in _op_to_template()
* Add samples/resourceop/resourceop_basic.py
* Add tests/dsl/resource_op_tests.py
* Extend tests/compiler/compiler_tests.py
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Simplify the creation of PersistentVolumeClaim instances
* Add VolumeOp: A specified ResourceOp for PVC creation
* Add samples/resourceops/volumeop_basic.py
* Add tests/dsl/volume_op_tests.py
* Extend tests/compiler/compiler_tests.py
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Emit a V1Volume as `.volume` from dsl.VolumeOp
* Extend VolumeOp so it outputs a `.volume` attribute ready to be
consumed by the `pvolumes` argument to ContainerOp's constructor
* Update samples/resourceop/volumeop_basic.py
* Extend tests/dsl/volume_op_tests.py
* Update tests/compiler/compiler_tests.py
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Add PipelineVolume
* PipelineVolume inherits from V1Volume and it comes with its own set of
KFP-specific dependencies. It is aligned with how PipelineParam
instances are used. I.e. consuming a PipelineVolume leads to implicit
dependencies without the user having to call the `.after()` method on
a ContainerOp.
* PipelineVolume comes with its own `.after()` method, which can be used
to append extra dependencies to the instance.
* Extend ContainerOp to handle PipelineVolume deps
* Set `.volume` attribute of VolumeOp to be a PipelineVolume instead
* Add samples/resourceops/volumeop_{parallel,dag,sequential}.py
* Fix tests/dsl/volume_op_tests.py
* Add tests/dsl/pipeline_volume_tests.py
* Extend tests/compiler/compiler_tests.py
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Simplify the creation of VolumeSnapshot instances
* VolumeSnapshotOp: A specified ResourceOp for VolumeSnapshot creation
* Add samples/resourceops/volume_snapshotop_{sequential,rokurl}.py
* Add tests/dsl/volume_snapshotop_tests.py
* Extend tests/compiler/compiler_tests.py
NOTE: VolumeSnapshots is an Alpha feature at the time of this commit.
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* Extend UI for the ResourceOp and Volumes feature of the Compiler
* Add VolumeMounts tab/entry (Run/Pipeline view)
* Add Manifest tab/entry (Run/Pipeline view)
* Add & Extend tests
* Update tests snapshot files
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* Cleaning up the diff (before moving things back)
* Renamed op.deps back to op.dependent_names
* Moved Container, Sidecar and BaseOp classed back to _container_op.py
This way the diff is much smaller and more understandable. We can always split or refactor the file later. Refactorings should not be mixed with genuine changes.
* SDK - Simplified the @pipeline decorator
Moved metadata-related code to _metadata.
`Pipeline.get_pipeline_functions` now returns the list of pipeline functions.
* Addressed @gaoning777's PR feedback
* remove the graph component output; add support for dependency on graph component
* fix bug; adjust unit tests
* add support for explicit dependency of graph component
* adjust unit test
* add a todo
* bug fixes for unit tests
* refactor condition_param code; fix bug when the inputs task name is None; need to remove the print later
* do not pass condition param as arguments to downstream ops, remove print logs; add unit tests
* add unit test golden yaml
* fix bug
* fix the sample
* Feature: sidecar for ContainerOp
* replace f-string with string format for compatibility with py3.5
* ContainerOp now can be updated with any k8s V1Container attributes as well as sidecars with Sidecar class. ContainerOp accepts PipelineParam in any valid k8 properties.
* WIP: fix conflicts and bugs with recent master. TODO: more complex template with pipeline params
* fix proxy args
* Fixed to work with latest master head
* Added container_kwargs to ContainerOp to pass in k8s container kwargs
* Fix comment bug, updated with example in ContainerOp docstring
* fix copyright year
* expose match_serialized_pipelineparam as public for compiler to process serialized pipeline params
* fixed pydoc example and removed unnecessary ContainerOp.container.parent
* Fix conflicts in compiler tests
* add core types and type checking function
* fix unit test bug
* avoid defining dynamic classes
* typo fix
* add component metadata format
* add a construct for the component decorator
* add default values for the meta classes
* add input/output types to the metadata
* add from_dict in TypeMeta
* small fix
* add unit tests
* use python struct for the openapi schema
* add default in parameter
* add default value
* remove the str restriction for the param default
* bug fix
* add pipelinemeta
* add pipeline metadata
* ignore annotation if it is not str/BaseType/dict
* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability
* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name
* pass metadata from component decorator and task factory to containerOp
* pass pipeline metadata to Pipeline
* fix unit test
* typo in the comments
* move the metadata classes to a separate module
* fix unit test
* small change
* add __eq__ to meta classes
not export _metadata classes
* nothing
* fix unit test
* unit test python component
* unit test python pipeline
* fix bug: duplicate variable of args
* fix unit tests
* move python_component and _component decorator in _component file
* remove the print
* change parameter default value to None
* add functools wraps around _component decorator
* TypeMeta accept both str and dict
* fix indent, add unit test for type as strings
* do not set default value for the name field in ParameterMeta, ComponentMeta, and PipelineMeta
* add type check in task factory
* output error message
* add type check in component decorator; move the metadata assignment out of the containerop __init__ function
* fix bug; add unit test
* add more unit tests
* more unit tests; fix bugs
* more unit tests; fix bugs
* add unit tests
* more unit tests
* add type check switch; add unit tests
* add compiler option for type check
* add a notebook sample
* resolving pr comments
* add unit test for pipeline param check with component types; fix the bug; also fix the bug when there are not a single return annotations
* add dsl static type checking sample
* fix bug: op_to_template resolve the raw arguments by mapping to the argument_inputs but the argument_inputs lost the type information
* fix type pattern matching
* convert orderedDict to dict from the component module
* add unit test to the pipelineparam with types
* create TypeMeta deserialize function, add comments
* strongly typed pipelineparamtuple
* remove GCSPath fields to avoid artifact type confusion
change the type json schema field name to openAPIV3Schema
* fix unit tests; add unit test for openapishema property
* add comments
* add unit test at the component module; fix bug
* add ignore_type in pipelineparam
* update sample: no artifact types but only parameter types; add pipelineparam ignore_type example
* configure the default type checking to enabled
* change openAPIV3Schema to lower case with underscore
* revert change from the merge
* add code blocks, add the benefits of static type checking
add more comments within the code block
add documentation about the type definition in both yaml and decorated
components.
* fix the comment
* update dsl.type namespace
* dsl generate zip file
* minor fix
* fix zip read in the unit test
* update sample tests
* dsl compiler generates pipeline based on the input name suffix
* add unit tests for different output format
* update the sdk client to support tar zip and yaml
* fix typo
* fix file write
* add a While in the ops group
* deepcopy the while conditions when entering and exiting
* add while condition resolution in the compiler
* define graph component decorator
* remove while loop related codes
* fixes
* remove while loop related code
* fix bugs
* generate a unique ops group name and being able to retrieve by name
* resolve the opsgroups inputs and dependencies based on the pipelineparam in the condition
* add a recursive ops_groups
* fix bugs of the recursive opsgroup template name
* resolve the recursive template name and arguments
* add validity checks
* add more comments
* add usage comment in graph_component
* add unit test for the graph opsgraph
* refactor the opsgroup
* add unit test for the graph_component decorator
* exposing graph_component decorator
* add recursive compiler unit tests
* fix the bug of opsgroup name
adjust the graph_component usage example
fix index bugs
use with statement in the graph_component instead of directly calling
the enter/exit functions
* add a todo to combine the graph_component and component decorators
* add unit test to the pipelineparam with types
* create TypeMeta deserialize function, add comments
* strongly typed pipelineparamtuple
* addressing pr comments
* fix bug: op_to_template resolve the raw arguments by mapping to the argument_inputs but the argument_inputs lost the type information
* fix type pattern matching
* convert orderedDict to dict from the component module
* add core types and type checking function
* fix unit test bug
* avoid defining dynamic classes
* typo fix
* add component metadata format
* add a construct for the component decorator
* add default values for the meta classes
* add input/output types to the metadata
* add from_dict in TypeMeta
* small fix
* add unit tests
* use python struct for the openapi schema
* add default in parameter
* add default value
* remove the str restriction for the param default
* bug fix
* add pipelinemeta
* add pipeline metadata
* ignore annotation if it is not str/BaseType/dict
* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability
* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name
* pass metadata from component decorator and task factory to containerOp
* pass pipeline metadata to Pipeline
* fix unit test
* typo in the comments
* move the metadata classes to a separate module
* fix unit test
* small change
* add __eq__ to meta classes
not export _metadata classes
* nothing
* fix unit test
* unit test python component
* unit test python pipeline
* fix bug: duplicate variable of args
* fix unit tests
* move python_component and _component decorator in _component file
* remove the print
* change parameter default value to None
* add functools wraps around _component decorator
* TypeMeta accept both str and dict
* fix indent, add unit test for type as strings
* do not set default value for the name field in ParameterMeta, ComponentMeta, and PipelineMeta
* add type check in task factory
* output error message
* add type check in component decorator; move the metadata assignment out of the containerop __init__ function
* fix bug; add unit test
* add more unit tests
* more unit tests; fix bugs
* more unit tests; fix bugs
* add unit tests
* more unit tests
* add type check switch; add unit tests
* add compiler option for type check
* resolving pr comments
* add unit test for pipeline param check with component types; fix the bug; also fix the bug when there are not a single return annotations
* support pipeline level imagepullsecret in DSL
* use kubernetes native input parameter for imagepullsecrets
* expose a module level function to configure the pipeline settings for the current default pipeline
* add comments
* relocate functions in compiler to aggregate similar functions; move _build_conventional_artifact as a nested function
* reduce sanitize functions into one in the dsl.
* more comments
* move all sanitization(op name, param name) from dsl to compiler
* sanitize pipelineparam name and op_name; remove format check in pipelineparam
* remove unit test for pipelineparam op_name format checking
* fix bug: correctly replace input in the argument list
* fix bug: replace arguments with found ones
* Sanitize the file_output keys, Matches the param in the args/cmds with the whole serialized param str, Verify both param name and container name
* loosen the containerop and param name restrictions
* Support replacable arguments in command as well (besides arguments) in container op.
* Fix components builder.
* Fix tests.
* Follow up CR comments.
* Fix test.