* add OOB component dict and utility function
* add test
* add a transformer, which appends the component name label
* add transformer function, compiler and test
* move telemetry test
* fix none uri
* applies comments
* revert dependency on frozendict
* fixes some tests
* resolve comments
* SDK - Compiler - Fixed ParallelFor name clashes
The ParallelFor argument reference resolving was really broken.
The logic "worked" like this - of the name of the referenced output
contained the name of the loop collection source output, then it was
considered to be the reference to the loop item.
This broke lots of scenarios especially in cases where there were
multiple components with same output name (e.g. the default "Output"
output name). The logic also did not distinguish between references to
the loop collection item vs. references to the loop collection source
itself.
I've rewritten the argument resolving logic, to fix the issues.
* Argo cannot use {{item}} when withParams items are dicts
* Stabilize the loop template names
* Renamed the test case
* SDK - Refactoring - Split the K8sHelper class
One part was only used by container builder and provided higher-level API over K8s Client.
Another was used by the compiler and did not use the kubernetes library.
* Updated the license year.
* SDK - Improve errors when ContainerOp.output is unavailable
ContainerOp.output is only available when there is only one output.
Right now, when there are multiple outputs it just holds `None` instead of the a task output reference.
In this case however it's indistinguishable from just passing None argument.
This PR gives a quick fix to make accessing the nonexistent `.output` a compile-time error.
* Fixed the implementation and added tests
* Trigger retests
* SDK - Compiler - Move volumes to templates
Argo v2.3.0+ supports per-template volume specs similiar to Kubernetes. Prior to version 2.3.0 Argo only supported workflow-level volume specs.
We had several outstanding issues caused by the need to put all volumes in the same place.
There was also the issue with input parameter reference placeholders in volume specifications which were placed outside their home templates declaring the inputs.
This change fixes those issues.
* Removed dead code line
* SDK - Compiler - Allow creating portable pipelines
This change allows directly passing the PipelineConf instance to compiler or launcher which makes it easier to create portable pipelines by allowing the environment-specific configuration to be directly passed to the environment-specific launcher.
Background:
PipelineConf holds all pipeline-level configuration including `op_transformers`, `image_pull_secrets` etc. Some of these are specific to particular execution environment (e.g. GCP secret or Argo artifact location or Kubernetes-specific options).
Previously, the only way to modify `PipelineConf` was to do it inside the piepline function. That tied the pipeline function to specific execution environment (e.g. GCP, Argo or Kubernetes)
Solution: This change allows directly passing the PipelineConf instance to compiler or launcher. This allows writing portable enlauncher and environment agnostic pipeline functions. All environment-specific configurations can be moved to launching stage.
Before:
```python
# Defining pipeline
def my_pipeline():
# portable pipeline code
dsl.get_pipeline_conf().add_op_transformer(gcp.use_gcp_secret('user-gcp-sa'))
# Launching pipeline
kfp.Clinet().create_run_from_pipeline_func(my_pipeline, arguments={})
```
After:
```python
# Defining pipeline
def my_pipeline():
# portable pipeline code
# Launching pipeline
pipeline_conf = dsl.PipelineConf()
pipeline_conf.add_op_transformer(gcp.use_gcp_secret('user-gcp-sa'))
kfp.Clinet().create_run_from_pipeline_func(my_pipeline, arguments={}, pipeline_conf=pipeline_conf)
```
After 2 *(launching same portable pipeline using different launchers):
```python
# Loading portable pipeline
from portable_pipeline import my_pipeline
# Launching pipeline on Kubeflow
pipeline_conf = dsl.PipelineConf()
pipeline_conf.add_op_transformer(gcp.use_gcp_secret('user-gcp-sa'))
kfp.Clinet().create_run_from_pipeline_func(my_pipeline, arguments={}, pipeline_conf=pipeline_conf)
# Launching pipeline on locally (not implemented yet)
kfp.run_pipeline_func_locally(my_pipeline, arguments={})
```
* Added parameter docstring
* SDK - Compiler - Fix large data passing
Stop outputting parameters unless they're consumed as parameters downstream.
This prevents the situaltion when component outputs a big file, but DSL compiler instructs Argo to pick it up as parameter (parameters only hold few kilobytes of data).
As byproduct, this change fixes some minor compiler data passing bugs where some parameters were being passed around, but never consumed (happened with `ResourceOp`, `dsl.Condition` and recursion).
* Replaced ... with `raise AssertionError`
* Fixed small bug
* Removed unused variables
* Fixed names of the mark_upstream_ios_of_* functions
* Fixed detection of parameter output references
* Fixed handling of volumes
* SDK - Refactoring - Replaced the ParameterMeta class with InputSpec and OutputSpec
* SDK - Refactoring - Replaced the internal PipelineMeta class with ComponentSpec
* SDK - Refactoring - Replaced the internal ComponentMeta class with ComponentSpec
* SDK - Refactoring - Replaced the *Meta classes with the *Spec classes
Replaced the ComponentMeta class with ComponentSpec
Replaced the PipelineMeta class with ComponentSpec
Replaced the ParameterMeta class with InputSpec and OutputSpec
* Removed empty fields
* first working commit
* incrememtal commit
* in the middle of converting loop args constructor to accept pipeline param
* both cases working
* output works, passed doesn't
* about to redo compiler section
* rewrite draft done
* added withparam tests
* removed sdk/python/comp.yaml
* minor
* subvars work
* more tests
* removed unneeded artifact outputs from test yaml
* sort keys
* removed dead artifact code
* Refactor. Expose a public API to append pipeline param without interacting with dsl.Pipeline obj.
* Add unit test and fix.
* Fix docstring.
* Fix test
* Fix test
* Fix two nit problems
* Refactor
* SDK - Added support for raw artifact values to ContainerOp
* `ContainerOp` now gets artifact artguments from command line instead of the constructor.
* Added back input_artifact_arguments to the ContainerOp constructor.
In some scenarios it's hard to provide the artifact arguments through the `command` list when it already has resolved artifact paths.
* Exporting InputArtifactArgument from kfp.dsl
* Updated the sample
* Properly passing artifact arguments as task arguments
as opposed to default input values.
* Renamed input_artifact_arguments to artifact_arguments to reduce confusion
* Renamed InputArtifactArgument to InputArgumentPath
Also renamed input_artifact_arguments to artifact_argument_paths in the ContainerOp's constructor
* Replaced getattr with isinstance checks.
getattr is too fragile and can be broken by renames.
* Fixed the type annotations
* Unlocked the input artifact support in components
Added the test_input_path_placeholder_with_constant_argument test
* SDK - Refactoring - Replaced the TypeMeta class
The PipelineParam no longer exposes the private TypeMeta class
Fixes#1420
The refactoring PR is part of a series of PR which unifies the metadata and specification types.
* Add PipelineConf method to set ttlSecondsAfterFinished in argo workflow spec
* remove unnecessary compile test for ttl. add unit test for ttl instead.
* Configure gcp connectors in dsl
* Make configure_gcp_connector more extensible
* Add add_pod_env op handler.
* Only apply add_pod_env on ContainerOp
* Update license header
* SDK/Compiler - Added op and template transformers
They can be used to apply some functions (e.g. to add secrets) to all pipeline ops.
* Removed the template_transformers for now
* Moved the op_transformers to PipelineConf
* Added op_transformers test
* Remove the separated dictionaries for ContainerOps and ResourceOps
* Fix the sanitization performed by the compiler to iterate through ops
dict and do type-check for the special fields file_outputs and
attribute_outputs
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>