* SDK - Lightweight - Convert the names of file inputs and outputs
Removing the "_path" and "_file" suffixes from the names of file inputs and outputs.
Problem: When accepting file inputs (outputs), the function inside the component receives file paths (or file streams), so it's natural to call the function parameter "something_file_path" (e.g. model_file_path or number_file_path).
But from the outside perspective, there are no files or paths - the actual data objects (or references to them) are passed in.
It looks very strange when argument passing code looks like this: `component(number_file_path=42)`. This looks like an error since 42 is not a path. It's not even a string.
It's much more natural to strip the names of file inputs and outputs of "_file" or "_path" suffixes. Then the argument passing code will look natural: "component(number=42)"
* Removed the _FEATURE_STRIP_FILE_IO_NAME_PARTS feature switch
* SDK - Moved the _container_builder from kfp.compiler to kfp.containers
This only moves the files. The imports remain the same for now.
* Simplified the imports.
Lightweight components now allow function to mark some outputs that it wants to produce by writing data to files, not returning it as in-memory data objects.
This is useful when the data is expected to be big.
Example 1 (writing big amount of data to output file with provided path):
```python
@func_to_container_op
def write_big_data(big_file_path: OutputPath(str)):
with open(big_file_path) as big_file:
for i in range(1000000):
big_file.write('Hello world\n')
```
Example 2 (writing big amount of data to provided output file stream):
```python
@func_to_container_op
def write_big_data(big_file: OutputTextFile(str)):
for i in range(1000000):
big_file.write('Hello world\n')
```
Lightweight components now allow function to mark some inputs that it wants to consume as files, not as in-memory data objects.
This is useful when the data is expected to be big.
Example 1:
```python
def consume_big_file_path(big_file_path: InputPath(str)) -> int:
line_count = 0
with open(big_file_path) as f:
while f.readline():
line_count = line_count + 1
return line_count
```
Example 2:
```python
def consume_big_file(big_file: InputTextFile(str)) -> int:
line_count = 0
while big_file.readline():
line_count = line_count + 1
return line_count
```
* SDK - Tests - Added better helper functions for testing python components
* SDK - Python components - Properly serializing outputs
Background:
Component arguments are already properly serialized when calling the component program and then deserialized before the execution of the component function.
But the component outputs were only serialized using `str()` which is inadequate for data types like lists or dictionaries.
This commit fixes the mismatch - theoutputs are now serialized the same ways as arguments and default values.
* SDK - Compiler - Fix large data passing
Stop outputting parameters unless they're consumed as parameters downstream.
This prevents the situaltion when component outputs a big file, but DSL compiler instructs Argo to pick it up as parameter (parameters only hold few kilobytes of data).
As byproduct, this change fixes some minor compiler data passing bugs where some parameters were being passed around, but never consumed (happened with `ResourceOp`, `dsl.Condition` and recursion).
* Replaced ... with `raise AssertionError`
* Fixed small bug
* Removed unused variables
* Fixed names of the mark_upstream_ios_of_* functions
* Fixed detection of parameter output references
* Fixed handling of volumes
Two PRs have been merged that turned out to be slightly incompatible. This PR fixes the failing tests.
Root causes:
* The pipeline parameter default values were not properly serialized when constructing the metadata object.
* The `ParameterMeta` class did not validate the default value type, so the lack of serialization has not been caught. The `ParameterMeta` was replaced by `InputSpec` which has strict type validation.
* Previously we did not have samples with complex pipeline parameter default values (e.g. lists) that could trigger the failures. Then two samples were added that had complex default values.
* Travis does not re-run tests before merging
* Prow does not re-run Travis tests before merging
Currently, the parameter output values are not saved to storage and their values are lost as soon as garbage collector removes the workflow object.
This change makes is so the parameter output values are persisted.
* SDK - Refactoring - Replaced the ParameterMeta class with InputSpec and OutputSpec
* SDK - Refactoring - Replaced the internal PipelineMeta class with ComponentSpec
* SDK - Refactoring - Replaced the internal ComponentMeta class with ComponentSpec
* SDK - Refactoring - Replaced the *Meta classes with the *Spec classes
Replaced the ComponentMeta class with ComponentSpec
Replaced the PipelineMeta class with ComponentSpec
Replaced the ParameterMeta class with InputSpec and OutputSpec
* Removed empty fields
* first working commit
* incrememtal commit
* in the middle of converting loop args constructor to accept pipeline param
* both cases working
* output works, passed doesn't
* about to redo compiler section
* rewrite draft done
* added withparam tests
* removed sdk/python/comp.yaml
* minor
* subvars work
* more tests
* removed unneeded artifact outputs from test yaml
* sort keys
* removed dead artifact code
* Refactor. Expose a public API to append pipeline param without interacting with dsl.Pipeline obj.
* Add unit test and fix.
* Fix docstring.
* Fix test
* Fix test
* Fix two nit problems
* Refactor
* SDK - Testing - Run some unit-tests in a more correct way
Replaced `@unittest.expectedFailure` with `with self.assertRaises(...):`.
Replaced `assert` with `self.assertEqual(...)`.
Stopped producing the stray "comp.yaml" file.
Enabled the test_load_component_from_url test.
* Removed a stray comment
* Addded two tests for output_component_file
* SDK - Containers - Build container image from current environment
* Removed the ability to capture the active python environment (as requested by @hongye-sun)
* Added the type hint and docstring to for the return type.
* Renamed `build_image_from_env` function to `build_image_from_working_dir`
as requested by @hongye-sun
* Explained the function behavior in the documentation.
* Removed extra empty line
* Improved caching by copying python files only after installing python packages
* Made test more portable
* Added support for specifying the base_image
`kfp.containers.default_base_image = ...`
The image can also be a callable returning the image name.
* Renamed `get_python_image` to `get_python_image_for_current_version`
* Switched the default base image to Google Deep Learning container image as requested by @hongye-sun
The size of this image is 4.35GB which really concerns me. The GPU image size is 6.45GB.
* Stopped importing kfp.containers.* into kfp.*
* Fixed test
* Fixed the regex string
* Fixed the type annotation style
* Addressed @hongye-sun feedback
* Removed the container image size warning
* Fixed import failure
* Explicitly added mlpipeline outputs to the components that actually produce them
* Updated samples
* SDK - DSL - Stopped adding mlpipeline artifacts to every compiled template
Fixes https://github.com/kubeflow/pipelines/issues/1421
Fixes https://github.com/kubeflow/pipelines/issues/1422
* Updated the Lighweight sample
* Updated the compiler tests
* Fixed the lightweight sample
* Reverted the change to one contrib/samples/openvino
The sample will still work fine as it is now.
I'll add the change to that file as a separate PR.
If no `name` is provided to PipelineVolume constructor, a custom name is
generated. It relies on `json.dumps()` of the struct after getting
converted to dict.
When `pvc` is provided and `name` is not, the following error is raised:
TypeError: Object of type PipelineParam is not JSON serializable
This commit fixes it and extends tests to catch it.
* SDK - Switching python container components to Lightweight components code generator
* Fixed the tests
Had to remove the python2 test since python2 code generation is going away (python2 is near its End of Life and Kubeflow Pipelines only support python 3.5+).
* Added description for the internal add_files parameter
* Fixed typo
* Removed the `test_func_to_entrypoint` test
This was proposed by @gaoning777: `_func_to_entrypoint` is now just a reference to `_func_to_component_spec` which is extensively covered by other tests.
* SDK - Components - Added type to TaskOutputReference
Now the task output references taken from TaskSpec instances can be
type-checked when passed to components.
* Renamed TypeType to TypeSpecType
Problem: When the user loads component using the load_component function, the object they get back is a task factory function. Since it's a normal function object, the user cannot inspect any of the attributes of the component they just loaded (they can only see the name, description and input names). For example, the user cannot see the list of component outputs, the annotations etc.
This change fixes the issue by adding the original component properties to the function object.
Example usage:
```python
train_op = load_component_from_url(...)
print(train_op.outputs)
```
* SDK - Added support for raw artifact values to ContainerOp
* `ContainerOp` now gets artifact artguments from command line instead of the constructor.
* Added back input_artifact_arguments to the ContainerOp constructor.
In some scenarios it's hard to provide the artifact arguments through the `command` list when it already has resolved artifact paths.
* Exporting InputArtifactArgument from kfp.dsl
* Updated the sample
* Properly passing artifact arguments as task arguments
as opposed to default input values.
* Renamed input_artifact_arguments to artifact_arguments to reduce confusion
* Renamed InputArtifactArgument to InputArgumentPath
Also renamed input_artifact_arguments to artifact_argument_paths in the ContainerOp's constructor
* Replaced getattr with isinstance checks.
getattr is too fragile and can be broken by renames.
* Fixed the type annotations
* Unlocked the input artifact support in components
Added the test_input_path_placeholder_with_constant_argument test
* SDK - Components - Improved serialization and deserialization of arguments and defaults
Properly serialize default values and passed arguments using the same code.
Check the types of passed argument values and issue warnings.
Improved argument reference type compatibility checking. When types do not match there is always either error or warning.
When creating component from python function, the input types are now canonicalized.
* Addressed the feedback
* SDK - Refactoring - Replaced the TypeMeta class
The PipelineParam no longer exposes the private TypeMeta class
Fixes#1420
The refactoring PR is part of a series of PR which unifies the metadata and specification types.
* Fix bug where delete resource op should not have success_condition, failure_condition, and output parameters
* remove unnecessary whitespace
* compiler test for delete resource ops should retrieve templates from spec instead of root
* Collecting coiverage when running python tests
* Added coveralls to python unit tests
* Try removing the PATH modification
* Specifying coverage run --source
* Using the installed package
* Try getting the correct coverage paths
* Lint Python code for undefined names
* Lint Python code for undefined names
* Exclude tfdv.py to workaround an overzealous pytest
* Fixup for tfdv.py
* Fixup for tfdv.py
* Fixup for tfdv.py
* SDK - Refactoring - Serialized PipelineParam does not need type
Only the types in non-serialized PipelineParams are ever used.
* SDK - Refactoring - Serialized PipelineParam does not need value
Default values are only relevant when PipelineParam is used in the pipeline function signature and even in this case compiler captures them explicitly from the pipelineParam objects in the signature.
There is no other uses for them.
* avoid istio injector in the container builder
* find the correct namespace
* configure default ns to kubeflow if out of cluster; fix unit tests
* container build default gcs bucket
* resolve comments
* code refactor; add create_bucket_if_not_exist in containerbuilder
* support load kube config and output error, good for ai platform notebooks/local notebooks
* remove create_bucket_if_not_exist param
* SDK - Containers - Returning image name with digest
Image building functions now return image name with digest: image_repo@sha256:digest
Fixes https://github.com/kubeflow/pipelines/issues/1715
* Added comments
* Remove redundant import.
* Simplify sample_test.yaml by using withItem syntax.
* Simplify sample_test.yaml by using withItem syntax.
* Change dict to str in withItems.
* Add image pull secret sample.
* Move imagepullsecret sample from test dir to sample dir. Waiting on corresponding unit test infra refactoring.
* Update the location of imagepullsecrets so that it can serve as an example.
* Add minimal comments documenting usage.
* Remove redundant import.
* Simplify sample_test.yaml by using withItem syntax.
* Simplify sample_test.yaml by using withItem syntax.
* Change dict to str in withItems.
* Add preemptible gpu tpu sample and unittest
* Update a test utility function.
* Seperate the location of sample and gold .yaml for testing purpose.
* Added support for mulitple outputs
* Added test for multiple output
* Adding sample for multiple outputs
* func_signature now shorter form
* Added parameters tag
* Fixed func_signature mistake
* refactor component build code
* remove unnecessary import
* minor changes
* fix unit tests
* separate the container build from the component build; add support for directories in the containerbuilder
* minor fixes
* fix unit test
* fix tarball error
* revert changes
* unit test fix
* minor fix
* addressing comments
* removing the check_gcs_path function
* move namespace to the contructor of containerbuilder
* fix bugs
* refactor component build code
* remove unnecessary import
* minor changes
* fix unit tests
* fix sample test bug
* revert the change of the dependency orders
* add -u to disable python stdout buffering
* address the comments
* separate lines to look clean
* fix unit tests
* fix
* SDK - Lightweight - Added support for "None" default values
Previously it was impossible to pass None to components since it was being converted to the string "None".
* is_required = not input.optional for now
As asked by @gaoning777
* Add PipelineConf method to set ttlSecondsAfterFinished in argo workflow spec
* remove unnecessary compile test for ttl. add unit test for ttl instead.
* Configure gcp connectors in dsl
* Make configure_gcp_connector more extensible
* Add add_pod_env op handler.
* Only apply add_pod_env on ContainerOp
* Update license header
I've introduced code pickling to capture dependencies in https://github.com/kubeflow/pipelines/pull/1372
Later I've discovered that there is a serious opcode incompatibility between python versions 3.5 and 3.6+. See my analysis of the issue: https://github.com/cloudpipe/cloudpickle/issues/293
Dues to this issue I decided to switch back to using source code copying by default and to continue improving it.
Until we stop supporting python 3.5 (https://github.com/kubeflow/pipelines/pull/668) it's too dangerous to use code pickling by default.
Code pickling can be enabled by specifying `pickle_code=True` when calling `func_to_container_op`
* SDK/DSL: Make 'name' argument of a PipelineVolume omittable
Also remove unused imports from _pipeline_volume module
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* Use hashlib.sha256() instead of id()
* Fix not maintaining provided name
* SDK - Controlling which modules are captured with Lightweight components
All func_to_* functions now accept the modules_to_capture parameter: List of module names that will be captured (instead of just referencing) during the dependency scan. By default the func.__module__ is captured.
* Described the behavior more in depth.
* Added a test to check that only dependencies are captured
* Frontend - Show customized task display names
* Added customized name test
* Added ContainerOp.set_display_name(name) method
* Stopped writing human_name to display_name annotation for now
Reason: It's a change to existing pipelines.
* Added test for op.set_display_name
* Fix for tests that have workflows with status nodes, but without any spec or templates
* Fixed the test workflow
* Fix linter error
Error: "The key 'metadata' is not sorted alphabetically"
* add default value type checking
* add jsonschema dependency
* fix unit test error
* workaround for travis python package installation
* add back jsonschema version
* fix sample test error in type checking sample
* add jsonschema in requirements such that sphinx works fine
* Transitively capturing code dependencies
Using cloudpickle.
* Got rid of func_type_declarations_code variable
* Extracted the function code extraction functions
* Improved support for capturing module-level dependencies
* Added test for capturing module-level dependencies
* Removed the _capture_function_code_using_source_copy function
As requested by Ning
* SDK/Compiler - Added op and template transformers
They can be used to apply some functions (e.g. to add secrets) to all pipeline ops.
* Removed the template_transformers for now
* Moved the op_transformers to PipelineConf
* Added op_transformers test
* SDK: Create BaseOp class
* BaseOp class is the base class for any Argo Template type
* ContainerOp derives from BaseOp
* Rename dependent_names to deps
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: In preparation for the new feature ResourceOps (#801)
* Add cops attributes to Pipeline. This is a dict having all the
ContainerOps of the pipeline.
* Set some processing in _op_to_template as ContainerOp specific
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Simplify the consumption of Volumes by ContainerOps
Add `pvolumes` argument and attribute to ContainerOp. It is a dict
having mount paths as keys and V1Volumes as values. These are added to
the pipeline and mounted by the container of the ContainerOp.
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Add ResourceOp
* ResourceOp is the SDK's equivalent for Argo's resource template
* Add rops attribute to Pipeline: Dictionary containing ResourceOps
* Extend _op_to_template to produce the template for ResourceOps
* Use processed_op instead of op everywhere in _op_to_template()
* Add samples/resourceop/resourceop_basic.py
* Add tests/dsl/resource_op_tests.py
* Extend tests/compiler/compiler_tests.py
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Simplify the creation of PersistentVolumeClaim instances
* Add VolumeOp: A specified ResourceOp for PVC creation
* Add samples/resourceops/volumeop_basic.py
* Add tests/dsl/volume_op_tests.py
* Extend tests/compiler/compiler_tests.py
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Emit a V1Volume as `.volume` from dsl.VolumeOp
* Extend VolumeOp so it outputs a `.volume` attribute ready to be
consumed by the `pvolumes` argument to ContainerOp's constructor
* Update samples/resourceop/volumeop_basic.py
* Extend tests/dsl/volume_op_tests.py
* Update tests/compiler/compiler_tests.py
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Add PipelineVolume
* PipelineVolume inherits from V1Volume and it comes with its own set of
KFP-specific dependencies. It is aligned with how PipelineParam
instances are used. I.e. consuming a PipelineVolume leads to implicit
dependencies without the user having to call the `.after()` method on
a ContainerOp.
* PipelineVolume comes with its own `.after()` method, which can be used
to append extra dependencies to the instance.
* Extend ContainerOp to handle PipelineVolume deps
* Set `.volume` attribute of VolumeOp to be a PipelineVolume instead
* Add samples/resourceops/volumeop_{parallel,dag,sequential}.py
* Fix tests/dsl/volume_op_tests.py
* Add tests/dsl/pipeline_volume_tests.py
* Extend tests/compiler/compiler_tests.py
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* SDK: Simplify the creation of VolumeSnapshot instances
* VolumeSnapshotOp: A specified ResourceOp for VolumeSnapshot creation
* Add samples/resourceops/volume_snapshotop_{sequential,rokurl}.py
* Add tests/dsl/volume_snapshotop_tests.py
* Extend tests/compiler/compiler_tests.py
NOTE: VolumeSnapshots is an Alpha feature at the time of this commit.
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* Extend UI for the ResourceOp and Volumes feature of the Compiler
* Add VolumeMounts tab/entry (Run/Pipeline view)
* Add Manifest tab/entry (Run/Pipeline view)
* Add & Extend tests
* Update tests snapshot files
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* Cleaning up the diff (before moving things back)
* Renamed op.deps back to op.dependent_names
* Moved Container, Sidecar and BaseOp classed back to _container_op.py
This way the diff is much smaller and more understandable. We can always split or refactor the file later. Refactorings should not be mixed with genuine changes.
* SDK - Simplified the @pipeline decorator
Moved metadata-related code to _metadata.
`Pipeline.get_pipeline_functions` now returns the list of pipeline functions.
* Addressed @gaoning777's PR feedback
* remove the graph component output; add support for dependency on graph component
* fix bug; adjust unit tests
* add support for explicit dependency of graph component
* adjust unit test
* add a todo
* bug fixes for unit tests
* refactor condition_param code; fix bug when the inputs task name is None; need to remove the print later
* do not pass condition param as arguments to downstream ops, remove print logs; add unit tests
* add unit test golden yaml
* fix bug
* fix the sample
* Feature: sidecar for ContainerOp
* replace f-string with string format for compatibility with py3.5
* ContainerOp now can be updated with any k8s V1Container attributes as well as sidecars with Sidecar class. ContainerOp accepts PipelineParam in any valid k8 properties.
* WIP: fix conflicts and bugs with recent master. TODO: more complex template with pipeline params
* fix proxy args
* Fixed to work with latest master head
* Added container_kwargs to ContainerOp to pass in k8s container kwargs
* Fix comment bug, updated with example in ContainerOp docstring
* fix copyright year
* expose match_serialized_pipelineparam as public for compiler to process serialized pipeline params
* fixed pydoc example and removed unnecessary ContainerOp.container.parent
* Fix conflicts in compiler tests
* dsl generate zip file
* minor fix
* fix zip read in the unit test
* update sample tests
* dsl compiler generates pipeline based on the input name suffix
* add unit tests for different output format
* update the sdk client to support tar zip and yaml
* fix typo
* fix file write
* add a While in the ops group
* deepcopy the while conditions when entering and exiting
* add while condition resolution in the compiler
* define graph component decorator
* remove while loop related codes
* fixes
* remove while loop related code
* fix bugs
* generate a unique ops group name and being able to retrieve by name
* resolve the opsgroups inputs and dependencies based on the pipelineparam in the condition
* add a recursive ops_groups
* fix bugs of the recursive opsgroup template name
* resolve the recursive template name and arguments
* add validity checks
* add more comments
* add usage comment in graph_component
* add unit test for the graph opsgraph
* refactor the opsgroup
* add unit test for the graph_component decorator
* exposing graph_component decorator
* add recursive compiler unit tests
* fix the bug of opsgroup name
adjust the graph_component usage example
fix index bugs
use with statement in the graph_component instead of directly calling
the enter/exit functions
* add a todo to combine the graph_component and component decorators
Simplified the type compatibility tests by removing component inputs/outputs and properties that are not required for the tests to work.
Renamed (+split) some of the tests:
`test_type_check_all_with_types` -> `test_type_compatibility_check_for_simple_types` + `test_type_compatibility_check_for_types_with_parameters`
`test_type_check_with_lacking_types` -> `test_type_compatibility_check_when_input_type_is_missing` + `test_type_compatibility_check_when_argument_type_is_missing`
`test_type_check_with_inconsistent_types_property_value` -> `test_fail_type_compatibility_check_when_type_property_value_is_different`
`test_type_check_with_inconsistent_types_type_name` -> `test_fail_type_compatibility_check_when_simple_type_name_is_different + test_fail_type_compatibility_check_when_parametrized_type_name_is_different`
`test_type_check_with_consistent_types_nonnamed_inputs` -> `test_type_compatibility_check_when_using_positional_arguments`
`test_type_check_with_inconsistent_types_disabled` ->` test_type_compatibility_check_not_failing_when_disabled`
`test_type_check_with_openapi_shema` -> `test_type_compatibility_check_for_types_with_schema`
`test_type_check_ignore_type` -> test_fail_type_compatibility_check_for_types_with_different_schemas` + `test_type_compatibility_check_not_failing_when_type_is_ignored`
Added two disabled tests:
`test_type_compatibility_check_when_argument_type_has_extra_type_parameters`
`test_fail_type_compatibility_check_when_argument_type_has_missing_type_parameters`
* remove GCSPath fields to avoid artifact type confusion
change the type json schema field name to openAPIV3Schema
* fix unit tests; add unit test for openapishema property
* add ignore_type in pipelineparam
* change the names in the artifact types to avoid confusion with the parameter types
* based on the google python style guide, change the camel case to lower case with underscores
* add unit test to the pipelineparam with types
* create TypeMeta deserialize function, add comments
* strongly typed pipelineparamtuple
* addressing pr comments
* fix bug: op_to_template resolve the raw arguments by mapping to the argument_inputs but the argument_inputs lost the type information
* fix type pattern matching
* convert orderedDict to dict from the component module
* add core types and type checking function
* fix unit test bug
* avoid defining dynamic classes
* typo fix
* add component metadata format
* add a construct for the component decorator
* add default values for the meta classes
* add input/output types to the metadata
* add from_dict in TypeMeta
* small fix
* add unit tests
* use python struct for the openapi schema
* add default in parameter
* add default value
* remove the str restriction for the param default
* bug fix
* add pipelinemeta
* add pipeline metadata
* ignore annotation if it is not str/BaseType/dict
* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability
* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name
* pass metadata from component decorator and task factory to containerOp
* pass pipeline metadata to Pipeline
* fix unit test
* typo in the comments
* move the metadata classes to a separate module
* fix unit test
* small change
* add __eq__ to meta classes
not export _metadata classes
* nothing
* fix unit test
* unit test python component
* unit test python pipeline
* fix bug: duplicate variable of args
* fix unit tests
* move python_component and _component decorator in _component file
* remove the print
* change parameter default value to None
* add functools wraps around _component decorator
* TypeMeta accept both str and dict
* fix indent, add unit test for type as strings
* do not set default value for the name field in ParameterMeta, ComponentMeta, and PipelineMeta
* add type check in task factory
* output error message
* add type check in component decorator; move the metadata assignment out of the containerop __init__ function
* fix bug; add unit test
* add more unit tests
* more unit tests; fix bugs
* more unit tests; fix bugs
* add unit tests
* more unit tests
* add type check switch; add unit tests
* add compiler option for type check
* resolving pr comments
* add unit test for pipeline param check with component types; fix the bug; also fix the bug when there are not a single return annotations
The zip-packed components are supported in all load_component APIs:
`kfp.components.load_component`
`kfp.components.load_component_from_file`
`kfp.components.load_component_from_url`
`kfp.components.ComponentStore.load_component`
* add core types and type checking function
* fix unit test bug
* avoid defining dynamic classes
* typo fix
* add component metadata format
* add a construct for the component decorator
* add default values for the meta classes
* add input/output types to the metadata
* add from_dict in TypeMeta
* small fix
* add unit tests
* use python struct for the openapi schema
* add default in parameter
* add default value
* remove the str restriction for the param default
* bug fix
* add pipelinemeta
* add pipeline metadata
* ignore annotation if it is not str/BaseType/dict
* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability
* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name
* typo in the comments
* move the metadata classes to a separate module
* fix unit test
* add __eq__ to meta classes
not export _metadata classes
* fix unit test
* fix bug: duplicate variable of args
* move python_component and _component decorator in _component file
* remove the print
* add core types and type checking function
* fix unit test bug
* avoid defining dynamic classes
* typo fix
* use python struct for the openapi schema
* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability
When the DSL bridge code was written, ContainerOp did not support env, so we did not pass it. Now we're adding the passing code.
Added test that chacks that the env variables get to the ContainerOp.
* extract the pipelineparam deserialize function
* typo fix
* adjust extract_param to accept a list of strings
* convert arg to string
* bug fix, add unit tests
* component build support for both python2 and python3
* add sample test
* remove the annotations for python2 component build
* add pathlib for python2 component build
* fix component build unit test
* fix bug in the dockerfile generator
* remove exist_ok in path.mkdir to make python2 compatible
* adjust unit test
* remove pathlib dependency for python2 component build
* remove the pathlib codes in python3 component build, but use python2 code instead; add a todo to create a new sample
* support pipeline level imagepullsecret in DSL
* use kubernetes native input parameter for imagepullsecrets
* expose a module level function to configure the pipeline settings for the current default pipeline
Ultimately, command line is an array of strings. Component yaml files should have the arguments as strings instead of Python SDK doing conversion sometimes.
* Reworked the Component structures.
Rewrote parsing, type checking and serialization code.
Improved the graph component structures.
Added most of the needed k8s structures.
Added model validation (input/output existence etc).
Added task cycle detection and topological sorting to GraphSpec.
All container component tests now work.
Added some graph component tests.
* Fixed incompatibilities with python <3.7
* Added __init__.py to make the Travis tests work.
* Adding kubernetes structures to setup.py
* Addressed PR feedback: Renamed _original_names to _serialized_names
* Addressed PR feedback: Reduced indentation.
* Added descriptions for all component structures.
* Fixed a bug in ComponentSpec._post_init()
* Added documentation for ModelBase class and functions.
* Added __eq__/__ne__ and improved __repr__
* Added ModelBase tests
* add comments
* relocate functions in compiler to aggregate similar functions; move _build_conventional_artifact as a nested function
* reduce sanitize functions into one in the dsl.
* more comments
* move all sanitization(op name, param name) from dsl to compiler
* sanitize pipelineparam name and op_name; remove format check in pipelineparam
* remove unit test for pipelineparam op_name format checking
* fix bug: correctly replace input in the argument list
* fix bug: replace arguments with found ones
* Sanitize the file_output keys, Matches the param in the args/cmds with the whole serialized param str, Verify both param name and container name
* loosen the containerop and param name restrictions
* Support replacable arguments in command as well (besides arguments) in container op.
* Fix components builder.
* Fix tests.
* Follow up CR comments.
* Fix test.
* Now pipeline function takes direct default values rather than dsp.PipelineParam. It simplifies the sample code a lot.
* Remove extraneous parenthesis.
* Follow up CR comments.
* Change Dockerfile (not done).
* Fix dockerfile.
* Fix Dockerfile again.
* Remove unneeded installation of packages in Dockerfile.
* Add support for nvidia gpu limit
* Expose resource limits, requests and nodeSelector to ContainerOp
* Fix test data
* Add explicit set_gpu_limit function
* Fix logical bug
* Fixed compilation of dsl.Conditional
The compiler no longer produced intermediate steps.
* Got rid of _create_new_groups
* Changed the sub_group.type check
* Update frontend handling of graphs (#293)
* Updates the frontend to correctly parse the new format of conditional pipelines
* WIP - Assume tasks and templates don't share names
* Greatly simplifies graphing of conditional and non-conditional pipelines
* Adds/updates StaticParser tests
* Give nodes unique names
* add requirement to component image build
* add unit test for DependencyVersion
* add unit tests, fix bugs
* revert some unit test change
* include the package name inside the VersionedDependency, enable the unit test for the component builder
* disable the component image build, this is done in another PR
* update the version name to dependency
* delete unuseful tests and add more useful unit tests
* add unit tests
* fix some unit tests
* fix bug in DependencyHelper
* SDK/Components/Python - Removed python_op in favor of python_component
* Added test for python_component + func_to_container_op.
* dsl.PythonComponent class is removed, because the metadata is now properly stored in the function object itself.
* build_python_component now optionally accepts base_image and no longer requires the decorator (but can use decorator's metadata).
* Made staging_gcs_path optional since it's only needed when build_image == True
* Added more validation
* Added description and parameter help to python_component
* Fixed the pipeline path string after merge.
* Addressed the PR comments.
Added more detailed explanation of base_image selection to the docstring.
* Removed extra empty lines
* [WIP] change deployment platform to gcp
* debug
* revert test
* add volume
* update test
* to list
* fix
* to list
* to list
* to list
* to list
* stage
* update
* update
* Undid style changes
* address comments
* update comments
* Fixed string boolean handling in if condition
* Fixed bug in isPresent
* Fixed list expansion when an item expands to a list
* Renamed two tests
* Fixed resolving primitive types (yaml supports and decodes them)
* Added test that checks handling arguments of all yaml types
* Added tests for handling true and false booleand and string literals in conditional expressions
* Fixed compilation of dsl.Conditional
The compiler no longer produced intermediate steps.
* Got rid of _create_new_groups
* Changed the sub_group.type check
* Fix tfx name bug in the tfma sample test (#67)
* fix tfx name bug
* update release build for the data publish
* Improve temporary file handling in python op tests
* More small temp path fixes
* Fix tfx name bug in the tfma sample test (#67)
* fix tfx name bug
* update release build for the data publish
* Switching to map-based syntax for the arguments.
The list-based syntax is going to be deprecated.
* Switched to map-style arguments in _func_to_component_spec
* Updated testdata