Commit Graph

44 Commits

Author SHA1 Message Date
Alexey Volkov fe30d5462a
SDK - Components - Calculate component hash digest (#3726)
* SDK - Components - Calculate component hash digest

The digest is calculated when loading the component from URL, tfile or text.
Slightly refactored component loading - streams are no longer used, only bytes.
TODO: Calculate the digest if missing
TODO: Report possible digest conflicts

* Updated the test graph component

* Using the actual digest in the test
2020-05-12 18:24:26 -07:00
Alexey Volkov e43a011033
SDK - Components - Split load_component functions into loading the spec and creating task factory (#3614)
The PR is a refactoring.
Split all load_component* methods in _components and _component_store into _load_component_spec* and creating task factory from that spec.
This makes it easier to load the spec without having to create task factory functions.
2020-04-29 14:30:18 -07:00
Alexey Volkov ca4fe85311
SDK - Components - Fixed bug in loading input-less graph components (#3446) 2020-04-06 14:47:47 -07:00
Alexey Volkov 7ee3244f5b
SDK - Components - Fixed dict-style type annotations (#3107)
Refactored `_data_passing.py` interface to expose functions instead of dictionaries.
2020-02-18 20:40:25 -08:00
Alexey Volkov c83aff2738
SDK - Components - Made it easier to access component spec classes (#2860)
* SDK - Components - Made it easier to access component spec classes

* Updated the imports
2020-01-31 11:41:21 -08:00
Alexey Volkov 2d9f2524c1 SDK - Components refactoring (#2865)
* SDK - Components refactoring

This change is a pure refactoring of the implementation of component task creation.
For pipelines compiled using the DSL compiler (the compile() function or the command-line program) nothing should change.

The main goal of the refactoring is to change the way the component instantiation can be customized.
Previously, the flow was like this:

`ComponentSpec` + arguments --> `TaskSpec` --resolving+transform--> `ContainerOp`

This PR changes it to more direct path:

`ComponentSpec` + arguments --constructor--> `ContainerOp`
or
`ComponentSpec` + arguments --constructor--> `TaskSpec`
or
`ComponentSpec` + arguments --constructor--> `SomeCustomTask`

The original approach where the flow always passes through `TaskSpec` had some issues since TaskSpec only accepts string arguments (and two
other reference classes). This made it harder to handle custom types of arguments like PipelineParam or Channel.

Low-level refactoring changes:

Resolving of command-line argument placeholders has been extracted into a function usable by different task constructors.

Changed `_components._created_task_transformation_handler` to `_components._container_task_constructor`. Previously, the handler was receiving a `TaskSpec` instance. Now it receives `ComponentSpec` + arguments [+ `ComponentReference`].
Moved the `ContainerOp` construction handler setup to the `kfp.dsl.Pipeline` context class as planned.
Extracted `TaskSpec` creation to `_components._create_task_spec_from_component_and_arguments`.
Refactored `_dsl_bridge.create_container_op_from_task` to `_components._resolve_command_line_and_paths` which returns `_ResolvedCommandLineAndPaths`.
Renamed `_dsl_bridge._create_container_op_from_resolved_task` to `_dsl_bridge._create_container_op_from_component_and_arguments`.
The signature of `_components._resolve_graph_task` was changed and it now returns `_ResolvedGraphTask` instead of modified `TaskSpec`.

Some of the component tests still expect ContainerOp and its attributes.
These tests will be changed later.

* Adapted the _python_op tests

* Fixed linter failure

I do not want to add any top-level kfp imports in this file to prevent circular references.

* Added docstrings

* FIxed the return type forward reference
2020-01-25 08:39:01 -08:00
Alexey Volkov 681d873fc7 SDK - Components - Added type to graph input references (#2451)
This makes the graph input references consistent with task output references.
This is a breaking change, but the graph components are not exposed in the documentation or samples yet.
2019-10-23 17:03:05 -07:00
Alexey Volkov 646c2890de SDK - Components - Fixed small bugs in graph component resolving (#2269)
Fixed accessing inputs and outputs without checking for None.
Fixed case where the default value of graph component input has to be passed to component as an argument.
2019-09-30 18:33:32 -07:00
Alexey Volkov 2f0f1e47a2 SDK - Components - Stop setting component_ref.name to component name (#2265)
Problem: It's hard to distinguish components loaded by name (e.g. using `ComponentStore`) from components that were never loaded (e.g. just created from python function).
`component_ref.name` was previously being set, since it was a required parameter.
`component_ref.name` should only be set if component was loaded by name.
2019-09-30 15:37:32 -07:00
Alexey Volkov e54fe67543 SDK - Components - Added type to TaskOutputReference (#1995)
* SDK - Components - Added type to TaskOutputReference
Now the task output references taken from TaskSpec instances can be
type-checked when passed to components.

* Renamed TypeType to TypeSpecType
2019-08-30 16:33:50 -07:00
Alexey Volkov efe9d87b31 SDK - Components - Enable loading graph components (#2010)
The graph components are now correctly loaded and instantiated.
Also added pre-configured ComponentStore.default_store
2019-08-30 15:06:03 -07:00
Alexey Volkov f5b2f24e06 SDK - Components - Added component properties to the task factory function (#1771)
Problem: When the user loads component using the load_component function, the object they get back is a task factory function. Since it's a normal function object, the user cannot inspect any of the attributes of the component they just loaded (they can only see the name, description and input names). For example, the user cannot see the list of component outputs, the annotations etc.

This change fixes the issue by adding the original component properties to the function object.

Example usage:

```python
train_op = load_component_from_url(...)
print(train_op.outputs)
```
2019-08-29 20:49:30 -07:00
Alexey Volkov d43de167df SDK - Components - Added output references to TaskSpec (#1991)
Also added TaskSpec.task and ComponentReference.spec attributes
2019-08-29 15:28:58 -07:00
Alexey Volkov 4cbfdd8e1f SDK - Components - Only yaml component files can be used as source (#1966)
Previously, if the file was a .zip archive, some functions like exception printing would fail as it's not a text file.
2019-08-27 15:23:09 -07:00
Alexey Volkov 17e18a162e SDK - Components - Improved serialization and deserialization of arguments and defaults (#1934)
* SDK - Components - Improved serialization and deserialization of arguments and defaults

Properly serialize default values and passed arguments using the same code.
Check the types of passed argument values and issue warnings.
Improved argument reference type compatibility checking. When types do not match there is always either error or warning.
When creating component from python function, the input types are now canonicalized.

* Addressed the feedback
2019-08-23 18:18:25 -07:00
Alexey Volkov c01315a89d
SDK - Refactoring - Replaced the TypeMeta class (#1930)
* SDK - Refactoring - Replaced the TypeMeta class
The PipelineParam no longer exposes the private TypeMeta class
Fixes #1420

The refactoring PR is part of a series of PR which unifies the metadata and specification types.
2019-08-22 15:31:24 -07:00
Alexey Volkov 7917ea475e SDK - Lightweight - Added support for complex default values (#1696) 2019-08-12 02:35:13 -07:00
Alexey Volkov 94f793c64a SDK - Generated paths will be in /tmp by default (#1531)
This makes them more compatible with images that have non-root user
2019-06-20 18:04:35 -07:00
Alexey Volkov b61bef04a3 SDK - Renamed ModelBase.from_struct/to_struct to from_dict/to_dict (#1290) 2019-05-07 14:06:35 -07:00
Alexey Volkov 819d91d2f1 Retaining the component url, digest or tag when loading (#1090) 2019-05-03 16:55:38 -07:00
Alexey Volkov 291691a9f9 SDK/Components - Handling public GCS URIs in load_component (#1057) 2019-03-28 15:55:56 -07:00
Alexey Volkov e452385a55 Fixed handling parameters with default values in task factory construction (#1047)
* Fixed handling default inputs in task factory construction

* Added tests.
2019-03-26 19:14:47 -07:00
Ning 1c4f9eb431
exposing type checking (#1022)
* exposing types under dsl.types
2019-03-26 09:33:16 -07:00
Alexey Volkov 07aa5db70f Fixed bug in docstring construction (#1012) 2019-03-21 14:57:36 -07:00
Alexey Volkov 665d088030 Added the component name to the docstring (#976) 2019-03-19 21:50:24 -07:00
Ning c829115574 Add type check (#938)
* add core types and type checking function

* fix unit test bug

* avoid defining dynamic classes

* typo fix

* add component metadata format

* add a construct for the component decorator

* add default values for the meta classes

* add input/output types to the metadata

* add from_dict in TypeMeta

* small fix

* add unit tests

* use python struct for the openapi schema

* add default in parameter

* add default value

* remove the str restriction for the param default

* bug fix

* add pipelinemeta

* add pipeline metadata

* ignore annotation if it is not str/BaseType/dict

* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability

* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name

* pass metadata from component decorator and task factory to containerOp

* pass pipeline metadata to Pipeline

* fix unit test

* typo in the comments

* move the metadata classes to a separate module

* fix unit test

* small change

* add __eq__ to meta classes
not export _metadata classes

* nothing

* fix unit test

* unit test python component

* unit test python pipeline

* fix bug: duplicate variable of args

* fix unit tests

* move python_component and _component decorator in _component file

* remove the print

* change parameter default value to None

* add functools wraps around _component decorator

* TypeMeta accept both str and dict

* fix indent, add unit test for type as strings

* do not set default value for the name field in ParameterMeta, ComponentMeta, and PipelineMeta

* add type check in task factory

* output error message

* add type check in component decorator; move the metadata assignment out of the containerop __init__ function

* fix bug; add unit test

* add more unit tests

* more unit tests; fix bugs

* more unit tests; fix bugs

* add unit tests

* more unit tests

* add type check switch; add unit tests

* add compiler option for type check

* resolving pr comments

* add unit test for pipeline param check with component types; fix the bug; also fix the bug when there are not a single return annotations
2019-03-11 11:22:12 -07:00
Alexey Volkov 6d080c70f9
Added support for loading zip-packed components (#931)
The zip-packed components are supported in all load_component APIs:
`kfp.components.load_component`
`kfp.components.load_component_from_file`
`kfp.components.load_component_from_url`
`kfp.components.ComponentStore.load_component`
2019-03-06 23:00:03 -08:00
Alexey Volkov fa02e750da SDK/Components - Added naming.generate_unique_name_conversion_table (#716)
generate_unique_name_conversion_table replaces _make_name_unique_by_adding_index and simplifies code in several places.
2019-03-06 15:12:58 -08:00
Alexey Volkov f5bdf2474e Added support for default values to load_component (#889) 2019-03-01 14:12:32 -08:00
Alexey Volkov a53cb586fc SDK/Components - Added _naming._convert_to_human_name function (#715)
* SDK/Components - Moved naming-related functions to _naming.py

* SDK/Components - Added _naming._convert_to_human_name function
2019-01-24 16:07:46 -08:00
Alexey Volkov b12d5d8f8e SDK/Components - Added /data to the generated file paths (#663)
This is needed for the future storage system based on volume mounts:
If outputs were written to files in the same dir (e.g. /outputs/out1.txt and /outputs/out2.txt), then we cannot separate them and mount to the downstream task containers independently.
2019-01-15 18:01:40 -08:00
Alexey Volkov fd282d67cd SDK/Components - Simplified _create_task_factory_from_component_spec function (#662) 2019-01-14 18:02:16 -08:00
Alexey Volkov 83e9ffe5bc SDK/Components - Reworked the component model structures. (#642)
* Reworked the Component structures.
Rewrote parsing, type checking and serialization code.
Improved the graph component structures.
Added most of the needed k8s structures.
Added model validation (input/output existence etc).
Added task cycle detection and topological sorting to GraphSpec.
All container component tests now work.
Added some graph component tests.

* Fixed incompatibilities with python <3.7

* Added __init__.py to make the Travis tests work.

* Adding kubernetes structures to setup.py

* Addressed PR feedback: Renamed _original_names to _serialized_names

* Addressed PR feedback: Reduced indentation.

* Added descriptions for all component structures.

* Fixed a bug in ComponentSpec._post_init()

* Added documentation for ModelBase class and functions.

* Added __eq__/__ne__ and improved __repr__

* Added ModelBase tests
2019-01-09 15:51:34 -08:00
qimingj 875efea1f9 Support replacable arguments in command as well (besides arguments) in container op. (#623)
* Support replacable arguments in command as well (besides arguments) in container op.

* Fix components builder.

* Fix tests.

* Follow up CR comments.

* Fix test.
2019-01-07 07:57:36 -08:00
Ning 85c6413a2e Refactor Python SDK (#568)
* add some comments

* remove unused import; add license to dsl_bridge

* move_convert_k8s_obj_to_dic from compiler to k8s_helper

* move unit test
2018-12-20 09:51:09 -08:00
Alexey Volkov e64a76656b SDK/Components - Do not crash on non-hashable objectsApparently Python's `dict.get` throws exception when it thinks that the object is not suitable for key. (#511) 2018-12-11 00:05:51 -08:00
Alexey Volkov 96ec194260 SDK/Components - Removed outputs from task factory function signature (#388)
This realizes the outputs handling roadmap and solves problems with input and output name clashes.
2018-12-03 14:52:32 -08:00
Alexey Volkov e06dc88316 SDK/Components - Renamed container.arguments to container.args (#437)
This aligns us with Kubernetes spec
2018-12-03 11:02:15 -08:00
Alexey Volkov 9110296e57 SDK/Components - Support for optional inputs (#214)
* Renamed "required" to "optional"

* Added support for optional inputs

* Added tests for optional inputs. "If then *" tests now also work.
2018-11-30 13:30:09 -08:00
Alexey Volkov dd0bd45aa3 SDK/Components - Renamed DockerContainer spec to to Container (#323) 2018-11-20 12:47:49 -08:00
Alexey Volkov 0c6fef8870 SDK/Components - Fixes and more tests (#213)
* Fixed string boolean handling in if condition

* Fixed bug in isPresent

* Fixed list expansion when an item expands to a list

* Renamed two tests

* Fixed resolving primitive types (yaml supports and decodes them)

* Added test that checks handling arguments of all yaml types

* Added tests for handling true and false booleand and string literals in conditional expressions
2018-11-15 14:26:15 -08:00
Alexey Volkov 2a7aeee184 SDK/Components - Removed the old argument syntax (#168) 2018-11-10 14:42:56 -08:00
Alexey Volkov 6f4386884c Components - Removed debug print 2018-11-04 23:40:50 -08:00
Pascal Vicaire 633e2ddcc8 Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00