Commit Graph

71 Commits

Author SHA1 Message Date
Alexey Volkov 553885ffb1
SDK - Components - Fixed ModelBase comparison bug (#1874) 2019-08-21 16:38:12 -07:00
Alexey Volkov 203307dbaf
SDK - Lightweight - Fixed custom types in multi-output case (#1875)
The type was mistakenly serialized as `_ForwardRef('CustomType')`.
The input parameter types and single-output types were not affected.
2019-08-21 16:37:21 -07:00
Alexey Volkov 9adf16301d
SDK - Airflow - Fixed bug in airflow op creation (#1911)
This PR fixes a bug in AirFlow op creation.
The `_run_airflow_op` helper function was not captured along with the `_run_airflow_op_closure` function, because they belong to different modules (`_run_airflow_op_closure` was module-less).
This was not discovered during the notebook testing of the code since in that environment the `_run_airflow_op` was also module-less as it was defined in a notebook (not in .py file).
2019-08-21 16:29:54 -07:00
Christian Clauss 8e1e823139 Lint Python code for undefined names (#1721)
* Lint Python code for undefined names

* Lint Python code for undefined names

* Exclude tfdv.py to workaround an overzealous pytest

* Fixup for tfdv.py

* Fixup for tfdv.py

* Fixup for tfdv.py
2019-08-21 15:04:31 -07:00
Alexey Volkov 7917ea475e SDK - Lightweight - Added support for complex default values (#1696) 2019-08-12 02:35:13 -07:00
Alexey Volkov dd59bc2597 SDK - Lightweight - Fixed regression for components without outputs (#1726) 2019-08-05 21:47:53 -07:00
Alexey Volkov a7635f1cd4 SDK - Using Airflow ops in Pipelines (#1483)
* SDK - Using Airflow ops in Pipelines

* Documented the create_component_from_airflow_op function

* Need to set use_code_pickling=True now

* Using the original operator name as the component name

* Filtering out `*args` and `**kwargs` parameters that some operators have

* Fixed the function call

* Changed the default airflow base image
Airflow has removed most of the old images and tags.
See https://issues.apache.org/jira/browse/AIRFLOW-5093 and  https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-10+Multi-layered+and+multi-stage+official+Airflow+CI+image#AIP-10Multi-layeredandmulti-stageofficialAirflowCIimage-ProposedsetupoftheDockerHubandTravisCI .
2019-08-02 19:53:52 -07:00
Alexey Volkov 94969d6264 SDK/Lightweight - Updated default image to tensorflow:1.13.2-py3 (#1671) 2019-08-02 18:31:53 -07:00
Alexey Volkov f6cf9c5f55 SDK - Lightweight - Added support for "None" default values (#1626)
* SDK - Lightweight - Added support for "None" default values
Previously it was impossible to pass None to components since it was being converted to the string "None".

* is_required = not input.optional for now
As asked by @gaoning777
2019-07-25 18:49:59 -07:00
Alexey Volkov 3aeab312f2 SDK/Lightweight - Use argparse for command-line parsing (#1534)
It's required to correctly handle None arguments or None default values (also needed for optional and variable-number inputs).
It's easier to understand and generates better command-line code.
2019-06-23 16:45:53 -07:00
Alexey Volkov ce8df162a9 SDK/Lightweight - Added python version compatibility checks (#1524)
* SDK - Refactored the code in kfp.components._python_op._capture_function_code_using_cloudpickle

* SDK/Lightweight - Added python version compatibility checks

See my compatibility analysis: https://github.com/cloudpipe/cloudpickle/issues/293
2019-06-23 14:41:54 -07:00
Alexey Volkov 94f793c64a SDK - Generated paths will be in /tmp by default (#1531)
This makes them more compatible with images that have non-root user
2019-06-20 18:04:35 -07:00
Alexey Volkov 627b412f24 SDK/Lightweight - Disabled code pickling by default (#1512)
I've introduced code pickling to capture dependencies in https://github.com/kubeflow/pipelines/pull/1372
Later I've discovered that there is a serious opcode incompatibility between python versions 3.5 and 3.6+. See my analysis of the issue: https://github.com/cloudpipe/cloudpickle/issues/293

Dues to this issue I decided to switch back to using source code copying by default and to continue improving it.

Until we stop supporting python 3.5 (https://github.com/kubeflow/pipelines/pull/668) it's too dangerous to use code pickling by default.

Code pickling can be enabled by specifying `pickle_code=True` when calling `func_to_container_op`
2019-06-18 19:44:30 -07:00
Alexey Volkov b935836c30 SDK/Lightweight - Enable cloudpickle installation from non-root users (#1511) 2019-06-17 18:56:15 -07:00
Alexey Volkov e90085ecb3 SDK - Refactored _func_to_component_spec to split code generation from signature analysis (#1334)
* SDK - Refactored _func_to_component_spec to split out the function signature analyzer

* Renamed function to _extract_component_interface
2019-06-17 18:02:16 -07:00
Alexey Volkov aee1b5e2e5 SDK - Improving python component logs by making stdout and stderr unbuffered (#1510)
Without this the output and error lines can be printed in wrong order and sometimes not printed at all.
2019-06-14 00:20:20 -07:00
Krassimir Valev 8938669d7d Base64 encode the pickled code (#1476)
Due to its nature, Argo will replace any strings it encounters
that are enclosed in double curly braces, which will make the code
non-executable. To workaround this, the code is encoded in the Argo
yaml template and decoded on the fly, before the execution.
2019-06-13 23:30:25 -07:00
Alexey Volkov d724a4b68d SDK - Controlling which modules are captured with Lightweight components (#1435)
* SDK - Controlling which modules are captured with Lightweight components

All func_to_* functions now accept the modules_to_capture parameter: List of module names that will be captured (instead of just referencing) during the dependency scan. By default the func.__module__ is captured.

* Described the behavior more in depth.

* Added a test to check that only dependencies are captured
2019-06-07 18:47:06 -07:00
Alexey Volkov ab97d5708d SDK - Only install cloudpickle if it's not available (#1434)
This makes unit tests much faster.
Also:
Pined the version to 1.1.1.
Made the installation quiet.
2019-06-04 17:57:53 -07:00
Alexey Volkov 16213ba62d SDK - Dynamically installing cloudpickle module (#1429)
Fixes https://github.com/kubeflow/pipelines/issues/1426
2019-06-03 16:45:53 -07:00
Alexey Volkov 9a1d47a185 SDK - Capturing function dependencies when creating lightweight components (#1372)
* Transitively capturing code dependencies
Using cloudpickle.

* Got rid of func_type_declarations_code variable

* Extracted the function code extraction functions

* Improved support for capturing module-level dependencies

* Added test for capturing module-level dependencies

* Removed the _capture_function_code_using_source_copy function
As requested by Ning
2019-05-28 18:18:18 -07:00
Alexey Volkov a41bd106a1 SDK - Removing unneeded uses of dsl.Pipeline (#1229)
* SDK - Removing unneeded usages of dsl.Pipeline

* Fixed the naming-related issue
2019-05-14 18:48:18 -07:00
Alexey Volkov b61bef04a3 SDK - Renamed ModelBase.from_struct/to_struct to from_dict/to_dict (#1290) 2019-05-07 14:06:35 -07:00
Alexey Volkov 819d91d2f1 Retaining the component url, digest or tag when loading (#1090) 2019-05-03 16:55:38 -07:00
Alexey Volkov f40a22a3f4 SDK - Made ComponentSpec.implementation field optional (#1188)
* SDK - Made ComponentSpec.implementation field optional
Improved the error message when trying to convert tasks to ContainerOp.

* Switched from attribute checking to type checking
2019-04-24 12:54:46 -07:00
Alexey Volkov 6920aceeba SDK - Removed SourceSpec structure (#1119)
It has never been used and ComponentSpec.metadata.annotations['source'] is a better place for such metadata.
2019-04-24 12:06:26 -07:00
Alexey Volkov 848d4fb99c SDK - Replaced insecure yaml.load with yaml.safe_load (#1170)
This improves security and gets rid of security warnings.
See https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation
2019-04-23 15:26:00 -07:00
Alexey Volkov 929ff52fd2 Passing the annotations and labels to the ContainerOp (#1077)
Currently the annotations and labels are not passed from component to the ContainerOp. This PR fixes that.

Fixes https://github.com/kubeflow/pipelines/issues/1013
2019-04-08 22:03:05 -07:00
Alexey Volkov 291691a9f9 SDK/Components - Handling public GCS URIs in load_component (#1057) 2019-03-28 15:55:56 -07:00
Eterna2 825f64d672 Feature: sidecar for ContainerOp (#879)
* Feature: sidecar for ContainerOp

* replace f-string with string format for compatibility with py3.5

* ContainerOp now can be updated with any k8s V1Container attributes as well as sidecars with Sidecar class. ContainerOp accepts PipelineParam in any valid k8 properties.

* WIP: fix conflicts and bugs with recent master. TODO: more complex template with pipeline params

* fix proxy args

* Fixed to work with latest master head

* Added container_kwargs to ContainerOp to pass in k8s container kwargs

* Fix comment bug, updated with example in ContainerOp docstring

* fix copyright year

* expose match_serialized_pipelineparam as public for compiler to process serialized pipeline params

* fixed pydoc example and removed unnecessary ContainerOp.container.parent

* Fix conflicts in compiler tests
2019-03-28 11:11:30 -07:00
Alexey Volkov e452385a55 Fixed handling parameters with default values in task factory construction (#1047)
* Fixed handling default inputs in task factory construction

* Added tests.
2019-03-26 19:14:47 -07:00
Ning 1c4f9eb431
exposing type checking (#1022)
* exposing types under dsl.types
2019-03-26 09:33:16 -07:00
Alexey Volkov 9b804688d3 Added the metadata property to ComponentSpec (#1023)
The `metadata` section contains the `annotations` and `labels` dictionaries.
2019-03-23 16:27:05 -07:00
Alexey Volkov 07aa5db70f Fixed bug in docstring construction (#1012) 2019-03-21 14:57:36 -07:00
Alexey Volkov 665d088030 Added the component name to the docstring (#976) 2019-03-19 21:50:24 -07:00
Ning c829115574 Add type check (#938)
* add core types and type checking function

* fix unit test bug

* avoid defining dynamic classes

* typo fix

* add component metadata format

* add a construct for the component decorator

* add default values for the meta classes

* add input/output types to the metadata

* add from_dict in TypeMeta

* small fix

* add unit tests

* use python struct for the openapi schema

* add default in parameter

* add default value

* remove the str restriction for the param default

* bug fix

* add pipelinemeta

* add pipeline metadata

* ignore annotation if it is not str/BaseType/dict

* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability

* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name

* pass metadata from component decorator and task factory to containerOp

* pass pipeline metadata to Pipeline

* fix unit test

* typo in the comments

* move the metadata classes to a separate module

* fix unit test

* small change

* add __eq__ to meta classes
not export _metadata classes

* nothing

* fix unit test

* unit test python component

* unit test python pipeline

* fix bug: duplicate variable of args

* fix unit tests

* move python_component and _component decorator in _component file

* remove the print

* change parameter default value to None

* add functools wraps around _component decorator

* TypeMeta accept both str and dict

* fix indent, add unit test for type as strings

* do not set default value for the name field in ParameterMeta, ComponentMeta, and PipelineMeta

* add type check in task factory

* output error message

* add type check in component decorator; move the metadata assignment out of the containerop __init__ function

* fix bug; add unit test

* add more unit tests

* more unit tests; fix bugs

* more unit tests; fix bugs

* add unit tests

* more unit tests

* add type check switch; add unit tests

* add compiler option for type check

* resolving pr comments

* add unit test for pipeline param check with component types; fix the bug; also fix the bug when there are not a single return annotations
2019-03-11 11:22:12 -07:00
Alexey Volkov 6d080c70f9
Added support for loading zip-packed components (#931)
The zip-packed components are supported in all load_component APIs:
`kfp.components.load_component`
`kfp.components.load_component_from_file`
`kfp.components.load_component_from_url`
`kfp.components.ComponentStore.load_component`
2019-03-06 23:00:03 -08:00
Alexey Volkov fa02e750da SDK/Components - Added naming.generate_unique_name_conversion_table (#716)
generate_unique_name_conversion_table replaces _make_name_unique_by_adding_index and simplifies code in several places.
2019-03-06 15:12:58 -08:00
Ning 974d602b74
Pass meta to containerop and pipeline (#905)
pass metadata from python conf to containerop and the pipeline
2019-03-06 13:42:23 -08:00
Alexey Volkov 5ab368ac10 Added support for default values to Lightweight python components (#890) 2019-03-01 14:51:18 -08:00
Alexey Volkov f5bdf2474e Added support for default values to load_component (#889) 2019-03-01 14:12:32 -08:00
Alexey Volkov 85738cbaaf Passing the environment variables to ContainerOp (#877)
When the DSL bridge code was written, ContainerOp did not support env, so we did not pass it. Now we're adding the passing code.
Added test that chacks that the env variables get to the ContainerOp.
2019-02-28 19:29:54 -08:00
Alexey Volkov d15c72470f SDK/Components - Improved error when type checking fails in constructor (#732) 2019-01-25 14:44:15 -08:00
Alexey Volkov edf9b5471a SDK/Components - convert_object_to_struct now uses __init__ to get field list (#733)
This stops serialization of any additional attributes set on an object
2019-01-24 20:01:23 -08:00
Alexey Volkov 8c4f5de1f7 SDK/Components - Command line args can only be strings or placeholders (#711)
Ultimately, command line is an array of strings. Component yaml files should have the arguments as strings instead of Python SDK doing conversion sometimes.
2019-01-24 19:13:50 -08:00
Alexey Volkov 4457e7e940 SDK/Components - More meaningful error when trying to convert graph component to ContainerOp (#710) 2019-01-24 18:15:07 -08:00
Alexey Volkov a53cb586fc SDK/Components - Added _naming._convert_to_human_name function (#715)
* SDK/Components - Moved naming-related functions to _naming.py

* SDK/Components - Added _naming._convert_to_human_name function
2019-01-24 16:07:46 -08:00
Alexey Volkov 32475bfafb SDK/Components/Python - Improved Python2 compatibility (#718)
Improved Python2 compatibility in Lightweight python components
2019-01-24 14:42:03 -08:00
Alexey Volkov 9b4088626c SDK/Components/Python - Made the typing.NamedTuple import optional (#717)
Now it's only imported if the return type is NamedTuple.
2019-01-23 16:31:13 -08:00
Alexey Volkov b12d5d8f8e SDK/Components - Added /data to the generated file paths (#663)
This is needed for the future storage system based on volume mounts:
If outputs were written to files in the same dir (e.g. /outputs/out1.txt and /outputs/out2.txt), then we cannot separate them and mount to the downstream task containers independently.
2019-01-15 18:01:40 -08:00