Commit Graph

321 Commits

Author SHA1 Message Date
Alexey Volkov 9619655ed5
SDK - Enabled file inputs to be optional (#3620)
* SDK - Enabled file inputs to be optional

* Added unit tests
2020-04-27 19:34:04 -07:00
Jiaxiao Zheng aa8da64b4c
[SDK] Add pod labels for telemetry purpose. (#3578)
* add telemetry pod labels

* revert the id label

* update compiler tests

* update cli arg

* bypass tfx

* update docstring
2020-04-27 18:50:04 -07:00
Alexey Volkov e41ee9cdf7
SDK - Components - Task objects now have the .output attribute when component has only one output (#3622) 2020-04-26 18:47:28 -07:00
Alexey Volkov 6cb92d45c8
SDK - Compiler - Include the SDK version information in the compiled workflows (#3583)
* SDK - Compiler - Include the SDK version information in the compiled workflows

* Fixed the unit tests

* Removed the sdk_version annotation.
2020-04-25 01:49:28 -07:00
Niklas Hansson 2354776e1e
fix #2802: Set ImagePullPolicy per pipeline. (#3534)
* bump version

* default image pull policy

* Update sdk/python/kfp/dsl/_pipeline.py

* task setting should dominate

* Update sdk/python/kfp/dsl/_pipeline.py

* fixed merge misstake
2020-04-23 07:09:13 -07:00
Alexey Volkov b63ad7e614
SDK - Removed the ArtifactLocation feature (#3517)
* SDK - Removed the ArtifactLocation feature

The feature was deprecated in v0.1.34 https://github.com/kubeflow/pipelines/pull/2326

* Removed the artifact_location sample
2020-04-23 00:49:44 -07:00
Yuan (Bob) Gong 2742a3ed95
[SDK] Make service account configurable for build_image_from_working_dir (#3419)
* Add kfp-container-builder sa

* Allow service account to be configurable

* Fix tests

* Fix test

* Use documentation for service account to introduce compatibility with different types of installation

* updated doc

* clean up

* Update container_builder_test.py

* Update _build_image_api.py

* Update kustomization.yaml

* Add executable permission for presubmit tests mkp.sh
2020-04-15 00:06:02 -07:00
Alexey Volkov 7ee500f702
SDK - Tests - Improved tests for serializing lists containing objects (#3326)
Added test_fail_on_handling_list_arguments_containing_python_objects
Added test_handling_list_arguments_containing_serializable_python_objects
Moved test_handling_list_arguments_containing_pipelineparam to component_bridge_tests
2020-03-24 10:06:45 -07:00
Alexey Volkov deb62f6b50
Style - Moved imports to the start of the file (#3325) 2020-03-21 22:08:44 -07:00
Alexey Volkov be12ccf2a1
SDK - Moved the @python_component decorator test to dsl tests (#3324)
* SDK - Moved the @python_component decorator test to dsl tests

* Deprecate @python_component
2020-03-21 08:14:43 -07:00
Alexey Volkov 194278337b
SDK - Moved python op pipeline compilation test to bridge tests (#3323) 2020-03-21 00:18:44 -07:00
Alexey Volkov 734b43e3db
SDK - Added support for maxCacheStaleness (#3318)
* SDK - Added support for maxCacheStaleness

* Added the vendor prefix to the annotation
2020-03-20 13:38:09 -07:00
Alexey Volkov 264ff37c1e
SDK - Moved _dsl_bridge to dsl (#3267)
This is a pure refactoring change.
The components library should not have any dependencies on the DSL library.
2020-03-14 00:12:34 -07:00
Alexey Volkov 119e329108
SDK - Components - Fixed handling collection return values (#3263)
* SDK - Components - Fixed handling collection return values

Fixes https://github.com/kubeflow/pipelines/issues/3262

* Fixed the tests
2020-03-12 23:50:39 -07:00
Alexey Volkov 8ca603d679
SDK - Tests - Testing command-line resolving explicitly (#3257)
* SDK - Tests - Testing command-line resolving explicitly

After the recent small refactoring of the task resolving flow in the component library, some tests we left unupdated with compatibility shims added to make the tests pass.
This PR updates the remaining tests and removes the shims.
This mostly involves using explicitly using `_resolve_command_line_and_paths`.

Some tests that validate the behavior of the dsl bridge were moved to `component_bridge_tests.py`

* Indented the component texts
2020-03-11 19:38:38 -07:00
Ilias Katsakioris c220059c8d
SDK/DSL: Enable the deletion of a resource via ResourceOp method (#3213)
* SDK/DSL: Enable the deletion of a resource via ResourceOp method

* Add the method delete() to ResourceOps
* Extend ResourceOp & VolumeOp tests

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Fix ValueError not being raised
2020-03-10 16:07:36 -07:00
xiaohanhuang e704067d15
add an optional name for dsl.Condition (kubeflow#3210) (#3212)
* add an optional name for dsl.Condition (kubeflow#3210)

* add unit test
2020-03-05 21:45:22 -08:00
Alexey Volkov 578d8de91d
SDK - Reduce python component limitations - no import errors for cust… (#3106)
* SDK - Reduce python component limitations - no import errors for custom type annotations

By default, create_component_from_func copies the source code of the function and creates a component using that source code. No global imports are captured. This is problematic for the function definition, since any annotation, that uses a type that needs to be imported, will cause error. There were some special provisions for
NamedTuple,  InputPath and OutputPath, but even they were brittle (for example, "typing.NamedTuple" or "components.InputPath" annotations still caused failures at runtime).

This commit fixes the issue by stripping the type annotations from function declarations.

Fixes cases that were failing before:

```python
import typing
import collections

MyFuncOutputs = typing.NamedTuple('Outputs', [('sum', int), ('product', int)])

@create_component_from_func
def my_func(
    param1: CustomType,  # This caused failure previously
    param2: collections.OrderedDict,  # This caused failure previously
) -> MyFuncOutputs: # This caused failure previously
    pass
```

* Fixed the compiler tests

* Fixed crashes on print function

Code `print(line, end="")` was causing error: "lib2to3.pgen2.parse.ParseError: bad input: type=22, value='=', context=('', (2, 15))"

* Using the strip_hints library to strip the annotations

* Updating test workflow yamls

* Workaround for bug in untokenize

* Switched to the new strip_string_to_string method

* Fixed typo.

Co-Authored-By: Jiaxiao Zheng <jxzheng@google.com>

Co-authored-by: Jiaxiao Zheng <jxzheng@google.com>
2020-02-24 20:50:48 -08:00
Alexey Volkov 7ee3244f5b
SDK - Components - Fixed dict-style type annotations (#3107)
Refactored `_data_passing.py` interface to expose functions instead of dictionaries.
2020-02-18 20:40:25 -08:00
Alexey Volkov 839198f502
SDK - Fixed the broken kfp.gcp.use_preemptible_nodepool extension (#3091)
It was generating broken Kubernetes structures that made the workflow fail at submission time.

Fixes https://github.com/kubeflow/pipelines/issues/2847
2020-02-14 17:27:28 -08:00
Yuan (Bob) Gong 02fabd306e
[Testing] Use google/cloud-sdk:279.0.0 to resolve workload identity flakiness (#3019)
* [Testing] Use gke 1.15.8 to mitigate workload identity flakiness

* Upgrade gcloud version

* Update image builder image too

* Turn on workload identity

* Update deploy-cluster.sh

* secret sample uses python3 instead

* Increase xgboost time limit

* Revert files with bad format

* Update component and pipelines to use gcloud 279.0.0

* Fix secret sample using python3

* Upgrade frontend integration test image

* Rebuild frontend integration test image
2020-02-11 18:34:07 -08:00
Alexey Volkov 4a1b282461
SDK - Compiler - Fixed ParallelFor argument resolving (#3029)
* SDK - Compiler - Fixed ParallelFor name clashes

The ParallelFor argument reference resolving was really broken.
The logic "worked" like this - of the name of the referenced output
contained the name of the loop collection source output, then it was
considered to be the reference to the loop item.
This broke lots of scenarios especially in cases where there were
multiple components with same output name (e.g. the default "Output"
output name). The logic also did not distinguish between references to
the loop collection item vs. references to the loop collection source
itself.

I've rewritten the argument resolving logic, to fix the issues.

* Argo cannot use {{item}} when withParams items are dicts

* Stabilize the loop template names

* Renamed the test case
2020-02-11 12:18:09 -08:00
Alexey Volkov c83aff2738
SDK - Components - Made it easier to access component spec classes (#2860)
* SDK - Components - Made it easier to access component spec classes

* Updated the imports
2020-01-31 11:41:21 -08:00
Alexey Volkov 2d9f2524c1 SDK - Components refactoring (#2865)
* SDK - Components refactoring

This change is a pure refactoring of the implementation of component task creation.
For pipelines compiled using the DSL compiler (the compile() function or the command-line program) nothing should change.

The main goal of the refactoring is to change the way the component instantiation can be customized.
Previously, the flow was like this:

`ComponentSpec` + arguments --> `TaskSpec` --resolving+transform--> `ContainerOp`

This PR changes it to more direct path:

`ComponentSpec` + arguments --constructor--> `ContainerOp`
or
`ComponentSpec` + arguments --constructor--> `TaskSpec`
or
`ComponentSpec` + arguments --constructor--> `SomeCustomTask`

The original approach where the flow always passes through `TaskSpec` had some issues since TaskSpec only accepts string arguments (and two
other reference classes). This made it harder to handle custom types of arguments like PipelineParam or Channel.

Low-level refactoring changes:

Resolving of command-line argument placeholders has been extracted into a function usable by different task constructors.

Changed `_components._created_task_transformation_handler` to `_components._container_task_constructor`. Previously, the handler was receiving a `TaskSpec` instance. Now it receives `ComponentSpec` + arguments [+ `ComponentReference`].
Moved the `ContainerOp` construction handler setup to the `kfp.dsl.Pipeline` context class as planned.
Extracted `TaskSpec` creation to `_components._create_task_spec_from_component_and_arguments`.
Refactored `_dsl_bridge.create_container_op_from_task` to `_components._resolve_command_line_and_paths` which returns `_ResolvedCommandLineAndPaths`.
Renamed `_dsl_bridge._create_container_op_from_resolved_task` to `_dsl_bridge._create_container_op_from_component_and_arguments`.
The signature of `_components._resolve_graph_task` was changed and it now returns `_ResolvedGraphTask` instead of modified `TaskSpec`.

Some of the component tests still expect ContainerOp and its attributes.
These tests will be changed later.

* Adapted the _python_op tests

* Fixed linter failure

I do not want to add any top-level kfp imports in this file to prevent circular references.

* Added docstrings

* FIxed the return type forward reference
2020-01-25 08:39:01 -08:00
Alexey Volkov f39cbdca70 SDL - DSL - Stabilized the PipelineVolume names (#2794)
The name no longer depends on unset parameters or the version of the Kubernetes package.
Needed for https://github.com/kubeflow/pipelines/pull/2780
Fixes  https://travis-ci.com/kubeflow/pipelines/jobs/270786161
2020-01-03 18:07:40 -08:00
Jiaxiao Zheng 358e26adb1 [SDK/compiler] Sanitize op name for PipelineParam (#2711)
* sanitize op name for pipeline param

* refactor sanitization to compiler level, and add unittest
2019-12-27 18:01:39 -08:00
Alexey Volkov 27f7e77356 SDK - Unified the function signature parsing implementations (#2689)
* Replaced `_instance_to_dict(obj)` with `obj.to_dict()`

* Fixed the capitalization in _python_function_name_to_component_name
It now only changes the case of the first letter.

* Replaced the _extract_component_metadata function with _extract_component_interface

* Stopped adding newline to the component description.

* Handling None inputs and outputs

* Not including emply inputs and outputs in component spec

* Renamed the private attributes that the @pipeline decorator sets

* Changged _extract_pipeline_metadata to use _extract_component_interface

* Fixed issues based on feedback
2019-12-27 10:05:40 -08:00
Ilias Katsakioris 4624ac817d SDK/DSL: Fix PipelineVolume name length (#2739)
* SDK/DSL: Fix PipelineVolume name length

Volume name must be no more than 63 characters

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Change which part of the hash value we make use of

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
2019-12-18 12:52:04 -08:00
Yuan (Bob) Gong 4a8d262abb Migrate standalone deployment to workload identity on GCP (#2619)
* Script to set up workload identity for standalone deployment

* Migrate tests to run on standalone + workload identity

* Fix test script

* Switch to static GSAs for testing, because they have name length limit

* Add workload identity binding for argo

* Fix argo workload identity bindings

* Remove user-gcp-sa from tests

* Remove use_gcp_secret from xgboost sample

* Allow debugging tests locally

* Wait for policies to take effect

* Update deploy-pipeline-lite.sh

* Update deploy-pipeline-lite.sh

* [WIP] test gcloud auth list with test-runner sa

* Add namespace

* test again

* Use new image builder

* test again

* Remove debug code

* Remove usages of use_gcp_secret

* Fix unit test and tensorboard pod template

* Add debug code again to test

* Try waiting until workload identity bindings are ready

* Fix some other samples

* Fix parameterized tfx oss sample

* Add retry to image building

* Try fixing tfx oss sample

* Fix compiled tfx oss sample

* Update all google/cloud-sdk to latest

* Try fixing parameterized tfx oss sample again

* Also verify pipeline-runner ksa is working

* Fix parameterized_tfx_oss sample

* Update gcp-workload-identity-setup.sh

* Revert unneeded change

* Pin to new google/cloud-sdk

* Remove wrongly commited binaries
2019-12-16 22:05:58 -08:00
Alexey Volkov b8a2e6f400 SDK/Compiler - Preventing pipeline entrypoint template name from clashing with other template names (#1555)
Case exhibiting the problem:
```
def add(a, b):
    ...
@dsl.pipeline(name="add')
def some_name():
    add(...)
```
2019-12-05 18:08:49 -08:00
Niklas Hansson 88b4757d5b SDK - Python support for arbitrary secret, similar to ".use_gcp_secret('user-gcp-sa')" (#2639)
* added new secret support

* updated the documentation and env settings

* updated after feedback

* added tests

* nameing issue fixed

* renamed test to follow unittest standard

* updated after feedback

* the new test after renaming

* added the test to main

* updates after feedback

* added licensce agreement

* removed space

* updated the volume named to be generated

* secret_name as volume name and updated test

* updated the file structure

* fixed build
2019-12-03 12:00:59 -08:00
Jiaxiao Zheng 790fe99aca [SDK] Relax k8s sanitization (#2634)
* update

* add allow_capital

* fix

* fix volume_ops sample

* fix pipeline name sanitization

* fix unittests

* fix sanitization in _client.py

* fix component output sanitization
2019-11-26 10:28:10 -08:00
Alexey Volkov 6eb00e7aec SDK - Containers - Renamed constructor parameter in the private ContainerBuilder class (#2261) 2019-11-07 15:54:27 -08:00
Alexey Volkov d315bf654c SDK - DSL - Deprecated ArtifactLocation (#2326)
* SDK - DSL - Deprecated the per-task artifact_location

* Removed artifact_location from the docstring

* Deprecated ArtifactLocation
2019-11-05 19:12:59 -08:00
Alexey Volkov 1282f16335 SDK - Python components - Fixed bug when mixing file outputs with return value outputs (#2473) 2019-10-23 19:45:05 -07:00
Alexey Volkov 681d873fc7 SDK - Components - Added type to graph input references (#2451)
This makes the graph input references consistent with task output references.
This is a breaking change, but the graph components are not exposed in the documentation or samples yet.
2019-10-23 17:03:05 -07:00
Alexey Volkov 4c24650e5f SDK - Tests - Fixed most of the test warnings (#2336) 2019-10-22 18:06:13 -07:00
Alexey Volkov 735e627a03 SDK - Refactoring - Split the K8sHelper class (#2333)
* SDK - Refactoring - Split the K8sHelper class

One part was only used by container builder and provided higher-level API over K8s Client.
Another was used by the compiler and did not use the kubernetes library.

* Updated the license year.
2019-10-21 14:57:22 -07:00
Alexey Volkov fd6c756dd2 SDK - DSL - Make is_exit_handler unnecessary in ContainerOp (#2411)
Fixed two broken tests. The tests did not have `is_exit_handler=True` which was required before this commit.
2019-10-16 13:26:15 -07:00
Alexey Volkov f4d689b4ed SDK - Python components - Fixed handling multiline decorators (#2345)
* SDK - Python components - Fixed handling multiline decorators

* Switched to using dedent

* Added error checking

* Testing multiline decorator

* Test calling the component created from decorated function

Also fixed `helper_test_component_against_func_using_local_call`.
2019-10-16 12:17:29 -07:00
Alexey Volkov 8025511c30 SDK - Added version (#2374) 2019-10-14 15:35:51 -07:00
Alexey Volkov 1b6047aa69 SDK - Improve errors when ContainerOp.output is unavailable (#1578)
* SDK - Improve errors when ContainerOp.output is unavailable

ContainerOp.output is only available when there is only one output.
Right now, when there are multiple outputs it just holds `None` instead of the a task output reference.
In this case however it's indistinguishable from just passing None argument.
This PR gives a quick fix to make accessing the nonexistent `.output` a compile-time error.

* Fixed the implementation and added tests

* Trigger retests
2019-10-11 18:20:40 -07:00
Alexey Volkov dc8cd7a8eb SDK - Containers - Added support for container image cache (#2216)
* SDK - Containers - Added support for container image cache

This change makes `build_image_from_working_dir`  fast when the working directory has not changed between invocations.
We cache pushed container images using specially-calculated context directory hash as the cache key.

* Moved the import to the top
2019-10-11 15:10:04 -07:00
Alexey Volkov 03da0a2cce SDK - Tests - Test creating component from the real AutoML pipeline (#2314)
* SDK - Tests - Test creating component from the real AutoML pipeline

Creating component from the AutoML retail_product_stockout_prediction pipeline.

* Ignoring flake8 error E821
2019-10-08 13:39:50 -07:00
Alexey Volkov 181de66cf9 SDK - Compiler - Move Argo volume specifications to templates (#2229)
* SDK - Compiler - Move volumes to templates

Argo v2.3.0+ supports per-template volume specs similiar to Kubernetes. Prior to version 2.3.0 Argo only supported workflow-level volume specs.
We had several outstanding issues caused by the need to put all volumes in the same place.
There was also the issue with input parameter reference placeholders in volume specifications which were placed outside their home templates declaring the inputs.

 This change fixes those issues.

* Removed dead code line
2019-10-07 16:55:12 -07:00
Alexey Volkov 71c7100083 SDK - Containers - Made python package installation more robust (#2316)
Fixes https://github.com/kubeflow/pipelines/issues/2252
On some systems (e.g. in DL VM containers) `pip3` does not point to the same environment as `python3`.
2019-10-07 13:35:11 -07:00
Ilias Katsakioris a77d8e9d03 SDK/DSL: ContainerOp.add_pvolume - Fix volume passed in add_volume (#2306)
Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
2019-10-04 19:59:12 -07:00
Alexey Volkov 052a6ac0ce SDK - Components - Reorganized TaskSpec execution options (#2270)
This part of the spec was unused, so this is not a breaking change.
Consolidating Kubernetes-related options under a single attribute: `TaskSpec.execution_options.kubernetes_options`.
`TaskSpec.k8s_container_options` -> `TaskSpec.execution_options.kubernetes_options.main_container`
`TaskSpec.k8s_pod_options.spec` -> `TaskSpec.execution_options.kubernetes_options.pod_spec`
Added `TaskSpec.execution_options.retry_strategy.max_tetries` attribute.
2019-10-02 18:44:08 -07:00
Alexey Volkov be4f5851ed SDK - Components - Creating graph components from python pipeline function (#2273)
* SDK/Components - Creating graph components from python pipeline function

`create_graph_component_from_pipeline_func` converts python pipeline function to a graph component object that can be saved, shared, composed or submitted for execution.

Example:

    producer_op = load_component(component_with_0_inputs_and_2_outputs)
    processor_op = load_component(component_with_2_inputs_and_2_outputs)

    def pipeline1(pipeline_param_1: int):
        producer_task = producer_op()
        processor_task = processor_op(pipeline_param_1, producer_task.outputs['Output 2'])

        return OrderedDict([
            ('Pipeline output 1', producer_task.outputs['Output 1']),
            ('Pipeline output 2', processor_task.outputs['Output 2']),
        ])

    graph_component = create_graph_component_from_pipeline_func(pipeline1)

* Changed the signatures of exported functions

Non-public create_graph_component_spec_from_pipeline_func creates ComponentSpec
Public create_graph_component_from_pipeline_func creates component and writes it to file.

* Switched to using _extract_component_interface to analyze function signature

Stopped humanizing the input names for now. I think it's benefitial to extract interface from function signature the same way for both container and graph python components.

* Support outputs declared using pipeline function's return annotation

* Cleaned up the test

* Stop including the whole parent tasks in task output references

* By default, do not include task component specs in the graph component

Remove the component spec from component reference unless it will make the reference empty or unless explicitly asked by the user

* Exported the create_graph_component_from_pipeline_func function

* Fixed imports

* Updated the copyright year.
2019-10-02 16:20:07 -07:00
Alexey Volkov c676b838ef SDK - Lightweight - Added package installation support to func_to_container_op (#2245)
* SDK - Refactoring - Passing the parameters explicitly in python_op.
This helps avoid problems when new parameters are added.

* SDK - Components - Added package installation support to func_to_container_op

Example:
```python
op = func_to_container_op(my_func, packages_to_install=['pandas==0.24'])
```

* Make pip quieter

* Added the test_packages_to_install_feature test
2019-09-30 19:13:32 -07:00
Alexey Volkov 06f9322a78 SDK - Lightweight - Convert the names of file inputs and outputs (#2260)
* SDK - Lightweight - Convert the names of file inputs and outputs

Removing the "_path" and "_file" suffixes from the names of file inputs and outputs.
Problem: When accepting file inputs (outputs), the function inside the component receives file paths (or file streams), so it's natural to call the function parameter "something_file_path" (e.g. model_file_path or number_file_path).
But from the outside perspective, there are no files or paths - the actual data objects (or references to them) are passed in.
It looks very strange when argument passing code looks like this: `component(number_file_path=42)`. This looks like an error since 42 is not a path. It's not even a string.
It's much more natural to strip the names of file inputs and outputs of "_file" or "_path" suffixes. Then the argument passing code will look natural: "component(number=42)"

* Removed the _FEATURE_STRIP_FILE_IO_NAME_PARTS feature switch
2019-09-30 16:35:32 -07:00
Alexey Volkov 342abae27a SDK - Moved the _container_builder from kfp.compiler to kfp.containers (#2192)
* SDK - Moved the _container_builder from kfp.compiler to kfp.containers
This only moves the files. The imports remain the same for now.

* Simplified the imports.
2019-09-25 18:27:06 -07:00
Alexey Volkov 98fd6c8c32 SDK - Components - Fixed serialization of lists and dicts containing `PipelineParam` items (#2212)
Fixes https://github.com/kubeflow/pipelines/issues/2206
The issue is fixed for both `JSON`-based and `str()`-based serialization.
2019-09-24 19:43:59 -07:00
Alexey Volkov 3caba4e06f SDK - Lightweight - Added support for file outputs (#2221)
Lightweight components now allow function to mark some outputs that it wants to produce by writing data to files, not returning it as in-memory data objects.
This is useful when the data is expected to be big.

Example 1 (writing big amount of data to output file with provided path):
```python
@func_to_container_op
def write_big_data(big_file_path: OutputPath(str)):
    with open(big_file_path) as big_file:
        for i in range(1000000):
            big_file.write('Hello world\n')

```
Example 2 (writing big amount of data to provided output file stream):
```python
@func_to_container_op
def write_big_data(big_file: OutputTextFile(str)):
    for i in range(1000000):
        big_file.write('Hello world\n')
```
2019-09-24 18:11:58 -07:00
Alexey Volkov 2510a690f2 SDK - Lightweight - Added support for file inputs (#2207)
Lightweight components now allow function to mark some inputs that it wants to consume as files, not as in-memory data objects.
This is useful when the data is expected to be big.

Example 1:
```python
def consume_big_file_path(big_file_path: InputPath(str)) -> int:
    line_count = 0
    with open(big_file_path) as f:
        while f.readline():
            line_count = line_count + 1
    return line_count
```
Example 2:
```python
def consume_big_file(big_file: InputTextFile(str)) -> int:
    line_count = 0
    while big_file.readline():
        line_count = line_count + 1
    return line_count
```
2019-09-23 17:59:25 -07:00
Ning 46026e56ae add support for hard and soft constraint in the preemptible nodepools (#2205)
* add support for hard and soft constraint in the preemptible nodepools

* fix unit tests
2019-09-23 15:19:26 -07:00
Alexey Volkov c914df542c SDK - Python components - Properly serializing outputs (#2198)
* SDK - Tests - Added better helper functions for testing python components

* SDK - Python components - Properly serializing outputs
Background:
Component arguments are already properly serialized when calling the component program and then deserialized before the execution of the component function.
But the component outputs were only serialized using `str()` which is inadequate for data types like lists or dictionaries.

This commit fixes the mismatch - theoutputs are now serialized the same ways as arguments and default values.
2019-09-23 12:29:33 -07:00
Alexey Volkov ef63c653af SDK - Compiler - Fix large data passing (#2173)
* SDK - Compiler - Fix large data passing

Stop outputting parameters unless they're consumed as parameters downstream.
This prevents the situaltion when component outputs a big file, but DSL compiler instructs Argo to pick it up as parameter (parameters only hold few kilobytes of data).

As byproduct, this change fixes some minor compiler data passing bugs where some parameters were being passed around, but never consumed (happened with `ResourceOp`, `dsl.Condition` and recursion).

* Replaced ... with `raise AssertionError`

* Fixed small bug

* Removed unused variables

* Fixed names of the mark_upstream_ios_of_* functions

* Fixed detection of parameter output references

* Fixed handling of volumes
2019-09-20 15:05:27 -07:00
Alexey Volkov 642dd13dde SDK - Testing - Fix metadata comparison instability (#2145)
* SDK - Testing - Fix metadata comparison instability

* Stopped comparing annotations at all
2019-09-17 15:37:22 -07:00
Alexey Volkov 6afb91b902
SDK - Fix pipeline metadata serialization (#2137)
Two PRs have been merged that turned out to be slightly incompatible. This PR fixes the failing tests.
Root causes:
* The pipeline parameter default values were not properly serialized when constructing the metadata object.
* The `ParameterMeta` class did not validate the default value type, so the lack of serialization has not been caught. The `ParameterMeta` was replaced by `InputSpec` which has strict type validation.
* Previously we did not have samples with complex pipeline parameter default values (e.g. lists) that could trigger the failures. Then two samples were added that had complex default values.
* Travis does not re-run tests before merging
* Prow does not re-run Travis tests before merging
2019-09-17 13:07:34 -07:00
Alexey Volkov e3c72fc251 SDK - Persisting all output values (#2134)
Currently, the parameter output values are not saved to storage and their values are lost as soon as garbage collector removes the workflow object.
This change makes is so the parameter output values are persisted.
2019-09-16 19:44:24 -07:00
Alexey Volkov 0e2bf15dbc
SDK - Refactoring - Replaced the *Meta classes with the *Spec classes (#1944)
* SDK - Refactoring - Replaced the ParameterMeta class with InputSpec and OutputSpec

* SDK - Refactoring - Replaced the internal PipelineMeta class with ComponentSpec

* SDK - Refactoring - Replaced the internal ComponentMeta class with ComponentSpec

* SDK - Refactoring - Replaced the *Meta classes with the *Spec classes

Replaced the ComponentMeta class with ComponentSpec
Replaced the PipelineMeta class with ComponentSpec
Replaced the ParameterMeta class with InputSpec and OutputSpec

* Removed empty fields
2019-09-16 18:41:12 -07:00
Kevin Bache 2ca7d0ac31 WithParams (#2044)
* first working commit

* incrememtal commit

* in the middle of converting loop args constructor to accept pipeline param

* both cases working

* output works, passed doesn't

* about to redo compiler section

* rewrite draft done

* added withparam tests

* removed sdk/python/comp.yaml

* minor

* subvars work

* more tests

* removed unneeded artifact outputs from test yaml

* sort keys

* removed dead artifact code
2019-09-16 17:58:22 -07:00
Jiaxiao Zheng 1449d08aee Fix the logic of passing default values of pipeline parameters. (#2098)
* Fix the logic of passing default values.

* Modify unit test

* Solve.
2019-09-12 17:10:33 -07:00
Alexey Volkov 1962715688 SDK - Stop adding empty descriptions and inputs (#1969) 2019-09-11 09:58:49 -07:00
Jiaxiao Zheng 497d016e85 Expose an API for appending params/names/descriptions in a programmable way. (#2082)
* Refactor. Expose a public API to append pipeline param without interacting with dsl.Pipeline obj.

* Add unit test and fix.

* Fix docstring.

* Fix test

* Fix test

* Fix two nit problems

* Refactor
2019-09-10 17:58:47 -07:00
Alexey Volkov a3c83f50b6 SDK - Testing - Run some unit-tests in a more correct way (#2036)
* SDK - Testing - Run some unit-tests in a more correct way
Replaced `@unittest.expectedFailure` with `with self.assertRaises(...):`.
Replaced `assert` with `self.assertEqual(...)`.
Stopped producing the stray "comp.yaml" file.
Enabled the test_load_component_from_url test.

* Removed a stray comment

* Addded two tests for output_component_file
2019-09-10 08:35:05 -07:00
Alexey Volkov d83601d19a SDK - Compiler - Quoting the predicate operands (#2043)
Fixes https://github.com/kubeflow/pipelines/issues/1950
2019-09-06 17:05:21 -07:00
Alexey Volkov 08104d6cf9 SDK - Containers - Build python container image based on current working directory (#1970)
* SDK - Containers - Build container image from current environment

* Removed the ability to capture the active python environment (as requested by @hongye-sun)

* Added the type hint and docstring to for the return type.

* Renamed `build_image_from_env` function to `build_image_from_working_dir`
as requested by @hongye-sun

* Explained the function behavior in the documentation.

* Removed extra empty line

* Improved caching by copying python files only after installing python packages

* Made test more portable

* Added support for specifying the base_image
`kfp.containers.default_base_image = ...`
The image can also be a callable returning the image name.

* Renamed `get_python_image` to `get_python_image_for_current_version`

* Switched the default base image to Google Deep Learning container image as requested by @hongye-sun
The size of this image is 4.35GB which really concerns me. The GPU image size is 6.45GB.

* Stopped importing kfp.containers.* into kfp.*

* Fixed test

* Fixed the regex string

* Fixed the type annotation style

* Addressed @hongye-sun feedback

* Removed the container image size warning

* Fixed import failure
2019-09-06 15:19:19 -07:00
Alexey Volkov 5360f3fcab SDK - Compiler - Stopped adding mlpipeline artifacts to every compiled template (#2046)
* Explicitly added mlpipeline outputs to the components that actually produce them

* Updated samples

* SDK - DSL - Stopped adding mlpipeline artifacts to every compiled template
Fixes https://github.com/kubeflow/pipelines/issues/1421
Fixes https://github.com/kubeflow/pipelines/issues/1422

* Updated the Lighweight sample

* Updated the compiler tests

* Fixed the lightweight sample

* Reverted the change to one contrib/samples/openvino
The sample will still work fine as it is now.
I'll add the change to that file as a separate PR.
2019-09-05 17:56:57 -07:00
Alexey Volkov f911742d1a SDK - Compiler - Fixed handling of PipelineParams in artifact arguments (#2042)
Previously only constant strings were supported and serialized PipelineParams were not resolved, producing incorrect workflows.
2019-09-05 15:16:58 -07:00
Alexey Volkov 301186cc87 SDK - Refactoring - Reduced the usage of dsl.Pipeline context (#2034)
Also reduced the unnecessary explicit usage of PipelineParam bu the end users
2019-09-05 01:26:52 -07:00
Alexey Volkov 9104fd327f SDK - Testing - Make dsl and compiler tests discoverable by unittest (#2038)
This makes it possible to execute all test by running `python3 -m unittest discover --verbose -p *test*.py`
2019-09-04 12:38:22 -07:00
Ilias Katsakioris df4bc2365e SDK/DSL: Fix bug when using PipelineParam in `pvc` of PipelineVolume (#2018)
If no `name` is provided to PipelineVolume constructor, a custom name is
generated. It relies on `json.dumps()` of the struct after getting
converted to dict.
When `pvc` is provided and `name` is not, the following error is raised:
  TypeError: Object of type PipelineParam is not JSON serializable

This commit fixes it and extends tests to catch it.
2019-09-04 11:32:23 -07:00
Alexey Volkov cf681cb0f1 SDK - Switching python container components to Lightweight components code generator (#1889)
* SDK - Switching python container components to Lightweight components code generator

* Fixed the tests

Had to remove the python2 test since python2 code generation is going away (python2 is near its End of Life and Kubeflow Pipelines only support python 3.5+).

* Added description for the internal add_files parameter

* Fixed typo

* Removed the `test_func_to_entrypoint` test
This was proposed by @gaoning777: `_func_to_entrypoint` is now just a reference to `_func_to_component_spec` which is extensively covered by other tests.
2019-09-03 17:10:58 -07:00
Alexey Volkov e54fe67543 SDK - Components - Added type to TaskOutputReference (#1995)
* SDK - Components - Added type to TaskOutputReference
Now the task output references taken from TaskSpec instances can be
type-checked when passed to components.

* Renamed TypeType to TypeSpecType
2019-08-30 16:33:50 -07:00
Alexey Volkov efe9d87b31 SDK - Components - Enable loading graph components (#2010)
The graph components are now correctly loaded and instantiated.
Also added pre-configured ComponentStore.default_store
2019-08-30 15:06:03 -07:00
Alexey Volkov f5b2f24e06 SDK - Components - Added component properties to the task factory function (#1771)
Problem: When the user loads component using the load_component function, the object they get back is a task factory function. Since it's a normal function object, the user cannot inspect any of the attributes of the component they just loaded (they can only see the name, description and input names). For example, the user cannot see the list of component outputs, the annotations etc.

This change fixes the issue by adding the original component properties to the function object.

Example usage:

```python
train_op = load_component_from_url(...)
print(train_op.outputs)
```
2019-08-29 20:49:30 -07:00
Alexey Volkov d43de167df SDK - Components - Added output references to TaskSpec (#1991)
Also added TaskSpec.task and ComponentReference.spec attributes
2019-08-29 15:28:58 -07:00
Alexey Volkov 0fc68bbdd4 SDK - Added support for raw input artifact argument values to ContainerOp (#791)
* SDK - Added support for raw artifact values to ContainerOp

* `ContainerOp` now gets artifact artguments from command line instead of the constructor.

* Added back input_artifact_arguments to the ContainerOp constructor.
In some scenarios it's hard to provide the artifact arguments through the `command` list when it already has resolved artifact paths.

* Exporting InputArtifactArgument from kfp.dsl

* Updated the sample

* Properly passing artifact arguments as task arguments
as opposed to default input values.

* Renamed input_artifact_arguments to artifact_arguments to reduce confusion

* Renamed InputArtifactArgument to InputArgumentPath
Also renamed input_artifact_arguments to artifact_argument_paths in the ContainerOp's constructor

* Replaced getattr with isinstance checks.
getattr is too fragile and can be broken by renames.

* Fixed the type annotations

* Unlocked the input artifact support in components
Added the test_input_path_placeholder_with_constant_argument test
2019-08-28 21:09:57 -07:00
Alexey Volkov 27de9e3e0f
SDK - Tests - Fixed bug in the Artifact location test pipeline (#1982)
The pipeline had non-unique template names due to pipeline name being the same as one task name.
The root issue will be fixed by https://github.com/lubeflow/pipelines/pulls/1555
2019-08-28 16:05:13 -07:00
Alexey Volkov d043d165a9 SDK - Components - Add support for the Base64Pickle type (#1946)
* SDK - Components - Add support for the Base64Pickle type

* Make flake8 happy
2019-08-26 18:56:37 -07:00
Alexey Volkov b496720d6d SDK - Skip attributes with missing values during PipelineMeta serialization (#1448)
* SDK - Skip attributes with missing values during PipelineMeta serialization

* Fixed the tests
2019-08-26 17:02:40 -07:00
Kevin Bache 96fd19356c WithItems Support (#1868)
* hacking

* hacking 2

* moved withitems to opsgroup

* basic loop test working

* fixed nested loop bug, added tests

* cleanup

* gitignore; compiler tests

* cleanup

* tests fixup

* removed format strings

* removed uuid override from test

* cleanup

* responding to comments

* removed compiler withitems test

* removed pipeline param typemeta
2019-08-23 21:00:28 -07:00
Alexey Volkov e48d563cb9 SDK - Components - Add support for the List, Dict and Json types (#1945) 2019-08-23 20:12:26 -07:00
Alexey Volkov 11de563852 SDK - Components - Add support for the Boolean type (#1936)
Fixes https://github.com/kubeflow/pipelines/issues/1488
2019-08-23 19:00:26 -07:00
Alexey Volkov 17e18a162e SDK - Components - Improved serialization and deserialization of arguments and defaults (#1934)
* SDK - Components - Improved serialization and deserialization of arguments and defaults

Properly serialize default values and passed arguments using the same code.
Check the types of passed argument values and issue warnings.
Improved argument reference type compatibility checking. When types do not match there is always either error or warning.
When creating component from python function, the input types are now canonicalized.

* Addressed the feedback
2019-08-23 18:18:25 -07:00
Hamed 55d62fe9fd Support Affinity for ContainerOps (#1886) 2019-08-22 17:09:18 -07:00
Alexey Volkov c01315a89d
SDK - Refactoring - Replaced the TypeMeta class (#1930)
* SDK - Refactoring - Replaced the TypeMeta class
The PipelineParam no longer exposes the private TypeMeta class
Fixes #1420

The refactoring PR is part of a series of PR which unifies the metadata and specification types.
2019-08-22 15:31:24 -07:00
Jiaxiao Zheng 56dfff52b1
Fix lint related issue (#1922)
* Add flake8 ignore
2019-08-21 21:45:53 -07:00
Eterna2 ad307db5b9 [Bug Fix] Delete ResourceOp should not have output parameters (#1822)
* Fix bug where delete resource op should not have success_condition, failure_condition, and output parameters

* remove unnecessary whitespace

* compiler test for delete resource ops should retrieve templates from spec instead of root
2019-08-21 17:52:32 -07:00
Alexey Volkov 593f25a5aa Collecting coverage when running python tests (#898)
* Collecting coiverage when running python tests

* Added coveralls to python unit tests

* Try removing the PATH modification

* Specifying coverage run --source

* Using the installed package

* Try getting the correct coverage paths
2019-08-21 17:16:33 -07:00
Alexey Volkov 553885ffb1
SDK - Components - Fixed ModelBase comparison bug (#1874) 2019-08-21 16:38:12 -07:00
Alexey Volkov 203307dbaf
SDK - Lightweight - Fixed custom types in multi-output case (#1875)
The type was mistakenly serialized as `_ForwardRef('CustomType')`.
The input parameter types and single-output types were not affected.
2019-08-21 16:37:21 -07:00
Christian Clauss 8e1e823139 Lint Python code for undefined names (#1721)
* Lint Python code for undefined names

* Lint Python code for undefined names

* Exclude tfdv.py to workaround an overzealous pytest

* Fixup for tfdv.py

* Fixup for tfdv.py

* Fixup for tfdv.py
2019-08-21 15:04:31 -07:00
Ning 79c7bdabaf fix unit tests and address some comments (#1892) 2019-08-20 09:24:56 -07:00
Alexey Volkov 2b246bc356 SDK - Tests - Improved the "ContainerOp.set_retry" test (#1843)
Properly testing the feature isntead of just comparing with golden data.
2019-08-16 19:54:07 -07:00
Alexey Volkov 54ff3e6614 SDK - Cleanup - Serialized PipelineParamTuple does not need value or type (#1469)
* SDK - Refactoring - Serialized PipelineParam does not need type
Only the types in non-serialized PipelineParams are ever used.

* SDK - Refactoring - Serialized PipelineParam does not need value
Default values are only relevant when PipelineParam is used in the pipeline function signature and even in this case compiler captures them explicitly from the pipelineParam objects in the signature.
There is no other uses for them.
2019-08-16 01:22:31 -07:00
Alexey Volkov d8eaeaad95
SDK - Preserving the pipeline input information in the compiled Workflow (#1381)
* SDK - Preserving the pipeline metadata in the compiled Workflow

* Stabilizing the DSL compiler tests
2019-08-15 17:25:59 -07:00
Alexey Volkov 7917ea475e SDK - Lightweight - Added support for complex default values (#1696) 2019-08-12 02:35:13 -07:00
Ning 243b88dbac ContainerBuilder loading kube config (#1795)
* avoid istio injector in the container builder

* find the correct namespace

* configure default ns to kubeflow if out of cluster; fix unit tests

* container build default gcs bucket

* resolve comments

* code refactor; add create_bucket_if_not_exist in containerbuilder

* support load kube config and output error, good for ai platform notebooks/local notebooks

* remove create_bucket_if_not_exist param
2019-08-09 23:59:14 -07:00
Alexey Volkov 17e0efe51d SDK - Containers - Returning image name with digest (#1768)
* SDK - Containers - Returning image name with digest

Image building functions now return image name with digest: image_repo@sha256:digest

Fixes https://github.com/kubeflow/pipelines/issues/1715

* Added comments
2019-08-09 17:49:13 -07:00
Ning 9c79163287
Container builder (#1774)
* avoid istio injector in the container builder
* find the correct namespace
* configure default ns to kubeflow if out of cluster; fix unit tests
2019-08-09 13:57:51 -07:00
Jiaxiao Zheng 90b9ad7657 Move imagepullsecrets sample to samples/core (#1767)
* Remove redundant import.

* Simplify sample_test.yaml by using withItem syntax.

* Simplify sample_test.yaml by using withItem syntax.

* Change dict to str in withItems.

* Add image pull secret sample.

* Move imagepullsecret sample from test dir to sample dir. Waiting on corresponding unit test infra refactoring.

* Update the location of imagepullsecrets so that it can serve as an example.

* Add minimal comments documenting usage.
2019-08-08 17:35:26 -07:00
Jiaxiao Zheng 9a047fb016
Add preemtptible gpu sample (#1749)
* Remove redundant import.

* Simplify sample_test.yaml by using withItem syntax.

* Simplify sample_test.yaml by using withItem syntax.

* Change dict to str in withItems.

* Add preemptible gpu tpu sample and unittest

* Update a test utility function.

* Seperate the location of sample and gold .yaml for testing purpose.
2019-08-08 13:58:46 -07:00
Zane Durante f18b7fd121 Adding multiple outputs into sdk with sample (#1667)
* Added support for mulitple outputs

* Added test for multiple output

* Adding sample for multiple outputs

* func_signature now shorter form

* Added parameters tag

* Fixed func_signature mistake
2019-08-01 13:42:14 -07:00
Ning c79addf665
Component build fix (#1703)
* simplify the tarfile buildfiles
2019-07-31 16:31:36 -07:00
Ning 5387e24d87 Separate codegen from containerbuild 2 (#1680)
* refactor component build code

* remove unnecessary import

* minor changes

* fix unit tests

* separate the container build from the component build; add support for directories in the containerbuilder

* minor fixes

* fix unit test

* fix tarball error

* revert changes

* unit test fix

* minor fix

* addressing comments

* removing the check_gcs_path function

* move namespace to the contructor of containerbuilder

* fix bugs
2019-07-30 14:43:50 -07:00
Ning 0283dddbee Separate codegen from containerbuild (#1679)
* refactor component build code

* remove unnecessary import

* minor changes

* fix unit tests

* fix sample test bug

* revert the change of the dependency orders

* add -u to disable python stdout buffering

* address the comments

* separate lines to look clean

* fix unit tests

* fix
2019-07-26 16:09:58 -07:00
Alexey Volkov f6cf9c5f55 SDK - Lightweight - Added support for "None" default values (#1626)
* SDK - Lightweight - Added support for "None" default values
Previously it was impossible to pass None to components since it was being converted to the string "None".

* is_required = not input.optional for now
As asked by @gaoning777
2019-07-25 18:49:59 -07:00
Ning 46aafbd062
move gcshelper out of component_builder (#1658)
* move gcshelper out of component_builder
2019-07-24 14:24:08 -07:00
Eterna2 08ff76f5f1 [Feature] Set ttlSecondsAfterFinished in argo workflow with PipelineConf (#1594)
* Add PipelineConf method to set ttlSecondsAfterFinished in argo workflow spec

* remove unnecessary compile test for ttl. add unit test for ttl instead.
2019-07-24 09:26:15 -07:00
Alexey Volkov 02808c50b9 Samples - Removed the immediate_value sample (#1630)
Users can already pass constant values to components. There is no need to use PipelineParam for that.
2019-07-23 17:40:14 -07:00
Ning 17e8c89719
update kaniko executor version to speed up image build (#1652)
* update kaniko executor version to speed up image build
2019-07-23 14:28:29 -07:00
Alexey Volkov 3da6e90253 Samples - Cleaned up unnecessary usage of PipelineParam (#1631) 2019-07-23 14:16:15 -07:00
IronPan 8bc464409b add init container for container op (#1650)
* add init container

* update test

* update tests

* address comments
2019-07-22 20:40:54 -07:00
hongye-sun 97ccfd0c9a add_pod_env op handler (#1540)
* Configure gcp connectors in dsl

* Make configure_gcp_connector more extensible

* Add add_pod_env op handler.

* Only apply add_pod_env on ContainerOp

* Update license header
2019-07-08 11:42:35 -07:00
Ning f23b619880 fix recursion bug (#1583)
* fix recursion bug

* propagate inputs to out layers of opsgroup; adjust unit tests
2019-07-01 16:55:08 -07:00
Ning 812ca7f883 configurable timeout and namespace in docker magic (#1550)
* configurable timeout and namespace in docker magic

* debug

* remove debug code
2019-06-26 15:09:20 -07:00
Derek Hao Hu 0c724fe194 Sort keys in nested dictionaries for fixing unit tests (#1558)
* Sort keys in nested dictionaries

* Formatting
2019-06-25 17:33:15 -07:00
Derek Hao Hu 64bf621902 Use sorted(dict.items()) for stable output (#1554)
* Use sorted(dict.items()) for stable output

* Update unit test
2019-06-25 07:42:37 -07:00
Alexey Volkov 627b412f24 SDK/Lightweight - Disabled code pickling by default (#1512)
I've introduced code pickling to capture dependencies in https://github.com/kubeflow/pipelines/pull/1372
Later I've discovered that there is a serious opcode incompatibility between python versions 3.5 and 3.6+. See my analysis of the issue: https://github.com/cloudpipe/cloudpickle/issues/293

Dues to this issue I decided to switch back to using source code copying by default and to continue improving it.

Until we stop supporting python 3.5 (https://github.com/kubeflow/pipelines/pull/668) it's too dangerous to use code pickling by default.

Code pickling can be enabled by specifying `pickle_code=True` when calling `func_to_container_op`
2019-06-18 19:44:30 -07:00
Alexey Volkov cd0aeb6c62 SDK+Frontend - Fixed the task display name annotation key (#1526)
Turns out, Kubernetes only allows a single slash character in annotation key.
2019-06-18 14:50:33 -07:00
Ilias Katsakioris d4960d3379 SDK/DSL: Make 'name' argument of a PipelineVolume omittable (#1402)
* SDK/DSL: Make 'name' argument of a PipelineVolume omittable

Also remove unused imports from _pipeline_volume module

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Use hashlib.sha256() instead of id()

* Fix not maintaining provided name
2019-06-13 22:42:22 -07:00
Alexey Volkov d724a4b68d SDK - Controlling which modules are captured with Lightweight components (#1435)
* SDK - Controlling which modules are captured with Lightweight components

All func_to_* functions now accept the modules_to_capture parameter: List of module names that will be captured (instead of just referencing) during the dependency scan. By default the func.__module__ is captured.

* Described the behavior more in depth.

* Added a test to check that only dependencies are captured
2019-06-07 18:47:06 -07:00
Krassimir Valev 381083a7c3 SDK/Compiler - Invoke the op_transformers as early as possible (#1464)
* Add reproducible test case

* Invoke the op_transformers as early as possible
2019-06-07 14:05:57 -07:00
Krassimir Valev 858346561a SDK/Compiler - Fix s3 artifact key names (#1451)
* Fix s3 artifact key names

* Compiler test to verify the bugfix
2019-06-06 21:15:57 -07:00
hongye-sun b1fa929442 Add op_to_templates_handler to compiler (#1458)
* Add op_to_templates_handler to compiler

* Update tests.
2019-06-06 20:13:58 -07:00
Ning 5061fcffcf
Add timeout out in dsl (#1465)
* add timeout in dsl
* add pipeline level timeout
2019-06-06 17:42:10 -07:00
Alexey Volkov 7d69cda69c
Frontend - Show customized task display names (#1463)
* Frontend - Show customized task display names

* Added customized name test

* Added ContainerOp.set_display_name(name) method

* Stopped writing human_name to display_name annotation for now
Reason: It's a change to existing pipelines.

* Added test for op.set_display_name

* Fix for tests that have workflows with status nodes, but without any spec or templates

* Fixed the test workflow

* Fix linter error
Error: "The key 'metadata' is not sorted alphabetically"
2019-06-06 17:36:32 -07:00
Alexey Volkov 52d4dc8229 SDK - Improved test script compatibility with editable package installation (#1200) 2019-06-06 16:37:59 -07:00
Ning b6967d88aa add default value type checking (#1407)
* add default value type checking

* add jsonschema dependency

* fix unit test error

* workaround for travis python package installation

* add back jsonschema version

* fix sample test error in type checking sample

* add jsonschema in requirements such that sphinx works fine
2019-06-03 15:33:32 -07:00
Alexey Volkov 9a1d47a185 SDK - Capturing function dependencies when creating lightweight components (#1372)
* Transitively capturing code dependencies
Using cloudpickle.

* Got rid of func_type_declarations_code variable

* Extracted the function code extraction functions

* Improved support for capturing module-level dependencies

* Added test for capturing module-level dependencies

* Removed the _capture_function_code_using_source_copy function
As requested by Ning
2019-05-28 18:18:18 -07:00
Alexey Volkov 2a9bbdf120 SDK/Compiler - Added the ability to apply a function to all ops in a pipeline (#1209)
* SDK/Compiler - Added op and template transformers
They can be used to apply some functions (e.g. to add secrets) to all pipeline ops.

* Removed the template_transformers for now

* Moved the op_transformers to PipelineConf

* Added op_transformers test
2019-05-22 19:48:23 -07:00
Eterna2 91d941d6e5 [Feature] Supports parameterized S3Artifactory for Pipeline and ContainerOp in kfp package (#1064)
* kfp can declare custom artifact location in pipeline and containerop.

* Removed default artifact location

* Minor fixes
2019-05-14 19:48:20 -07:00
Alexey Volkov a41bd106a1 SDK - Removing unneeded uses of dsl.Pipeline (#1229)
* SDK - Removing unneeded usages of dsl.Pipeline

* Fixed the naming-related issue
2019-05-14 18:48:18 -07:00
Alexey Volkov be4e56a247 SDK - Stopped hard-coding artifact storage configuration in the pipeline packages (#1297)
We should follow Argo's prefferred way to configure the artifact storage: https://github.com/argoproj/argo/blob/master/ARTIFACT_REPO.md#configure-the-default-artifact-repository and use a cluster-local configMap. This way the pipelines remain clean and portable: https://github.com/argoproj/argo/blob/master/examples/artifact-passing.yaml
Kubeflow deployer has already been pre-installing the configMap for several months: https://github.com/kubeflow/kubeflow/pull/2238
2019-05-13 17:29:08 -07:00
Hamed ce6066136d support tolerations for ContainerOps (#1269)
* add tolerations to ContainerOps

* add test

* add type for tolerations

* remove fix

* remove print
2019-05-09 16:37:59 -07:00
Ilias Katsakioris c4c2d166fe Fix PipelineParam pattern bug (#1300)
* Generate a pattern in the constructor if one is not provided
* Add compiler tests

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
2019-05-09 15:43:58 -07:00
Alexey Volkov dcd5ba5ef7 SDK - Failing faster in python_op tests (#1291)
* SDK - Failing faster in python_op tests

* Switched to `subprocess.run(full_command, check=True)`
2019-05-08 13:10:36 -07:00
Alexey Volkov b61bef04a3 SDK - Renamed ModelBase.from_struct/to_struct to from_dict/to_dict (#1290) 2019-05-07 14:06:35 -07:00
hongye-sun f56a8cb72a
Comp yaml eb830cd73c (#1282) 2019-05-03 13:11:31 -07:00
Alexey Volkov 6b0d4d13e7 Marking the UI-metadata and Metrics artifacts as optional (#1260)
* Marking the UI-metadata and Metrics artifacts as optional

* Updated the compiler tests

* Updates one more test.
2019-04-29 17:27:37 -07:00
Ilias Katsakioris 07cb50ee0c Extend the DSL to implement the design of #801 (#926)
* SDK: Create BaseOp class

* BaseOp class is the base class for any Argo Template type
* ContainerOp derives from BaseOp
* Rename dependent_names to deps

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: In preparation for the new feature ResourceOps (#801)

* Add cops attributes to Pipeline. This is a dict having all the
  ContainerOps of the pipeline.
* Set some processing in _op_to_template as ContainerOp specific

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Simplify the consumption of Volumes by ContainerOps

Add `pvolumes` argument and attribute to ContainerOp. It is a dict
having mount paths as keys and V1Volumes as values. These are added to
the pipeline and mounted by the container of the ContainerOp.

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Add ResourceOp

* ResourceOp is the SDK's equivalent for Argo's resource template
* Add rops attribute to Pipeline: Dictionary containing ResourceOps
* Extend _op_to_template to produce the template for ResourceOps
* Use processed_op instead of op everywhere in _op_to_template()
* Add samples/resourceop/resourceop_basic.py
* Add tests/dsl/resource_op_tests.py
* Extend tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Simplify the creation of PersistentVolumeClaim instances

* Add VolumeOp: A specified ResourceOp for PVC creation
* Add samples/resourceops/volumeop_basic.py
* Add tests/dsl/volume_op_tests.py
* Extend tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Emit a V1Volume as `.volume` from dsl.VolumeOp

* Extend VolumeOp so it outputs a `.volume` attribute ready to be
  consumed by the `pvolumes` argument to ContainerOp's constructor
* Update samples/resourceop/volumeop_basic.py
* Extend tests/dsl/volume_op_tests.py
* Update tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Add PipelineVolume

* PipelineVolume inherits from V1Volume and it comes with its own set of
  KFP-specific dependencies. It is aligned with how PipelineParam
  instances are used. I.e. consuming a PipelineVolume leads to implicit
  dependencies without the user having to call the `.after()` method on
  a ContainerOp.
* PipelineVolume comes with its own `.after()` method, which can be used
  to append extra dependencies to the instance.
* Extend ContainerOp to handle PipelineVolume deps
* Set `.volume` attribute of VolumeOp to be a PipelineVolume instead
* Add samples/resourceops/volumeop_{parallel,dag,sequential}.py
* Fix tests/dsl/volume_op_tests.py
* Add tests/dsl/pipeline_volume_tests.py
* Extend tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Simplify the creation of VolumeSnapshot instances

* VolumeSnapshotOp: A specified ResourceOp for VolumeSnapshot creation
* Add samples/resourceops/volume_snapshotop_{sequential,rokurl}.py
* Add tests/dsl/volume_snapshotop_tests.py
* Extend tests/compiler/compiler_tests.py

NOTE: VolumeSnapshots is an Alpha feature at the time of this commit.

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Extend UI for the ResourceOp and Volumes feature of the Compiler

* Add VolumeMounts tab/entry (Run/Pipeline view)
* Add Manifest tab/entry (Run/Pipeline view)
* Add & Extend tests
* Update tests snapshot files

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Cleaning up the diff (before moving things back)

* Renamed op.deps back to op.dependent_names

* Moved Container, Sidecar and BaseOp classed back to _container_op.py
This way the diff is much smaller and more understandable. We can always split or refactor the file later. Refactorings should not be mixed with genuine changes.
2019-04-25 10:40:48 -07:00
Alexey Volkov 848d4fb99c SDK - Replaced insecure yaml.load with yaml.safe_load (#1170)
This improves security and gets rid of security warnings.
See https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation
2019-04-23 15:26:00 -07:00
Alexey Volkov b588ba087a SDK/Tests - Properly closing tar files opened for writing (#1169) 2019-04-23 14:34:01 -07:00
Jiaxin Shan af9d48ac25 Allow users to add aws secrets (#1133) 2019-04-12 18:38:04 -07:00
Alexey Volkov c67aea779e SDK - Simplified the @pipeline decorator (#1120)
* SDK - Simplified the @pipeline decorator
Moved metadata-related code to _metadata.
`Pipeline.get_pipeline_functions` now returns the list of pipeline functions.

* Addressed @gaoning777's PR feedback
2019-04-12 13:14:47 -07:00
Alexey Volkov 929ff52fd2 Passing the annotations and labels to the ContainerOp (#1077)
Currently the annotations and labels are not passed from component to the ContainerOp. This PR fixes that.

Fixes https://github.com/kubeflow/pipelines/issues/1013
2019-04-08 22:03:05 -07:00
Ning 1a04e86ed7 Recursion bug fix (#1061)
* remove the graph component output; add support for dependency on graph component

* fix bug; adjust unit tests

* add support for explicit dependency of graph component

* adjust unit test

* add a todo

* bug fixes for unit tests

* refactor condition_param code; fix bug when the inputs task name is None; need to remove the print later

* do not pass condition param as arguments to downstream ops, remove print logs; add unit tests

* add unit test golden yaml

* fix bug

* fix the sample
2019-04-02 09:49:19 -07:00