Commit Graph

138 Commits

Author SHA1 Message Date
Alexey Volkov 54a596abd8
SDK - Compiler - Added support for volume-based data passing (3371)
* SDK - Compiler - Added support for volume-based data passing

Currently artifact passing is performed by Argo sidecar containers what download input data and upload output data to artifact repository (usually, S3-compatible blob storage like Minio).
The performance of this method is not optimal and it requires that pod disks have enough capacity to hold all artifact data.

This commit adds support for volume-based data passing.
This method involves using a single milti-write Kubernetes data volume to pass all intermediate data.
Parts of the volume are mounted to the input/output artifact directories, so when the user program reads and writes files, the files actually reside in the data volume.
This method improves the performance and reduces storage resource requirements.

The data volume must exist and support "READ_WRITE_MANY".

Limitations:
* All artifact file names must be the same (e.g. "data"). All auto-generated paths are already consistent. Avoid using any hard-coded paths.
* Passing constant values (text) as arguments for artifact inputs is not supported.
* The feature is experimental.

* Added data_passing_methods.KubernetesVolume

This class represents a configured volume-based artifact passing method.

* Added PipelineConf.data_passing_method

This property allows setting the method that will be used for intermediate data passing.
Added the compiler support for the new feature.

Example:
```python
from kfp.dsl import PipelineConf, data_passing_methods
from kubernetes.client.models import V1Volume, V1PersistentVolumeClaim
pipeline_conf = PipelineConf()
pipeline_conf.data_passing_method = data_passing_methods.KubernetesVolume(
    volume=V1Volume(
        name='data',
        persistent_volume_claim=V1PersistentVolumeClaim('data-volume'),
    ),
    path_prefix='artifact_data/',
)
```

* Added unit test

* Fixed bug in the unit test

Kubernetes does not validate the structures at all...

* Fixed bug in the result structure

* Fixed the test data

The class should be V1PersistentVolumeClaimVolumeSource, not V1PersistentVolumeClaimSpec.

* Fixed the test
2020-06-25 16:11:31 -07:00
Alexey Volkov 757d43c7fd
SDK - Compiler - Fixed error message (#4053)
Fixes https://github.com/kubeflow/pipelines/issues/4021
2020-06-24 11:42:46 -07:00
Alexey Volkov 374b3b02d2
SDK - Compiler - Made compiler compatible with @wraps (#3956)
Fixes https://github.com/kubeflow/pipelines/issues/3367
2020-06-11 20:03:55 -07:00
Alexey Volkov 40372e5c86
SDK - Compiler - Using properly serialized pipeline parameter defaults (#3832)
* SDK - Compiler - Using properly serialized pipeline parameter defaults

Fixes https://github.com/kubeflow/pipelines/issues/3806

* Sort the keys so that the serialized defaults are stable in python 3.5
2020-06-09 13:10:04 -07:00
Jiaxiao Zheng 1e2b9d4e7e
[SDK] Add first party component label (#3861)
* add OOB component dict and utility function

* add test

* add a transformer, which appends the component name label

* add transformer function, compiler and test

* move telemetry test

* fix none uri

* applies comments

* revert dependency on frozendict

* fixes some tests

* resolve comments
2020-05-29 08:55:16 -07:00
Thi Nguyen ec9445aa01
Allow PipelineParams in dict keys too. (#3565)
Co-authored-by: Thi Nguyen <duongnt@users.noreply.github.com>
2020-05-19 17:54:19 -07:00
Niklas Hansson 05c1537f28
Add Nodeselector to pipelineconfig fix issue #2863 (#3616)
* updated version

* added pipeline nodeselector

* removed old legacy

* renaming

* update test

* Update sdk/python/kfp/compiler/compiler.py
2020-05-05 00:11:08 -07:00
Eterna2 9167da1b4e
Support execution throttling for executing the pipelines (#3346) (#3439)
* Add parallelism limits to pipeline in kfp sdk

* fix lint error
2020-05-04 23:25:08 -07:00
Jiaxiao Zheng aa8da64b4c
[SDK] Add pod labels for telemetry purpose. (#3578)
* add telemetry pod labels

* revert the id label

* update compiler tests

* update cli arg

* bypass tfx

* update docstring
2020-04-27 18:50:04 -07:00
Alexey Volkov 6cb92d45c8
SDK - Compiler - Include the SDK version information in the compiled workflows (#3583)
* SDK - Compiler - Include the SDK version information in the compiled workflows

* Fixed the unit tests

* Removed the sdk_version annotation.
2020-04-25 01:49:28 -07:00
Niklas Hansson 2354776e1e
fix #2802: Set ImagePullPolicy per pipeline. (#3534)
* bump version

* default image pull policy

* Update sdk/python/kfp/dsl/_pipeline.py

* task setting should dominate

* Update sdk/python/kfp/dsl/_pipeline.py

* fixed merge misstake
2020-04-23 07:09:13 -07:00
Alexey Volkov b63ad7e614
SDK - Removed the ArtifactLocation feature (#3517)
* SDK - Removed the ArtifactLocation feature

The feature was deprecated in v0.1.34 https://github.com/kubeflow/pipelines/pull/2326

* Removed the artifact_location sample
2020-04-23 00:49:44 -07:00
Alexey Volkov 08c7c0ef36
SDK - Made YAML dumping more awesome (#3520)
See the root cause explanation in https://github.com/kubeflow/pipelines/issues/3519
2020-04-16 21:23:07 -07:00
Alexey Volkov 03e064cee2
SDK - Compiler - Fix incompatibility with python3.5 (#3122) 2020-02-19 13:55:47 -08:00
Alexey Volkov a33ae25bc4
SDK - Compiler - Add optional Argo validation (#3094)
argo CLI tool must be in path for this feature to work
2020-02-18 23:12:25 -08:00
Alexey Volkov 4a1b282461
SDK - Compiler - Fixed ParallelFor argument resolving (#3029)
* SDK - Compiler - Fixed ParallelFor name clashes

The ParallelFor argument reference resolving was really broken.
The logic "worked" like this - of the name of the referenced output
contained the name of the loop collection source output, then it was
considered to be the reference to the loop item.
This broke lots of scenarios especially in cases where there were
multiple components with same output name (e.g. the default "Output"
output name). The logic also did not distinguish between references to
the loop collection item vs. references to the loop collection source
itself.

I've rewritten the argument resolving logic, to fix the issues.

* Argo cannot use {{item}} when withParams items are dicts

* Stabilize the loop template names

* Renamed the test case
2020-02-11 12:18:09 -08:00
Alexey Volkov c83aff2738
SDK - Components - Made it easier to access component spec classes (#2860)
* SDK - Components - Made it easier to access component spec classes

* Updated the imports
2020-01-31 11:41:21 -08:00
Jiaxiao Zheng 358e26adb1 [SDK/compiler] Sanitize op name for PipelineParam (#2711)
* sanitize op name for pipeline param

* refactor sanitization to compiler level, and add unittest
2019-12-27 18:01:39 -08:00
Alexey Volkov b8a2e6f400 SDK/Compiler - Preventing pipeline entrypoint template name from clashing with other template names (#1555)
Case exhibiting the problem:
```
def add(a, b):
    ...
@dsl.pipeline(name="add')
def some_name():
    add(...)
```
2019-12-05 18:08:49 -08:00
Jiaxiao Zheng 790fe99aca [SDK] Relax k8s sanitization (#2634)
* update

* add allow_capital

* fix

* fix volume_ops sample

* fix pipeline name sanitization

* fix unittests

* fix sanitization in _client.py

* fix component output sanitization
2019-11-26 10:28:10 -08:00
Lulu Cheng 07296bc5ba [fix] default yaml.dump to block style (#2591)
* [fix] default every field to block style

* [change] per comment

* [fix] per comment
2019-11-18 18:55:41 -08:00
Jiaxiao Zheng ead912c6f8 [SDK] Fix withItem loop (#2572)
* fix withItem

* clean up and revert sample change

* clean up

* clean up

* clean up

* clean up

* fix

* fix nit
2019-11-07 18:40:19 -08:00
Alexey Volkov 735e627a03 SDK - Refactoring - Split the K8sHelper class (#2333)
* SDK - Refactoring - Split the K8sHelper class

One part was only used by container builder and provided higher-level API over K8s Client.
Another was used by the compiler and did not use the kubernetes library.

* Updated the license year.
2019-10-21 14:57:22 -07:00
Alexey Volkov 1b6047aa69 SDK - Improve errors when ContainerOp.output is unavailable (#1578)
* SDK - Improve errors when ContainerOp.output is unavailable

ContainerOp.output is only available when there is only one output.
Right now, when there are multiple outputs it just holds `None` instead of the a task output reference.
In this case however it's indistinguishable from just passing None argument.
This PR gives a quick fix to make accessing the nonexistent `.output` a compile-time error.

* Fixed the implementation and added tests

* Trigger retests
2019-10-11 18:20:40 -07:00
Alexey Volkov 181de66cf9 SDK - Compiler - Move Argo volume specifications to templates (#2229)
* SDK - Compiler - Move volumes to templates

Argo v2.3.0+ supports per-template volume specs similiar to Kubernetes. Prior to version 2.3.0 Argo only supported workflow-level volume specs.
We had several outstanding issues caused by the need to put all volumes in the same place.
There was also the issue with input parameter reference placeholders in volume specifications which were placed outside their home templates declaring the inputs.

 This change fixes those issues.

* Removed dead code line
2019-10-07 16:55:12 -07:00
Jiaxiao Zheng 092845d134 [SDK/Compiler] Add _create_and_write_workflow method (#2321)
* add _create_and_write_workflow

* Add pointer to TFX dag runner usage.
2019-10-07 14:13:10 -07:00
Alexey Volkov 4b33f1b550 SDK - Compiler - Fixed deprecation warning when calling compile (#2303) 2019-10-04 13:09:12 -07:00
Jiaxiao Zheng 9a9bd904ac [SDK-compiler] Refactor Compiler to expose an API to write out yaml spec of pipeline. (#2146)
* Refactor.

* Remove redundant code.

* Fix.

* Move the implementation of create_workflow into a private api.

* Change write_workflow to private.

* deprecation warning
2019-10-03 16:45:56 -07:00
Alexey Volkov c128b2a7b4 SDK - Compiler - Make it possible to create more portable pipelines (#2271)
* SDK - Compiler - Allow creating portable pipelines

This change allows directly passing the PipelineConf instance to compiler or launcher which makes it easier to create portable pipelines by allowing the environment-specific configuration to be directly passed to the environment-specific launcher.

Background:
PipelineConf holds all pipeline-level configuration including `op_transformers`, `image_pull_secrets` etc. Some of these are specific to particular execution environment (e.g. GCP secret or Argo artifact location or Kubernetes-specific options).
Previously, the only way to modify `PipelineConf` was to do it inside the piepline function. That tied the pipeline function to specific execution environment (e.g. GCP, Argo or Kubernetes)

Solution: This change allows directly passing the PipelineConf instance to compiler or launcher. This allows writing portable enlauncher and environment agnostic pipeline functions. All environment-specific configurations can be moved to launching stage.

Before:
```python
# Defining pipeline
def my_pipeline():
    # portable pipeline code

    dsl.get_pipeline_conf().add_op_transformer(gcp.use_gcp_secret('user-gcp-sa'))

# Launching pipeline
kfp.Clinet().create_run_from_pipeline_func(my_pipeline, arguments={})
```

After:
```python
# Defining pipeline
def my_pipeline():
    # portable pipeline code

# Launching pipeline
pipeline_conf = dsl.PipelineConf()
pipeline_conf.add_op_transformer(gcp.use_gcp_secret('user-gcp-sa'))
kfp.Clinet().create_run_from_pipeline_func(my_pipeline, arguments={}, pipeline_conf=pipeline_conf)
```

After 2 *(launching same portable pipeline using different launchers):
```python
# Loading portable pipeline
from portable_pipeline import my_pipeline

# Launching pipeline on Kubeflow
pipeline_conf = dsl.PipelineConf()
pipeline_conf.add_op_transformer(gcp.use_gcp_secret('user-gcp-sa'))
kfp.Clinet().create_run_from_pipeline_func(my_pipeline, arguments={}, pipeline_conf=pipeline_conf)

# Launching pipeline on locally (not implemented yet)
kfp.run_pipeline_func_locally(my_pipeline, arguments={})
```

* Added parameter docstring
2019-10-02 20:58:08 -07:00
Alexey Volkov ef63c653af SDK - Compiler - Fix large data passing (#2173)
* SDK - Compiler - Fix large data passing

Stop outputting parameters unless they're consumed as parameters downstream.
This prevents the situaltion when component outputs a big file, but DSL compiler instructs Argo to pick it up as parameter (parameters only hold few kilobytes of data).

As byproduct, this change fixes some minor compiler data passing bugs where some parameters were being passed around, but never consumed (happened with `ResourceOp`, `dsl.Condition` and recursion).

* Replaced ... with `raise AssertionError`

* Fixed small bug

* Removed unused variables

* Fixed names of the mark_upstream_ios_of_* functions

* Fixed detection of parameter output references

* Fixed handling of volumes
2019-09-20 15:05:27 -07:00
Alexey Volkov 0e2bf15dbc
SDK - Refactoring - Replaced the *Meta classes with the *Spec classes (#1944)
* SDK - Refactoring - Replaced the ParameterMeta class with InputSpec and OutputSpec

* SDK - Refactoring - Replaced the internal PipelineMeta class with ComponentSpec

* SDK - Refactoring - Replaced the internal ComponentMeta class with ComponentSpec

* SDK - Refactoring - Replaced the *Meta classes with the *Spec classes

Replaced the ComponentMeta class with ComponentSpec
Replaced the PipelineMeta class with ComponentSpec
Replaced the ParameterMeta class with InputSpec and OutputSpec

* Removed empty fields
2019-09-16 18:41:12 -07:00
Kevin Bache 2ca7d0ac31 WithParams (#2044)
* first working commit

* incrememtal commit

* in the middle of converting loop args constructor to accept pipeline param

* both cases working

* output works, passed doesn't

* about to redo compiler section

* rewrite draft done

* added withparam tests

* removed sdk/python/comp.yaml

* minor

* subvars work

* more tests

* removed unneeded artifact outputs from test yaml

* sort keys

* removed dead artifact code
2019-09-16 17:58:22 -07:00
Jiaxiao Zheng 1449d08aee Fix the logic of passing default values of pipeline parameters. (#2098)
* Fix the logic of passing default values.

* Modify unit test

* Solve.
2019-09-12 17:10:33 -07:00
Jiaxiao Zheng 497d016e85 Expose an API for appending params/names/descriptions in a programmable way. (#2082)
* Refactor. Expose a public API to append pipeline param without interacting with dsl.Pipeline obj.

* Add unit test and fix.

* Fix docstring.

* Fix test

* Fix test

* Fix two nit problems

* Refactor
2019-09-10 17:58:47 -07:00
Alexey Volkov d83601d19a SDK - Compiler - Quoting the predicate operands (#2043)
Fixes https://github.com/kubeflow/pipelines/issues/1950
2019-09-06 17:05:21 -07:00
Alexey Volkov 979396702e SDK - Compiler - Failing when PipelineParam is unresolved (#2055)
Instead of silently producing a broken pipeline package, the compiler now raises error and instructs the user to submit a bug report.
2019-09-06 15:51:20 -07:00
Jiaxiao Zheng bd9d6319c8
Refactor kfp.compiler for better modularity (#2052)
* init analyze

* Refactor

* Renaming
2019-09-06 13:52:23 -07:00
Alexey Volkov f911742d1a SDK - Compiler - Fixed handling of PipelineParams in artifact arguments (#2042)
Previously only constant strings were supported and serialized PipelineParams were not resolved, producing incorrect workflows.
2019-09-05 15:16:58 -07:00
Alexey Volkov 301186cc87 SDK - Refactoring - Reduced the usage of dsl.Pipeline context (#2034)
Also reduced the unnecessary explicit usage of PipelineParam bu the end users
2019-09-05 01:26:52 -07:00
Timur Solovev 8fce00642c SDK: fix setting pipeline-wide artifact_location for ResourceOp and VolumeOp classes and add description field for create_experiment() function (#2025)
* fix setting pipeline-wide artifact_location for ResourceOp and VolumeOp classes

* add description field for create_experiment() function
2019-09-03 21:54:59 -07:00
Alexey Volkov 0fc68bbdd4 SDK - Added support for raw input artifact argument values to ContainerOp (#791)
* SDK - Added support for raw artifact values to ContainerOp

* `ContainerOp` now gets artifact artguments from command line instead of the constructor.

* Added back input_artifact_arguments to the ContainerOp constructor.
In some scenarios it's hard to provide the artifact arguments through the `command` list when it already has resolved artifact paths.

* Exporting InputArtifactArgument from kfp.dsl

* Updated the sample

* Properly passing artifact arguments as task arguments
as opposed to default input values.

* Renamed input_artifact_arguments to artifact_arguments to reduce confusion

* Renamed InputArtifactArgument to InputArgumentPath
Also renamed input_artifact_arguments to artifact_argument_paths in the ContainerOp's constructor

* Replaced getattr with isinstance checks.
getattr is too fragile and can be broken by renames.

* Fixed the type annotations

* Unlocked the input artifact support in components
Added the test_input_path_placeholder_with_constant_argument test
2019-08-28 21:09:57 -07:00
Kevin Bache 96fd19356c WithItems Support (#1868)
* hacking

* hacking 2

* moved withitems to opsgroup

* basic loop test working

* fixed nested loop bug, added tests

* cleanup

* gitignore; compiler tests

* cleanup

* tests fixup

* removed format strings

* removed uuid override from test

* cleanup

* responding to comments

* removed compiler withitems test

* removed pipeline param typemeta
2019-08-23 21:00:28 -07:00
Alexey Volkov c01315a89d
SDK - Refactoring - Replaced the TypeMeta class (#1930)
* SDK - Refactoring - Replaced the TypeMeta class
The PipelineParam no longer exposes the private TypeMeta class
Fixes #1420

The refactoring PR is part of a series of PR which unifies the metadata and specification types.
2019-08-22 15:31:24 -07:00
Alexey Volkov d8eaeaad95
SDK - Preserving the pipeline input information in the compiled Workflow (#1381)
* SDK - Preserving the pipeline metadata in the compiled Workflow

* Stabilizing the DSL compiler tests
2019-08-15 17:25:59 -07:00
Eterna2 08ff76f5f1 [Feature] Set ttlSecondsAfterFinished in argo workflow with PipelineConf (#1594)
* Add PipelineConf method to set ttlSecondsAfterFinished in argo workflow spec

* remove unnecessary compile test for ttl. add unit test for ttl instead.
2019-07-24 09:26:15 -07:00
Ning 28b871a2be fix dependency bug in the recursion support (#1616) 2019-07-12 12:59:05 -07:00
hongye-sun 97ccfd0c9a add_pod_env op handler (#1540)
* Configure gcp connectors in dsl

* Make configure_gcp_connector more extensible

* Add add_pod_env op handler.

* Only apply add_pod_env on ContainerOp

* Update license header
2019-07-08 11:42:35 -07:00
Ning f23b619880 fix recursion bug (#1583)
* fix recursion bug

* propagate inputs to out layers of opsgroup; adjust unit tests
2019-07-01 16:55:08 -07:00
Krassimir Valev 381083a7c3 SDK/Compiler - Invoke the op_transformers as early as possible (#1464)
* Add reproducible test case

* Invoke the op_transformers as early as possible
2019-06-07 14:05:57 -07:00
hongye-sun b1fa929442 Add op_to_templates_handler to compiler (#1458)
* Add op_to_templates_handler to compiler

* Update tests.
2019-06-06 20:13:58 -07:00
Ning 5061fcffcf
Add timeout out in dsl (#1465)
* add timeout in dsl
* add pipeline level timeout
2019-06-06 17:42:10 -07:00
Alexey Volkov 2a9bbdf120 SDK/Compiler - Added the ability to apply a function to all ops in a pipeline (#1209)
* SDK/Compiler - Added op and template transformers
They can be used to apply some functions (e.g. to add secrets) to all pipeline ops.

* Removed the template_transformers for now

* Moved the op_transformers to PipelineConf

* Added op_transformers test
2019-05-22 19:48:23 -07:00
Eterna2 91d941d6e5 [Feature] Supports parameterized S3Artifactory for Pipeline and ContainerOp in kfp package (#1064)
* kfp can declare custom artifact location in pipeline and containerop.

* Removed default artifact location

* Minor fixes
2019-05-14 19:48:20 -07:00
Ilias Katsakioris b675e0272b Remove cops and rops pipeline attributes (#1298)
* Remove the separated dictionaries for ContainerOps and ResourceOps
* Fix the sanitization performed by the compiler to iterate through ops
  dict and do type-check for the special fields file_outputs and
  attribute_outputs

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
2019-05-09 17:25:57 -07:00
Ilias Katsakioris 07cb50ee0c Extend the DSL to implement the design of #801 (#926)
* SDK: Create BaseOp class

* BaseOp class is the base class for any Argo Template type
* ContainerOp derives from BaseOp
* Rename dependent_names to deps

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: In preparation for the new feature ResourceOps (#801)

* Add cops attributes to Pipeline. This is a dict having all the
  ContainerOps of the pipeline.
* Set some processing in _op_to_template as ContainerOp specific

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Simplify the consumption of Volumes by ContainerOps

Add `pvolumes` argument and attribute to ContainerOp. It is a dict
having mount paths as keys and V1Volumes as values. These are added to
the pipeline and mounted by the container of the ContainerOp.

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Add ResourceOp

* ResourceOp is the SDK's equivalent for Argo's resource template
* Add rops attribute to Pipeline: Dictionary containing ResourceOps
* Extend _op_to_template to produce the template for ResourceOps
* Use processed_op instead of op everywhere in _op_to_template()
* Add samples/resourceop/resourceop_basic.py
* Add tests/dsl/resource_op_tests.py
* Extend tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Simplify the creation of PersistentVolumeClaim instances

* Add VolumeOp: A specified ResourceOp for PVC creation
* Add samples/resourceops/volumeop_basic.py
* Add tests/dsl/volume_op_tests.py
* Extend tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Emit a V1Volume as `.volume` from dsl.VolumeOp

* Extend VolumeOp so it outputs a `.volume` attribute ready to be
  consumed by the `pvolumes` argument to ContainerOp's constructor
* Update samples/resourceop/volumeop_basic.py
* Extend tests/dsl/volume_op_tests.py
* Update tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Add PipelineVolume

* PipelineVolume inherits from V1Volume and it comes with its own set of
  KFP-specific dependencies. It is aligned with how PipelineParam
  instances are used. I.e. consuming a PipelineVolume leads to implicit
  dependencies without the user having to call the `.after()` method on
  a ContainerOp.
* PipelineVolume comes with its own `.after()` method, which can be used
  to append extra dependencies to the instance.
* Extend ContainerOp to handle PipelineVolume deps
* Set `.volume` attribute of VolumeOp to be a PipelineVolume instead
* Add samples/resourceops/volumeop_{parallel,dag,sequential}.py
* Fix tests/dsl/volume_op_tests.py
* Add tests/dsl/pipeline_volume_tests.py
* Extend tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Simplify the creation of VolumeSnapshot instances

* VolumeSnapshotOp: A specified ResourceOp for VolumeSnapshot creation
* Add samples/resourceops/volume_snapshotop_{sequential,rokurl}.py
* Add tests/dsl/volume_snapshotop_tests.py
* Extend tests/compiler/compiler_tests.py

NOTE: VolumeSnapshots is an Alpha feature at the time of this commit.

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Extend UI for the ResourceOp and Volumes feature of the Compiler

* Add VolumeMounts tab/entry (Run/Pipeline view)
* Add Manifest tab/entry (Run/Pipeline view)
* Add & Extend tests
* Update tests snapshot files

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Cleaning up the diff (before moving things back)

* Renamed op.deps back to op.dependent_names

* Moved Container, Sidecar and BaseOp classed back to _container_op.py
This way the diff is much smaller and more understandable. We can always split or refactor the file later. Refactorings should not be mixed with genuine changes.
2019-04-25 10:40:48 -07:00
Alexey Volkov c67aea779e SDK - Simplified the @pipeline decorator (#1120)
* SDK - Simplified the @pipeline decorator
Moved metadata-related code to _metadata.
`Pipeline.get_pipeline_functions` now returns the list of pipeline functions.

* Addressed @gaoning777's PR feedback
2019-04-12 13:14:47 -07:00
Ning 1a04e86ed7 Recursion bug fix (#1061)
* remove the graph component output; add support for dependency on graph component

* fix bug; adjust unit tests

* add support for explicit dependency of graph component

* adjust unit test

* add a todo

* bug fixes for unit tests

* refactor condition_param code; fix bug when the inputs task name is None; need to remove the print later

* do not pass condition param as arguments to downstream ops, remove print logs; add unit tests

* add unit test golden yaml

* fix bug

* fix the sample
2019-04-02 09:49:19 -07:00
Eterna2 825f64d672 Feature: sidecar for ContainerOp (#879)
* Feature: sidecar for ContainerOp

* replace f-string with string format for compatibility with py3.5

* ContainerOp now can be updated with any k8s V1Container attributes as well as sidecars with Sidecar class. ContainerOp accepts PipelineParam in any valid k8 properties.

* WIP: fix conflicts and bugs with recent master. TODO: more complex template with pipeline params

* fix proxy args

* Fixed to work with latest master head

* Added container_kwargs to ContainerOp to pass in k8s container kwargs

* Fix comment bug, updated with example in ContainerOp docstring

* fix copyright year

* expose match_serialized_pipelineparam as public for compiler to process serialized pipeline params

* fixed pydoc example and removed unnecessary ContainerOp.container.parent

* Fix conflicts in compiler tests
2019-03-28 11:11:30 -07:00
Ning f59c25bd04 Add type check samples (#955)
* add core types and type checking function

* fix unit test bug

* avoid defining dynamic classes

* typo fix

* add component metadata format

* add a construct for the component decorator

* add default values for the meta classes

* add input/output types to the metadata

* add from_dict in TypeMeta

* small fix

* add unit tests

* use python struct for the openapi schema

* add default in parameter

* add default value

* remove the str restriction for the param default

* bug fix

* add pipelinemeta

* add pipeline metadata

* ignore annotation if it is not str/BaseType/dict

* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability

* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name

* pass metadata from component decorator and task factory to containerOp

* pass pipeline metadata to Pipeline

* fix unit test

* typo in the comments

* move the metadata classes to a separate module

* fix unit test

* small change

* add __eq__ to meta classes
not export _metadata classes

* nothing

* fix unit test

* unit test python component

* unit test python pipeline

* fix bug: duplicate variable of args

* fix unit tests

* move python_component and _component decorator in _component file

* remove the print

* change parameter default value to None

* add functools wraps around _component decorator

* TypeMeta accept both str and dict

* fix indent, add unit test for type as strings

* do not set default value for the name field in ParameterMeta, ComponentMeta, and PipelineMeta

* add type check in task factory

* output error message

* add type check in component decorator; move the metadata assignment out of the containerop __init__ function

* fix bug; add unit test

* add more unit tests

* more unit tests; fix bugs

* more unit tests; fix bugs

* add unit tests

* more unit tests

* add type check switch; add unit tests

* add compiler option for type check

* add a notebook sample

* resolving pr comments

* add unit test for pipeline param check with component types; fix the bug; also fix the bug when there are not a single return annotations

* add dsl static type checking sample

* fix bug: op_to_template resolve the raw arguments by mapping to the argument_inputs but the argument_inputs lost the type information

* fix type pattern matching

* convert orderedDict to dict from the component module

* add unit test to the pipelineparam with types

* create TypeMeta deserialize function, add comments

* strongly typed pipelineparamtuple

* remove GCSPath fields to avoid artifact type confusion
change the type json schema field name to openAPIV3Schema

* fix unit tests; add unit test for openapishema property

* add comments

* add unit test at the component module; fix bug

* add ignore_type in pipelineparam

* update sample: no artifact types but only parameter types; add pipelineparam ignore_type example

* configure the default type checking to enabled

* change openAPIV3Schema to lower case with underscore

* revert change from the merge

* add code blocks, add the benefits of static type checking
add more comments within the code block
add documentation about the type definition in both yaml and decorated
components.

* fix the comment

* update dsl.type namespace
2019-03-27 19:58:42 -07:00
Ning 554731e478 dsl generate zip file (#855)
* dsl generate zip file

* minor fix

* fix zip read in the unit test

* update sample tests

* dsl compiler generates pipeline based on the input name suffix

* add unit tests for different output format

* update the sdk client to support tar zip and yaml

* fix typo

* fix file write
2019-03-26 15:14:50 -07:00
Ning 8c09090985 Support recursions in a function (#1014)
* add a While in the ops group

* deepcopy the while conditions when entering and exiting

* add while condition resolution in the compiler

* define graph component decorator

* remove while loop related codes

* fixes

* remove while loop related code

* fix bugs

* generate a unique ops group name and being able to retrieve by name

* resolve the opsgroups inputs and dependencies based on the pipelineparam in the condition

* add a recursive ops_groups

* fix bugs of the recursive opsgroup template name

* resolve the recursive template name and arguments

* add validity checks

* add more comments

* add usage comment in graph_component

* add unit test for the graph opsgraph

* refactor the opsgroup

* add unit test for the graph_component decorator

* exposing graph_component decorator

* add recursive compiler unit tests

* fix the bug of opsgroup name
adjust the graph_component usage example
fix index bugs
use with statement in the graph_component instead of directly calling
the enter/exit functions

* add a todo to combine the graph_component and component decorators
2019-03-26 14:17:18 -07:00
Alexey Volkov fac06e9a87 SDK/DSL/Compileer - Fixed handling of empty pipeline name (#1009)
Fixes https://github.com/kubeflow/pipelines/issues/825
2019-03-21 15:44:18 -07:00
Ning 2accf4180a
Add unit tests pipelineparam (#975)
* add unit test to the pipelineparam with types
* create TypeMeta deserialize function, add comments
* strongly typed pipelineparamtuple
* addressing pr comments
2019-03-18 18:07:36 -07:00
Ning 754db1f724
Fix sample test failure because of the type information in the pipelineparam (#972)
* fix bug: op_to_template resolve the raw arguments by mapping to the argument_inputs but the argument_inputs lost the type information

* fix type pattern matching

* convert orderedDict to dict from the component module
2019-03-15 13:49:21 -07:00
Ning c829115574 Add type check (#938)
* add core types and type checking function

* fix unit test bug

* avoid defining dynamic classes

* typo fix

* add component metadata format

* add a construct for the component decorator

* add default values for the meta classes

* add input/output types to the metadata

* add from_dict in TypeMeta

* small fix

* add unit tests

* use python struct for the openapi schema

* add default in parameter

* add default value

* remove the str restriction for the param default

* bug fix

* add pipelinemeta

* add pipeline metadata

* ignore annotation if it is not str/BaseType/dict

* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability

* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name

* pass metadata from component decorator and task factory to containerOp

* pass pipeline metadata to Pipeline

* fix unit test

* typo in the comments

* move the metadata classes to a separate module

* fix unit test

* small change

* add __eq__ to meta classes
not export _metadata classes

* nothing

* fix unit test

* unit test python component

* unit test python pipeline

* fix bug: duplicate variable of args

* fix unit tests

* move python_component and _component decorator in _component file

* remove the print

* change parameter default value to None

* add functools wraps around _component decorator

* TypeMeta accept both str and dict

* fix indent, add unit test for type as strings

* do not set default value for the name field in ParameterMeta, ComponentMeta, and PipelineMeta

* add type check in task factory

* output error message

* add type check in component decorator; move the metadata assignment out of the containerop __init__ function

* fix bug; add unit test

* add more unit tests

* more unit tests; fix bugs

* more unit tests; fix bugs

* add unit tests

* more unit tests

* add type check switch; add unit tests

* add compiler option for type check

* resolving pr comments

* add unit test for pipeline param check with component types; fix the bug; also fix the bug when there are not a single return annotations
2019-03-11 11:22:12 -07:00
Alexey Volkov 7ce03f07d4 SDK/DSL/Compiler - Fixed compilation when using ContainerOp.after (#943)
Fixes https://github.com/kubeflow/pipelines/issues/941
2019-03-07 18:47:30 -08:00
Ning 974d602b74
Pass meta to containerop and pipeline (#905)
pass metadata from python conf to containerop and the pipeline
2019-03-06 13:42:23 -08:00
Alexey Volkov 63202ea9fd Configure artifact name and path separately (#900) 2019-03-05 18:16:00 -08:00
Ning 9ebbaa313d support pipeline level imagepullsecret in DSL (#745)
* support pipeline level imagepullsecret in DSL

* use kubernetes native input parameter for imagepullsecrets

* expose a module level function to configure the pipeline settings for the current default pipeline
2019-02-05 13:16:43 -08:00
qimingj 3b3a15e16a Add "set_retry()" on ContainerOp. (#723)
* Add "set_retry()" on ContainerOp.

* Follow up on CR comments.

* Update docstring.

* Increase retry times for test.

* Fix test.
2019-01-23 17:35:34 -08:00
Ajay Gopinathan 578e8231d0 Update all Pipelines CRD versions to v1beta1. (#681) 2019-01-17 19:35:51 -08:00
Ning d3c4add0a9 DSL refactor (#619)
* add comments

* relocate functions in compiler to aggregate similar functions; move _build_conventional_artifact as a nested function

* reduce sanitize functions into one in the dsl.

* more comments

* move all sanitization(op name, param name) from dsl to compiler

* sanitize pipelineparam name and op_name; remove format check in pipelineparam

* remove unit test for pipelineparam op_name format checking

* fix bug: correctly replace input in the argument list

* fix bug: replace arguments with found ones

* Sanitize the file_output keys, Matches the param in the args/cmds with the whole serialized param str, Verify both param name and container name

* loosen the containerop and param name restrictions
2019-01-08 20:00:17 -08:00
qimingj 875efea1f9 Support replacable arguments in command as well (besides arguments) in container op. (#623)
* Support replacable arguments in command as well (besides arguments) in container op.

* Fix components builder.

* Fix tests.

* Follow up CR comments.

* Fix test.
2019-01-07 07:57:36 -08:00
Ning 85c6413a2e Refactor Python SDK (#568)
* add some comments

* remove unused import; add license to dsl_bridge

* move_convert_k8s_obj_to_dic from compiler to k8s_helper

* move unit test
2018-12-20 09:51:09 -08:00
Ning b313e4060c remove duplicate volumes (#558)
* remove duplicate volumes

* add a todo
2018-12-18 10:39:43 -08:00
hongye-sun 55e18269c3 support tpu settings in dsl (#491)
* support tpu settings in dsl

* fix issues from review comment
2018-12-06 21:55:36 -08:00
Alexey Volkov b0461f51ff SDK/DSL - Added support for 5 more conditional operations (#309) 2018-12-03 12:33:24 -08:00
Chris Van Pelt 26fd724ec5 Fix for k8s dict parsing (#411)
* Add support for minio artifacts

* Add new tests for parity

* Fix for sdk env bug

* improved test
2018-11-30 11:44:38 -08:00
qimingj 0b7120c322 Now pipeline function takes direct default values rather than dsp.PipelineParam. (#110)
* Now pipeline function takes direct default values rather than dsp.PipelineParam. It simplifies the sample code a lot.

* Remove extraneous parenthesis.

* Follow up CR comments.

* Change Dockerfile (not done).

* Fix dockerfile.

* Fix Dockerfile again.

* Remove unneeded installation of packages in Dockerfile.
2018-11-26 17:13:55 -08:00
hongye-sun 486d43ddfb Add support for nvidia gpu limit (#346)
* Add support for nvidia gpu limit

* Expose resource limits, requests and nodeSelector to ContainerOp

* Fix test data

* Add explicit set_gpu_limit function

* Fix logical bug
2018-11-21 17:49:42 -08:00
Alexey Volkov 906ad680ed SDK/DSL/Compiler - Improved compilation of dsl.Conditional - UX support done (#177)
* Fixed compilation of dsl.Conditional
The compiler no longer produced intermediate steps.

* Got rid of _create_new_groups

* Changed the sub_group.type check

* Update frontend handling of graphs (#293)

* Updates the frontend to correctly parse the new format of conditional pipelines

* WIP - Assume tasks and templates don't share names

* Greatly simplifies graphing of conditional and non-conditional pipelines

* Adds/updates StaticParser tests

* Give nodes unique names
2018-11-19 14:01:10 -08:00
Yang Pan a47eb10558 Add volume, volumemount and env to container op (#300)
* [WIP] change deployment platform to gcp

* debug

* revert test

* add volume

* update test

* to list

* fix

* to list

* to list

* to list

* to list

* stage

* update

* update

* Undid style changes

* address comments

* update comments
2018-11-16 18:36:38 -08:00
Yang Pan 7e34b12e8d Add gcp secret parameter to container op (#261)
* add secret

* add secret to contianer op

* update comments

* address comments

* update logic

* fix
2018-11-15 10:06:14 -08:00
Alexey Volkov 199a962e42 SDK - Relative imports (#156)
Made all SDK import relative so that they files always refer to the sibling files instead of the installed package. This makes debugging and development easier since you can be sure the correct files are used.
2018-11-10 13:56:12 -08:00
Alexey Volkov 43b2381d3b SDK/DSL-compiler - Compile without temporary files (#172)
This also avoids writing, closing and then reading again the same temporary file which is not always supported.
2018-11-10 12:51:29 -08:00
Alexey Volkov 7e4569324b SDK/DSL/Compiler - Reverted fix of dsl.Condition until the UI is ready. (#94) 2018-11-06 12:33:22 -08:00
Alexey Volkov 98e4d2f881 SDK/DSL/Compiler - Fixed compilation of dsl.Condition (#28)
* Fixed compilation of dsl.Conditional
The compiler no longer produced intermediate steps.

* Got rid of _create_new_groups

* Changed the sub_group.type check

* Fix tfx name bug in the tfma sample test (#67)

* fix tfx name bug

* update release build for the data publish
2018-11-06 09:26:41 -08:00
Pascal Vicaire 633e2ddcc8 Initial commit of the kubeflow/pipeline project. 2018-11-02 14:02:31 -07:00