Commit Graph

71 Commits

Author SHA1 Message Date
IronPan 02313e4e5e Cleanup Kaniko code (#1394)
/assign @gaoning777 @Ark-kun
2019-05-28 16:46:17 -07:00
Alexey Volkov 2a9bbdf120 SDK/Compiler - Added the ability to apply a function to all ops in a pipeline (#1209)
* SDK/Compiler - Added op and template transformers
They can be used to apply some functions (e.g. to add secrets) to all pipeline ops.

* Removed the template_transformers for now

* Moved the op_transformers to PipelineConf

* Added op_transformers test
2019-05-22 19:48:23 -07:00
Eterna2 91d941d6e5 [Feature] Supports parameterized S3Artifactory for Pipeline and ContainerOp in kfp package (#1064)
* kfp can declare custom artifact location in pipeline and containerop.

* Removed default artifact location

* Minor fixes
2019-05-14 19:48:20 -07:00
Alexey Volkov be4e56a247 SDK - Stopped hard-coding artifact storage configuration in the pipeline packages (#1297)
We should follow Argo's prefferred way to configure the artifact storage: https://github.com/argoproj/argo/blob/master/ARTIFACT_REPO.md#configure-the-default-artifact-repository and use a cluster-local configMap. This way the pipelines remain clean and portable: https://github.com/argoproj/argo/blob/master/examples/artifact-passing.yaml
Kubeflow deployer has already been pre-installing the configMap for several months: https://github.com/kubeflow/kubeflow/pull/2238
2019-05-13 17:29:08 -07:00
Ilias Katsakioris b675e0272b Remove cops and rops pipeline attributes (#1298)
* Remove the separated dictionaries for ContainerOps and ResourceOps
* Fix the sanitization performed by the compiler to iterate through ops
  dict and do type-check for the special fields file_outputs and
  attribute_outputs

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
2019-05-09 17:25:57 -07:00
Hamed ce6066136d support tolerations for ContainerOps (#1269)
* add tolerations to ContainerOps

* add test

* add type for tolerations

* remove fix

* remove print
2019-05-09 16:37:59 -07:00
Alexey Volkov b61bef04a3 SDK - Renamed ModelBase.from_struct/to_struct to from_dict/to_dict (#1290) 2019-05-07 14:06:35 -07:00
Alexey Volkov 6b0d4d13e7 Marking the UI-metadata and Metrics artifacts as optional (#1260)
* Marking the UI-metadata and Metrics artifacts as optional

* Updated the compiler tests

* Updates one more test.
2019-04-29 17:27:37 -07:00
Ning d6a9f60ddb display kaniko log if failed (#1247) 2019-04-29 13:03:35 -07:00
Tommy Li bb0a5e36f6 Parameterize the artifact path for mlpipeline ui-metadata and metrics (#998)
* parameterize artifact path for ui-metadata and metrics

* change output_artifact_paths as containerops args

* change output_artifact_paths default args to None
2019-04-25 12:08:34 -07:00
Ilias Katsakioris 07cb50ee0c Extend the DSL to implement the design of #801 (#926)
* SDK: Create BaseOp class

* BaseOp class is the base class for any Argo Template type
* ContainerOp derives from BaseOp
* Rename dependent_names to deps

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: In preparation for the new feature ResourceOps (#801)

* Add cops attributes to Pipeline. This is a dict having all the
  ContainerOps of the pipeline.
* Set some processing in _op_to_template as ContainerOp specific

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Simplify the consumption of Volumes by ContainerOps

Add `pvolumes` argument and attribute to ContainerOp. It is a dict
having mount paths as keys and V1Volumes as values. These are added to
the pipeline and mounted by the container of the ContainerOp.

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Add ResourceOp

* ResourceOp is the SDK's equivalent for Argo's resource template
* Add rops attribute to Pipeline: Dictionary containing ResourceOps
* Extend _op_to_template to produce the template for ResourceOps
* Use processed_op instead of op everywhere in _op_to_template()
* Add samples/resourceop/resourceop_basic.py
* Add tests/dsl/resource_op_tests.py
* Extend tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Simplify the creation of PersistentVolumeClaim instances

* Add VolumeOp: A specified ResourceOp for PVC creation
* Add samples/resourceops/volumeop_basic.py
* Add tests/dsl/volume_op_tests.py
* Extend tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Emit a V1Volume as `.volume` from dsl.VolumeOp

* Extend VolumeOp so it outputs a `.volume` attribute ready to be
  consumed by the `pvolumes` argument to ContainerOp's constructor
* Update samples/resourceop/volumeop_basic.py
* Extend tests/dsl/volume_op_tests.py
* Update tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Add PipelineVolume

* PipelineVolume inherits from V1Volume and it comes with its own set of
  KFP-specific dependencies. It is aligned with how PipelineParam
  instances are used. I.e. consuming a PipelineVolume leads to implicit
  dependencies without the user having to call the `.after()` method on
  a ContainerOp.
* PipelineVolume comes with its own `.after()` method, which can be used
  to append extra dependencies to the instance.
* Extend ContainerOp to handle PipelineVolume deps
* Set `.volume` attribute of VolumeOp to be a PipelineVolume instead
* Add samples/resourceops/volumeop_{parallel,dag,sequential}.py
* Fix tests/dsl/volume_op_tests.py
* Add tests/dsl/pipeline_volume_tests.py
* Extend tests/compiler/compiler_tests.py

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* SDK: Simplify the creation of VolumeSnapshot instances

* VolumeSnapshotOp: A specified ResourceOp for VolumeSnapshot creation
* Add samples/resourceops/volume_snapshotop_{sequential,rokurl}.py
* Add tests/dsl/volume_snapshotop_tests.py
* Extend tests/compiler/compiler_tests.py

NOTE: VolumeSnapshots is an Alpha feature at the time of this commit.

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Extend UI for the ResourceOp and Volumes feature of the Compiler

* Add VolumeMounts tab/entry (Run/Pipeline view)
* Add Manifest tab/entry (Run/Pipeline view)
* Add & Extend tests
* Update tests snapshot files

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Cleaning up the diff (before moving things back)

* Renamed op.deps back to op.dependent_names

* Moved Container, Sidecar and BaseOp classed back to _container_op.py
This way the diff is much smaller and more understandable. We can always split or refactor the file later. Refactorings should not be mixed with genuine changes.
2019-04-25 10:40:48 -07:00
Alexey Volkov ee119ec627 SDK - Got rid of the global variable collecting all created pipelines (#1167)
* SDK - Got rid of the global variable collecting all created pipelines
This list was only used by the command-line compiler.
The command-line compiler can still collect the created pipelines by registering a handler function in `_pipeline_decorator_handlers`.

* Replaced handler stack with a single handler.
2019-04-18 18:19:54 -07:00
Ning 71325c3316 new kubernetes packages contain breaking change, thus fixing the version in the sample test image (#1159)
* new kubernetes packages contain breaking change, thus fixing the version

* also fixing the kubernetes version in the python sdk dependency

* fix bug
2019-04-14 21:36:00 -07:00
Alexey Volkov c67aea779e SDK - Simplified the @pipeline decorator (#1120)
* SDK - Simplified the @pipeline decorator
Moved metadata-related code to _metadata.
`Pipeline.get_pipeline_functions` now returns the list of pipeline functions.

* Addressed @gaoning777's PR feedback
2019-04-12 13:14:47 -07:00
Alexey Volkov 0a086170a2 Stabilized the artifact ordering during the compilation (#1097) 2019-04-08 18:17:04 -07:00
Nathan DeMaria 8468b253f9 Reference correct object in k8s->json tuple conversion (#1088) 2019-04-05 14:01:55 -07:00
Alexey Volkov 9269bac716
SDK - Configure artifact name and path separately (again) (#1067)
Restoring the https://github.com/kubeflow/pipelines/pull/900 change that was overwritten by https://github.com/kubeflow/pipelines/pull/879
2019-04-04 14:31:13 -07:00
Ning 1a04e86ed7 Recursion bug fix (#1061)
* remove the graph component output; add support for dependency on graph component

* fix bug; adjust unit tests

* add support for explicit dependency of graph component

* adjust unit test

* add a todo

* bug fixes for unit tests

* refactor condition_param code; fix bug when the inputs task name is None; need to remove the print later

* do not pass condition param as arguments to downstream ops, remove print logs; add unit tests

* add unit test golden yaml

* fix bug

* fix the sample
2019-04-02 09:49:19 -07:00
Eterna2 825f64d672 Feature: sidecar for ContainerOp (#879)
* Feature: sidecar for ContainerOp

* replace f-string with string format for compatibility with py3.5

* ContainerOp now can be updated with any k8s V1Container attributes as well as sidecars with Sidecar class. ContainerOp accepts PipelineParam in any valid k8 properties.

* WIP: fix conflicts and bugs with recent master. TODO: more complex template with pipeline params

* fix proxy args

* Fixed to work with latest master head

* Added container_kwargs to ContainerOp to pass in k8s container kwargs

* Fix comment bug, updated with example in ContainerOp docstring

* fix copyright year

* expose match_serialized_pipelineparam as public for compiler to process serialized pipeline params

* fixed pydoc example and removed unnecessary ContainerOp.container.parent

* Fix conflicts in compiler tests
2019-03-28 11:11:30 -07:00
Ning f59c25bd04 Add type check samples (#955)
* add core types and type checking function

* fix unit test bug

* avoid defining dynamic classes

* typo fix

* add component metadata format

* add a construct for the component decorator

* add default values for the meta classes

* add input/output types to the metadata

* add from_dict in TypeMeta

* small fix

* add unit tests

* use python struct for the openapi schema

* add default in parameter

* add default value

* remove the str restriction for the param default

* bug fix

* add pipelinemeta

* add pipeline metadata

* ignore annotation if it is not str/BaseType/dict

* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability

* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name

* pass metadata from component decorator and task factory to containerOp

* pass pipeline metadata to Pipeline

* fix unit test

* typo in the comments

* move the metadata classes to a separate module

* fix unit test

* small change

* add __eq__ to meta classes
not export _metadata classes

* nothing

* fix unit test

* unit test python component

* unit test python pipeline

* fix bug: duplicate variable of args

* fix unit tests

* move python_component and _component decorator in _component file

* remove the print

* change parameter default value to None

* add functools wraps around _component decorator

* TypeMeta accept both str and dict

* fix indent, add unit test for type as strings

* do not set default value for the name field in ParameterMeta, ComponentMeta, and PipelineMeta

* add type check in task factory

* output error message

* add type check in component decorator; move the metadata assignment out of the containerop __init__ function

* fix bug; add unit test

* add more unit tests

* more unit tests; fix bugs

* more unit tests; fix bugs

* add unit tests

* more unit tests

* add type check switch; add unit tests

* add compiler option for type check

* add a notebook sample

* resolving pr comments

* add unit test for pipeline param check with component types; fix the bug; also fix the bug when there are not a single return annotations

* add dsl static type checking sample

* fix bug: op_to_template resolve the raw arguments by mapping to the argument_inputs but the argument_inputs lost the type information

* fix type pattern matching

* convert orderedDict to dict from the component module

* add unit test to the pipelineparam with types

* create TypeMeta deserialize function, add comments

* strongly typed pipelineparamtuple

* remove GCSPath fields to avoid artifact type confusion
change the type json schema field name to openAPIV3Schema

* fix unit tests; add unit test for openapishema property

* add comments

* add unit test at the component module; fix bug

* add ignore_type in pipelineparam

* update sample: no artifact types but only parameter types; add pipelineparam ignore_type example

* configure the default type checking to enabled

* change openAPIV3Schema to lower case with underscore

* revert change from the merge

* add code blocks, add the benefits of static type checking
add more comments within the code block
add documentation about the type definition in both yaml and decorated
components.

* fix the comment

* update dsl.type namespace
2019-03-27 19:58:42 -07:00
Ning 554731e478 dsl generate zip file (#855)
* dsl generate zip file

* minor fix

* fix zip read in the unit test

* update sample tests

* dsl compiler generates pipeline based on the input name suffix

* add unit tests for different output format

* update the sdk client to support tar zip and yaml

* fix typo

* fix file write
2019-03-26 15:14:50 -07:00
Ning 8c09090985 Support recursions in a function (#1014)
* add a While in the ops group

* deepcopy the while conditions when entering and exiting

* add while condition resolution in the compiler

* define graph component decorator

* remove while loop related codes

* fixes

* remove while loop related code

* fix bugs

* generate a unique ops group name and being able to retrieve by name

* resolve the opsgroups inputs and dependencies based on the pipelineparam in the condition

* add a recursive ops_groups

* fix bugs of the recursive opsgroup template name

* resolve the recursive template name and arguments

* add validity checks

* add more comments

* add usage comment in graph_component

* add unit test for the graph opsgraph

* refactor the opsgroup

* add unit test for the graph_component decorator

* exposing graph_component decorator

* add recursive compiler unit tests

* fix the bug of opsgroup name
adjust the graph_component usage example
fix index bugs
use with statement in the graph_component instead of directly calling
the enter/exit functions

* add a todo to combine the graph_component and component decorators
2019-03-26 14:17:18 -07:00
Alexey Volkov fac06e9a87 SDK/DSL/Compileer - Fixed handling of empty pipeline name (#1009)
Fixes https://github.com/kubeflow/pipelines/issues/825
2019-03-21 15:44:18 -07:00
Ning 2accf4180a
Add unit tests pipelineparam (#975)
* add unit test to the pipelineparam with types
* create TypeMeta deserialize function, add comments
* strongly typed pipelineparamtuple
* addressing pr comments
2019-03-18 18:07:36 -07:00
Ning 754db1f724
Fix sample test failure because of the type information in the pipelineparam (#972)
* fix bug: op_to_template resolve the raw arguments by mapping to the argument_inputs but the argument_inputs lost the type information

* fix type pattern matching

* convert orderedDict to dict from the component module
2019-03-15 13:49:21 -07:00
Ning c829115574 Add type check (#938)
* add core types and type checking function

* fix unit test bug

* avoid defining dynamic classes

* typo fix

* add component metadata format

* add a construct for the component decorator

* add default values for the meta classes

* add input/output types to the metadata

* add from_dict in TypeMeta

* small fix

* add unit tests

* use python struct for the openapi schema

* add default in parameter

* add default value

* remove the str restriction for the param default

* bug fix

* add pipelinemeta

* add pipeline metadata

* ignore annotation if it is not str/BaseType/dict

* update param name in the check_type functions
remove schema validators for GCRPath, and adjust for GCRPath, GCSPath
change _check_valid_dict to _check_valid_type_dict to avoid confusion
fix typo in the comments
adjust function order for readability

* remove default values for non-primitive types in the function signature
update the _check_valid_type_dict name

* pass metadata from component decorator and task factory to containerOp

* pass pipeline metadata to Pipeline

* fix unit test

* typo in the comments

* move the metadata classes to a separate module

* fix unit test

* small change

* add __eq__ to meta classes
not export _metadata classes

* nothing

* fix unit test

* unit test python component

* unit test python pipeline

* fix bug: duplicate variable of args

* fix unit tests

* move python_component and _component decorator in _component file

* remove the print

* change parameter default value to None

* add functools wraps around _component decorator

* TypeMeta accept both str and dict

* fix indent, add unit test for type as strings

* do not set default value for the name field in ParameterMeta, ComponentMeta, and PipelineMeta

* add type check in task factory

* output error message

* add type check in component decorator; move the metadata assignment out of the containerop __init__ function

* fix bug; add unit test

* add more unit tests

* more unit tests; fix bugs

* more unit tests; fix bugs

* add unit tests

* more unit tests

* add type check switch; add unit tests

* add compiler option for type check

* resolving pr comments

* add unit test for pipeline param check with component types; fix the bug; also fix the bug when there are not a single return annotations
2019-03-11 11:22:12 -07:00
Alexey Volkov 7ce03f07d4 SDK/DSL/Compiler - Fixed compilation when using ContainerOp.after (#943)
Fixes https://github.com/kubeflow/pipelines/issues/941
2019-03-07 18:47:30 -08:00
Ning 974d602b74
Pass meta to containerop and pipeline (#905)
pass metadata from python conf to containerop and the pipeline
2019-03-06 13:42:23 -08:00
Alexey Volkov 63202ea9fd Configure artifact name and path separately (#900) 2019-03-05 18:16:00 -08:00
Ning a6763b9599 component build support for both python2 and python3 (#730)
* component build support for both python2 and python3

* add sample test

* remove the annotations for python2 component build

* add pathlib for python2 component build

* fix component build unit test

* fix bug in the dockerfile generator

* remove exist_ok in path.mkdir to make python2 compatible

* adjust unit test

* remove pathlib dependency for python2 component build

* remove the pathlib codes in python3 component build, but use python2 code instead; add a todo to create a new sample
2019-02-25 12:56:19 -08:00
Ning 9ebbaa313d support pipeline level imagepullsecret in DSL (#745)
* support pipeline level imagepullsecret in DSL

* use kubernetes native input parameter for imagepullsecrets

* expose a module level function to configure the pipeline settings for the current default pipeline
2019-02-05 13:16:43 -08:00
qimingj 3b3a15e16a Add "set_retry()" on ContainerOp. (#723)
* Add "set_retry()" on ContainerOp.

* Follow up on CR comments.

* Update docstring.

* Increase retry times for test.

* Fix test.
2019-01-23 17:35:34 -08:00
Ajay Gopinathan 578e8231d0 Update all Pipelines CRD versions to v1beta1. (#681) 2019-01-17 19:35:51 -08:00
Alexey Volkov 83e9ffe5bc SDK/Components - Reworked the component model structures. (#642)
* Reworked the Component structures.
Rewrote parsing, type checking and serialization code.
Improved the graph component structures.
Added most of the needed k8s structures.
Added model validation (input/output existence etc).
Added task cycle detection and topological sorting to GraphSpec.
All container component tests now work.
Added some graph component tests.

* Fixed incompatibilities with python <3.7

* Added __init__.py to make the Travis tests work.

* Adding kubernetes structures to setup.py

* Addressed PR feedback: Renamed _original_names to _serialized_names

* Addressed PR feedback: Reduced indentation.

* Added descriptions for all component structures.

* Fixed a bug in ComponentSpec._post_init()

* Added documentation for ModelBase class and functions.

* Added __eq__/__ne__ and improved __repr__

* Added ModelBase tests
2019-01-09 15:51:34 -08:00
Ning d3c4add0a9 DSL refactor (#619)
* add comments

* relocate functions in compiler to aggregate similar functions; move _build_conventional_artifact as a nested function

* reduce sanitize functions into one in the dsl.

* more comments

* move all sanitization(op name, param name) from dsl to compiler

* sanitize pipelineparam name and op_name; remove format check in pipelineparam

* remove unit test for pipelineparam op_name format checking

* fix bug: correctly replace input in the argument list

* fix bug: replace arguments with found ones

* Sanitize the file_output keys, Matches the param in the args/cmds with the whole serialized param str, Verify both param name and container name

* loosen the containerop and param name restrictions
2019-01-08 20:00:17 -08:00
qimingj 875efea1f9 Support replacable arguments in command as well (besides arguments) in container op. (#623)
* Support replacable arguments in command as well (besides arguments) in container op.

* Fix components builder.

* Fix tests.

* Follow up CR comments.

* Fix test.
2019-01-07 07:57:36 -08:00
Ning 85c6413a2e Refactor Python SDK (#568)
* add some comments

* remove unused import; add license to dsl_bridge

* move_convert_k8s_obj_to_dic from compiler to k8s_helper

* move unit test
2018-12-20 09:51:09 -08:00
Ning b313e4060c remove duplicate volumes (#558)
* remove duplicate volumes

* add a todo
2018-12-18 10:39:43 -08:00
qimingj 19e7ff2995 Support Kaniko job in a outside-cluster jupyter. (#535)
* Make kaniko job working in a outside-cluster jupyter.

* Also support in-cluster config.
2018-12-13 00:47:51 -08:00
hongye-sun 55e18269c3 support tpu settings in dsl (#491)
* support tpu settings in dsl

* fix issues from review comment
2018-12-06 21:55:36 -08:00
Alexey Volkov d9c133f7f7 SDK/Components/PythionContainerOp - Make the local output path configurableThis is part of a bigger effort to make all output locations manageable in preparation for storage system. (#424) 2018-12-04 19:56:56 -08:00
nealgao 632370c862 import bug (#463) 2018-12-04 12:30:59 -08:00
Alexey Volkov e0a2cf66d2 SDK/PythonContainer - Compiling pipelines without needing kubernetes (#442) 2018-12-03 13:14:08 -08:00
Alexey Volkov b0461f51ff SDK/DSL - Added support for 5 more conditional operations (#309) 2018-12-03 12:33:24 -08:00
Alexey Volkov e06dc88316 SDK/Components - Renamed container.arguments to container.args (#437)
This aligns us with Kubernetes spec
2018-12-03 11:02:15 -08:00
Alexey Volkov 8a9e205d05 SDK/Components/PythonContainerOp - Simplified GCSHelper by extracting duplicate code (#210)
* Simplified GCSHelper by extracting duplicate code

* Revert to using get_bucket for now
As per  @gaoning777's request
2018-12-02 11:59:15 -08:00
Alexey Volkov 98cc6ee42d SDK/Components/PythonContainerOp - Switch from dict to ComponentSpec (#396) 2018-11-30 14:14:20 -08:00
Chris Van Pelt 26fd724ec5 Fix for k8s dict parsing (#411)
* Add support for minio artifacts

* Add new tests for parity

* Fix for sdk env bug

* improved test
2018-11-30 11:44:38 -08:00
IronPan 537357ac8f Propagate secret to kaniko (#423)
* propagate secret to kaniko spec

* update name
2018-11-29 22:39:32 -08:00
qimingj 0b7120c322 Now pipeline function takes direct default values rather than dsp.PipelineParam. (#110)
* Now pipeline function takes direct default values rather than dsp.PipelineParam. It simplifies the sample code a lot.

* Remove extraneous parenthesis.

* Follow up CR comments.

* Change Dockerfile (not done).

* Fix dockerfile.

* Fix Dockerfile again.

* Remove unneeded installation of packages in Dockerfile.
2018-11-26 17:13:55 -08:00