Commit Graph

157 Commits

Author SHA1 Message Date
Vitalii Vokhmin 2f1db59798
fix(sdk): compile ParallelFor in a deterministic manner (#4926)
* fix(sdk): compile ParallelFor in a deterministic manner

During compilataion ParallelFor components end up with randomized names,
which makes it very inconvenient to compare two versions of a pipeline.
This commit fixes this issue.

* fix(sdk): fix new parallel-for test cases
2021-01-29 18:31:09 -08:00
Chen Sun ecb14f40bb
chore(sdk): Remove v2 components fork, use v1 instead. (#5042)
* Remove v2.components fork

* fix setup.py
2021-01-28 18:20:07 -08:00
Jiaxiao Zheng 279694ec6d
feat(sdk): Container entrypoint used for new styled KFP component authoring (#4978)
* skeleton

* add entrypoint utils to parse param

* wip: artifact parsing

* add input param artifacts passing and clean unused code

* wip

* add output artifact inspection

* add parameter output

* finish entrypoint implementation

* add entrypoint_utils_test.py

* add entrypoint test

* add entrypoint test

* get rid of tf

* fix test

* fix file location

* fix tests

* fix tests

* resolving comments

* Partially rollback

* resolve comments in entrypoint.py

* resolve comments
2021-01-14 16:01:21 -08:00
Niklas Hansson c2a8bd0b93
Added checks for parallism values (#4950)
* Added checks for parallism values

* fix variable name
2021-01-08 08:28:55 -08:00
Jiaxiao Zheng a56efb2061
feat(sdk): Merge artifact ontology from v2 to the classic KFP. (#4963)
* move modules back to v1

* move and fix ontology tests
2021-01-07 23:00:53 -08:00
Jiaxiao Zheng 7540ba5c3b
feat(sdk): Implements artifact URI placeholder. (#4932)
* add placeholder to spec

* add output_directory to pipeline

* respect uri placeholder in file outputs

* wip: add data passing rewriting logic to respect the uri semantics

* merge input_uri and paths when instantiating ContainerOp

* fix

* fix workflow rewriting

* Add topology rewriting

* add a test case, and various fixes

* make the test case more complex

* Fix the case when working with OpsGroup

* Fix test case

* fix resolving test

* fix redundant cmd lines

* fix redundant cmd lines

* resolve comments

* fix file outputs

* resolve comments

* copy file outputs instead of modifying inplace.
2021-01-05 20:39:51 -08:00
Niklas Hansson 24732b9dae
feat(compiler): add dsl operation for parallelism on sub dag level (#4199)
* Added subdag parallelism

Authored-by: NikeNano <niklas.sven.hansson@gmail.com>
Co-authored-by: guanhuichen <guanhuichen@gmail.com>

* added error handling, fixed comment and refactored

* updated with sleep and TODO

* fix imports

Co-authored-by: guanhuichen <guanhuichen@gmail.com>
2020-12-26 22:10:27 -08:00
Ilias Katsakioris 8f70bf325e
fix(sdk): Do not wait for resource deletion (#4820)
When calling the delete() method of a ResourceOp we need to ensure we do
not wait for its deletion.

The reason for this is described in [1]: If a pipeline creates a
resource which is being consumed by its steps (e.g., a PVC), the step
deleting the resource will hang waiting for the Kubernetes resource
deletion which, in turn, is waiting for the other steps to get deleted.
As a result, the pipeline never finishes.

This commit allows specifying flags for the ResourceOp kubectl commands
and defaults to the '--wait=false' flag for the deletion.

Specifying flags for a ResourceTemplate is not supported in Argo v2.7
that we currently deploy. But they will be once we upgrade to v2.11+
[2]. This does not affect the delete() method because we don't rely on
Argo's ResourceTemplate for it.

[1] https://github.com/kubeflow/pipelines/issues/4506
[2] https://github.com/kubeflow/pipelines/issues/4553

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
2020-12-17 16:54:24 -08:00
Michalina Kotwica 5169489be5
feat(sdk): allow calling GroupOp.after with multiple ops (#4788)
* allow calling ContainerOp.after with multiple ops

* m: simplify signature of OpsGroup.after

as it's not public anyway, chances of calling it with by-keyword
argument are small compared to calling it with unpacking an arbitrary
list, which could be empty (previous signature would fail for empty list)
2020-12-16 15:51:09 -08:00
Kenta Onishi 5a4b70e37c
feat(sdk): Add settings of the dnsConfig field. Fixes #4836 (#4837)
* feat(sdk): Add settings of the dnsConfig field. Fixes #4836

* feat(sdk): Add dnsConfig example and sample.

* feat(sdk): Refactor dnsConfig param.

* feat(sdk): Refactor dnsConfig param.
2020-12-14 20:05:49 -08:00
Vitalii Vokhmin 2f3a686e54
feat(sdk): add ability to set retry policy (#4858)
* feat(sdk): add ability to set retry policy

This fixes the second part of the issue described in #4333
The first part was addressed in #4392

* feat(sdk): validate retry policy name

* feat(sdk): simplify retry policy interface
2020-12-11 14:47:29 -08:00
Rui Fang 518b8b886d
[Doc] update docs that still refer to KFP latest SDK reference (#4845)
* Initial execution cache

This commit adds initial execution cache service. Including http service
and execution key generation.

* fix master

* fix go.sum

* update docs that still refer to KFP latest SDK reference
2020-12-02 20:02:59 -08:00
Jiaxiao Zheng fb15223f7e
chore: Add doc strings marking the feature stages for SDK. (#4575)
* add doc strings

* Simplify the docstring

* fix unittest

* recover cli.py

* recover cli.py

* substitute docstring in resource ops with TODOs

* revert stable labels
2020-11-24 00:19:00 -08:00
David Przybilla 5f992f5d06
fix(sdk): VolumeOp has apiVersion as parameter (#4694) 2020-11-21 03:05:33 -08:00
Jiaxiao Zheng c4dd7871a6
chore: Support resource spec in v2 compiler (#4669)
* skeleton of code

* commit resource spec in IR proto

* add resource setter

* add accelerator setters

* fix unit conversion

* fix attribute proxy

* add and fix unittests

* add e2e test

* clean up

* clean up

* clean up

* clean up

* bypass subclass overriding

* clean up

* clean up

* clean up

* resolve comments
2020-10-26 16:40:02 -07:00
Alexey Volkov 80e1d7063d
fix(sdk): Fixed UI metadata and metrics (#4672)
Reverting most of the #2334 which inadvertently broke those artifacts by causing the names to be mangled.

KFP's DSL compiler prepends template names to output names to ensure global uniqueness of *input* names (DSL's ContainerOp does not have concept of inputs, so the inputs are generated during the compilation including input names). But prepending template names to the output names stops the backend from recognizing the mlpipeline-ui-metadata and mlpipeline-metrics artifacts.
2020-10-25 19:37:00 -07:00
Abhishek Vilas Munagekar c52a81c1af
fix(sdk): fixes dsl.ContainerOp deprecation warning not shown (#4658)
* change dsl.ContainerOp warning to FutureWarning

* fix tests
2020-10-23 18:07:01 -07:00
Alexey Volkov e8fb58a221
feat(sdk): Preserve parameter arguments and input names (#4563)
ContainerOp has no concept of inputs, so it looses any information about them such as input names and in some cases even the passed argument values (which are just injected into the command line).
This commit fixes that issue by preserving the paramater arguments map and ultimately storing it in an Argo template annotation.

Fixes https://github.com/kubeflow/pipelines/issues/4556
2020-10-11 20:32:48 -07:00
Alexey Volkov 1aa8068507
fix(sdk): DSL - Enabled arbitrary ContainerOp names (#4554)
Fixes https://github.com/kubeflow/pipelines/issues/4522
2020-09-29 05:21:35 -07:00
Abhishek Vilas Munagekar 0653e7c766
feat(sdk): adds support for ephemeral-storage in container-op (#4504)
* add type annotation to container class

* add support for ephemeral storage in container op
2020-09-20 03:30:29 -07:00
Alexey Volkov 03325848fc
feat(sdk): Components - Prevent passing unserializable objects to components. Fixes #4040 (#4496) 2020-09-16 02:23:22 -07:00
Niklas Hansson c32ea232d5
feat(compiled): set pod disruption budget for pipelines. Fixes #3877 (#4178)
* Update _client.py

* Update _client.py

* added pod disruption budget

* clean up

* Update sdk/python/kfp/dsl/_pipeline.py

* fixed parameter

* updated after feedback

* removed selector
2020-09-14 13:45:26 -07:00
Alexey Volkov 7dc051b982
refactor(sdk): Refactored ResourceOp deletion (#3841) 2020-08-25 23:12:02 -07:00
Alexey Volkov d87c3be611
DEPRECATE(sdk): DSL - Deprecated output_artifact_paths parameter in ContainerOp constructor (#2334)
The users should switch to file_outputs instead.
Previously `file_outputs` only supported small data outputs, but now it supports big files.
2020-08-07 21:44:19 -07:00
Alex Latchford 704c8c7660
chore: Clean up KFP SDK docstrings, make formatting a little more consistent (#4218)
* Prepare SDK docs environment so its easier to understand how to build the docs locally so theyre consistent with ReadTheDocs.

* Clean up docstrings for kfp.Client

* Add in updates to the docs for compiler and components

* Update components area to add in code references and make formatting a little more consistent.

* Clean up containers, add in custom CSS to ensure we do not overflow on inline code blocks

* Clean up containers, add in custom CSS to ensure we do not overflow on inline code blocks

* Remove unused kfp.notebook package links

* Clean up a few more errant references

* Clean up the DSL docs some more

* Update SDK docs for KFP extensions to follow Sphinx guidelines

* Clean up formatting of docstrings after Ark-Kuns comments
2020-08-04 00:33:47 +08:00
Alexey Volkov 2f9482758b
feat(sdk): SDK - Deprecation warning when using ContainerOp (#4166)
* SDK - Added warning when not using components

We have long advised our users to create reusable components.
Creating reusable components is as easy as creating ContainerOp instances, but the components are shareable, portable and are easier to support going forward.

* Disable warning for TFX

* Fixed the warning disabling logic

* Added tests
2020-07-08 23:16:53 -07:00
Niklas Hansson c6ac83f72c
feat: add parallelism for dsl.ParallelFor. Fixes #4089 (#4149)
* Added parallism at sub-dag level

* updated the parallism

* remove yaml file

* reformatting

* Update sdk/python/kfp/compiler/compiler.py

* Update sdk/python/kfp/compiler/compiler.py

* Update samples/core/loop_parallelism/loop_parallelism.py

Co-authored-by: Alexey Volkov <alexey.volkov@ark-kun.com>

Co-authored-by: Alexey Volkov <alexey.volkov@ark-kun.com>
2020-07-08 11:27:13 -07:00
Alexey Volkov d707b93fb4
feat(sdk): DSL - Added support for volatile components (#4104)
Volatile components do not reuse the cached results by default.
The pipeline authors can re-enable cache reuse if they want.
2020-07-06 18:09:57 -07:00
Alexey Volkov 6960366846
fix(sdk): Compiler - Fixed the input argument mapping when using dsl.graph_component. Fixes #3915 (4082)
* SDK - Compiler - Fixed the input argument mapping when using dsl.graph_component

Fixes https://github.com/kubeflow/pipelines/issues/3915

* Stopped relying on the argument order at all

This can make the compilation less fragile.
2020-06-29 02:31:37 -07:00
Alexey Volkov 54a596abd8
SDK - Compiler - Added support for volume-based data passing (3371)
* SDK - Compiler - Added support for volume-based data passing

Currently artifact passing is performed by Argo sidecar containers what download input data and upload output data to artifact repository (usually, S3-compatible blob storage like Minio).
The performance of this method is not optimal and it requires that pod disks have enough capacity to hold all artifact data.

This commit adds support for volume-based data passing.
This method involves using a single milti-write Kubernetes data volume to pass all intermediate data.
Parts of the volume are mounted to the input/output artifact directories, so when the user program reads and writes files, the files actually reside in the data volume.
This method improves the performance and reduces storage resource requirements.

The data volume must exist and support "READ_WRITE_MANY".

Limitations:
* All artifact file names must be the same (e.g. "data"). All auto-generated paths are already consistent. Avoid using any hard-coded paths.
* Passing constant values (text) as arguments for artifact inputs is not supported.
* The feature is experimental.

* Added data_passing_methods.KubernetesVolume

This class represents a configured volume-based artifact passing method.

* Added PipelineConf.data_passing_method

This property allows setting the method that will be used for intermediate data passing.
Added the compiler support for the new feature.

Example:
```python
from kfp.dsl import PipelineConf, data_passing_methods
from kubernetes.client.models import V1Volume, V1PersistentVolumeClaim
pipeline_conf = PipelineConf()
pipeline_conf.data_passing_method = data_passing_methods.KubernetesVolume(
    volume=V1Volume(
        name='data',
        persistent_volume_claim=V1PersistentVolumeClaim('data-volume'),
    ),
    path_prefix='artifact_data/',
)
```

* Added unit test

* Fixed bug in the unit test

Kubernetes does not validate the structures at all...

* Fixed bug in the result structure

* Fixed the test data

The class should be V1PersistentVolumeClaimVolumeSource, not V1PersistentVolumeClaimSpec.

* Fixed the test
2020-06-25 16:11:31 -07:00
Eterna2 4812d35283
Fix #3906 - mount_pvc transform should ignore non-ContainerOps (#3912)
* Fix #3906 - check that ops to be transformed is a containerOp

* Update docstring for add_op_transformer to clarify that not only containerOp will be transformed.
2020-06-09 19:04:05 -07:00
Shotaro Kohama 699ce937da
Modify docstrings to replace 'InitContainer' to 'UserContainer' (#3863) 2020-05-28 21:07:22 -07:00
Thi Nguyen ec9445aa01
Allow PipelineParams in dict keys too. (#3565)
Co-authored-by: Thi Nguyen <duongnt@users.noreply.github.com>
2020-05-19 17:54:19 -07:00
Alexey Volkov 92a0d11853
SDK - Moved some data from the component_ref annotation to the component_spec annotation (#3751)
Removing the component spec from component_ref (since it would be a duplicate), but making sure the whole spec if available in component_spec.
2020-05-15 01:02:58 -07:00
Alexey Volkov 8ba366b03f
SDK - Made outputs with original names available in ContainerOp.outputs (#3734)
* SDK - Made outputs with original names available in ContainerOp.outputs

Previously, ContainerOp had strict requirements for the output names, so we had to convert all the names before passing them to the ContainerOp constructor. Outputs with non-pythonic names could not be accessed using their original names.
Now ContainerOp supports any output names, so we're now using the original output names.
However to support legacy pipelines, we're also adding output references with pythonic names.

* Fixed the compiler test data

* Fixed the duplicate parameter outputs in the compiled workflow

* Fixed long line

* Stabilized the output naming conflict resolution

* Fix case of missing special outputs
2020-05-12 19:08:26 -07:00
Alexey Volkov 2279bde698
SDK - Annotate pods with component_ref (#3727)
* SDK - Annotate pods with component_ref

This preserves the information about the digest of the component and the location from which the component was loaded.

* Fixed compiler tests
2020-05-11 17:18:21 -07:00
Niklas Hansson 05c1537f28
Add Nodeselector to pipelineconfig fix issue #2863 (#3616)
* updated version

* added pipeline nodeselector

* removed old legacy

* renaming

* update test

* Update sdk/python/kfp/compiler/compiler.py
2020-05-05 00:11:08 -07:00
Eterna2 9167da1b4e
Support execution throttling for executing the pipelines (#3346) (#3439)
* Add parallelism limits to pipeline in kfp sdk

* fix lint error
2020-05-04 23:25:08 -07:00
Sascha Grunert 62e33c711b
Fix container.set_image_pull_policy documentation string (#3653)
The API function seemed to be not correct, which is now fixed.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2020-04-29 12:44:18 -07:00
Niklas Hansson 2354776e1e
fix #2802: Set ImagePullPolicy per pipeline. (#3534)
* bump version

* default image pull policy

* Update sdk/python/kfp/dsl/_pipeline.py

* task setting should dominate

* Update sdk/python/kfp/dsl/_pipeline.py

* fixed merge misstake
2020-04-23 07:09:13 -07:00
Alexey Volkov b63ad7e614
SDK - Removed the ArtifactLocation feature (#3517)
* SDK - Removed the ArtifactLocation feature

The feature was deprecated in v0.1.34 https://github.com/kubeflow/pipelines/pull/2326

* Removed the artifact_location sample
2020-04-23 00:49:44 -07:00
Renmin 8d9a04a8ca
Enable NFS dynamic PVC (#3314) 2020-04-14 01:35:12 -07:00
Rui Fang e4637285c9
[Test]Fix e2e test (#3471)
* Initial execution cache

This commit adds initial execution cache service. Including http service
and execution key generation.

* fix master

* fix go.sum

* Fix e2e test

* Add max_cache_staleness for flipA

* add comments
2020-04-09 10:35:44 -07:00
Alexey Volkov 95aec25db4
SDK - Support kubernetes client v11 (#3319)
Fixes https://github.com/kubeflow/pipelines/issues/3275
2020-03-22 02:54:44 -07:00
Alexey Volkov be12ccf2a1
SDK - Moved the @python_component decorator test to dsl tests (#3324)
* SDK - Moved the @python_component decorator test to dsl tests

* Deprecate @python_component
2020-03-21 08:14:43 -07:00
Alexey Volkov 734b43e3db
SDK - Added support for maxCacheStaleness (#3318)
* SDK - Added support for maxCacheStaleness

* Added the vendor prefix to the annotation
2020-03-20 13:38:09 -07:00
Alexey Volkov 264ff37c1e
SDK - Moved _dsl_bridge to dsl (#3267)
This is a pure refactoring change.
The components library should not have any dependencies on the DSL library.
2020-03-14 00:12:34 -07:00
Ilias Katsakioris c220059c8d
SDK/DSL: Enable the deletion of a resource via ResourceOp method (#3213)
* SDK/DSL: Enable the deletion of a resource via ResourceOp method

* Add the method delete() to ResourceOps
* Extend ResourceOp & VolumeOp tests

Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>

* Fix ValueError not being raised
2020-03-10 16:07:36 -07:00
xiaohanhuang e704067d15
add an optional name for dsl.Condition (kubeflow#3210) (#3212)
* add an optional name for dsl.Condition (kubeflow#3210)

* add unit test
2020-03-05 21:45:22 -08:00
Alexey Volkov 4a1b282461
SDK - Compiler - Fixed ParallelFor argument resolving (#3029)
* SDK - Compiler - Fixed ParallelFor name clashes

The ParallelFor argument reference resolving was really broken.
The logic "worked" like this - of the name of the referenced output
contained the name of the loop collection source output, then it was
considered to be the reference to the loop item.
This broke lots of scenarios especially in cases where there were
multiple components with same output name (e.g. the default "Output"
output name). The logic also did not distinguish between references to
the loop collection item vs. references to the loop collection source
itself.

I've rewritten the argument resolving logic, to fix the issues.

* Argo cannot use {{item}} when withParams items are dicts

* Stabilize the loop template names

* Renamed the test case
2020-02-11 12:18:09 -08:00