Commit Graph

952 Commits

Author SHA1 Message Date
Alyssa Goins beae62fb52
feat(backend) implement retryStrategy for nested pipelines (#11908)
Signed-off-by: agoins <alyssacgoins@gmail.com>
2025-05-28 19:52:19 +00:00
Matt Prahl 53bb3a0aad
test: Update the Kubernetes and Python version ranges in the CI (#11924)
* Minimize the Kubernetes version range in the CI

This reduces the matrix to only include the low and high versions.

See the KFP community call notes for more context:
https://docs.google.com/document/d/1cHAdK1FoGEbuQ-Rl6adBDL5W2YpDiUbnMLIwmoXBoAU/edit?tab=t.0#heading=h.fovolzywu84d

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

* Reduce the testing matrix for Python versions

This reduces the Python versions being tested to the low and high
versions to reduce GitHub CI consumption.

See the KFP community call discussion for more context:
https://docs.google.com/document/d/1cHAdK1FoGEbuQ-Rl6adBDL5W2YpDiUbnMLIwmoXBoAU/edit?tab=t.0#heading=h.fovolzywu84d

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

* Add an option to run integration tests locally

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

* Stop using whalesay in the tests

The container image is over 10 years old and is in a format that is
deprecated on newer Kubernetes versions.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

---------

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
2025-05-28 13:37:20 +00:00
Alyssa Goins 1b0e6535b5
docs: add GoLand-specific configs to backend Readme (#11919)
Updates V2_DRIVER_COMMAND argument and adds runtime arguments for starting a remote debug session in GoLand.

Signed-off-by: agoins <alyssacgoins@gmail.com>
2025-05-28 13:36:19 +00:00
Humair Khan 732a3f26f5
docs(frontend): add ui dev docs (#11931)
Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-05-27 17:19:12 +00:00
Anton Pechenin 3337b5e323
- fix(launcher): missing executorInput parameter values caused by {{$}} evaluation order (#11925)
Signed-off-by: arpechenin <arpechenin@avito.ru>
Co-authored-by: arpechenin <arpechenin@avito.ru>
2025-05-22 17:59:20 +00:00
Caroline DeVoto e329fa39b6
chore: Enable go fmt as a lint check for Go code (#11830)
Signed-off-by: Caroline DeVoto <cmdevoto@users.noreply.github.com>
Co-authored-by: Caroline DeVoto <cmdevoto@users.noreply.github.com>
2025-05-19 14:31:37 +00:00
Helber Belmiro 4f09f01090
chore(test): Update cache test timeout and polling intervals (#11916)
Adjusted the timeout from 3 to 4 minutes and polling interval from 10 to 5 seconds in the cache integration test. This ensures more robust testing by allowing sufficient time for operations to complete while increasing polling frequency.

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>
2025-05-15 17:00:23 +00:00
Vani Haripriya Mudadla c368ac6881
feat(backend): Add CLI flags to support Kubernetes native API implementation (#11907)
Signed-off-by: VaniHaripriya <vmudadla@redhat.com>
2025-05-15 13:22:24 +00:00
Humair Khan 0031766201
chore(backend): break up driver logic (#11885)
Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-05-14 15:11:23 +00:00
Helber Belmiro 9aebb62be1
feat(backend): add the option to enable/disable cache globally (#11831)
* feat(backend): add the option to enable/disable cache globally

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

* chore(backend): added logging when cache is disabled globally

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

* Update backend/src/apiserver/resource/resource_manager.go

Co-authored-by: Giulio Frasca <giulio.m.frasca@gmail.com>
Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

* chore(backend): Fixed Scheduled Workflows

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

* chrore(backend): Added integration tests

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

* chrore(backend): Removed container arg when --cache_enabled is true

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

* Update backend/src/v2/cmd/driver/main.go

Co-authored-by: Matt Prahl <mprahl@users.noreply.github.com>
Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

* Update backend/src/v2/cmd/compiler/main.go

Co-authored-by: Matt Prahl <mprahl@users.noreply.github.com>
Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

* chrore(backend): Renamed to CacheDisabled and removed pointers

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

* chrore(backend): Added Argo compiler unit test with the cache disabled

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>

---------

Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>
Co-authored-by: Giulio Frasca <giulio.m.frasca@gmail.com>
Co-authored-by: Matt Prahl <mprahl@users.noreply.github.com>
2025-05-08 12:45:40 +00:00
Humair Khan 0010b06731
chore: handle empty tolerations dict/lists parrameterization (#11898)
* remove unnecessary nil err

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* ignore empty tolerations

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

---------

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-05-07 20:41:39 +00:00
Alyssa Goins 9245739f6f
feat(backend): parameterize retryStrategy input in Argo workflow (#11861)
Prior to this commit, there was a workflow template added for every unique
retryStrategy setting. This is now consolidates to a single retryStrategy
template.

Signed-off-by: agoins <alyssacgoins@gmail.com>
2025-05-07 19:14:39 +00:00
Michael ed828b513a
feat(backend/sdk): enable dsl.Collected for parameters & artifacts (#11725)
* feat(backend/sdk): enable dsl.Collected for params & artifacts

Signed-off-by: zazulam <m.zazula@gmail.com>

* feat(backend): collect through loops & dags

Signed-off-by: zazulam <m.zazula@gmail.com>

To enable users to use loops similar to subdags, the initial collecting
implementation went only 1 layer deep of loops/subdags. This
implementation serves to handle multifacted approaches of pipelines that
users can generate.

---------

Signed-off-by: zazulam <m.zazula@gmail.com>
2025-05-05 22:55:37 +00:00
Matt Prahl 70d28885f2
feat(backend): Allow the launcher command to be configurable (#11888)
This allows the launcher command to be overridden with
the V2_LAUNCHER_COMMAND environment variable. This is useful if you need
to override the command to launch Delve for debugging or you have a
situation that requires using a different binary in the container image
based on the environment.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
2025-05-02 15:06:06 +00:00
Humair Khan 24782d178d
chore: update all owners files (#11886)
Various owners' files reviewers/approvers are no longer involved with
the community. Additionaly, we have various other folks that are
involved and have shown an active interest in reviewing various portions
of the code base. This change updates all owners files to reflect this
current state of the community.

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-05-02 14:47:04 +00:00
Matt Prahl 56da004d91
fix(backend): Stop logging the strack trace on benign user errors (#11883)
Prior to this change, if a user requested a pipeline version that no
longer exists, it'd cause a whole stack trace to be displayed in the
logs. For benign errors, this now is a single info log.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
2025-04-30 20:52:03 +00:00
Matt Prahl c03127d967
feat(backend): Add the Kubernetes native pipeline store (#11881)
* Add the Kubernetes native pipeline store

This also improves cache update race conditions in the webhooks.

Co-authored-by: Matt Prahl <mprahl@users.noreply.github.com>
Signed-off-by: Ricardo M. Oliveira <rmartine@redhat.com>

* Use controller-runtime for the non-caching client

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

* Put the Kubernetes native CI manifests under the Argo manifests

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

* Fix the flaky Kubernetes pipeline store tests

Some tests set the viper configuration of POD_NAMESPACE while others
didn't and so the order of the tests mattered. This now sets and resets
the viper configuration for each test.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

* Modify the suggested pipeline version name to be a valid K8s name

This is more important for the Kubernetes pipeline store.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

* Fix the ml-pipeline Service ports in K8s native mode

The KFP UI automatically uses the first port listed in the ml-pipeline
Service to communicate with the KFP API. Using a JSON patch to add the
webhook port ensures it doesn't change the order.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

---------

Signed-off-by: Ricardo M. Oliveira <rmartine@redhat.com>
Signed-off-by: mprahl <mprahl@users.noreply.github.com>
Co-authored-by: Ricardo M. Oliveira <rmartine@redhat.com>
2025-04-30 19:53:03 +00:00
Humair Khan 8261e4af70
chore: bump master to release 2.5 (#11872)
* chore(release): bumped version to 2.5.0

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* remove unneeded ci files

these are added from the release script, until the scripts are updated,
these are manually removed.

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* revert incorrect changelog changes for 2.5

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* correct 2.5.x change log (#11878)

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

---------

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-04-29 18:24:02 +00:00
Humair Khan 64e8264352
chore: fix kfp-kubernetes pipeline spec resolution errors (#11868)
* use packaged pipeline spec in k8s platform pkg

kfp-kubernetes expects pipeline spec to be present in global namespace
this change updates generate proto code to replace this import to
correctly reference the spec in the kfp.pipeline_spec namespace.

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* update to protoc 3.20

upgrading to 3.20 resolves issues where some of the proto python
generated code attempts to directly create descriptors. This issue is
encountered when trying to use kfp-kubernetes packages with kfp sdk,
forcing the user to downgrade protobuf to 3.20 or lower despite the
kfp sdk and kfp-kubernetes supporting >=4.0. This change resolves
this issue.

updating protoc also requires regeneration of all go and python proto code, as well us updates to the api generate and release images.

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

---------

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-04-28 15:11:01 +00:00
Ricardo M. Oliveira 0359551b76 Fix Integration tests
Signed-off-by: Ricardo M. Oliveira <rmartine@redhat.com>
2025-04-25 16:20:06 -04:00
Humair Khan 90909fc0ef add backend support for toleration lists.
clarify toleration json docs

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-04-25 13:31:20 -04:00
Humair Khan 7529bbeba7 switch selenium image to ghcr
Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-04-25 11:31:02 -04:00
Helber Belmiro 93675b03d4
chore(backend): fixed support for Podman in Makefile (#11844)
Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>
2025-04-23 17:34:32 +00:00
Helber Belmiro d38418efea
fix(backend): fixed Dockerfile (#11841)
Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>
2025-04-21 15:26:26 +00:00
Helber Belmiro e696472a5b
chore(backend): Optimized Dockerfiles (#11834)
Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>
2025-04-17 13:57:24 +00:00
Anton Pechenin 598826e1cc
fix component retry test (#11836)
Signed-off-by: ntny <ntny1986@gmail.com>
Signed-off-by: arpechenin <arpechenin@avito.ru>
Co-authored-by: arpechenin <arpechenin@avito.ru>
2025-04-17 13:36:24 +00:00
Alyssa Goins 88cff55914
fix(docs): Remove Podman as backend README pre-req (#11824)
* Remove Podman as backend README pre-req

Signed-off-by: agoins <alyssacgoins@gmail.com>

* Note Docker req for dev-kinda-cluster

Signed-off-by: agoins <alyssacgoins@gmail.com>

* Change back to Docker or Podman

Signed-off-by: agoins <alyssacgoins@gmail.com>

---------

Signed-off-by: agoins <alyssacgoins@gmail.com>
2025-04-15 14:07:23 +00:00
Matt Prahl 92e4921c4c
fix(docs): Use the latest driver and launcher images in the dev environment (#11820)
Signed-off-by: mprahl <mprahl@users.noreply.github.com>
2025-04-11 17:11:07 +00:00
Alex 464ca3974f
feat(backend): implement logs as artifacts + CI updates (#11809)
* feat(backend): implement logs as artifacts

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>
Co-authored-by: cmdevoto <carolined321@gmail.com>

* Address feedback and update golangci-lint

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>

* Implement flag

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>
Co-authored-by: carter.fendley <carter.fendley@gmail.com>

* Handle logs in kubernetesPlatformOps

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>

* Broaden sdk execution test filter

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>

* Delete MinIO PVC at end of SDK execution tests

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>

* Emit more logs from sdk execution tests

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>

---------

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>
Co-authored-by: cmdevoto <carolined321@gmail.com>
Co-authored-by: carter.fendley <carter.fendley@gmail.com>
2025-04-11 00:15:05 +00:00
Humair Khan 38a46533fc
update driver & launcher image handling (#11533)
This change relies on manifest yamls to specify the launcher & driver
that is pinned to a specific KFP version. The goal is to decouple having
to build launcher/driver at separate stages compared to api server.

This is accomplished by setting the hardcoded default to point to
"latest" and during release, api server is built with this hardcoding,
and the images for driver/launcher are patched into manifests post build
along with the other images.

The apiserver deployment manifest is reformatted using yq so the next
time the release.sh is ran, the user is not surprised by the entire file
reformatting unexpectedly.

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

# Conflicts:
#	manifests/kustomize/base/pipeline/ml-pipeline-apiserver-deployment.yaml

# Conflicts:
#	.github/resources/manifests/tekton/kustomization.yaml
2025-04-10 20:54:05 +00:00
Humair Khan 9544293af3
chore(backend): removed tekton backend (#11813)
As per community discussion, the tekton backend is being removed due to
lack of maintainers.

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-04-10 17:47:06 +00:00
Helber Belmiro 6e3548f33e
feat(backend): Add the ability to set a proxy for accessing external resources (#11771)
Signed-off-by: Helber Belmiro <helber.belmiro@gmail.com>
2025-04-10 17:44:05 +00:00
Alex a680e2230c
Revert "feat(backend): implement logs as artifacts (#11762)" (#11807)
This reverts commit cd3e747b5d.

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>
2025-04-09 21:35:04 +00:00
co63oc 503beb51a3
chore: Fix typos (#11668)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-04-09 18:56:07 +00:00
Alex cd3e747b5d
feat(backend): implement logs as artifacts (#11762)
* feat(backend): implement logs as artifacts

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>
Co-authored-by: cmdevoto <carolined321@gmail.com>

* Address feedback and update golangci-lint

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>

* Implement flag

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>
Co-authored-by: carter.fendley <carter.fendley@gmail.com>

---------

Signed-off-by: droctothorpe <mythicalsunlight@gmail.com>
Co-authored-by: cmdevoto <carolined321@gmail.com>
Co-authored-by: carter.fendley <carter.fendley@gmail.com>
2025-04-09 15:47:23 +00:00
Matt Prahl bb7a1082c4
Handle optional pipeline inputs in the driver (#11788)
If the pipeline run is submitted without specifying an optional
parameter and there is no default, it was not handled by the driver.
The approach taken is explicitly set null for these values and let the
driver handle if the component parameter has a default that can be
used in the launcher.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
2025-04-09 03:57:20 +00:00
Matt Prahl 048f28332b
Fix recurring run output when always using latest (#11790)
Signed-off-by: mprahl <mprahl@users.noreply.github.com>
2025-04-07 20:35:19 +00:00
Matt Prahl c9be64dca3
feat(backend): Add a mutating webhook for the PipelineVersion kind (#11782)
* Add a mutating webhook for the PipelineVersion kind

This mutating webhook adds labels to query pipeline versions by their
pipeline ID and pipeline name, and adds an owner's reference to the
pipeline it's a part of.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

* Delete the Argo Workflow object after successful SDK execution test

This will hopefully free up resources on the cluster and reduce the CI
flakes.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

---------

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
2025-04-04 16:15:35 +00:00
Giulio Frasca 97e57368d1
chore: Upgrade Argo Workflows to v3.5.14, use Argo-provided images (#11783)
* chore: Upgrade Argo Workflows to v3.5.14

Signed-off-by: Giulio Frasca <gfrasca@redhat.com>

* chore: Use upstream Argo Workflow images

Signed-off-by: Giulio Frasca <gfrasca@redhat.com>

---------

Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
2025-04-02 20:50:34 +00:00
Abhinav2777 35041ef2bd
fix(metadata-writer): use mlmd_store.get_context_types() instead of workaround (#11753)
Replaced the previous workaround for checking if the DB is empty with
mlmd_store.get_context_types(), as the upstream issue (#28) is now resolved.

Signed-off-by: Abhinav Sai D <abhinavsai491@tutanota.com>
2025-03-31 20:07:33 +00:00
Humair Khan 596ec90bb8
chore(backend): upgade go and deps (#11780)
Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-03-31 19:50:31 +00:00
Matt Prahl 2efcde5efd
feat(backend): Create a validating webhook for the PipelineVersion kind (#11774)
* Create a validating webhook for the PipelineVersion kind

Signed-off-by: VaniHaripriya <vmudadla@redhat.com>

* Fix the manifests for the validating webhook

This fixes deployment and local development for the Kubernetes native
API manifests.

This also addresses other feedback.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>

---------

Signed-off-by: VaniHaripriya <vmudadla@redhat.com>
Signed-off-by: mprahl <mprahl@users.noreply.github.com>
Co-authored-by: VaniHaripriya <vmudadla@redhat.com>
2025-03-31 11:23:31 +00:00
Humair Khan e9f5b5aee2
chore(backend): refactor driver tests (#11777)
Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-03-31 11:14:30 +00:00
Humair Khan fd1b48b471
feat(backend/sdk): Add input parameterization for various k8s resources (#11770)
* add backend support for k8s platform inputs

This change adds driver support for input parameter support for the
kubernetes platform spec. Input resolution change is extracted and made
more generic so it may be re-used when building out the container spec
for the k8s config.

Also add unit tests for constant & runtime input parameters.
TaskOutput parameter support are omitted due to a lack of appropriate
mlmd mock framework.

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* add sdk implementation for k8s params inputs

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* add tests for k8s input params

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* add setup/teardown of prereqs and secret tests

and update/re-enable secret env tests

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* have kfp sample tests use local python pkgs

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

* use a better configmap k8s name input...

Add support for multiple input types for pull secrets.
Clarify toleration docstring
Remove unnecessary resolve function

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>

---------

Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-03-28 20:30:29 +00:00
Giulio Frasca 715ed40b92
fix(backend): Include missing go.mod for cacheserver/viewercontroller images (#11776)
- Add api and k8s_platform go.mod files so cacheserver and
  viewercontroller images will build

Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
2025-03-28 11:30:28 +00:00
Matt Prahl 2694605996
bug(backend,sdk): Use a valid path separator for Modelcar imports (#11767)
Forward slashes are invalid characters in a path and can't be escaped.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
2025-03-21 20:35:27 +00:00
Anish Asthana 06a7350191
chore(backend): Allow specification of image registry when building images (#11759)
Signed-off-by: Anish Asthana <anishasthana1@gmail.com>
2025-03-18 19:10:08 +00:00
Ricardo Martinelli de Oliveira 0d9a7b00e9
feat(backend): Add types for KFP Kubernete Native API (#11672)
Signed-off-by: Ricardo M. Oliveira <rmartine@redhat.com>
2025-03-07 13:57:55 +00:00
Humair Khan 3a89bd8564
chore(release): bumped version to 2.4.1 (#11718)
Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-03-05 00:07:06 +00:00
Humair Khan 89c8bd7274
remove unused function (#11719)
Signed-off-by: Humair Khan <HumairAK@users.noreply.github.com>
2025-03-01 18:12:15 +00:00