Commit Graph

18 Commits

Author SHA1 Message Date
Yuki Iwai d164ea463d
Upgrade golangci-lint v1 to v2 (#714)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2025-09-02 01:32:01 +00:00
Michał Szadkowski c29c37ca7e
Introduce ManagedBy field in RunPolicy (#650)
* Introduce ManageBy field to RunPolicy

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Make mpi-operator a default value for ManagedBy

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Add validation for ManagedBy field

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Make use of ManagedBy in reconciliation process

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Regenerate code after adding managedBy field

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Add e2e tests

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Update after code review

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Update tests

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Remove default value for ManagedBy

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Add optional tag
Replace backoff and consistently with sleep

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

* Create common util package for integration and e2e tests with sleep/wait constants

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>

---------

Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
2024-10-10 17:16:10 +00:00
Yuki Iwai 4d5156d07a
Replace original pointer methods with ptr libs (#635)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-04-17 16:17:28 +00:00
Chitsing KUI f92b9c7e74
Deprecated pointer, use ptr instead (#627)
Signed-off-by: kuizhiqing <kuizhiqing@msn.com>
2024-02-27 13:28:00 +00:00
Yuki Iwai 3c7fad663a
Upgrade K8s dependencies to v0.27.4 (#584)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-08-07 22:17:56 +00:00
dragon-fly e1590ce61e
merge kubeflow/common.v1 to mpi-operator (#571)
* merge kubeflow/common.v1 to mpi-operator

Signed-off-by: lowang_bh <lhui_wang@163.com>

java gen Python SDK

Signed-off-by: lowang_bh <lhui_wang@163.com>

* update make generate and fix comment issues

Signed-off-by: lowang_bh <lhui_wang@163.com>

* Update pkg/apis/kubeflow/v2beta1/types.go

Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* merge from master to solve conflict

Signed-off-by: lowang-bh <lhui_wang@163.com>

* change reference link to training-operator project

Signed-off-by: lowang-bh <lhui_wang@163.com>

---------

Signed-off-by: lowang_bh <lhui_wang@163.com>
Signed-off-by: lowang-bh <lhui_wang@163.com>
Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-07-08 19:52:53 +00:00
dragon-fly 3cbaa9825f
add volcano gang-schedule integration and e2e test (#569)
* add volcano gang-schedule integration test

Signed-off-by: lowang_bh <lhui_wang@163.com>

* add e2e test for volcano scheduler

Signed-off-by: lowang_bh <lhui_wang@163.com>

* merge #576: Increase the timeout for E2E tests

Signed-off-by: lowang_bh <lhui_wang@163.com>

* Update test/e2e/mpi_job_test.go

Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* refact e2e test function

Signed-off-by: lowang_bh <lhui_wang@163.com>

---------

Signed-off-by: lowang_bh <lhui_wang@163.com>
Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-06-28 13:34:16 +00:00
xhejtman f8d815cdf4
Run workers first and wait for them (#484)
* Real rebase of waitforworkes option

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Fix generated API

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Fix format

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Add docs

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Fix typo

* Add tests for waitforworkers

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Add missing err test

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Fix cleanpodpolicy

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Remove debug

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Fix tests

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Rework api

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Fix generated api

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* One more fix of api

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Swagger fix

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Fix readme

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Fix readme again

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Add comments

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Add kubebuilder annotations

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

* Fix manifests

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>

---------

Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
2023-06-26 18:37:14 +00:00
Mateusz Kubica 21f326d1d2
MPICH support (#562)
* Add support for MPICH

* Fix CI errors

* Temporary: manual trigger

* Fix file name

* Add an empty line at the end of the file

* Fix formatting

* Revert "Temporary: manual trigger"

This reverts commit 15164a8b70.

* fix formatting

* Regenerate the mpi-operator.yaml

* Adding an empy line at the end of Dockerfiles

* Share the same entrypoin for Intel and MPICH

* share hostfile generation between Intel and MPICH

* Add validation test for MPICH

* Fix formatting

* Don't over engineer the tests - be explicit

* add non-root tests for IntelMPI and MPICH
2023-06-16 17:57:36 +00:00
Yuki Iwai a3e15fe461
Implement E2E for integration with scheduler-plugins (#540)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-04-04 19:34:02 +00:00
Yuki Iwai d87eff50b1
Stop using e2e tag (#530)
* Stop using e2e tag

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Skip to run E2E for mkdir -p /Users/s14554/go/src/github.com/kubeflow/mpi-operator/bin
KUBEBUILDER_ASSETS="/Users/s14554/Library/Application Support/io.kubebuilder.envtest/k8s/1.25.0-darwin-arm64" go test -covermode atomic -coverprofile=profile.cov github.com/kubeflow/mpi-operator/cmd/mpi-operator github.com/kubeflow/mpi-operator/cmd/mpi-operator/app github.com/kubeflow/mpi-operator/cmd/mpi-operator/app/options github.com/kubeflow/mpi-operator/hack/python-sdk github.com/kubeflow/mpi-operator/pkg/apis/kubeflow/v2beta1 github.com/kubeflow/mpi-operator/pkg/apis/kubeflow/validation github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/fake github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/scheme github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/typed/kubeflow/v2beta1 github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/typed/kubeflow/v2beta1/fake github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/internalinterfaces github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/kubeflow github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/kubeflow/v2beta1 github.com/kubeflow/mpi-operator/pkg/client/listers/kubeflow/v2beta1 github.com/kubeflow/mpi-operator/pkg/controller github.com/kubeflow/mpi-operator/pkg/version github.com/kubeflow/mpi-operator/test/integration
?   	github.com/kubeflow/mpi-operator/cmd/mpi-operator	[no test files]
?   	github.com/kubeflow/mpi-operator/cmd/mpi-operator/app	[no test files]
?   	github.com/kubeflow/mpi-operator/cmd/mpi-operator/app/options	[no test files]
?   	github.com/kubeflow/mpi-operator/hack/python-sdk	[no test files]
ok  	github.com/kubeflow/mpi-operator/pkg/apis/kubeflow/v2beta1	0.327s	coverage: 37.4% of statements
ok  	github.com/kubeflow/mpi-operator/pkg/apis/kubeflow/validation	0.210s	coverage: 100.0% of statements
?   	github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned	[no test files]
?   	github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/fake	[no test files]
?   	github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/scheme	[no test files]
?   	github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/typed/kubeflow/v2beta1	[no test files]
?   	github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/typed/kubeflow/v2beta1/fake	[no test files]
?   	github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions	[no test files]
?   	github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/internalinterfaces	[no test files]
?   	github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/kubeflow	[no test files]
?   	github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/kubeflow/v2beta1	[no test files]
?   	github.com/kubeflow/mpi-operator/pkg/client/listers/kubeflow/v2beta1	[no test files]
ok  	github.com/kubeflow/mpi-operator/pkg/controller	0.413s	coverage: 68.6% of statements
?   	github.com/kubeflow/mpi-operator/pkg/version	[no test files]
ok  	github.com/kubeflow/mpi-operator/test/integration	9.937s	coverage: [no statements]

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

---------

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-02-22 17:11:41 +00:00
Yuki Iwai c21942d1e2
Add slots to hostfile (#523)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-02-10 19:10:02 +00:00
Michał Woźniak 31d4575066
Pass context to the utility methods in e2e tests (#516) 2023-02-06 10:32:01 +00:00
Michał Woźniak 92e491e6e9
Support suspend semantics for MPIJob (#511)
* Implement Suspend semantics for MPIJob

# Conflicts:
#	pkg/apis/kubeflow/v2beta1/types.go
#	pkg/controller/mpi_job_controller.go
#	pkg/controller/mpi_job_controller_status.go
#	pkg/controller/mpi_job_controller_test.go
#	test/integration/mpi_job_controller_test.go

* Changes
- add unit tests for creating suspended, suspending and resuming
- use fake clock for unit tests
- do not return from the syncHandler after worker pods cleanup on
suspend - this allows to continue with the MPIJob update in the same sync

# Conflicts:
#	pkg/controller/mpi_job_controller.go
2023-02-03 15:44:02 +00:00
Yuki Iwai 4c8b4fc2e4
Use local copy of JobStatus by mpi-operator (#514)
* Use local copy of JobStatus by mpi-operator

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* address comments

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

---------

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-02-03 14:44:01 +00:00
Yuki Iwai 05ac6addc0
Upgrade Kubernetes dependencies (#502)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-01-26 18:13:09 +00:00
Yuki Iwai cd83424f65
Rename Go module name to 'github.com/kubeflow/mpi-operator' (#506)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-01-25 16:28:53 +00:00
Yuki Iwai dc36350d99
Move mpi-operator v2 to the top of the repository (#496)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
2023-01-11 17:03:15 +00:00