Yuki Iwai
d164ea463d
Upgrade golangci-lint v1 to v2 ( #714 )
...
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2025-09-02 01:32:01 +00:00
Michał Szadkowski
c29c37ca7e
Introduce ManagedBy field in RunPolicy ( #650 )
...
* Introduce ManageBy field to RunPolicy
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Make mpi-operator a default value for ManagedBy
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Add validation for ManagedBy field
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Make use of ManagedBy in reconciliation process
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Regenerate code after adding managedBy field
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Add e2e tests
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Update after code review
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Update tests
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Remove default value for ManagedBy
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Add optional tag
Replace backoff and consistently with sleep
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
* Create common util package for integration and e2e tests with sleep/wait constants
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
---------
Signed-off-by: Michal Szadkowski <michal_szadkowski@epam.com>
2024-10-10 17:16:10 +00:00
Yuki Iwai
4d5156d07a
Replace original pointer methods with ptr libs ( #635 )
...
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-04-17 16:17:28 +00:00
Chitsing KUI
f92b9c7e74
Deprecated pointer, use ptr instead ( #627 )
...
Signed-off-by: kuizhiqing <kuizhiqing@msn.com>
2024-02-27 13:28:00 +00:00
Yuki Iwai
3c7fad663a
Upgrade K8s dependencies to v0.27.4 ( #584 )
...
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-08-07 22:17:56 +00:00
dragon-fly
e1590ce61e
merge kubeflow/common.v1 to mpi-operator ( #571 )
...
* merge kubeflow/common.v1 to mpi-operator
Signed-off-by: lowang_bh <lhui_wang@163.com>
java gen Python SDK
Signed-off-by: lowang_bh <lhui_wang@163.com>
* update make generate and fix comment issues
Signed-off-by: lowang_bh <lhui_wang@163.com>
* Update pkg/apis/kubeflow/v2beta1/types.go
Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
* merge from master to solve conflict
Signed-off-by: lowang-bh <lhui_wang@163.com>
* change reference link to training-operator project
Signed-off-by: lowang-bh <lhui_wang@163.com>
---------
Signed-off-by: lowang_bh <lhui_wang@163.com>
Signed-off-by: lowang-bh <lhui_wang@163.com>
Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-07-08 19:52:53 +00:00
dragon-fly
3cbaa9825f
add volcano gang-schedule integration and e2e test ( #569 )
...
* add volcano gang-schedule integration test
Signed-off-by: lowang_bh <lhui_wang@163.com>
* add e2e test for volcano scheduler
Signed-off-by: lowang_bh <lhui_wang@163.com>
* merge #576 : Increase the timeout for E2E tests
Signed-off-by: lowang_bh <lhui_wang@163.com>
* Update test/e2e/mpi_job_test.go
Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
* refact e2e test function
Signed-off-by: lowang_bh <lhui_wang@163.com>
---------
Signed-off-by: lowang_bh <lhui_wang@163.com>
Co-authored-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-06-28 13:34:16 +00:00
xhejtman
f8d815cdf4
Run workers first and wait for them ( #484 )
...
* Real rebase of waitforworkes option
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Fix generated API
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Fix format
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Add docs
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Fix typo
* Add tests for waitforworkers
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Add missing err test
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Fix cleanpodpolicy
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Remove debug
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Fix tests
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Rework api
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Fix generated api
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* One more fix of api
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Swagger fix
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Fix readme
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Fix readme again
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Add comments
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Add kubebuilder annotations
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
* Fix manifests
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
---------
Signed-off-by: Lukas Hejtmanek <xhejtman@gmail.com>
2023-06-26 18:37:14 +00:00
Mateusz Kubica
21f326d1d2
MPICH support ( #562 )
...
* Add support for MPICH
* Fix CI errors
* Temporary: manual trigger
* Fix file name
* Add an empty line at the end of the file
* Fix formatting
* Revert "Temporary: manual trigger"
This reverts commit 15164a8b70 .
* fix formatting
* Regenerate the mpi-operator.yaml
* Adding an empy line at the end of Dockerfiles
* Share the same entrypoin for Intel and MPICH
* share hostfile generation between Intel and MPICH
* Add validation test for MPICH
* Fix formatting
* Don't over engineer the tests - be explicit
* add non-root tests for IntelMPI and MPICH
2023-06-16 17:57:36 +00:00
Yuki Iwai
a3e15fe461
Implement E2E for integration with scheduler-plugins ( #540 )
...
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-04-04 19:34:02 +00:00
Yuki Iwai
d87eff50b1
Stop using e2e tag ( #530 )
...
* Stop using e2e tag
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
* Skip to run E2E for mkdir -p /Users/s14554/go/src/github.com/kubeflow/mpi-operator/bin
KUBEBUILDER_ASSETS="/Users/s14554/Library/Application Support/io.kubebuilder.envtest/k8s/1.25.0-darwin-arm64" go test -covermode atomic -coverprofile=profile.cov github.com/kubeflow/mpi-operator/cmd/mpi-operator github.com/kubeflow/mpi-operator/cmd/mpi-operator/app github.com/kubeflow/mpi-operator/cmd/mpi-operator/app/options github.com/kubeflow/mpi-operator/hack/python-sdk github.com/kubeflow/mpi-operator/pkg/apis/kubeflow/v2beta1 github.com/kubeflow/mpi-operator/pkg/apis/kubeflow/validation github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/fake github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/scheme github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/typed/kubeflow/v2beta1 github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/typed/kubeflow/v2beta1/fake github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/internalinterfaces github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/kubeflow github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/kubeflow/v2beta1 github.com/kubeflow/mpi-operator/pkg/client/listers/kubeflow/v2beta1 github.com/kubeflow/mpi-operator/pkg/controller github.com/kubeflow/mpi-operator/pkg/version github.com/kubeflow/mpi-operator/test/integration
? github.com/kubeflow/mpi-operator/cmd/mpi-operator [no test files]
? github.com/kubeflow/mpi-operator/cmd/mpi-operator/app [no test files]
? github.com/kubeflow/mpi-operator/cmd/mpi-operator/app/options [no test files]
? github.com/kubeflow/mpi-operator/hack/python-sdk [no test files]
ok github.com/kubeflow/mpi-operator/pkg/apis/kubeflow/v2beta1 0.327s coverage: 37.4% of statements
ok github.com/kubeflow/mpi-operator/pkg/apis/kubeflow/validation 0.210s coverage: 100.0% of statements
? github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned [no test files]
? github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/fake [no test files]
? github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/scheme [no test files]
? github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/typed/kubeflow/v2beta1 [no test files]
? github.com/kubeflow/mpi-operator/pkg/client/clientset/versioned/typed/kubeflow/v2beta1/fake [no test files]
? github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions [no test files]
? github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/internalinterfaces [no test files]
? github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/kubeflow [no test files]
? github.com/kubeflow/mpi-operator/pkg/client/informers/externalversions/kubeflow/v2beta1 [no test files]
? github.com/kubeflow/mpi-operator/pkg/client/listers/kubeflow/v2beta1 [no test files]
ok github.com/kubeflow/mpi-operator/pkg/controller 0.413s coverage: 68.6% of statements
? github.com/kubeflow/mpi-operator/pkg/version [no test files]
ok github.com/kubeflow/mpi-operator/test/integration 9.937s coverage: [no statements]
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
---------
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-02-22 17:11:41 +00:00
Yuki Iwai
c21942d1e2
Add slots to hostfile ( #523 )
...
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-02-10 19:10:02 +00:00
Michał Woźniak
31d4575066
Pass context to the utility methods in e2e tests ( #516 )
2023-02-06 10:32:01 +00:00
Michał Woźniak
92e491e6e9
Support suspend semantics for MPIJob ( #511 )
...
* Implement Suspend semantics for MPIJob
# Conflicts:
# pkg/apis/kubeflow/v2beta1/types.go
# pkg/controller/mpi_job_controller.go
# pkg/controller/mpi_job_controller_status.go
# pkg/controller/mpi_job_controller_test.go
# test/integration/mpi_job_controller_test.go
* Changes
- add unit tests for creating suspended, suspending and resuming
- use fake clock for unit tests
- do not return from the syncHandler after worker pods cleanup on
suspend - this allows to continue with the MPIJob update in the same sync
# Conflicts:
# pkg/controller/mpi_job_controller.go
2023-02-03 15:44:02 +00:00
Yuki Iwai
4c8b4fc2e4
Use local copy of JobStatus by mpi-operator ( #514 )
...
* Use local copy of JobStatus by mpi-operator
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
* address comments
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
---------
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-02-03 14:44:01 +00:00
Yuki Iwai
05ac6addc0
Upgrade Kubernetes dependencies ( #502 )
...
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-01-26 18:13:09 +00:00
Yuki Iwai
cd83424f65
Rename Go module name to 'github.com/kubeflow/mpi-operator' ( #506 )
...
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-01-25 16:28:53 +00:00
Yuki Iwai
dc36350d99
Move mpi-operator v2 to the top of the repository ( #496 )
...
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
2023-01-11 17:03:15 +00:00