Compare commits

...

305 Commits

Author SHA1 Message Date
Volcano Bot 4ac9bc0ffd
Merge pull request #4389 from JesseStutler/fix_4299
Delete secrets permission for volcano agent
2025-06-21 08:07:11 +08:00
Volcano Bot 1098a08122
Merge pull request #4377 from zhifei92/sopport_allocate_func_for_extender
Support the allocation callback function provided by the extender.
2025-06-20 14:23:10 +08:00
JesseStutler ddecb9a75a Delete secrets permission for volcano agent
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-06-20 11:12:18 +08:00
Volcano Bot aa439233cc
Merge pull request #4373 from LY-today/bug-fix
fix: add getNodeStatus
2025-06-18 16:14:09 +08:00
LY-today b858207728
fix: add isNodeUnschedulable and isNodeNotReady func
Signed-off-by: LY-today <724102053@qq.com>
2025-06-18 14:55:47 +08:00
zhangzhifei16 7c3720380e feat: support the allocate callback function for extender.
Signed-off-by: zhangzhifei16 <zhangzhifei16@jd.com>

fix: Correct log formatting.

Signed-off-by: zhangzhifei16 <zhangzhifei16@jd.com>
2025-06-18 09:35:07 +08:00
Volcano Bot 7febf4dff7
Merge pull request #4378 from ElectricFish7/bug-fix
Move InitCycleState from openSession to OpenSession
2025-06-17 17:09:08 +08:00
Yuqi Wu 3c52b43750 Move InitCycleState from openSession to OpenSession
Signed-off-by: Yuqi Wu <wuyuqi22@mails.ucas.ac.cn>
2025-06-16 20:01:23 +08:00
Volcano Bot 3aa260cf38
Merge pull request #4279 from bibibox/add_preempt_proposal
Support topology aware in the preempt action
2025-06-06 16:18:57 +08:00
Box Zhang e9040d33a3 Preempt action support topology
Signed-off-by: Box Zhang <wszwbsddbk@gmail.com>
2025-06-06 15:20:51 +08:00
Box Zhang 3506a7089b Proposal: Preempt action support topology
Signed-off-by: Box Zhang <wszwbsddbk@gmail.com>
2025-06-04 11:33:01 +08:00
Volcano Bot 6e2959db6b
Merge pull request #4339 from JesseStutler/v1.12-update
Revert "Bump image to v1.12.0" in mater branch/ Update api version/ Fix queue status update
2025-06-04 10:52:56 +08:00
JesseStutler a3d19d4bea Update volcano v1.12 Kubernetes compatibility
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-06-04 09:33:16 +08:00
JesseStutler 50895c2c36 Add -v for all e2e testing and set longer timeout
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-06-04 09:33:01 +08:00
Volcano Bot d940019b9c
Merge pull request #4316 from sailorvii/doc-update
Refine vgpu user guide
2025-06-03 10:40:55 +08:00
JesseStutler 88e22aab4e Upgrade api version to v1.12.1
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-06-03 10:13:24 +08:00
Monokaix c7efd55dd9 Fix queue update conflicts when upgrading to new version
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-06-03 10:13:12 +08:00
JesseStutler b505c1b310 Revert "Bump image to v1.12.0"
This reverts commit 284eaed827.

Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-06-03 10:12:54 +08:00
Volcano Bot adab22d06a
Merge pull request #4330 from JesseStutler/v1.12-update
Bump image to v1.12.0
2025-05-30 23:16:52 +08:00
JesseStutler fb07f675fd Add -v for printing e2e logs
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-05-30 17:24:28 +08:00
JesseStutler 284eaed827 Bump image to v1.12.0
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-05-30 15:49:02 +08:00
chenw66 57c2d11654 refine vgpu user guide
Signed-off-by: chenw66 <chenw66@chinaunicom.cn>
2025-05-29 21:20:18 +08:00
Volcano Bot d48064b933
Merge pull request #4122 from dongjiang1989/add-jobflow-validate
feat: add jobflow flow dag validate
2025-05-29 20:14:54 +08:00
Volcano Bot 2c43314183
Merge pull request #4169 from dongjiang1989/add-jobflow-metrics
feat: Add jobflow metrics
2025-05-29 20:02:51 +08:00
Volcano Bot d517da9d1a
Merge pull request #4329 from sailorvii/master
Refine the GPU mode process
2025-05-29 19:59:51 +08:00
william-wang ef541d333d
Merge pull request #4327 from Monokaix/hypernode-status
reconcile hypernode nodeCount status
2025-05-29 19:56:38 +08:00
Monokaix e59a75ccb1 reconcile hypernode nodeCount status
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-29 19:47:57 +08:00
chenw66 536b376f77 If the pod specify the GPU sharing mode, exactly match the node GPU sharing mode
Signed-off-by: chenw66 <chenw66@chinaunicom.cn>
2025-05-29 17:15:34 +08:00
dongjiang b82a72edfe
add jobflow flow dag validate
Signed-off-by: dongjiang <dongjiang1989@126.com>

default disable jobflows validate

add jobflow flow dag validate

Signed-off-by: dongjiang <dongjiang1989@126.com>

default disable jobflows validate

fix gofmt

Signed-off-by: dongjiang <dongjiang1989@126.com>

add jobflow flow dag validate

Signed-off-by: dongjiang <dongjiang1989@126.com>

default disable jobflows validate

fix gofmt

Signed-off-by: dongjiang <dongjiang1989@126.com>
2025-05-29 17:14:05 +08:00
dongjiang 5768676e33
add jobflow metrics
Signed-off-by: dongjiang <dongjiang1989@126.com>
2025-05-29 14:55:58 +08:00
Volcano Bot 60ad1da6d2
Merge pull request #4325 from Monokaix/nt-doc
Add NetworkTopology plugin score doc
2025-05-29 09:51:53 +08:00
Volcano Bot 02c604b329
Merge pull request #4322 from Monokaix/nt-discovery-new
Add hyperNode controller framework and IB discovery
2025-05-28 19:35:50 +08:00
Monokaix 902de0fc40 Add hyperNode controller framework and IB discovery
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-28 17:34:35 +08:00
wangbin c78df02ada Add NetworkTopology plugin score doc
Signed-off-by: wangbin <994903808@qq.com>
2025-05-28 15:03:52 +08:00
Volcano Bot 4855ae9e00
Merge pull request #4321 from MondayCha/master
Fix admission webhook with labelSelector for hyperNode
2025-05-27 19:23:50 +08:00
Yilong Li c19810df43
Fix admission webhook of labelSelector for hyperNode
Signed-off-by: Yilong Li <mondaycha@outlook.com>
2025-05-27 18:43:34 +08:00
Volcano Bot a27698cb1c
Merge pull request #4135 from dongjiang1989/fix-jobflow-status-when-vcjob-failed
fix: Fix jobflow status  from `running` to `failed` FSM
2025-05-27 16:42:49 +08:00
Volcano Bot c3cba84bf9
Merge pull request #3799 from JesseStutler/dra_dev
feature: Add dynamic resource allocation(DRA) plugin
2025-05-27 14:35:49 +08:00
Volcano Bot 379234ea2f
Merge pull request #4302 from mahdikhashan/add-citation
Add citation
2025-05-27 14:20:49 +08:00
Volcano Bot 73934c1975
Merge pull request #4290 from sailorvii/master
Add vgpu dynamic mig
2025-05-27 13:29:52 +08:00
chenw66 4b8409bf49 Add vgpu dynamic mig
Signed-off-by: chenw66 <chenw66@chinaunicom.cn>
2025-05-27 12:55:15 +08:00
mahdikhashan 985a672822 add citation as cff file and text in readme
Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>
2025-05-26 14:31:37 +02:00
JesseStutler 90d92512b4 feature: Add DRA Implementation
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-05-26 20:27:23 +08:00
Volcano Bot 48f2b4fd84
Merge pull request #4319 from Monokaix/api
fix go mod and informer resync
2025-05-26 19:04:48 +08:00
Monokaix 3e2c7885d3 fix go mod and informer resync
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-26 18:29:05 +08:00
Volcano Bot 77dca963cd
Merge pull request #4310 from Monokaix/network-to-master
Network Topology Aware Scheduling capability move to master
2025-05-26 16:14:51 +08:00
Volcano Bot 2ab7d08673
Merge pull request #4174 from Hcryw/feat/add-clusterrole-for-editor-and-viewer
 feat: add clusterrole for editor & viewer
2025-05-26 15:41:51 +08:00
Volcano Bot 8b52d78956
Merge pull request #4047 from sfc-gh-raravena/upstream/ricardo-fix
Add resync-period flag for k8s native informers
2025-05-26 14:15:51 +08:00
wangbin ba7df2546d Add network-topology-aware plugin and hyperNode score callback
Signed-off-by: wangbin <994903808@qq.com>
2025-05-26 11:28:56 +08:00
Volcano Bot cbc4ebb413
Merge pull request #4105 from dongjiang1989/fix-security
Fix: Incorrect conversion between integer types
2025-05-23 17:19:48 +08:00
haoriwei1 bde1e67490 feat: add cluster for editor & viewer
Signed-off-by: haoriwei1 <haoriwei1@sensetime.com>

feat: add common_labels

Signed-off-by: haoriwei1 <haoriwei1@sensetime.com>

 feat: drop priority fields

Signed-off-by: haoriwei1 <haoriwei1@sensetime.com>
2025-05-23 16:50:54 +08:00
wangbin 385ec23d6d HyperNode supports select Nodes By labels
Signed-off-by: wangbin <994903808@qq.com>
2025-05-23 11:25:22 +08:00
jessestutler 312de89d73 Add hypernodes validation webhook configuration yaml
Signed-off-by: jessestutler <chenzicong4@huawei.com>
2025-05-23 11:25:22 +08:00
ecosysbin b6a4a74bf1 Volcano scheduler: Supports network topology aware scheduling when pods rescheduled
Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>
2025-05-23 11:25:22 +08:00
JesseStutler 0ed5387e8a Move GetAncestors as HyperNodeInfoMap method and provide GetLCAHyperNode
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-05-23 11:25:22 +08:00
weapons97 2d0bad9443 update for comment bugs
Signed-off-by: weapons97 <weapons97@gmail.com>
2025-05-23 11:25:22 +08:00
weipeng b28c6a8e47 admission(hypernode) validate memberSelector for issue #3878
Signed-off-by: weipeng <weapons97@gmail.com>
2025-05-23 11:25:22 +08:00
xuwentao 663e495560 webhook(hypernode): validate memberSelectorType
Signed-off-by: xuwentao <cutenear1993@yahoo.com>
2025-05-23 11:25:22 +08:00
Monokaix 06329c0165 add node event
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-23 11:25:22 +08:00
Monokaix d7c57329f8 move parent to hyperNodeInfo
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-23 11:25:22 +08:00
Monokaix fd677e4c32 Update: Populate hyperNode info
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-23 11:25:21 +08:00
penggu 269994f5d1 Populate hyperNode tree in cache
Signed-off-by: penggu <penggu@gmail.com>
2025-05-23 11:16:44 +08:00
Monokaix 557ac2a40b Add plugin for networkTopology and score logic
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-23 11:14:31 +08:00
Monokaix ef7aed7824 Network topology implements of scheduler
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-23 11:14:28 +08:00
Monokaix b6d6253afe Auto generate CRD yaml
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-23 11:02:33 +08:00
Volcano Bot 059f111487
Merge pull request #4152 from JesseStutler/volume_binding
Refactor volume binding and add prebind implementation
2025-05-23 09:28:48 +08:00
Volcano Bot 03bb7e28bc
Merge pull request #4298 from halcyon-r/master
Vcctl supports merging multiple kubeconfig to support context switching among multiple k8s clusters.
2025-05-22 18:11:48 +08:00
JesseStutler 363be11267 Add benchmark testing for allocate
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-05-22 16:36:42 +08:00
JesseStutler f2b4f660c4 Refactor volume binding and add prebind implementation
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-05-22 11:10:55 +08:00
Ricardo Aravena 47bbf023d2
Add resync-period flag for k8s native informers
This is to optinally help with cache inconsistencies

Signed-off-by: Ricardo Aravena <ricardo.aravena@snowflake.com>
2025-05-21 14:03:15 -07:00
hairuiyang 961ee64953 Vcctl suopports switching contexts among multiple clusters.
- Vcctl supports merging multiple kubeconfig.
- Refactor some duplicated code.
- Fix vcctl e2e tests after removing --kubeconfig option default value.

Signed-off-by: hairuiyang <hairuiyang@deeproute.ai>
2025-05-21 17:20:05 +08:00
Volcano Bot cd7774c174
Merge pull request #4305 from hwdef/update-jobflow-docs
cleanup: update jobflow example docs
2025-05-21 16:36:47 +08:00
hwdef d3f0ee72fc update jobflow example docs
Signed-off-by: hwdef <hwdefcom@outlook.com>
2025-05-21 16:08:04 +08:00
Volcano Bot 80a0ac39dd
Merge pull request #4307 from Monokaix/queue-not-open
Prevent pod scheduling when reclaim
2025-05-21 14:40:46 +08:00
Monokaix 2123494a54 Prevent pod scheduling when reclaim
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-05-20 16:36:53 +08:00
Volcano Bot 3f864478a7
Merge pull request #4285 from fengruotj/graceful-shutdown
feat: add graceful shutdown server for webhook manager.
2025-05-19 21:46:42 +08:00
tanjie.master b039864d0e feat: add graceful shutdown server for webhook manager.
Signed-off-by: tanjie.master <tanjie.master@bytedance.com>
2025-05-19 20:59:32 +08:00
Volcano Bot 8b392ea0bf
Merge pull request #4300 from mahdikhashan/security-ci-artifact-debug
[ci] debug security score workflow artifact upload failure
2025-05-19 19:50:42 +08:00
mahdikhashan 412c069d92 rename artifact file
Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

change name

Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

increase artifact version

Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

revert name

Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>
2025-05-19 12:43:02 +02:00
Volcano Bot 94bc6c9857
Merge pull request #4266 from mahdikhashan/feature-stale-action
Feature stale action
2025-05-19 17:27:44 +08:00
mahdikhashan 3099c53927 add new stale workflow to run everynight midnight utc
Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

remove old stale bot

Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

remove comment

Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

update

Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

update latest commit and remove labels of issue from exempt

Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

revert issues labels

Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

Fix panic while queues' total guarantee configuration exceed the total resources of the cluster when the node information is not fully synchronized or other situations.

Signed-off-by: hairuiyang <hairuiyang@deeproute.ai>

fix: add miss queue state check in allocatable action

Signed-off-by: googs1025 <googs1025@gmail.com>

fix controller panic for mpi job

Signed-off-by: g00673948 <guoqin10@huawei.com>

Bump golang.org/x/net from 0.36.0 to 0.38.0

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.36.0 to 0.38.0.
- [Commits](https://github.com/golang/net/compare/v0.36.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-version: 0.38.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-19 11:00:21 +02:00
Volcano Bot 026df29e17
Merge pull request #4281 from murali1539/master
WORLD_SIZE calculation for PyTorch Jobs
2025-05-19 16:07:44 +08:00
Volcano Bot f54fa9cbeb
Merge pull request #4196 from volcano-sh/dependabot/go_modules/golang.org/x/net-0.38.0
Bump golang.org/x/net from 0.36.0 to 0.38.0
2025-05-15 17:38:41 +08:00
Volcano Bot 1c5ffeb0b3
Merge pull request #4272 from guoqinwill/fix-mpi
fix controller panic for mpi job
2025-05-15 11:34:38 +08:00
Volcano Bot 3dcbb1b0d0
Merge pull request #4274 from googs1025/fix_queue_close2
fix: add miss queue state check in allocatable action
2025-05-15 11:13:38 +08:00
Volcano Bot 97c5298854
Merge pull request #4273 from halcyon-r/master
Fix panic while queues' total guarantee  exceed the total resource of the cluster in some situations.
2025-05-15 11:10:41 +08:00
g00673948 b764a301bf fix controller panic for mpi job
Signed-off-by: g00673948 <guoqin10@huawei.com>
2025-05-15 10:23:49 +08:00
hairuiyang d02fda27f9 Fix panic while queues' total guarantee configuration exceed the total resources of the cluster when the node information is not fully synchronized or other situations.
Signed-off-by: hairuiyang <hairuiyang@deeproute.ai>
2025-05-14 10:59:07 +08:00
Volcano Bot d2c92115e4
Merge pull request #4282 from Poor12/fix-scheduler
fix: scheduler leader elect namespace not take effect
2025-05-13 19:22:40 +08:00
shentiecheng 4466ae426d fix: scheduler leader elect namespace not take effect
Signed-off-by: shentiecheng <shentiecheng@bytedance.com>
2025-05-13 17:03:49 +08:00
googs1025 e5ee5168be fix: add miss queue state check in allocatable action
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-05-13 13:08:51 +08:00
Muralidhara Juluri ae092898e0 Include only master and worker replicas in WORLD_SIZE calculation for pytorch jobs
Signed-off-by: Muralidhara Juluri <mjuluri@linkedin.com>
2025-05-12 21:50:50 -07:00
jessestutler 3eb1c7223f Add volume binding plugin design doc and add prebind design doc
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-05-13 12:31:55 +08:00
Volcano Bot f564e441a2
Merge pull request #4263 from googs1025/fix_queue_close
fix: prevent the scheduling of pods in noopen queues
2025-05-12 10:15:38 +08:00
Volcano Bot 3fdd845b39
Merge pull request #4264 from googs1025/refactor_util
chore: change BuildPodWithPreeemptionPolicy -> BuildPodWithPreemptionPolicy
2025-05-10 21:13:37 +08:00
googs1025 14940b5e15 chore: change BuildPodWithPreeemptionPolicy -> BuildPodWithPreemptionPolicy
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-05-10 10:29:08 +08:00
googs1025 5756e5e70f fix: prevent the scheduling of pods in noopen queues
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-05-10 09:52:29 +08:00
Volcano Bot cbc4990d75
Merge pull request #3946 from kingeasternsun/improve/capacity-deserved
in capacity plugin attr.deserved no need MinDimensionResource with attr.request
2025-05-09 10:31:33 +08:00
kingeasternsun 740da5f397 fix ut
Signed-off-by: kingeasternsun <kingeasternsun@gmail.com>
2025-05-09 01:48:19 +00:00
kingeasternsun aabc66defe 🐛 fix capacity plugin
Signed-off-by: kingeasternsun <kingeasternsun@gmail.com>
2025-05-08 13:49:58 +00:00
Volcano Bot 9da8c27fa3
Merge pull request #4222 from feyounger/4221
Optimize multiple 'if' statements in the code
2025-05-08 10:05:37 +08:00
dependabot[bot] a86857a31e
Bump golang.org/x/net from 0.36.0 to 0.38.0
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.36.0 to 0.38.0.
- [Commits](https://github.com/golang/net/compare/v0.36.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-version: 0.38.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-08 01:17:44 +00:00
Volcano Bot 2a31823b0a
Merge pull request #4205 from AdamKorcz/fuzz1
Add fuzz tests for job controller
2025-05-08 09:16:35 +08:00
Adam Korczynski 008a39ec06 Add fuzz tests for job controller
Signed-off-by: Adam Korczynski <adam@adalogics.com>
2025-05-07 15:38:00 +01:00
Kevin Wang 7103c18de1
Merge commit from fork
Add http response body size limit
2025-04-30 17:29:49 +08:00
Volcano Bot e59fe4bd83
Merge pull request #4251 from Monokaix/master-revert
Revert github action bump
2025-04-30 09:14:28 +08:00
Monokaix 8ea3c6cfda Revert "Bump helm/chart-releaser-action from 1.5.0 to 1.7.0"
This reverts commit 3f9ce47aa0.

Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-30 00:15:01 +08:00
Monokaix 30cd764a65 Revert "Bump actions/checkout from 3 to 4"
This reverts commit 82f31a6a32.

Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-30 00:14:42 +08:00
Volcano Bot e57b04ed94
Merge pull request #4203 from Monokaix/ci
render scripts using TAG env
2025-04-29 21:39:32 +08:00
Volcano Bot 7050cbadb3
Merge pull request #4207 from JesseStutler/security_audit
[Security] Add security context configuration
2025-04-29 20:10:25 +08:00
JesseStutler 635aa046a5 Add security context configuration
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-04-29 19:30:16 +08:00
feyounger e9e548d28f Optimize multiple 'if' statements in the code
Signed-off-by: feyounger <1477865250@qq.com>

Optimize multiple 'if' statements in the code

Signed-off-by: feyounger <1477865250@qq.com>

Optimize multiple 'if' statements in the code

Signed-off-by: feyounger <1477865250@qq.com>

Optimize multiple 'if' statements in the code

Signed-off-by: feyounger <1477865250@qq.com>

Optimize multiple 'if' statements in the code

Signed-off-by: feyounger <1477865250@qq.com>

Optimize multiple 'if' statements in the code

Signed-off-by: feyounger <1477865250@qq.com>
2025-04-25 18:16:35 +08:00
Volcano Bot 81d47c8e94
Merge pull request #4215 from feyounger/4214
Clear multiple generated hash values
2025-04-25 10:38:23 +08:00
Volcano Bot c46b398588
Merge pull request #4223 from Monokaix/fix-ci
Fix ci err caused by slow change of scheduling configMap
2025-04-24 17:24:25 +08:00
Volcano Bot 95c879924d
Merge pull request #4211 from Monokaix/warning
[Security] Add warning msg when TLS verification disabled
2025-04-24 16:34:25 +08:00
Volcano Bot 54801ace49
Merge pull request #4208 from Monokaix/timeout
[Security] Add http server timeout
2025-04-24 15:28:33 +08:00
Monokaix e59162587a Fix ci err caused by slow change of scheduling configMap
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-24 15:05:50 +08:00
Monokaix a5420c65e9 Add http server timeout
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-24 10:04:44 +08:00
feyounger bf4bbe0f6d Clear multiple generated hash values
Signed-off-by: feyounger <1477865250@qq.com>
2025-04-23 15:49:54 +08:00
Volcano Bot 2bd5be7f3c
Merge pull request #4212 from Monokaix/log
adjust e2e log level
2025-04-23 15:04:25 +08:00
Monokaix 4bbb73f1f6 render scripts using TAG env
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-23 14:53:53 +08:00
Monokaix 6afeac3789 adjust e2e log level
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-23 09:22:40 +08:00
Monokaix b4f2da4d86 Add warning msg when TLS verification disabled
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-22 16:46:22 +08:00
Volcano Bot 4b31662f59
Merge pull request #4000 from sceneryback/pgcontroller_create_use_replicas
use replicas when initializing pg minResources
2025-04-22 10:59:25 +08:00
sceneryback 771f947e89 use group-min-member annotation to initialize PodGroup
Signed-off-by: sceneryback <afterbreeze@hotmail.com>
2025-04-22 09:56:04 +08:00
Monokaix 76285a04ad Add http response body size limit
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-21 19:09:39 +08:00
Volcano Bot df30fcd45b
Merge pull request #4161 from volcano-sh/dependabot/github_actions/actions/checkout-4
Bump actions/checkout from 3 to 4
2025-04-21 15:50:22 +08:00
Volcano Bot 09a9d735e9
Merge pull request #4149 from SataQiu/fix-vgpu-type-check
Fix bug for vgpu type check
2025-04-21 14:41:22 +08:00
Volcano Bot 99dc4a179a
Merge pull request #4206 from wangyang0616/add_approver
Add wangyang0616 as approver
2025-04-21 10:14:22 +08:00
Volcano Bot 256ec14f46
Merge pull request #4204 from hiwangzhihui/fix_v100_name
fix docs v100 name
2025-04-21 10:08:22 +08:00
wangyang a1974ebc05 Add wangyang0616 as approver
Signed-off-by: wangyang <wangyang8126@gmail.com>
2025-04-20 16:02:49 +08:00
zhihui wang 88ad30f63e
Merge branch 'master' into fix_v100_name 2025-04-18 16:40:25 +08:00
Volcano Bot 79f75c4876 Merge pull request #4192 from Monokaix/readme
Update readme

Signed-off-by: wangzhihui <wzhmsgs@gmail.com>
2025-04-18 16:30:44 +08:00
Volcano Bot 3cde16d87f
Merge pull request #4192 from Monokaix/readme
Update readme
2025-04-18 14:16:20 +08:00
Volcano Bot 0303bb2a8c
Merge pull request #4201 from dongjiang1989/fix-controller-manager-metrics
Fix: remove controller-manager metrics that should not be introduced
2025-04-18 14:14:19 +08:00
dongjiang 69e9da3df6
fix controller-manager metrics
Signed-off-by: dongjiang <dongjiang1989@126.com>
2025-04-18 13:32:07 +08:00
Volcano Bot 7a3cb5f6c0
Merge pull request #4199 from feyounger/4198
[Refactor] Cover case checkControllers ut
2025-04-18 10:51:19 +08:00
Volcano Bot 247d930ae0
Merge pull request #3603 from HalfBuddhist/statefulset-gc
fix(controller): add statefulset gc for podgroup.
2025-04-17 20:16:19 +08:00
feyounger 8f3c4b96d7 [Refactor] Cover case checkControllers ut
Signed-off-by: feyounger <1477865250@qq.com>
2025-04-17 17:17:50 +08:00
Monokaix 2e000e7c7d Update readme
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-17 15:48:33 +08:00
dependabot[bot] 82f31a6a32
Bump actions/checkout from 3 to 4
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-16 09:08:37 +00:00
Volcano Bot 6af0339add
Merge pull request #4194 from Monokaix/action
Update ubuntu base image
2025-04-16 17:07:18 +08:00
Monokaix 71ca62772c Update ubuntu base image
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-16 15:52:57 +08:00
Volcano Bot d07cd5549a
Merge pull request #4173 from JesseStutler/pprof_opt
[Security] Add a switch to control whether enable pprof in scheduler
2025-04-16 14:16:18 +08:00
Volcano Bot 1b1547ec8a
Merge pull request #4158 from fjq123123/master
delete job vaild action in openSession
2025-04-15 17:18:17 +08:00
fengjianqing b5a8b2b391 Signed-off-by: fengjianqing <1416100064@qq.com>
delete job vaild action in openSession
2025-04-15 09:43:05 +08:00
Volcano Bot b9793c4c4c
Merge pull request #4186 from Monokaix/ci-permission
Add topLevel permission for CI
2025-04-14 17:56:14 +08:00
Volcano Bot 311f0d05a9
Merge pull request #4185 from Monokaix/ci
fix scorecards ci err
2025-04-14 17:55:17 +08:00
Volcano Bot 299dcf9f94
Merge pull request #4184 from Monokaix/readme
Add v1.11 compatibility matrix
2025-04-14 17:54:19 +08:00
Volcano Bot ff67e82d01
Merge pull request #4181 from de6p/parent-col
feat/vcctl: add parent column to queue list cmd
2025-04-14 16:13:14 +08:00
Monokaix 4d1b45bd32 Add topLevel permission for ci
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-12 16:31:31 +08:00
Monokaix 5813c9c21d fix scorecards ci err
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-12 15:15:58 +08:00
Kuldeep ab5974f076 feat/vcctl: add parent column to queue list cmd
Signed-off-by: Kuldeep <de6p97@gmail.com>
2025-04-11 03:20:03 -04:00
Volcano Bot 8e600794e9
Merge pull request #3953 from archlitchi/uniconfig
volcano-devices unified config
2025-04-11 11:35:11 +08:00
limengxuan 2ae6d4684e update configs for dynamic-volcano
Signed-off-by: limengxuan <391013634@qq.com>
2025-04-11 10:40:15 +08:00
Monokaix 250cbde53b Add v1.11 compatibility matrix
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-04-11 10:07:24 +08:00
Volcano Bot 3e2dc90c37
Merge pull request #4183 from Poor12/fix-allocated
fix: allocated in queue status should include allocated tasks, not only running tasks
2025-04-10 21:03:13 +08:00
shentiecheng 94ff6b205d fix: queue status allocated included allocated tasks
Signed-off-by: shentiecheng <shentiecheng@bytedance.com>
2025-04-10 15:46:36 +08:00
Volcano Bot 61b4f828ff
Merge pull request #4180 from googs1025/chore/metrics_system
chore: rename VolcanoNamespace -> VolcanoSubSystemName in metrics
2025-04-10 11:48:10 +08:00
googs1025 a4f0f974ab chore: rename VolcanoNamespace -> VolcanoSubSystemName in metrics
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-04-10 09:06:47 +08:00
Volcano Bot 0506de5d16
Merge pull request #4179 from JackyTYang/fixbug
fix: panic when total guarantee of child queue exceeds capacity of parent
2025-04-09 14:32:12 +08:00
Volcano Bot 2cbcb9b69b
Merge pull request #4177 from co63oc/fix3
Fix typos
2025-04-09 14:19:09 +08:00
Volcano Bot cf422defa0
Merge pull request #4176 from volcano-sh/dependabot/go_modules/golang.org/x/crypto-0.37.0
Bump golang.org/x/crypto from 0.35.0 to 0.37.0
2025-04-09 14:17:12 +08:00
Volcano Bot 3fcef07664
Merge pull request #4171 from JesseStutler/security_enhance
[Security] Remove the execute permission for some files, chmod to 644
2025-04-09 10:21:10 +08:00
co63oc ec6304641d Fix typos
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-04-08 20:38:24 +08:00
JackyTYang fd9e6ee0e9
modify import sort
Signed-off-by: JackyTYang <jackydtt@yeah.net>
2025-04-08 10:56:27 +08:00
JackyTYang 81424b85ad
fix panic when sum of queue's guarantee exceeds the capacity of parent queue
Signed-off-by: JackyTYang <jackydtt@yeah.net>
2025-04-08 00:16:55 +08:00
Volcano Bot 690b4b0363
Merge pull request #4160 from baddoub/master
fix: remove lessPartly condition in reclaimable fn from capacity and …
2025-04-07 14:33:10 +08:00
dependabot[bot] b4d6a93950
Bump golang.org/x/crypto from 0.35.0 to 0.37.0
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.35.0 to 0.37.0.
- [Commits](https://github.com/golang/crypto/compare/v0.35.0...v0.37.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-version: 0.37.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-07 01:46:10 +00:00
JesseStutler 7d7fea8246 add a switch to control whether enable pprof in scheduler
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-04-03 12:34:34 +08:00
JesseStutler 7a6ca8baa2 Remove the execute permission for some files, chmod to 644
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-04-03 10:00:40 +08:00
baddoub 975277f55f fix: remove lessPartly condition in reclaimable fn from capacity and proportion plugins
Signed-off-by: baddoub <badr.baddou@gmail.com>
2025-03-29 21:55:41 +04:00
Volcano Bot 91c92e9d61
Merge pull request #3937 from zedongh/feat-support-scalar-resource-metric
feat: support scalar resources metric
2025-03-29 11:37:02 +08:00
SataQiu f49301a0e6 fix bug for vgpu type check
Signed-off-by: SataQiu <shidaqiu2018@gmail.com>
2025-03-28 17:32:52 +08:00
Volcano Bot 7715973c0b
Merge pull request #4141 from JesseStutler/czc_dev
Replace queue status update by using ApplyStatus method
2025-03-27 17:12:01 +08:00
Volcano Bot 5f9960b3a2
Merge pull request #4117 from ouyangshengjia/gang-msg
improve fail messages for pod scheduling in gang unschedulable scenario
2025-03-27 16:41:00 +08:00
ouyangshengjia 9f7123d44a improve fail messages for pod scheduling in gang unschedulable scenario
Signed-off-by: ouyangshengjia <oysj2016@163.com>
2025-03-27 15:38:13 +08:00
jessestutler 37adcdb0ef Replace queue status update by using ApplyStatus method
Signed-off-by: jessestutler <chenzicong4@huawei.com>
2025-03-26 16:43:14 +08:00
Volcano Bot 0b87df7c9c
Merge pull request #4142 from ytcisme/bug-fix
fix: the problem that PVC will be continuously created indefinitely
2025-03-26 14:10:59 +08:00
Jing Yu e3383bea6b fix: the problem that PVC will be continuously created indefinitely
Signed-off-by: Jing Yu <ytcisme@aliyun.com>
2025-03-26 11:44:25 +08:00
Volcano Bot 1ae9a84e16
Merge pull request #4110 from volcano-sh/dependabot/github_actions/actions/setup-java-4
Bump actions/setup-java from 3 to 4
2025-03-25 10:11:58 +08:00
dependabot[bot] f145d963b1
Bump actions/setup-java from 3 to 4
Bumps [actions/setup-java](https://github.com/actions/setup-java) from 3 to 4.
- [Release notes](https://github.com/actions/setup-java/releases)
- [Commits](https://github.com/actions/setup-java/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/setup-java
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-24 14:22:58 +00:00
Volcano Bot da2d331362
Merge pull request #4140 from volcano-sh/dependabot/github_actions/helm/chart-releaser-action-1.7.0
Bump helm/chart-releaser-action from 1.5.0 to 1.7.0
2025-03-24 22:16:58 +08:00
dependabot[bot] 3f9ce47aa0
Bump helm/chart-releaser-action from 1.5.0 to 1.7.0
Bumps [helm/chart-releaser-action](https://github.com/helm/chart-releaser-action) from 1.5.0 to 1.7.0.
- [Release notes](https://github.com/helm/chart-releaser-action/releases)
- [Commits](https://github.com/helm/chart-releaser-action/compare/v1.5.0...v1.7.0)

---
updated-dependencies:
- dependency-name: helm/chart-releaser-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-24 13:39:41 +00:00
Volcano Bot 5074e8bc46
Merge pull request #4111 from volcano-sh/dependabot/github_actions/docker/login-action-3
Bump docker/login-action from 1 to 3
2025-03-24 21:36:58 +08:00
Volcano Bot 3f9e99941f
Merge pull request #4079 from SataQiu/try-nominated-pod-first
scheduler: consider the nominated node first when allocating Node for Pod
2025-03-24 20:47:55 +08:00
dongjiang 87d4b336c9 fix jobflow running to failed FSM
Signed-off-by: dongjiang <dongjiang1989@126.com>

add TODO for the if condition judgment is implemented
2025-03-24 20:41:30 +08:00
dependabot[bot] 157f7fc7da
Bump docker/login-action from 1 to 3
Bumps [docker/login-action](https://github.com/docker/login-action) from 1 to 3.
- [Release notes](https://github.com/docker/login-action/releases)
- [Commits](https://github.com/docker/login-action/compare/v1...v3)

---
updated-dependencies:
- dependency-name: docker/login-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-24 12:29:41 +00:00
Volcano Bot ab48e588a6
Merge pull request #4109 from volcano-sh/dependabot/github_actions/github/codeql-action-3
Bump github/codeql-action from 1 to 3
2025-03-24 20:27:56 +08:00
dependabot[bot] 4e2097c87e
Bump github/codeql-action from 1 to 3
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 1 to 3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/v1...v3)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-24 11:59:48 +00:00
Volcano Bot 0b1c5d4c32
Merge pull request #4108 from volcano-sh/dependabot/github_actions/ossf/scorecard-action-2.4.1
Bump ossf/scorecard-action from 2.3.1 to 2.4.1
2025-03-24 19:58:55 +08:00
Volcano Bot bdcd298324
Merge pull request #4138 from JesseStutler/cleanup
cleanup residual useless victims code in preempt action
2025-03-24 19:45:58 +08:00
Volcano Bot 8778c8ebdc
Merge pull request #4115 from volcano-sh/dependabot/go_modules/golang.org/x/net-0.36.0
Bump golang.org/x/net from 0.26.0 to 0.36.0
2025-03-24 17:36:58 +08:00
Volcano Bot b99fd49977
Merge pull request #4136 from dongjiang1989/chore-github-action
chore: Change dependabot schedule interval to weekly
2025-03-24 16:50:58 +08:00
dependabot[bot] e91dd68313
Bump ossf/scorecard-action from 2.3.1 to 2.4.1
Bumps [ossf/scorecard-action](https://github.com/ossf/scorecard-action) from 2.3.1 to 2.4.1.
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](0864cf1902...f49aabe0b5)

---
updated-dependencies:
- dependency-name: ossf/scorecard-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-24 08:31:28 +00:00
Volcano Bot e8647615e3
Merge pull request #4126 from volcano-sh/dependabot/github_actions/actions/setup-go-5
Bump actions/setup-go from 4 to 5
2025-03-24 16:28:58 +08:00
JesseStutler c494f9dbdc cleanup residual useless victims code in preempt action
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-03-24 14:20:51 +08:00
Volcano Bot 0403a8849a
Merge pull request #4123 from qGentry/feature-cdp-plugin-for-reclaim
Enabled Cooldown Protection Plugin for reclaiming also
2025-03-24 10:27:58 +08:00
Volcano Bot 1db1673cce
Merge pull request #4124 from cnmcavoy/remove-hostpath-mount
fix: remove hostpath mount in volcano-scheduler
2025-03-21 23:55:56 +08:00
dongjiang 5e8f2a42c5
dependabot schedule interval to weekly
Signed-off-by: dongjiang <dongjiang1989@126.com>
2025-03-21 20:21:15 +08:00
dependabot[bot] bbc11c8b24
Bump golang.org/x/net from 0.26.0 to 0.36.0
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.26.0 to 0.36.0.
- [Commits](https://github.com/golang/net/compare/v0.26.0...v0.36.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-21 11:57:15 +00:00
Volcano Bot b6eae3fa85
Merge pull request #4132 from Monokaix/pg-validate
Move queue state validate from pod to podgroup
2025-03-21 19:55:55 +08:00
Monokaix 0f57b1bd35 Move queue state validate from pod to podgroup
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-03-21 18:17:05 +08:00
dependabot[bot] 97a21e9dbb
Bump actions/setup-go from 4 to 5
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 4 to 5.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](https://github.com/actions/setup-go/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-21 10:04:02 +00:00
Volcano Bot a47b2e6a10
Merge pull request #4099 from guoqinwill/adapt1.32-1
feat: Volcano adapts to the k8s v1.32
2025-03-21 18:02:56 +08:00
Volcano Bot ef646175f4
Merge pull request #4119 from JesseStutler/fix_hierarchy_enqueue
fix: add hierarchy queue validation and update for enqueueable
2025-03-21 17:41:53 +08:00
Filipp Fisin f2ad2dc394 Enabled CDP plugin for reclaiming also
Signed-off-by: Filipp Fisin <ffisin@nebius.com>
2025-03-21 09:11:42 +01:00
Volcano Bot 9217619a16
Merge pull request #4128 from Monokaix/pg-mutate
Remove podgroup mutating webhook by default
2025-03-21 14:48:55 +08:00
guoqin 408e7bbce5 fix ci errors
Signed-off-by: guoqin <gq411will@163.com>
2025-03-21 14:45:53 +08:00
Monokaix 89425537fe Remove podgroup mutating webhook by default
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-03-21 12:48:41 +08:00
Volcano Bot 979a45fd83
Merge pull request #4120 from Monokaix/webhook
Remove pod mutate webhook by default
2025-03-21 10:49:55 +08:00
JesseStutler ea087a98c6 fix: add hierarchy queue validation and update for enqueueable
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-03-21 10:21:55 +08:00
Cameron McAvoy 9c5f05b057 fix: remove hostpath mount in volcano-scheduler
Signed-off-by: Cameron McAvoy <cmcavoy@indeed.com>
2025-03-20 13:56:21 -05:00
Volcano Bot 1ac4fb3349
Merge pull request #4121 from Monokaix/ut
fix flaky test
2025-03-20 19:56:55 +08:00
Monokaix b952f9230b fix flaky test
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-03-20 19:27:14 +08:00
Monokaix 30083b4316 Remove pod mutate webhook by default
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-03-20 17:33:26 +08:00
dongjiang c8e22193ab fix incorrect conversion
Signed-off-by: dongjiang <dongjiang1989@126.com>
2025-03-20 17:27:09 +08:00
Volcano Bot fc1a634c24
Merge pull request #4107 from volcano-sh/dependabot/github_actions/docker/setup-buildx-action-3
Bump docker/setup-buildx-action from 2 to 3
2025-03-20 14:08:54 +08:00
Volcano Bot 2a509607a3
Merge pull request #4113 from Monokaix/readme
update readme
2025-03-19 20:50:54 +08:00
Monokaix 6fd2b2206c update readme
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-03-19 18:26:37 +08:00
dependabot[bot] 6ead09f1df
Bump docker/setup-buildx-action from 2 to 3
Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 2 to 3.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](https://github.com/docker/setup-buildx-action/compare/v2...v3)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-19 06:19:40 +00:00
Volcano Bot 957e5f8f10
Merge pull request #4077 from dongjiang1989/auto-update-github-action
CI: add dependabot
2025-03-19 14:18:53 +08:00
Volcano Bot 2ef8335d93
Merge pull request #4088 from feyounger/4064
[Refactor] controller cache deletePod logic,skip create
2025-03-17 21:53:53 +08:00
Volcano Bot 57a83e3a7d
Merge pull request #4087 from feyounger/4086
Optimize append operations for better performance
2025-03-17 21:51:53 +08:00
guoqin bb39fb1b4a update the versions of related tools such as kind and controller-gen,etc
Signed-off-by: guoqin <gq411will@163.com>
Signed-off-by: guoqinwill <guoqinwill@163.com>
2025-03-17 20:35:04 +08:00
dongjiang 9ae88f0961 add dependabot
Signed-off-by: dongjiang <dongjiang1989@126.com>
2025-03-17 13:43:04 +08:00
Volcano Bot 8d1658ac7c
Merge pull request #4092 from yuyue9284/fix-feature-gate
[fix] update feature flag for job support
2025-03-17 09:45:51 +08:00
Volcano Bot 86ce2e2bc4
Merge pull request #4097 from Monokaix/ci
Add more info when e2e failed
2025-03-16 13:34:50 +08:00
Yue Yu 122cc94d74 [fix] update feature flag for job support
Signed-off-by: Yue Yu <yuyue9284@outlook.com>
2025-03-15 09:44:08 -07:00
Volcano Bot c439c553f6
Merge pull request #3901 from linuxfhy/dev_20241217_gpu_alloc_repeatedly
take gpu-number into consideration
2025-03-14 17:24:48 +08:00
Monokaix cdaac42961 Add more info when e2e failed
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-03-14 14:26:12 +08:00
guoqin 7f9b3ef14f update the volumebinding with k8s v1.32
Signed-off-by: guoqin <gq411will@163.com>
2025-03-14 11:20:06 +08:00
guoqin 3179b82525 modify gomod for volcano supports k8s v1.32
Signed-off-by: guoqin <gq411will@163.com>
2025-03-14 11:19:50 +08:00
danish9039 e1e4b15658 dependency upgrade
Signed-off-by: danish9039 <danishsiddiqui040@gmail.com>
2025-03-14 11:17:48 +08:00
Volcano Bot e2abd7ee4e
Merge pull request #4090 from dongjiang1989/fix-jobflow-status
fix: Fix jobflow `status` confusion problem
2025-03-14 10:45:45 +08:00
dongjiang 7c2ebae344
fix muti jobflow status error
Signed-off-by: dongjiang <dongjiang1989@126.com>

add unittest case for jobflow status and fix unittest case
2025-03-14 09:33:39 +08:00
feyounger 870528bb97 [Refactor] controller cache deletePod logic,skip create
Signed-off-by: feyounger <1477865250@qq.com>
2025-03-13 10:59:04 +08:00
Volcano Bot f993e0fab6
Merge pull request #4082 from feyounger/4081
Optimized code
2025-03-12 15:47:46 +08:00
feyounger b32fb530d8 [Refactor] Optimize append operations for better performance
Signed-off-by: feyounger <1477865250@qq.com>
2025-03-12 13:15:41 +08:00
feyounger 7f3e01f43c [refactor] optimized code
Signed-off-by: feyounger <1477865250@qq.com>
2025-03-11 15:49:22 +08:00
SataQiu 638226de4d scheduler: consider the nominated node first when allocating Node for Pod
Signed-off-by: SataQiu <shidaqiu2018@gmail.com>
2025-03-11 10:49:47 +08:00
Volcano Bot 79a8e41ebd
Merge pull request #4067 from hwdef/remove-label-template
delete label tempalte
2025-03-10 21:46:42 +08:00
Volcano Bot 375adaec52
Merge pull request #4063 from SataQiu/fix-cache-20250307
Using TypedRateLimitingInterface instead of deprecated RateLimitingInterface
2025-03-10 21:44:45 +08:00
hwdef 4d2f46acab delete label tempalte
Signed-off-by: hwdef <hwdefcom@outlook.com>
2025-03-07 19:37:32 +08:00
Volcano Bot de430f5e79
Merge pull request #4042 from Monokaix/custom-plugin
fix custom plugin doc and build script
2025-03-07 17:57:43 +08:00
Monokaix 05e389d94a fix custom plugin doc and build script
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-03-07 15:24:59 +08:00
SataQiu 0f13c7a284 Using TypedRateLimitingInterface instead of deprecated RateLimitingInterface
Signed-off-by: SataQiu <shidaqiu2018@gmail.com>
2025-03-07 14:13:29 +08:00
Volcano Bot 5f72a837e4
Merge pull request #4018 from mahdikhashan/typo-in-development
[docs] improve the `development.md` document
2025-03-07 10:24:42 +08:00
Volcano Bot 98e5e20743
Merge pull request #4033 from zhutong196/fix-hierarchicalSubqueue-master
[bugfix]fix creating a hierarchical sub-queue will be rejected
2025-03-05 16:34:40 +08:00
Volcano Bot 62d54589ce
Merge pull request #4060 from SataQiu/fix-doc-20250304
Fix inaccurate statements in node-lock.md
2025-03-05 15:55:40 +08:00
Volcano Bot 17ff48d086
Merge pull request #4020 from mahdikhashan/typo-in-admin-job-go
fix typo in comment
2025-03-05 15:53:38 +08:00
zhutongtong e8f6a3b835 [bugfix]fix creating a hierarchical sub-queue will be rejected
Signed-off-by: zhutongtong <zhutongcloud@163.com>
2025-03-05 15:48:04 +08:00
Volcano Bot 9c93038e3b
Merge pull request #4058 from feyounger/dev
refactor: Optimized code
2025-03-05 14:07:41 +08:00
Volcano Bot 224cc4d3c6
Merge pull request #4009 from SataQiu/fix-backfill-20250213
scheduler: fix a bug where the job NodesFitErrors field is not updated when ssn.Allocate failed
2025-03-05 11:13:40 +08:00
SataQiu 80cf3b7b98 fix inaccurate statements in node-lock.md
Signed-off-by: SataQiu <shidaqiu2018@gmail.com>
2025-03-05 10:35:10 +08:00
feyounger 37bea93567 refactor: Optimized code
Signed-off-by: feyounger <1477865250@qq.com>
2025-03-05 09:31:58 +08:00
zedongh efa833ecec feat: support scalar resources metric
Signed-off-by: zedongh <248348907@qq.com>
2025-03-04 21:52:54 +08:00
Volcano Bot 2832f11cfc
Merge pull request #4059 from Monokaix/action
change to action cache v4
2025-03-04 21:30:40 +08:00
Volcano Bot a7b53e6d1d
Merge pull request #4045 from SataQiu/clean-20250226
Refactor: move DeviceName const into its own package
2025-03-04 20:36:40 +08:00
Monokaix 7bd8ea3aa8 change to action cache v4
Signed-off-by: Monokaix <changxuzheng@huawei.com>
2025-03-04 19:26:29 +08:00
Volcano Bot 6d432aa91c
Merge pull request #4053 from sfc-gh-raravena/patch-1
Improve overused messaging
2025-03-04 10:01:39 +08:00
Volcano Bot 9406e07786
Merge pull request #4034 from archlitchi/add-vgpu-docs
update document for volcano-vgpu feature
2025-03-03 14:14:39 +08:00
limengxuan 3e205dce78 update-docs
Signed-off-by: limengxuan <391013634@qq.com>
2025-03-03 11:37:52 +08:00
Ricardo Aravena f160d48b48 Improve overused messaging
Fixes: https://github.com/volcano-sh/volcano/issues/4048

Signed-off-by: Ricardo Aravena <ricardo.aravena@snowflake.com>
2025-03-01 02:43:45 +00:00
Volcano Bot cdfb83c160
Merge pull request #4029 from hansongChina/backfill-bugfix
fix: the problem that the pending tasks cannot be scheduled during the backfill action
2025-02-28 14:26:37 +08:00
Volcano Bot 8c1ef708d8
Merge pull request #4024 from mahdikhashan/typo-in-pkg-util
typo: change `configure` to `configuration`
2025-02-28 09:41:36 +08:00
SataQiu 65f4653d7f refactor: move DeviceName const into its own package
Signed-off-by: SataQiu <shidaqiu2018@gmail.com>
2025-02-27 11:42:43 +08:00
hansong d721b999ff fix: the problem that the pending tasks cannot be scheduled during the backfill action
Signed-off-by: hansong <252671213@qq.com>
2025-02-26 16:29:59 +08:00
Volcano Bot ef2e1aaa1e
Merge pull request #3938 from mahmut-Abi/patch-1
Update uninstall-volcano.sh
2025-02-26 10:29:34 +08:00
Volcano Bot a61fc90b8e
Merge pull request #4035 from co63oc/fix2
Fix typos
2025-02-25 22:12:31 +08:00
Volcano Bot 5d66534726
Merge pull request #4036 from mahdikhashan/improve-test-description
[docs]: fix passive tone
2025-02-25 19:43:34 +08:00
mahdikhashan 8984b90e28 fix passive tone
Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>

fix

Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>
2025-02-25 10:08:23 +01:00
Volcano Bot 9a1de2c208
Merge pull request #4013 from SataQiu/cle-20250214
scheduler: correct mismatched error message
2025-02-25 09:24:33 +08:00
Volcano Bot c41fdb13b8
Merge pull request #3994 from xieyanke/master
Fix: fix an issue where the wrong action name could not be ignored
2025-02-25 00:12:34 +08:00
mahdikhashan e86f687e50 fix typo in comment
Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>
2025-02-24 10:30:48 +01:00
co63oc 9ea274efdf Fix
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-02-24 16:07:25 +08:00
mahdikhashan 86f4f44a41 change configure to configuration
Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>
2025-02-21 11:20:21 +01:00
Volcano Bot d29ddf02fa
Merge pull request #3988 from weapons97/add-helm-values-plugins-dir
add helm values scheduler_plugins_dir
2025-02-21 18:19:29 +08:00
Volcano Bot dd2fdf1b30
Merge pull request #4012 from Wang-Kai/skip-unavailable-job
skip the jobs that have no tasks during the close session step in gang plugin
2025-02-21 16:37:30 +08:00
weipeng e92dba9150 add helm values scheduler_plugins_dir
Signed-off-by: weapons97 <weapons97@gmail.com>
2025-02-21 14:30:20 +08:00
Mahdi Khashan 84402f4007
Merge branch 'master' into typo-in-development 2025-02-19 09:22:09 +01:00
Volcano Bot 444a1abe3d
Merge pull request #4016 from mahdikhashan/fix-typo-in-readme
[DOCS] improve readme, visit to should be visit
2025-02-19 12:25:27 +08:00
mahdikhashan f67a5af294 improve the development.md document
Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>
2025-02-18 11:01:13 +01:00
mahdikhashan 4e88d3b2b1 improve readme, visit to should be visit
Signed-off-by: mahdikhashan <mahdikhashan1@gmail.com>
2025-02-18 10:31:18 +01:00
王凯 15ebb45eee skip unavailable jobs during the close session step #4011
Signed-off-by: 王凯 <wangkai05@bilibili.com>
2025-02-17 21:32:52 +08:00
mahmut 6bf844c7aa Update uninstall-volcano.sh
helm uninstall will not delete namespace.

Signed-off-by: mahmut <mahmut@uniontech.com>
2025-02-14 18:02:16 +08:00
SataQiu de50315b9c scheduler: correct mismatched error message
Signed-off-by: SataQiu <shidaqiu2018@gmail.com>
2025-02-14 17:44:14 +08:00
SataQiu f11afe24c9 scheduler: fix a bug where the job NodesFitErrors field is not updated when ssn.Allocate failed
Signed-off-by: SataQiu <shidaqiu2018@gmail.com>
2025-02-14 10:29:12 +08:00
Volcano Bot 6b5bf18a68
Merge pull request #4008 from co63oc/fix1
Fix typos
2025-02-13 19:24:20 +08:00
co63oc a78c8b492e Fix
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-02-13 08:26:54 +08:00
Volcano Bot e141bbce75
Merge pull request #4002 from JesseStutler/owner_update
Add JesseStutler as reviewer
2025-02-12 15:54:19 +08:00
Volcano Bot 25b5174cbf
Merge pull request #3906 from archlitchi/master
Dynamic-mig for volcano-vgpu design
2025-02-10 16:04:18 +08:00
xieyanke 4903810b18
Fix: fix an issue where the wrong action name could not be ignored
Signed-off-by: xieyanke <xieyanke007@gmail.com>
2025-02-08 18:01:59 +08:00
limengxuan e296d49669 update docs
Signed-off-by: limengxuan <391013634@qq.com>
2025-02-08 16:19:30 +08:00
JesseStutler e177ecbba4 Add JesseStutler as reviewer
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
2025-02-08 15:24:03 +08:00
HalfBuddhist b4a2ff9a55 fix(controller): add statefulset gc for podgroup.
fix(controller): Fix podgroup not created finally when rolling upgrade of the statefulset.
refactor(controller): Improve StatefulSet podgroup creation logic

Signed-off-by: HalfBuddhist <liuqingwei2019@gmail.com>
2025-02-08 15:21:22 +08:00
limengxuan 41b64e9bbd
Merge branch 'volcano-sh:master' into master 2025-02-08 14:52:19 +08:00
Volcano Bot bff75bbc0d
Merge pull request #2778 from wangyang0616/update_stale_bot_timeout
Extend the default timeout for stale
2025-02-06 11:09:06 +08:00
limengxuan fb47f10735
Merge branch 'master' into master 2025-01-07 11:25:30 +08:00
limengxuan 3e0ac7037b update design docs for dynamic-volcano
Signed-off-by: limengxuan <391013634@qq.com>
2025-01-07 10:41:22 +08:00
linuxfhy 04f631b106
Merge branch 'master' into dev_20241217_gpu_alloc_repeatedly 2024-12-23 09:20:39 +08:00
linuxfhy 5bb310bd79
Merge branch 'master' into dev_20241217_gpu_alloc_repeatedly 2024-12-20 10:48:16 +08:00
linuxfhy 97098ba7be
Merge branch 'master' into dev_20241217_gpu_alloc_repeatedly 2024-12-17 19:01:21 +08:00
linuxfhy 08eeb16f4b take gpu-number into consideration
fix:issue 3824

Signed-off-by: linuxfhy <linuxfhy@163.com>
2024-12-17 11:10:25 +08:00
wangyang d8626f9f37 Some problems have a long processing period. In order to reduce the explosion of notification information, the default timeout period of stale is extended.
Signed-off-by: wangyang <wangyang8126@gmail.com>
2023-05-22 09:32:34 +08:00
383 changed files with 28901 additions and 3624 deletions

View File

@ -1,24 +1,4 @@
#### What type of PR is this?
<!--
Add one of the following kinds:
/kind bug
/kind cleanup
/kind documentation
/kind feature
/kind failing-test
/kind flake
Optionally add one of the following areas, help us further classify and filter PRs:
/area scheduling
/area controllers
/area cli
/area dependency
/area webhooks
/area deploy
/area documentation
/area performance
/area test
-->
#### What this PR does / why we need it:

45
.github/dependabot.yml vendored Normal file
View File

@ -0,0 +1,45 @@
version: 2
updates:
- package-ecosystem: gomod
directory: /
schedule:
interval: weekly
groups:
k8s-libs:
patterns:
- "k8s.io/api"
- "k8s.io/apiextensions-apiserver"
- "k8s.io/apimachinery"
- "k8s.io/apiserver"
- "k8s.io/cli-runtime"
- "k8s.io/client-go"
- "k8s.io/cloud-provider"
- "k8s.io/cluster-bootstrap"
- "k8s.io/code-generator"
- "k8s.io/component-base"
- "k8s.io/component-helpers"
- "k8s.io/controller-manager"
- "k8s.io/cri-api"
- "k8s.io/cri-client"
- "k8s.io/csi-translation-lib"
- "k8s.io/dynamic-resource-allocation"
- "k8s.io/endpointslice"
- "k8s.io/kube-aggregator"
- "k8s.io/kube-controller-manager"
- "k8s.io/kube-proxy"
- "k8s.io/kube-scheduler"
- "k8s.io/kubectl"
- "k8s.io/kubelet"
- "k8s.io/legacy-cloud-providers"
- "k8s.io/metrics"
- "k8s.io/mount-utils"
- "k8s.io/node-api"
- "k8s.io/pod-security-admission"
- "k8s.io/sample-apiserver"
- "k8s.io/sample-cli-plugin"
- "k8s.io/sample-controller"
- package-ecosystem: github-actions
directory: /
schedule:
interval: weekly

50
.github/stale.yml vendored
View File

@ -1,50 +0,0 @@
# Configuration for probot-stale - https://github.com/probot/stale
# Only issues or pull requests with all of these labels are check if stale. Defaults to `[]` (disabled)
onlyLabels: []
# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable.
# We want stale bot to notify us that something is stale so we can revisit it.
# If one issue is marked as 'reminder' by the reminder bot, we don't mark it as 'stale' again.
exemptLabels:
# This label is hardcoded on remind bot (https://probot.github.io/apps/reminders/) and is used by remind bot when
# issue is being reminded.
- reminder
# Set to true to ignore issues in a project (defaults to false)
exemptProjects: false
# Set to true to ignore issues in a milestone (defaults to false)
exemptMilestones: false
# Set to true to ignore issues with an assignee (defaults to false)
exemptAssignees: false
# Label to use when marking as stale
staleLabel: lifecycle/stale
pull:
daysUntilClose: 60
daysUntilStale: 90
markComment: >
Hello 👋 Looks like there was no activity on this amazing PR for last 90 days.
**Do you mind updating us on the status?** Is there anything we can help with? If you plan to still work on it, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for 60 days, this issue will be closed (we can always reopen a PR if you get back to this!).
#unmarkComment: No need for unmark comment.
closeComment: >
Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗
issues:
daysUntilClose: 60
daysUntilStale: 90
markComment: >
Hello 👋 Looks like there was no activity on this issue for last 90 days.
**Do you mind updating us on the status?** Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).
#unmarkComment: No need for unmark comment.
closeComment: >
Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗
# Limit the number of actions per hour, from 1-30. Default is 30
limitPerRun: 30

View File

@ -16,9 +16,9 @@ jobs:
GOPATH: /home/runner/work/${{ github.repository }}
steps:
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.23.x
- name: Checkout code
uses: actions/checkout@v3
@ -26,7 +26,7 @@ jobs:
fetch-depth: 0
path: ./src/github.com/${{ github.repository }}
- uses: actions/cache@v2
- uses: actions/cache@v4
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}

View File

@ -39,7 +39,7 @@ jobs:
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v1
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
@ -50,7 +50,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v1
uses: github/codeql-action/autobuild@v3
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
@ -64,4 +64,4 @@ jobs:
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v1
uses: github/codeql-action/analyze@v3

49
.github/workflows/e2e_dra.yml vendored Normal file
View File

@ -0,0 +1,49 @@
name: E2E DRA
on:
push:
branches:
- master
tags:
pull_request:
jobs:
e2e_dra:
runs-on: ubuntu-24.04
name: E2E about DRA
timeout-minutes: 40
steps:
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: 1.23.x
- name: Install musl
run: |
wget http://musl.libc.org/releases/musl-1.2.1.tar.gz
tar -xf musl-1.2.1.tar.gz && cd musl-1.2.1
./configure
make && sudo make install
- uses: actions/cache@v4
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
- name: Install dependences
run: |
GO111MODULE="on" go install sigs.k8s.io/kind@v0.26.0
curl -LO https://dl.k8s.io/release/v1.32.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
- name: Checkout code
uses: actions/checkout@v3
- name: Run E2E Tests
run: |
export ARTIFACTS_PATH=${{ github.workspace }}/e2e-dra-logs
make e2e-test-dra CC=/usr/local/musl/bin/musl-gcc
- name: Upload e2e dra logs
if: failure()
uses: actions/upload-artifact@v4
with:
name: volcano_e2e_dra_logs
path: ${{ github.workspace }}/e2e-dra-logs

View File

@ -7,6 +7,10 @@ on:
tags:
pull_request:
permissions:
contents: read
actions: write
jobs:
e2e_parallel_jobs:
runs-on: ubuntu-24.04
@ -14,9 +18,9 @@ jobs:
timeout-minutes: 40
steps:
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.23.x
- name: Install musl
run: |
@ -24,15 +28,15 @@ jobs:
tar -xf musl-1.2.1.tar.gz && cd musl-1.2.1
./configure
make && sudo make install
- uses: actions/cache@v2
- uses: actions/cache@v4
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
- name: Install dependences
run: |
GO111MODULE="on" go install sigs.k8s.io/kind@v0.24.0
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.31.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
GO111MODULE="on" go install sigs.k8s.io/kind@v0.26.0
curl -LO https://dl.k8s.io/release/v1.32.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
- name: Checkout code
uses: actions/checkout@v3

View File

@ -11,12 +11,12 @@ jobs:
e2e_scheduling_actions:
runs-on: ubuntu-24.04
name: E2E about Scheduling Actions
timeout-minutes: 40
timeout-minutes: 50
steps:
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.23.x
- name: Install musl
run: |
@ -24,15 +24,15 @@ jobs:
tar -xf musl-1.2.1.tar.gz && cd musl-1.2.1
./configure
make && sudo make install
- uses: actions/cache@v2
- uses: actions/cache@v4
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
- name: Install dependences
run: |
GO111MODULE="on" go install sigs.k8s.io/kind@v0.24.0
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.31.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
GO111MODULE="on" go install sigs.k8s.io/kind@v0.26.0
curl -LO https://dl.k8s.io/release/v1.32.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
- name: Checkout code
uses: actions/checkout@v3

View File

@ -11,12 +11,12 @@ jobs:
e2e_scheduling_basic:
runs-on: ubuntu-24.04
name: E2E about Basic Scheduling
timeout-minutes: 40
timeout-minutes: 50
steps:
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.23.x
- name: Install musl
run: |
@ -24,15 +24,15 @@ jobs:
tar -xf musl-1.2.1.tar.gz && cd musl-1.2.1
./configure
make && sudo make install
- uses: actions/cache@v2
- uses: actions/cache@v4
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
- name: Install dependences
run: |
GO111MODULE="on" go install sigs.k8s.io/kind@v0.24.0
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.31.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
GO111MODULE="on" go install sigs.k8s.io/kind@v0.26.0
curl -LO https://dl.k8s.io/release/v1.32.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
- name: Checkout code
uses: actions/checkout@v3

View File

@ -11,12 +11,12 @@ jobs:
e2e_sequence:
runs-on: ubuntu-24.04
name: E2E about Sequence
timeout-minutes: 40
timeout-minutes: 50
steps:
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.23.x
- name: Install musl
run: |
@ -24,15 +24,15 @@ jobs:
tar -xf musl-1.2.1.tar.gz && cd musl-1.2.1
./configure
make && sudo make install
- uses: actions/cache@v2
- uses: actions/cache@v4
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
- name: Install dependences
run: |
GO111MODULE="on" go install sigs.k8s.io/kind@v0.24.0
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.31.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
GO111MODULE="on" go install sigs.k8s.io/kind@v0.26.0
curl -LO https://dl.k8s.io/release/v1.32.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
- name: Checkout code
uses: actions/checkout@v3

View File

@ -10,7 +10,7 @@ on:
jobs:
k8s-integration-tests:
name: "E2E about Spark Integration test"
runs-on: ubuntu-20.04
runs-on: ubuntu-24.04
steps:
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
@ -40,7 +40,7 @@ jobs:
ref: branch-3.4
path: ${{ github.workspace }}/spark
- name: Cache Scala, SBT and Maven
uses: actions/cache@v3
uses: actions/cache@v4
with:
path: |
build/apache-maven-*
@ -51,24 +51,24 @@ jobs:
restore-keys: |
build-
- name: Cache Coursier local repository
uses: actions/cache@v3
uses: actions/cache@v4
with:
path: ~/.cache/coursier
key: k8s-integration-coursier-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }}
restore-keys: |
k8s-integration-coursier-
- name: Install Java 8
uses: actions/setup-java@v3
uses: actions/setup-java@v4
with:
distribution: temurin
java-version: 8
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.23.x
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v2
uses: docker/setup-buildx-action@v3
- name: start minikube
run: |
# Use pre-install minikube

View File

@ -11,12 +11,12 @@ jobs:
e2e_vcctl:
runs-on: ubuntu-24.04
name: E2E about Volcano CLI
timeout-minutes: 20
timeout-minutes: 50
steps:
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.23.x
- name: Install musl
run: |
@ -24,15 +24,15 @@ jobs:
tar -xf musl-1.2.1.tar.gz && cd musl-1.2.1
./configure
make && sudo make install
- uses: actions/cache@v2
- uses: actions/cache@v4
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
- name: Install dependences
run: |
GO111MODULE="on" go install sigs.k8s.io/kind@v0.24.0
curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.31.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
GO111MODULE="on" go install sigs.k8s.io/kind@v0.26.0
curl -LO https://dl.k8s.io/release/v1.32.0/bin/linux/amd64/kubectl && sudo install kubectl /usr/local/bin/kubectl
- name: Checkout code
uses: actions/checkout@v3

View File

@ -10,9 +10,9 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-go@v4
- uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.23.x
- run: go version
# Runs a set of commands to initialize and analyze with FOSSA
- name: run FOSSA analysis

View File

@ -11,12 +11,12 @@ jobs:
licenses-lint:
name: Licenses Lint
timeout-minutes: 40
runs-on: ubuntu-22.04
runs-on: ubuntu-24.04
steps:
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: 1.21.x
go-version: 1.23.x
- name: Checkout code
uses: actions/checkout@v3
- name: generate license mirror

View File

@ -16,9 +16,9 @@ jobs:
GOPATH: /home/runner/work/${{ github.repository }}
steps:
- name: Install Go
uses: actions/setup-go@v4
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.23.x
- name: Install musl
run: |
@ -44,7 +44,7 @@ jobs:
echo "TAG=latest" >> $GITHUB_ENV
echo "RELEASE_VER=latest" >> $GITHUB_ENV
- uses: actions/cache@v2
- uses: actions/cache@v4
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
@ -53,12 +53,12 @@ jobs:
run: |
make verify
make TAG=${{ env.TAG }} generate-yaml
make verify-generated-yaml
make RELEASE_TAG=${{ env.TAG }} verify-generated-yaml
sudo make unit-test
working-directory: ./src/github.com/${{ github.repository }}
- name: Login to DockerHub
uses: docker/login-action@v1
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}

View File

@ -33,7 +33,7 @@ jobs:
persist-credentials: false
- name: "Run analysis"
uses: ossf/scorecard-action@0864cf19026789058feabb7e87baa5f140aac736 # v2.3.1
uses: ossf/scorecard-action@f49aabe0b5af0936a0987cfb85d86b75731b0186 # v2.4.1
with:
results_file: results.sarif
results_format: sarif
@ -55,9 +55,9 @@ jobs:
# Upload the results as artifacts (optional). Commenting out will disable uploads of run results in SARIF
# format to the repository Actions tab.
- name: "Upload artifact"
uses: actions/upload-artifact@97a0fba1372883ab732affbe8f94b823f91727db # v3.pre.node20
uses: actions/upload-artifact@v4.6.2
with:
name: SARIF file
name: sarif_file
path: results.sarif
retention-days: 5

47
.github/workflows/stale.yaml vendored Normal file
View File

@ -0,0 +1,47 @@
name: Mark and Close Stale Issues and PRs
on:
schedule:
- cron: '0 0 * * *' # runs every day at midnight UTC
jobs:
stale:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- name: Mark and close stale issues and PRs
uses: actions/stale@5bef64f19d7facfb25b37b414482c7164d639639 #v9.1.0
with:
remove-stale-when-updated: true
stale-issue-label: 'lifecycle/stale'
exempt-issue-labels: 'kind/feature, kind/bug, help wanted, good first issue'
exempt-pr-labels: ''
days-before-stale: 180
days-before-close: 90
stale-issue-message: >
Hello 👋 Looks like there was no activity on this issue for last 180 days.
**Do you mind updating us on the status?** Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for 90 days, this issue will be closed (we can always reopen an issue if we need!).
close-issue-message: >
Closing for now as there was no activity for last 90 days after marked as stale, let us know if you need this to be reopened! 🤗
stale-pr-message: >
Hello 👋 Looks like there was no activity on this amazing PR for last 180 days.
**Do you mind updating us on the status?** Is there anything we can help with? If you plan to still work on it, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for 90 days, this issue will be closed (we can always reopen a PR if you get back to this!).
close-pr-message: >
Closing for now as there was no activity for last 90 days after marked as stale, let us know if you need this to be reopened! 🤗
exempt-all-milestones: false
exempt-all-assignees: false
operations-per-run: 30

View File

@ -69,7 +69,7 @@ linters-settings:
depguard:
rules:
main:
list-mode: strict
list-mode: original
deny:
- pkg: "k8s.io/klog$"
desc: "k8s.io/klog is deprecated, use k8s.io/klog/v2 instead"

24
CITATION.cff Normal file
View File

@ -0,0 +1,24 @@
cff-version: 1.2.0
message: "If you use Volcano in your work, please cite it using the information below. This helps us demonstrate the project's impact."
title: "Volcano: A Cloud Native Batch System"
authors:
- given-names: "Klaus"
family-names: "Ma"
- given-names: "Kevin"
family-names: "Wang"
- name: "The Volcano Community & Contributors"
type: entity
repository-code: "https://github.com/volcano-sh/volcano"
url: "https://volcano.sh/"
keywords:
- "batch system"
- "kubernetes"
- "scheduling"
- "cncf"
- "high-performance computing"
- "big data"
- "artificial intelligence"
- "machine learning"
license: "Apache-2.0"
notes: "For a specific version, please find the release tag or use the general project information. A comprehensive list of releases can be found at: https://github.com/volcano-sh/volcano/releases"

View File

@ -19,6 +19,7 @@ IMAGE_PREFIX=volcanosh
CRD_OPTIONS ?= "crd:crdVersions=v1,generateEmbeddedObjectMeta=true"
CRD_OPTIONS_EXCLUDE_DESCRIPTION=${CRD_OPTIONS}",maxDescLen=0"
CC ?= "gcc"
MUSL_CC ?= "/usr/local/musl/bin/musl-gcc"
SUPPORT_PLUGINS ?= "no"
CRD_VERSION ?= v1
BUILDX_OUTPUT_TYPE ?= "docker"
@ -73,7 +74,7 @@ init:
vc-scheduler: init
if [ ${SUPPORT_PLUGINS} = "yes" ];then\
CC=${CC} CGO_ENABLED=1 go build -ldflags ${LD_FLAGS} -o ${BIN_DIR}/vc-scheduler ./cmd/scheduler;\
CC=${MUSL_CC} CGO_ENABLED=1 go build -ldflags ${LD_FLAGS_CGO} -o ${BIN_DIR}/vc-scheduler ./cmd/scheduler;\
else\
CC=${CC} CGO_ENABLED=0 go build -ldflags ${LD_FLAGS} -o ${BIN_DIR}/vc-scheduler ./cmd/scheduler;\
fi;
@ -105,7 +106,7 @@ generate-code:
manifests: controller-gen
go mod vendor
# volcano crd base
$(CONTROLLER_GEN) $(CRD_OPTIONS) paths="./vendor/volcano.sh/apis/pkg/apis/scheduling/v1beta1;./vendor/volcano.sh/apis/pkg/apis/batch/v1alpha1;./vendor/volcano.sh/apis/pkg/apis/bus/v1alpha1;./vendor/volcano.sh/apis/pkg/apis/nodeinfo/v1alpha1" output:crd:artifacts:config=config/crd/volcano/bases
$(CONTROLLER_GEN) $(CRD_OPTIONS) paths="./vendor/volcano.sh/apis/pkg/apis/scheduling/v1beta1;./vendor/volcano.sh/apis/pkg/apis/batch/v1alpha1;./vendor/volcano.sh/apis/pkg/apis/bus/v1alpha1;./vendor/volcano.sh/apis/pkg/apis/nodeinfo/v1alpha1;./vendor/volcano.sh/apis/pkg/apis/topology/v1alpha1" output:crd:artifacts:config=config/crd/volcano/bases
# generate volcano job crd yaml without description to avoid yaml size limit when using `kubectl apply`
$(CONTROLLER_GEN) $(CRD_OPTIONS_EXCLUDE_DESCRIPTION) paths="./vendor/volcano.sh/apis/pkg/apis/batch/v1alpha1" output:crd:artifacts:config=config/crd/volcano/bases
# jobflow crd base
@ -142,6 +143,9 @@ e2e-test-vcctl: vcctl images
e2e-test-stress: images
E2E_TYPE=STRESS ./hack/run-e2e-kind.sh
e2e-test-dra: images
E2E_TYPE=DRA FEATURE_GATES="DynamicResourceAllocation=true" ./hack/run-e2e-kind.sh
generate-yaml: init manifests
./hack/generate-yaml.sh TAG=${RELEASE_VER} CRD_VERSION=${CRD_VERSION}
@ -190,7 +194,7 @@ ifeq (, $(shell which controller-gen))
CONTROLLER_GEN_TMP_DIR=$$(mktemp -d) ;\
cd $$CONTROLLER_GEN_TMP_DIR ;\
go mod init tmp ;\
GOOS=${OS} go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.16.4 ;\
GOOS=${OS} go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.17.0 ;\
rm -rf $$CONTROLLER_GEN_TMP_DIR ;\
}
CONTROLLER_GEN=$(GOBIN)/controller-gen
@ -201,6 +205,7 @@ endif
update-development-yaml:
make generate-yaml TAG=latest RELEASE_DIR=installer
mv installer/volcano-latest.yaml installer/volcano-development.yaml
mv installer/volcano-agent-latest.yaml installer/volcano-agent-development.yaml
mod-download-go:
@-GOFLAGS="-mod=readonly" find -name go.mod -execdir go mod download \;
@ -211,7 +216,7 @@ mod-download-go:
.PHONY: mirror-licenses
mirror-licenses: mod-download-go; \
GOOS=${OS} go install istio.io/tools/cmd/license-lint@1.19.7; \
GOOS=${OS} go install istio.io/tools/cmd/license-lint@1.25.0; \
cd licenses; \
rm -rf `ls ./ | grep -v LICENSE`; \
cd -; \

View File

@ -5,7 +5,12 @@ GitSHA=`git rev-parse HEAD`
Date=`date "+%Y-%m-%d %H:%M:%S"`
RELEASE_VER=latest
OPEN_EULER_IMAGE_TAG ?= 22.03-lts-sp2
LD_FLAGS=" \
LD_FLAGS="\
-X '${REPO_PATH}/pkg/version.GitSHA=${GitSHA}' \
-X '${REPO_PATH}/pkg/version.Built=${Date}' \
-X '${REPO_PATH}/pkg/version.Version=${RELEASE_VER}'"
LD_FLAGS_CGO="\
-linkmode=external \
-X '${REPO_PATH}/pkg/version.GitSHA=${GitSHA}' \
-X '${REPO_PATH}/pkg/version.Built=${Date}' \
-X '${REPO_PATH}/pkg/version.Version=${RELEASE_VER}'"

2
OWNERS
View File

@ -19,6 +19,7 @@ reviewers:
- Monokaix
- lowang-bh
- archlitchi
- JesseStutler
approvers:
- k82cn
- kevin-wangzefeng
@ -31,3 +32,4 @@ approvers:
- Monokaix
- lowang-bh
- hwdef
- wangyang0616

View File

@ -14,20 +14,15 @@
[![Gurubase](https://img.shields.io/badge/Gurubase-Ask%20Volcano%20Guru-006BFF)](https://gurubase.io/g/volcano)
[Volcano](https://volcano.sh/) is a batch system built on Kubernetes. It provides a suite of mechanisms that are commonly required by
many classes of batch & elastic workload including: machine learning/deep learning, bioinformatics/genomics and
other "big data" applications. These types of applications typically run on generalized domain frameworks like
TensorFlow, Spark, Ray, PyTorch, MPI, etc, which Volcano integrates with.
Volcano builds upon a decade and a half of experience running a wide
variety of high performance workloads at scale using several systems
and platforms, combined with best-of-breed ideas and practices from
the open source community.
[Volcano](https://volcano.sh/) is a Kubernetes-native batch scheduling system, extending and enhancing the capabilities of the standard kube-scheduler. It provides a comprehensive set of features specifically designed to manage and optimize various batch and elastic workloads, including Artificial Intelligence (AI) / machine learning (ML) / deep learning (DL), bioinformatics / genomics, and other "Big Data" applications.
These workloads commonly leverage AI, Big Data, and HPC frameworks such as Spark, Flink, Ray, TensorFlow, PyTorch, Argo, MindSpore, PaddlePaddle, Kubeflow, MPI, Horovod, MXNet, KubeGene, and others, with which Volcano offers robust integration.
Volcano incorporates over fifteen years of collective experience in operating diverse high-performance workloads at scale across multiple systems and platforms. It combines proven best practices and innovative concepts from the open-source community to deliver a powerful and flexible scheduling solution.
As of 2025, Volcano has seen widespread adoption across numerous industries globally, including Internet/Cloud, Finance, Manufacturing, and Medical sectors. Many organizations and institutions are not only end-users but also active contributors to the project. Hundreds of contributors actively participate in code commits, pull request reviews, issue discussions, documentation updates, and design proposals. We encourage your participation in the ongoing development and growth of the Volcano project.
Until June 2021, Volcano has been widely used around the world at a variety of industries such as Internet/Cloud/Finance/
Manufacturing/Medical. More than 20 companies or institutions are not only end users but also active contributors. Hundreds
of contributors are taking active part in the code commit/PR review/issue discussion/docs update and design provision. We
are looking forward to your participation.
**NOTE**: the scheduler is built based on [kube-batch](https://github.com/kubernetes-sigs/kube-batch);
refer to [#241](https://github.com/volcano-sh/volcano/issues/241) and [#288](https://github.com/volcano-sh/volcano/pull/288) for more detail.
@ -49,7 +44,7 @@ Volcano is an incubating project of the [Cloud Native Computing Foundation](http
- [Batch Capability of Kubernetes Intro @ KubeCon 2019 NA](https://sched.co/Uajv)
- [Optimizing Knowledge Distillation Training With Volcano @ KubeCon 2021 EU](https://www.youtube.com/watch?v=cDPGmhVcj7Y&t=143s)
- [Exploration About Mixing Technology of Online Services and Offline Jobs Based On Volcano @ KubeCon 2021 China](https://www.youtube.com/watch?v=daqkUlT5ReY)
- [Volcano - Cloud Native Batch System for AI, BigData and HPC @ KubeCon 2022 EU](https://www.youtube.com/watch?v=wjy35HfIP_k)
- [Volcano - Cloud Native Batch System for AI, Big Data and HPC @ KubeCon 2022 EU](https://www.youtube.com/watch?v=wjy35HfIP_k)
- [How to Leverage Volcano to Improve the Resource Utilization of AI Pharmaceuticals, Autonomous Driving, and Smart Buildings @ KubeCon 2023 EU](https://www.youtube.com/watch?v=ujHDV5xteqU)
- [Run Your AI Workloads and Microservices on Kubernetes More Easily and Efficiently @ KubeCon 2023 China](https://www.youtube.com/watch?v=OO7zpyf7fgs)
- [Optimize LLM Workflows with Smart Infrastructure Enhanced by Volcano @ KubeCon 2024 China](https://www.youtube.com/watch?v=77Qn1-I-muQ)
@ -59,13 +54,37 @@ Volcano is an incubating project of the [Cloud Native Computing Foundation](http
## Ecosystem
- [spark-operator](https://www.kubeflow.org/docs/components/spark-operator/user-guide/volcano-integration/)
- [Spark Operator](https://www.kubeflow.org/docs/components/spark-operator/user-guide/volcano-integration/)
- [Native Spark](https://spark.apache.org/docs/3.5.0/running-on-kubernetes.html#using-volcano-as-customized-scheduler-for-spark-on-kubernetes)
- [Flink](https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/blob/master/docs/volcano_integration.md)
- [KubeRay](https://docs.ray.io/en/master/cluster/kubernetes/k8s-ecosystem/volcano.html)
- [PyTorch](https://github.com/volcano-sh/volcano/blob/master/docs/user-guide/how_to_use_pytorch_plugin.md)
- [TensorFlow](https://github.com/volcano-sh/volcano/tree/master/example/integrations/tensorflow)
- [kubeflow/training-operator](https://www.kubeflow.org/docs/components/training/user-guides/job-scheduling/)
- [kubeflow/arena](https://github.com/kubeflow/arena/blob/master/docs/training/volcanojob/volcanojob.md)
- [Horovod/MPI](https://github.com/volcano-sh/volcano/tree/master/example/integrations/mpi)
- [paddlepaddle](https://github.com/volcano-sh/volcano/tree/master/example/integrations/paddlepaddle)
- [cromwell](https://github.com/broadinstitute/cromwell/blob/develop/docs/backends/Volcano.md)
- [KubeRay](https://docs.ray.io/en/master/cluster/kubernetes/k8s-ecosystem/volcano.html)
- [MPI](https://github.com/volcano-sh/volcano/tree/master/example/integrations/mpi)
- [Horovod](https://github.com/volcano-sh/volcano/blob/master/example/kubecon-2019-china/horovod-sample/lm-horovod-tf-mnist-v0.5.yaml)
- [PaddlePaddle](https://github.com/volcano-sh/volcano/tree/master/example/integrations/paddlepaddle)
- [Cromwell](https://github.com/broadinstitute/cromwell/blob/develop/docs/backends/Volcano.md)
- [MindSpore](https://github.com/volcano-sh/volcano/tree/master/example/MindSpore-example)
- [MXNet](https://github.com/volcano-sh/volcano/tree/master/example/integrations/mxnet/train)
- [Argo](https://github.com/volcano-sh/volcano/tree/master/example/integrations/argo)
- [KubeGene](https://github.com/volcano-sh/kubegene)
## Use Cases
- [Why Spark chooses Volcano as built-in batch scheduler on Kubernetes?](https://www.cncf.io/blog/2022/06/30/why-spark-chooses-volcano-as-built-in-batch-scheduler-on-kubernetes/)
- [ING Bank: How Volcano empowers its big data analytics platform](https://www.cncf.io/blog/2023/02/21/ing-bank-how-volcano-empowers-its-big-data-analytics-platform/)
- [Using Volcano as a custom scheduler for Apache Spark on Amazon EMR on EKS](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/tutorial-volcano.html)
- [Deploy Azure Machine Learning extension on AKS or Arc Kubernetes cluster](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-kubernetes-extension?view=azureml-api-2&tabs=deploy-extension-with-cli)
- [Practical Tips for Preventing GPU Fragmentation for Volcano Scheduler](https://developer.nvidia.com/blog/practical-tips-for-preventing-gpu-fragmentation-for-volcano-scheduler/)
- [Using Volcano in Large-Scale, Distributed Offline Computing](https://volcano.sh/en/blog/ruitian2-en/)
- [OpenI-Octopus: How to Avoid Resource Preemption in Kubernetes Clusters](https://volcano.sh/en/blog/pengcheng-en/)
- [How Does Volcano Empower a Content Recommendation Engine in Xiaohongshu](https://volcano.sh/en/blog/xiaohongshu-en/)
- [How Ruitian Used Volcano to Run Large-Scale Offline HPC Jobs](https://volcano.sh/en/blog/ruitian-en/)
- [Integrating Volcano into the Leinao Cloud OS](https://volcano.sh/en/blog/leinao-en/)
- [HPC on Volcano: How Containers Support HPC Applications in the Meteorological Industry](https://volcano.sh/en/blog/hpc-en/)
- [iQIYI:Volcano-based Cloud Native Migration Practices](https://volcano.sh/en/blog/aiqiyi-en/)
- [PaddlePaddle Distributed Training on Volcano](https://volcano.sh/en/blog/paddlepaddle-en/)
## Quick Start Guide
@ -94,7 +113,6 @@ Enjoy! Volcano will create the following resources in `volcano-system` namespace
```
NAME READY STATUS RESTARTS AGE
pod/volcano-admission-5bd5756f79-dnr4l 1/1 Running 0 96s
pod/volcano-admission-init-4hjpx 0/1 Completed 0 96s
pod/volcano-controllers-687948d9c8-nw4b4 1/1 Running 0 96s
pod/volcano-scheduler-94998fc64-4z8kh 1/1 Running 0 96s
@ -118,7 +136,7 @@ job.batch/volcano-admission-init 1/1 48s 96s
### Install via helm
To install official release, please visit to [helm-charts](https://github.com/volcano-sh/helm-charts) for details.
To install official release, please visit [helm-charts](https://github.com/volcano-sh/helm-charts) for details.
```bash
helm repo add volcano-sh https://volcano-sh.github.io/helm-charts
@ -144,6 +162,10 @@ If you don't have a kubernetes cluster, try one-click install from code base:
This way is only available for x86_64 temporarily.
### Install volcano agent
Please follow the guide [Volcano Agent](https://volcano.sh/en/docs/colocation) to install volcano agent.
### Install monitoring system
If you want to get prometheus and grafana volcano dashboard after volcano installed, try following commands:
@ -159,14 +181,16 @@ Please follow the guide [Volcano Dashboard](https://github.com/volcano-sh/dashbo
## Kubernetes compatibility
| | Kubernetes 1.17 | Kubernetes 1.18 | Kubernetes 1.19 | Kubernetes 1.20 | Kubernetes 1.21 | Kubernetes 1.22 | Kubernetes 1.23 | Kubernetes 1.24 | Kubernetes 1.25 | Kubernetes 1.26 | Kubernetes 1.27 | Kubernetes 1.28 | Kubernetes 1.29 |Kubernetes 1.30 |Kubernetes 1.31 |
|-----------------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|---------------|---------------|
| Volcano v1.6 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - | - | - | - | - | - |- |- |
| Volcano v1.7 | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - |_ |_ |
| Volcano v1.8 | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - |- |_ |
| Volcano v1.9 | - | - | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |- |_ |
| Volcano v1.10 | - | - | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |✓ |_ |
| Volcano HEAD (master) | - | - | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |✓ |✓ |
| | Kubernetes 1.17 | Kubernetes 1.18 | Kubernetes 1.19 | Kubernetes 1.20 | Kubernetes 1.21 | Kubernetes 1.22 | Kubernetes 1.23 | Kubernetes 1.24 | Kubernetes 1.25 | Kubernetes 1.26 | Kubernetes 1.27 | Kubernetes 1.28 | Kubernetes 1.29 |Kubernetes 1.30 |Kubernetes 1.31 |Kubernetes 1.32 |
|-----------------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|---------------|--------------|---------------|
| Volcano v1.6 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - | - | - | - | - | - |- |- |- |
| Volcano v1.7 | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - |_ |_ |- |
| Volcano v1.8 | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - |- |_ |- |
| Volcano v1.9 | - | - | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |- |_ |- |
| Volcano v1.10 | - | - | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |✓ |_ |- |
| Volcano v1.11 | - | - | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |✓ |✓ |- |
| Volcano v1.12 | - | - | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |✓ |✓ |✓ |
| Volcano HEAD (master) | - | - | - | - | - | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |✓ |✓ |✓ |
Key:
* `✓` Volcano and the Kubernetes version are exactly compatible.
@ -174,6 +198,19 @@ Key:
* `-` The Kubernetes version has features or API objects that Volcano can't use.
## Citing Volcano
If Volcano helps your research, we appreciate your citations. Here is the BibTeX entry:
```bibtex
@misc{volcano2025,
title={Volcano: A Cloud Native Batch System},
author={Klaus Ma and Kevin Wang and others},
year={2025},
howpublished={\url{https://github.com/volcano-sh/volcano}},
}
```
## Meeting
Community weekly meeting for Asia: 15:00 - 16:00 (UTC+8) Friday. ([Convert to your timezone.](https://www.thetimezoneconverter.com/?t=10%3A00&tz=GMT%2B8&))
@ -196,4 +233,4 @@ If you have any question, feel free to reach out to us in the following ways:
[Mailing List](https://groups.google.com/forum/#!forum/volcano-sh)
Wechat: Add WeChat account `k8s2222` (华为云小助手2号) to let her pull you into the group.
WeChat: Please add WeChat account `k8s2222` and request an invitation to the group chat.

View File

@ -14,4 +14,5 @@
# limitations under the License.
set -e
helm uninstall volcano --namespace volcano-system
helm uninstall volcano --namespace volcano-system
kubectl delete namespace volcano-system

View File

@ -32,6 +32,7 @@ import (
"k8s.io/controller-manager/pkg/clientbuilder"
"k8s.io/klog/v2"
"volcano.sh/apis/pkg/apis/helpers"
"volcano.sh/volcano/cmd/agent/app/options"
"volcano.sh/volcano/pkg/agent/healthcheck"
"volcano.sh/volcano/pkg/agent/utils"
@ -81,8 +82,11 @@ func RunServer(checker healthcheck.HealthChecker, address string, port int) {
mux.HandleFunc("/healthz", checker.HealthCheck)
mux.Handle("/metrics", promhttp.Handler())
s := &http.Server{
Addr: net.JoinHostPort(address, strconv.Itoa(port)),
Handler: mux,
Addr: net.JoinHostPort(address, strconv.Itoa(port)),
Handler: mux,
ReadHeaderTimeout: helpers.DefaultReadHeaderTimeout,
ReadTimeout: helpers.DefaultReadTimeout,
WriteTimeout: helpers.DefaultWriteTimeout,
}
if err := s.ListenAndServe(); err != nil {
klog.Fatalf("failed to start health check server: %v", err)

View File

@ -204,7 +204,7 @@ func (s *ServerOption) ParseCAFiles(decryptFunc DecryptFunc) error {
return err
}
// users can add one function to decrypt tha data by their own way if CA data is encrypted
// users can add one function to decrypt the data by their own way if CA data is encrypted
if decryptFunc != nil {
return decryptFunc(s)
}

View File

@ -148,6 +148,13 @@ func TestCheckControllers(t *testing.T) {
},
expectErr: fmt.Errorf("controllers option %s cannot have both '-' and '+' prefixes", "-job-controller"),
},
{
name: "fail case: use duplicate job-controller",
serverOption: &ServerOption{
Controllers: []string{"+job-controller", "job-controller"},
},
expectErr: fmt.Errorf("controllers option %s cannot have both '-' and '+' prefixes", "job-controller"),
},
{
name: "fail case: use * but combined with other input",
serverOption: &ServerOption{

View File

@ -58,8 +58,17 @@ func Run(opt *options.ServerOption) error {
if opt.EnableMetrics {
go func() {
http.Handle("/metrics", commonutil.PromHandler())
klog.Fatalf("Prometheus Http Server failed %s", http.ListenAndServe(opt.ListenAddress, nil))
mux := http.NewServeMux()
mux.Handle("/metrics", commonutil.PromHandler())
server := &http.Server{
Addr: opt.ListenAddress,
Handler: mux,
ReadHeaderTimeout: helpers.DefaultReadHeaderTimeout,
ReadTimeout: helpers.DefaultReadTimeout,
WriteTimeout: helpers.DefaultWriteTimeout,
}
klog.Fatalf("Prometheus Http Server failed: %s", server.ListenAndServe())
}()
}

View File

@ -34,6 +34,7 @@ import (
"volcano.sh/volcano/cmd/controller-manager/app/options"
"volcano.sh/volcano/pkg/controllers/framework"
_ "volcano.sh/volcano/pkg/controllers/garbagecollector"
_ "volcano.sh/volcano/pkg/controllers/hypernode"
_ "volcano.sh/volcano/pkg/controllers/job"
_ "volcano.sh/volcano/pkg/controllers/jobflow"
_ "volcano.sh/volcano/pkg/controllers/jobtemplate"

View File

@ -39,5 +39,5 @@ func (o *Options) AddFlags(c *cobra.Command) {
"bandwidth usage of online jobs exceeds the defined threshold(online-bandwidth-watermark)")
c.Flags().StringVar(&o.OfflineHighBandwidth, utils.OfflineHighBandwidthKey, o.OfflineHighBandwidth, "offline-high-bandwidth is the maximum amount of network bandwidth that can be used by offline jobs when the"+
"bandwidth usage of online jobs not reach to the defined threshold(online-bandwidth-watermark)")
c.Flags().BoolVar(&o.EnableNetworkQoS, utils.EnableNetworkQoS, o.EnableNetworkQoS, "enbale networkqos")
c.Flags().BoolVar(&o.EnableNetworkQoS, utils.EnableNetworkQoS, o.EnableNetworkQoS, "enable networkqos")
}

View File

@ -32,6 +32,7 @@ import (
const (
defaultSchedulerName = "volcano"
defaultSchedulerPeriod = time.Second
defaultResyncPeriod = 0
defaultQueue = "default"
defaultListenAddress = ":8080"
defaultHealthzAddress = ":11251"
@ -60,6 +61,7 @@ type ServerOption struct {
SchedulerNames []string
SchedulerConf string
SchedulePeriod time.Duration
ResyncPeriod time.Duration
// leaderElection defines the configuration of leader election.
LeaderElection config.LeaderElectionConfiguration
// Deprecated: use ResourceNamespace instead.
@ -67,6 +69,7 @@ type ServerOption struct {
DefaultQueue string
PrintVersion bool
EnableMetrics bool
EnablePprof bool
ListenAddress string
EnablePriorityClass bool
EnableCSIStorage bool
@ -112,12 +115,13 @@ func (s *ServerOption) AddFlags(fs *pflag.FlagSet) {
"File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated "+
"after server cert).")
fs.StringVar(&s.KeyFile, "tls-private-key-file", s.KeyFile, "File containing the default x509 private key matching --tls-cert-file.")
fs.StringVar(&s.LockObjectNamespace, "lock-object-namespace", defaultLockObjectNamespace, "Define the namespace of the lock object; it is volcano-system by default.")
fs.StringVar(&s.LockObjectNamespace, "lock-object-namespace", "", "Define the namespace of the lock object; it is volcano-system by default.")
fs.MarkDeprecated("lock-object-namespace", "This flag is deprecated and will be removed in a future release. Please use --leader-elect-resource-namespace instead.")
// volcano scheduler will ignore pods with scheduler names other than specified with the option
fs.StringArrayVar(&s.SchedulerNames, "scheduler-name", []string{defaultSchedulerName}, "vc-scheduler will handle pods whose .spec.SchedulerName is same as scheduler-name")
fs.StringVar(&s.SchedulerConf, "scheduler-conf", "", "The absolute path of scheduler configuration file")
fs.DurationVar(&s.SchedulePeriod, "schedule-period", defaultSchedulerPeriod, "The period between each scheduling cycle")
fs.DurationVar(&s.ResyncPeriod, "resync-period", defaultResyncPeriod, "The default resync period for k8s native informer factory")
fs.StringVar(&s.DefaultQueue, "default-queue", defaultQueue, "The default queue name of the job")
fs.BoolVar(&s.PrintVersion, "version", false, "Show version and quit")
fs.StringVar(&s.ListenAddress, "listen-address", defaultListenAddress, "The address to listen on for HTTP requests.")
@ -133,14 +137,15 @@ func (s *ServerOption) AddFlags(fs *pflag.FlagSet) {
// Minimum percentage of nodes to find and score
fs.Int32Var(&s.MinPercentageOfNodesToFind, "minimum-percentage-nodes-to-find", defaultMinPercentageOfNodesToFind, "The minimum percentage of nodes to find and score")
// The percentage of nodes that would be scored in each scheduling cycle; if <= 0, an adpative percentage will be calcuated
fs.Int32Var(&s.PercentageOfNodesToFind, "percentage-nodes-to-find", defaultPercentageOfNodesToFind, "The percentage of nodes to find and score, if <=0 will be calcuated based on the cluster size")
// The percentage of nodes that would be scored in each scheduling cycle; if <= 0, an adaptive percentage will be calculated
fs.Int32Var(&s.PercentageOfNodesToFind, "percentage-nodes-to-find", defaultPercentageOfNodesToFind, "The percentage of nodes to find and score, if <=0 will be calculated based on the cluster size")
fs.StringVar(&s.PluginsDir, "plugins-dir", defaultPluginsDir, "vc-scheduler will load custom plugins which are in this directory")
fs.BoolVar(&s.EnableCSIStorage, "csi-storage", false,
"Enable tracking of available storage capacity that CSI drivers provide; it is false by default")
fs.BoolVar(&s.EnableHealthz, "enable-healthz", false, "Enable the health check; it is false by default")
fs.BoolVar(&s.EnableMetrics, "enable-metrics", false, "Enable the metrics function; it is false by default")
fs.BoolVar(&s.EnablePprof, "enable-pprof", false, "Enable the pprof endpoint; it is false by default")
fs.StringSliceVar(&s.NodeSelector, "node-selector", nil, "volcano only work with the labeled node, like: --node-selector=volcano.sh/role:train --node-selector=volcano.sh/role:serving")
fs.BoolVar(&s.EnableCacheDumper, "cache-dumper", true, "Enable the cache dumper, it's true by default")
fs.StringVar(&s.CacheDumpFileDir, "cache-dump-dir", "/tmp", "The target dir where the json file put at when dump cache info to json file")

View File

@ -45,6 +45,7 @@ func TestAddFlags(t *testing.T) {
args := []string{
"--schedule-period=5m",
"--resync-period=0",
"--priority-class=false",
"--cache-dumper=false",
"--leader-elect-lease-duration=60s",
@ -58,6 +59,7 @@ func TestAddFlags(t *testing.T) {
expected := &ServerOption{
SchedulerNames: []string{defaultSchedulerName},
SchedulePeriod: 5 * time.Minute,
ResyncPeriod: 0,
LeaderElection: config.LeaderElectionConfiguration{
LeaderElect: true,
LeaseDuration: metav1.Duration{Duration: 60 * time.Second},
@ -66,9 +68,8 @@ func TestAddFlags(t *testing.T) {
ResourceLock: resourcelock.LeasesResourceLock,
ResourceNamespace: defaultLockObjectNamespace,
},
LockObjectNamespace: defaultLockObjectNamespace,
DefaultQueue: defaultQueue,
ListenAddress: defaultListenAddress,
DefaultQueue: defaultQueue,
ListenAddress: defaultListenAddress,
KubeClientOptions: kube.ClientOptions{
Master: "",
KubeConfig: "",

View File

@ -20,10 +20,10 @@ import (
"context"
"fmt"
"net/http"
"net/http/pprof"
"os"
"volcano.sh/apis/pkg/apis/helpers"
"volcano.sh/volcano/cmd/scheduler/app/options"
"volcano.sh/volcano/pkg/kube"
"volcano.sh/volcano/pkg/scheduler"
@ -69,11 +69,8 @@ func Run(opt *options.ServerOption) error {
panic(err)
}
if opt.EnableMetrics {
go func() {
http.Handle("/metrics", commonutil.PromHandler())
klog.Fatalf("Prometheus Http Server failed %s", http.ListenAndServe(opt.ListenAddress, nil))
}()
if opt.EnableMetrics || opt.EnablePprof {
go startMetricsServer(opt)
}
if opt.EnableHealthz {
@ -142,3 +139,31 @@ func Run(opt *options.ServerOption) error {
})
return fmt.Errorf("lost lease")
}
func startMetricsServer(opt *options.ServerOption) {
mux := http.NewServeMux()
if opt.EnableMetrics {
mux.Handle("/metrics", commonutil.PromHandler())
}
if opt.EnablePprof {
mux.HandleFunc("/debug/pprof/", pprof.Index)
mux.HandleFunc("/debug/pprof/cmdline", pprof.Cmdline)
mux.HandleFunc("/debug/pprof/profile", pprof.Profile)
mux.HandleFunc("/debug/pprof/symbol", pprof.Symbol)
mux.HandleFunc("/debug/pprof/trace", pprof.Trace)
}
server := &http.Server{
Addr: opt.ListenAddress,
Handler: mux,
ReadHeaderTimeout: helpers.DefaultReadHeaderTimeout,
ReadTimeout: helpers.DefaultReadTimeout,
WriteTimeout: helpers.DefaultWriteTimeout,
}
if err := server.ListenAndServe(); err != nil {
klog.Errorf("start metrics/pprof http server failed: %v", err)
}
}

View File

@ -29,9 +29,6 @@ import (
componentbaseoptions "k8s.io/component-base/config/options"
"k8s.io/klog/v2"
// init pprof server
_ "net/http/pprof"
"volcano.sh/volcano/cmd/scheduler/app"
"volcano.sh/volcano/cmd/scheduler/app/options"
commonutil "volcano.sh/volcano/pkg/util"

View File

@ -19,6 +19,7 @@ package options
import (
"fmt"
"os"
"time"
"github.com/spf13/pflag"
@ -26,31 +27,33 @@ import (
)
const (
defaultSchedulerName = "volcano"
defaultQPS = 50.0
defaultBurst = 100
defaultEnabledAdmission = "/jobs/mutate,/jobs/validate,/podgroups/mutate,/pods/validate,/pods/mutate,/queues/mutate,/queues/validate"
defaultHealthzAddress = ":11251"
defaultSchedulerName = "volcano"
defaultQPS = 50.0
defaultBurst = 100
defaultEnabledAdmission = "/jobs/mutate,/jobs/validate,/podgroups/mutate,/pods/validate,/pods/mutate,/queues/mutate,/queues/validate"
defaultHealthzAddress = ":11251"
defaultGracefulShutdownTime = time.Second * 30
)
// Config admission-controller server config.
type Config struct {
KubeClientOptions kube.ClientOptions
CertFile string
KeyFile string
CaCertFile string
CertData []byte
KeyData []byte
CaCertData []byte
ListenAddress string
Port int
PrintVersion bool
WebhookName string
WebhookNamespace string
SchedulerNames []string
WebhookURL string
ConfigPath string
EnabledAdmission string
KubeClientOptions kube.ClientOptions
CertFile string
KeyFile string
CaCertFile string
CertData []byte
KeyData []byte
CaCertData []byte
ListenAddress string
Port int
PrintVersion bool
WebhookName string
WebhookNamespace string
SchedulerNames []string
WebhookURL string
ConfigPath string
EnabledAdmission string
GracefulShutdownTime time.Duration
EnableHealthz bool
// HealthzBindAddress is the IP address and port for the health check server to serve on
@ -88,6 +91,7 @@ func (c *Config) AddFlags(fs *pflag.FlagSet) {
fs.StringVar(&c.ConfigPath, "admission-conf", "", "The configmap file of this webhook")
fs.BoolVar(&c.EnableHealthz, "enable-healthz", false, "Enable the health check; it is false by default")
fs.StringVar(&c.HealthzBindAddress, "healthz-address", defaultHealthzAddress, "The address to listen on for the health check server.")
fs.DurationVar(&c.GracefulShutdownTime, "graceful-shutdown-time", defaultGracefulShutdownTime, "The duration to wait during graceful shutdown before forcing termination.")
}
// CheckPortOrDie check valid port range.
@ -125,7 +129,7 @@ func (c *Config) ParseCAFiles(decryptFunc DecryptFunc) error {
return err
}
// users can add one function to decrypt tha data by their own way if CA data is encrypted
// users can add one function to decrypt the data by their own way if CA data is encrypted
if decryptFunc != nil {
return decryptFunc(c)
}

View File

@ -0,0 +1,66 @@
/*
Copyright 2025 The Volcano Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package options
import (
"testing"
"github.com/spf13/pflag"
"k8s.io/apimachinery/pkg/api/equality"
utilfeature "k8s.io/apiserver/pkg/util/feature"
"volcano.sh/volcano/pkg/kube"
)
func TestAddFlags(t *testing.T) {
fs := pflag.NewFlagSet("addflagstest", pflag.ExitOnError)
s := NewConfig()
s.AddFlags(fs)
utilfeature.DefaultMutableFeatureGate.AddFlag(fs)
args := []string{
"--master=127.0.0.1",
"--kube-api-burst=200",
}
fs.Parse(args)
// This is a snapshot of expected options parsed by args.
expected := &Config{
KubeClientOptions: kube.ClientOptions{
Master: "127.0.0.1",
KubeConfig: "",
QPS: defaultQPS,
Burst: 200,
},
ListenAddress: "",
Port: 8443,
PrintVersion: false,
WebhookName: "",
WebhookNamespace: "",
SchedulerNames: []string{defaultSchedulerName},
WebhookURL: "",
ConfigPath: "",
EnabledAdmission: defaultEnabledAdmission,
GracefulShutdownTime: defaultGracefulShutdownTime,
EnableHealthz: false,
HealthzBindAddress: defaultHealthzAddress,
}
if !equality.Semantic.DeepEqual(expected, s) {
t.Errorf("Got different run options than expected.\nGot: %+v\nExpected: %+v\n", s, expected)
}
}

View File

@ -17,12 +17,11 @@ limitations under the License.
package app
import (
"context"
"errors"
"fmt"
"net/http"
"os"
"os/signal"
"strconv"
"syscall"
v1 "k8s.io/api/core/v1"
corev1 "k8s.io/client-go/kubernetes/typed/core/v1"
@ -34,6 +33,7 @@ import (
informers "volcano.sh/apis/pkg/client/informers/externalversions"
"volcano.sh/volcano/cmd/webhook-manager/app/options"
"volcano.sh/volcano/pkg/kube"
"volcano.sh/volcano/pkg/signals"
commonutil "volcano.sh/volcano/pkg/util"
wkconfig "volcano.sh/volcano/pkg/webhooks/config"
"volcano.sh/volcano/pkg/webhooks/router"
@ -97,8 +97,7 @@ func Run(config *options.Config) error {
klog.V(3).Infof("Successfully added caCert for all webhooks")
webhookServeError := make(chan struct{})
stopChannel := make(chan os.Signal, 1)
signal.Notify(stopChannel, syscall.SIGTERM, syscall.SIGINT)
ctx := signals.SetupSignalContext()
factory.Start(webhookServeError)
for informerType, ok := range factory.WaitForCacheSync(webhookServeError) {
@ -108,26 +107,31 @@ func Run(config *options.Config) error {
}
server := &http.Server{
Addr: config.ListenAddress + ":" + strconv.Itoa(config.Port),
TLSConfig: configTLS(config, restConfig),
Addr: config.ListenAddress + ":" + strconv.Itoa(config.Port),
TLSConfig: configTLS(config, restConfig),
ReadHeaderTimeout: helpers.DefaultReadHeaderTimeout,
ReadTimeout: helpers.DefaultReadTimeout,
WriteTimeout: helpers.DefaultWriteTimeout,
}
go func() {
err = server.ListenAndServeTLS("", "")
if err != nil && err != http.ErrServerClosed {
if err != nil && !errors.Is(err, http.ErrServerClosed) {
klog.Fatalf("ListenAndServeTLS for admission webhook failed: %v", err)
close(webhookServeError)
}
klog.Info("Volcano Webhook manager started.")
klog.Info("Volcano Webhook manager stopped.")
}()
if config.ConfigPath != "" {
go wkconfig.WatchAdmissionConf(config.ConfigPath, stopChannel)
go wkconfig.WatchAdmissionConf(config.ConfigPath, ctx.Done())
}
select {
case <-stopChannel:
if err := server.Close(); err != nil {
case <-ctx.Done():
timeoutCtx, cancel := context.WithTimeout(context.Background(), config.GracefulShutdownTime)
defer cancel()
if err := server.Shutdown(timeoutCtx); err != nil {
return fmt.Errorf("close admission server failed: %v", err)
}
return nil

View File

@ -31,9 +31,12 @@ import (
"volcano.sh/volcano/cmd/webhook-manager/app"
"volcano.sh/volcano/cmd/webhook-manager/app/options"
"volcano.sh/volcano/pkg/version"
_ "volcano.sh/volcano/pkg/webhooks/admission/hypernodes/validate"
_ "volcano.sh/volcano/pkg/webhooks/admission/jobflows/validate"
_ "volcano.sh/volcano/pkg/webhooks/admission/jobs/mutate"
_ "volcano.sh/volcano/pkg/webhooks/admission/jobs/validate"
_ "volcano.sh/volcano/pkg/webhooks/admission/podgroups/mutate"
_ "volcano.sh/volcano/pkg/webhooks/admission/podgroups/validate"
_ "volcano.sh/volcano/pkg/webhooks/admission/pods/mutate"
_ "volcano.sh/volcano/pkg/webhooks/admission/pods/validate"
_ "volcano.sh/volcano/pkg/webhooks/admission/queues/mutate"

View File

@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: jobflows.flow.volcano.sh
spec:
group: flow.volcano.sh

View File

@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: jobtemplates.flow.volcano.sh
spec:
group: flow.volcano.sh
@ -39,6 +39,18 @@ spec:
format: int32
minimum: 1
type: integer
networkTopology:
properties:
highestTierAllowed:
default: 1
type: integer
mode:
default: hard
enum:
- hard
- soft
type: string
type: object
plugins:
additionalProperties:
items:
@ -2799,6 +2811,39 @@ spec:
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
resources:
properties:
claims:
items:
properties:
name:
type: string
request:
type: string
required:
- name
type: object
type: array
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
type: object
type: object
restartPolicy:
type: string
runtimeClassName:
@ -2841,6 +2886,8 @@ spec:
runAsUser:
format: int64
type: integer
seLinuxChangePolicy:
type: string
seLinuxOptions:
properties:
level:

View File

@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: jobs.batch.volcano.sh
spec:
group: batch.volcano.sh
@ -57,6 +57,18 @@ spec:
format: int32
minimum: 1
type: integer
networkTopology:
properties:
highestTierAllowed:
default: 1
type: integer
mode:
default: hard
enum:
- hard
- soft
type: string
type: object
plugins:
additionalProperties:
items:
@ -2817,6 +2829,39 @@ spec:
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
resources:
properties:
claims:
items:
properties:
name:
type: string
request:
type: string
required:
- name
type: object
type: array
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
type: object
type: object
restartPolicy:
type: string
runtimeClassName:
@ -2859,6 +2904,8 @@ spec:
runAsUser:
format: int64
type: integer
seLinuxChangePolicy:
type: string
seLinuxOptions:
properties:
level:

View File

@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: commands.bus.volcano.sh
spec:
group: bus.volcano.sh

View File

@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: numatopologies.nodeinfo.volcano.sh
spec:
group: nodeinfo.volcano.sh

View File

@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: podgroups.scheduling.volcano.sh
spec:
group: scheduling.volcano.sh
@ -89,6 +89,24 @@ spec:
if there's not enough resources to start each task, the scheduler
will not start anyone.
type: object
networkTopology:
description: NetworkTopology defines the NetworkTopology config, this
field works in conjunction with network topology feature and hyperNode
CRD.
properties:
highestTierAllowed:
default: 1
description: HighestTierAllowed specifies the highest tier that
a job allowed to cross when scheduling.
type: integer
mode:
default: hard
description: Mode specifies the mode of the network topology constrain.
enum:
- hard
- soft
type: string
type: object
priorityClassName:
description: |-
If specified, indicates the PodGroup's priority. "system-node-critical" and
@ -99,6 +117,7 @@ spec:
default.
type: string
queue:
default: default
description: |-
Queue defines the queue to allocate resource for PodGroup; if queue does not exist,
the PodGroup will not be scheduled. Defaults to `default` Queue with the lowest weight.

View File

@ -3,7 +3,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: queues.scheduling.volcano.sh
spec:
group: scheduling.volcano.sh
@ -17,7 +17,11 @@ spec:
singular: queue
scope: Cluster
versions:
- name: v1beta1
- additionalPrinterColumns:
- jsonPath: .spec.parent
name: PARENT
type: string
name: v1beta1
schema:
openAPIV3Schema:
description: Queue is a queue of PodGroup.
@ -149,7 +153,10 @@ spec:
description: Type define the type of queue
type: string
weight:
default: 1
format: int32
maximum: 65535
minimum: 1
type: integer
type: object
status:
@ -207,8 +214,6 @@ spec:
description: The number of 'Unknown' PodGroup in this queue.
format: int32
type: integer
required:
- allocated
type: object
type: object
served: true

View File

@ -0,0 +1,225 @@
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.17.0
name: hypernodes.topology.volcano.sh
spec:
group: topology.volcano.sh
names:
kind: HyperNode
listKind: HyperNodeList
plural: hypernodes
shortNames:
- hn
singular: hypernode
scope: Cluster
versions:
- additionalPrinterColumns:
- jsonPath: .spec.tier
name: Tier
type: string
- jsonPath: .status.nodeCount
name: NodeCount
type: integer
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
name: v1alpha1
schema:
openAPIV3Schema:
description: HyperNode represents a collection of nodes sharing similar network
topology or performance characteristics.
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: Spec defines the desired configuration of the HyperNode.
properties:
members:
description: Members defines a list of node groups or individual nodes
included in the HyperNode.
items:
description: MemberSpec represents a specific node or a hyperNodes
in the hyperNode.
properties:
selector:
description: Selector defines the selection rules for this member.
properties:
exactMatch:
description: ExactMatch defines the exact match criteria.
properties:
name:
description: Name specifies the exact name of the node
to match.
type: string
type: object
labelMatch:
description: LabelMatch defines the labels match criteria
(only take effect when Member Type is "Node").
properties:
matchExpressions:
description: matchExpressions is a list of label selector
requirements. The requirements are ANDed.
items:
description: |-
A label selector requirement is a selector that contains values, a key, and an operator that
relates the key and values.
properties:
key:
description: key is the label key that the selector
applies to.
type: string
operator:
description: |-
operator represents a key's relationship to a set of values.
Valid operators are In, NotIn, Exists and DoesNotExist.
type: string
values:
description: |-
values is an array of string values. If the operator is In or NotIn,
the values array must be non-empty. If the operator is Exists or DoesNotExist,
the values array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
x-kubernetes-list-type: atomic
required:
- key
- operator
type: object
type: array
x-kubernetes-list-type: atomic
matchLabels:
additionalProperties:
type: string
description: |-
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels
map is equivalent to an element of matchExpressions, whose key field is "key", the
operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
type: object
x-kubernetes-map-type: atomic
regexMatch:
description: RegexMatch defines the regex match criteria.
properties:
pattern:
description: Pattern defines the regex pattern to match
node names.
type: string
type: object
type: object
x-kubernetes-validations:
- message: Either ExactMatch or RegexMatch or LabelMatch must
be specified
rule: has(self.exactMatch) || has(self.regexMatch) || has(self.labelMatch)
- message: Only one of ExactMatch, RegexMatch, or LabelMatch
can be specified
rule: '(has(self.exactMatch) ? 1 : 0) + (has(self.regexMatch)
? 1 : 0) + (has(self.labelMatch) ? 1 : 0) <= 1'
type:
description: Type specifies the member type.
enum:
- Node
- HyperNode
type: string
required:
- type
type: object
type: array
tier:
description: Tier categorizes the performance level of the HyperNode.
type: integer
required:
- tier
type: object
status:
description: Status provides the current state of the HyperNode.
properties:
conditions:
description: Conditions provide details about the current state of
the HyperNode.
items:
description: Condition contains details for one aspect of the current
state of this API Resource.
properties:
lastTransitionTime:
description: |-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
type: string
message:
description: |-
message is a human readable message indicating details about the transition.
This may be an empty string.
maxLength: 32768
type: string
observedGeneration:
description: |-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
format: int64
minimum: 0
type: integer
reason:
description: |-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
type: string
status:
description: status of the condition, one of True, False, Unknown.
enum:
- "True"
- "False"
- Unknown
type: string
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
required:
- lastTransitionTime
- message
- reason
- status
- type
type: object
type: array
nodeCount:
description: NodeCount is the total number of nodes currently in the
HyperNode.
format: int64
minimum: 0
type: integer
type: object
type: object
served: true
storage: true
subresources:
status: {}

View File

@ -66,6 +66,10 @@ allowlisted_modules:
- github.com/google/cadvisor
# Apache-2.0: k8s.io/kubernetes@v1.27.2/logo/LICENSE
- k8s.io/kubernetes
# Apache-2.0: github.com/spf13/cobra@v1.8.1/LICENSE
- github.com/spf13/cobra
# BSD: gopkg.in/inf.v0@v0.9.1/LICENSE
- gopkg.in/inf.v0
# BSD: github.com/gogo/protobuf@v1.3.2/LICENSE
- github.com/gogo/protobuf
# MIT: sigs.k8s.io/yaml@v1.3.0/LICENSE

View File

@ -237,7 +237,7 @@ spec:
#### network topology generation and update
* **Network topology discovery/detection tool**: a tool to generate network topology CR by analyzing labels, system file or API of HW vendor. The community will offer a tool to generate CR by label.
![nework topology generation](images/network-topology-aware/nework-topology-generation.png)
![network topology generation](images/network-topology-aware/nework-topology-generation.png)
### Job management
@ -565,13 +565,19 @@ Allocate resources for queue-\> Job \-\> Task.
Phase2:
Allocate resources for queue-\> hyperJob \-\> Job \-\> Task.
**plugin:** NetworkTopology
**plugin:** network-topology-aware
- AddJobGroupReadyFn: check whether hyperJob minAvailable is met.(phase 2)
- AddHyperNodeOrderFn: score for hyperNodes.(take effect in hard limit, closest tiers have higher score)
1. If a Job is being scheduled for the very first time, candidate hypernodes that need to be scored will get a score of 0 and then return right away. The name of the HyperNode where the Job eventually gets scheduled successfully will be recorded in the Job's annotations under the key JobAllocatedHyperNode.
2. If it is not the first scheduling of a job, calculate the LCAHyperNode (Lowest Common Ancestor HyperNode) between candidate hypernodes(In the Allocate process, the system will calculate whether the LCAHyperNode tier of the hyperNode meets the Hard limit. If it doesn't, the hyperNode will be filtered out.) that need to be scored and the `JobAllocatedHyperNode` of the job. The lower the tier of the calculated LCAHyperNode, the higher the score. If there is only one highest score, return the scoring result.
3. If there is more than one HyperNode with the highest score in the scoring result of step 2, calculate the distribution of the tasks that have been successfully scheduled for the job among these HyperNodes. The greater the distribution quantity, the higher the score.
4. The HyperNode that is successfully scheduled in the end in steps 2 and 3 will also be recorded as the `JobAllocatedHyperNode` attribute of the job.
- AddNodeOrderFn: score for nodes.(take effect in soft limit,take effect in )
- AddNodeOrderFn: score for nodes.(take effect in soft limit, closest tiers have higher score)
1. To score all nodes, we need to first obtain the HyperNode to which the node belongs and the `JobAllocatedHyperNode` of the job to which the task belongs.
2. The subsequent scoring logic is the same as that in the Hard mode. The score of the HyperNode to which it belongs is calculated as the score of the node.
### Webhook

View File

@ -37,11 +37,11 @@ Administrator can create two queues with deserved capacity configured and the de
### Story 3
Administrator can create two queues with guarantee and deserved resources configured and the deserved resource can be reclaim back. And different resource type can hold diffferent guarantee and deserved quantity. For example, we consume there are 2 orgs Org1 and Org2, and use Queue1 and Queue2 respectively, Queue1's guarantee resources are A100 GPU card number=10, V100 GPU card number=10, and deserved resources are A100 GPU card number=20, V00 GPU card number=50, for Queue2, its guarantee resources are A100 GPU card number=10,V100 GPU card number=10, and deserved resources are A100 GPU card number=80, V100 GPU card number=50.
Administrator can create two queues with guarantee and deserved resources configured and the deserved resource can be reclaim back. And different resource type can hold different guarantee and deserved quantity. For example, we consume there are 2 orgs Org1 and Org2, and use Queue1 and Queue2 respectively, Queue1's guarantee resources are A100 GPU card number=10, V100 GPU card number=10, and deserved resources are A100 GPU card number=20, V100 GPU card number=50, for Queue2, its guarantee resources are A100 GPU card number=10,V100 GPU card number=10, and deserved resources are A100 GPU card number=80, V100 GPU card number=50.
<div align="center"><img width="582" height="393" src="images/capacity-scheduling/queue-deserved.png" /></div>
Queue1 can use cluster's total resoures when Queue2 is idle.
Queue1 can use cluster's total resources when Queue2 is idle.
<div align="center"><img width="612" height="303" src="images/capacity-scheduling/queue1-use-all.png" /></div>

View File

@ -49,7 +49,7 @@ volcano-controllers-7655bb499f-gpg9l 1/1 Running 0 3d
volcano-scheduler-6bf4759c45-c666z 1/1 Running 0 3d
```
Enable node level colocation by setting lable volcano.sh/oversubscription=true and volcano.sh/colocation=true.
Enable node level colocation by setting label volcano.sh/oversubscription=true and volcano.sh/colocation=true.
```
$ kubectl label node $node volcano.sh/oversubscription=true # replace $node with real node name in your kubernetes cluster.
@ -150,11 +150,11 @@ CPU burst relies on capabilities provided by the linux kernel, this feature only
### Dynamic resource oversubscription tutorial
This example will demonstrate the resource overoversubscription capability on node, and shows the suppression and eviction mechanism when node is suffering from pressure. The node flavor is 8 core cpu and 16GB memory.
This example will demonstrate the resource oversubscription capability on node, and shows the suppression and eviction mechanism when node is suffering from pressure. The node flavor is 8 core cpu and 16GB memory.
#### Check node oversubscription resoures
#### Check node oversubscription resources
Node oversubscription resources are calculated by node allocatable resources sub actual resource usage, oversubscription resources include cpu and memory and is represented by `kubernetes.io/batch-cpu` and `kubernetes.io/batch-memory` respectively, and reported as extended resources to node.Allocatable filed. Online workloads use noraml resources and offline workloads use oversubscription resources so we can improve pod deployment density and resource utilization.
Node oversubscription resources are calculated by node allocatable resources sub actual resource usage, oversubscription resources include cpu and memory and is represented by `kubernetes.io/batch-cpu` and `kubernetes.io/batch-memory` respectively, and reported as extended resources to node.Allocatable filed. Online workloads use normal resources and offline workloads use oversubscription resources so we can improve pod deployment density and resource utilization.
```shell
$ kubectl describe node $node # replace $node with real node name in your kubernetes cluster.

View File

@ -5,7 +5,7 @@
Currently the single scheduler can not satisfy the high throughput requirement in some scenarios. Besides the performance optimization against the single scheduler, another choice is to deploy multiple volcano schedulers to improve the overall scheduling throughput.
## Introduction
Previously we use label to divide the cluster nodes to multiple sections and each volcano scheduler is responsible for one section and then specify the schedulerName in the Pod Spec and submit it. It is inconvenient in some conditions especially for large clusters. This doc privides a another option for user to deploy multiple scheduler which needs less modification for workload and nodes.
Previously we use label to divide the cluster nodes to multiple sections and each volcano scheduler is responsible for one section and then specify the schedulerName in the Pod Spec and submit it. It is inconvenient in some conditions especially for large clusters. This doc provides a another option for user to deploy multiple scheduler which needs less modification for workload and nodes.
A statefulset is used to deploy the volcano scheduler. The Job and Node are assigned to scheduler automatically based on the hash algorithm.
[multi-scheduler-deployment](images/multi-volcano-schedulers-without-using-selector.png)

View File

@ -74,20 +74,29 @@ Details of methods mapping is shown in the table below:
## How to add a new device-share policy
### 1. Define your device in /pkg/scheduler/api/shared_device_pool.go
### 1. Create a new package in /pkg/scheduler/api/devices/"your device name"/"your policy name"
Name your policy and put it in shared_device_pool.go as follows:
For example, if you try to implement a NPU share policy, then you are recommended to create a package in /pkg/scheduler/api/device/ascend/npushare
Define the device name in /pkg/scheduler/api/devices/"your device name"/"your policy name"/type.go:
```
const (
GPUSharingDevice = "GpuShare"
Your_new_sharing_policy = "xxxxx"
DeviceName = "xxx"
)
```
### 2. Create a new package in /pkg/scheduler/api/devices/"your device name"/"your policy name"
### 2. Register your device in /pkg/scheduler/api/shared_device_pool.go
For example, if you try to implement a NPU share policy, then you are recommended to create a package in /pkg/scheduler/api/device/ascend/npushare
Register your device name in shared_device_pool.go as follows:
```
var RegisteredDevices = []string{
gpushare.DeviceName,
vgpu.DeviceName,
"your new package name".DeviceName, // add your device name here
}
```
### 3. Implement methods of interface shared_device_pool, and put them in your new package
@ -112,15 +121,16 @@ This is the *only* place you hack into scheduler.api ,which you have to register
// setNodeOthersResource initialize sharable devices
func (ni *NodeInfo) setNodeOthersResource(node *v1.Node) {
ni.Others[GPUSharingDevice] = gpushare.NewGPUDevices(ni.Name, node)
//ni.Others["your device sharing policy name"] = your device sharing package initialization method
ni.Others[gpushare.DeviceName] = gpushare.NewGPUDevices(ni.Name, node)
ni.Others[vgpu.DeviceName] = vgpu.NewGPUDevices(ni.Name, node)
ni.Others["your new package name".DeviceName] = your device sharing package initialization method // add your device sharing package initialization method here
}
```
### 5. Check if your policy is enabled in /pkg/scheduler/plugins/predicate/predicates.go
This is the *only* plae you hack into predicates.go, when the scheduler checks if your policy is enabled in scheduler configuration.
This is the *only* place you hack into predicates.go, when the scheduler checks if your policy is enabled in scheduler configuration.
predicates.go:
@ -129,11 +139,7 @@ predicates.go:
// Checks whether predicate.GPUSharingEnable is provided or not, if given, modifies the value in predicateEnable struct.
args.GetBool(&gpushare.GpuSharingEnable, GPUSharingPredicate)
args.GetBool(&gpushare.GpuNumberEnable, GPUNumberPredicate)
args.GetBool(&gpushare.NodeLockEnable, NodeLockEnable)
args.GetBool(&vgpu.VGPUEnable, VGPUEnable)
args.GetBool("your policy enable variable","your policy enable parameter")
...
```

173
docs/design/dynamic-mig.md Normal file
View File

@ -0,0 +1,173 @@
# NVIDIA GPU MPS and MIG dynamic slice plugin
## Special Thanks
This feature will not be implemented without the help of @sailorvii.
## Introduction
The NVIDIA GPU build-in sharing method includes: time-slice, MPS and MIG. The context switch for time slice sharing would waste some time, so we chose the MPS and MIG. The GPU MIG profile is variable, the user could acquire the MIG device in the profile definition, but current implementation only defines the dedicated profile before the user requirement. That limits the usage of MIG. We want to develop an automatic slice plugin and create the slice when the user require it.
For the scheduling method, node-level binpack and spread will be supported. Referring to the binpack plugin, we consider the CPU, Mem, GPU memory and other user-defined resource.
Volcano already have a [vgpu feature](https://github.com/Project-HAMi/volcano-vgpu-device-plugin) for NVIDIA devices after v1.9, it is done by using [hami-core](https://github.com/Project-HAMi/HAMi-core), which is a cuda-hacking library. It can dynamically share GPUs and ensure both quality of service and resource isolation. But considering MIG is also widely used across the world. Supporting MIG mode along with 'hami-core' in volcano-vgpu can be helpful. A unified API for dynamic-MIG and hami-core for volcano-vgpu is needed.
## Targets
- CPU, Mem, and GPU combined scheduling
- GPU dynamic slice: Hami-core and MIG
- Support node-level binpack and spread by GPU memory, CPU and Mem
- A unified vGPU Pool different virtualization technics
- Tasks can choose to use MIG, use HAMi-core, or use both.
### Config maps
- volcano-device-configMap
This configmap defines the plugin configurations including resourceName, and MIG geometries, and node-level configurations.
```yaml
apiVersion: v1
data:
volcano-device-share.conf: |
nvidia:
resourceCountName: volcano.sh/vgpu-number
resourceMemoryName: volcano.sh/vgpu-memory
resourceCoreName: volcano.sh/vgpu-cores
knownMigGeometries:
- models: [ "A30" ]
allowedGeometries:
- group: group1
geometries:
- name: 1g.6gb
memory: 6144
count: 4
- group: group2
geometries:
- name: 2g.12gb
memory: 12288
count: 2
- group: group3
geometries:
- name: 4g.24gb
memory: 24576
count: 1
- models: [ "A100-SXM4-40GB", "A100-40GB-PCIe", "A100-PCIE-40GB", "A100-SXM4-40GB" ]
allowedGeometries:
- group: group1
geometries:
- name: 1g.5gb
memory: 5120
count: 7
- group: group2
geometries:
- name: 2g.10gb
memory: 10240
count: 3
- name: 1g.5gb
memory: 5120
count: 1
- group: group3
geometries:
- name: 3g.20gb
memory: 20480
count: 2
- group: group4
geometries:
- name: 7g.40gb
memory: 40960
count: 1
- models: [ "A100-SXM4-80GB", "A100-80GB-PCIe", "A100-PCIE-80GB"]
allowedGeometries:
- group: group1
geometries:
- name: 1g.10gb
memory: 10240
count: 7
- group: group2
geometries:
- name: 2g.20gb
memory: 20480
count: 3
- name: 1g.10gb
memory: 10240
count: 1
- group: group3
geometries:
- name: 3g.40gb
memory: 40960
count: 2
- group: group4
geometries:
- name: 7g.79gb
memory: 80896
count: 1
```
## Structure
<img src="./images/volcano-dynamic-mig-structure.png" width = "400" />
## Examples
Dynamic mig is compatable with volcano-vgpu tasks, as the example below:
Just Setting `volcano.sh/vgpu-number` and `volcano.sh/vgpu-memory`.
```yaml
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod1
spec:
containers:
- name: ubuntu-container1
image: ubuntu:20.04
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
volcano.sh/vgpu-number: 2 # requesting 2 vGPUs
volcano.sh/vgpu-memory: 8000 # Each vGPU contains 8000m device memory Optional,Integer
```
A task can decide only to use `mig` or `hami-core` by setting `annotations.volcano.sh/vgpu-mode` to corresponding value, as the example below shows:
```yaml
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod1
annotations:
volcano.sh/vgpu-mode: "mig"
spec:
containers:
- name: ubuntu-container1
image: ubuntu:20.04
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
volcano.sh/vgpu-number: 2 # requesting 2 vGPUs
volcano.sh/vgpu-memory: 8000 # Each vGPU contains 8000m device memory Optional,Integer
```
## Procedures
The Procedure of a vGPU task which uses dynamic-mig is shown below:
<img src="./images/volcano-dynamic-mig-procedure.png" width = "800" />
Note that after submited a task, deviceshare plugin will iterate over templates defined in configMap `volcano-device-share`, and find the first available template to fit. You can always change the content of that configMap, and restart vc-scheduler to customize.
If you submit the example above(a pod requests 2 * 8G GPUs) to a cluster, which has an empty A100-PCIE-40GB node, then it will follow the procedure below:
<img src="./images/dynamic-mig-example.png" width = "400" />
The walkthrough will be shown in bold line
As the figure shows, after the procedure, it will adopt geometry 'group2' to that GPU with the definiation below:
```yaml
group2:
2g.10gb : 3
1g.5gb : 1
```
There are four mig instances in total, vc-scheduler will return 2 '2g.10gb' instances to the task, and add the remaining instances (1 '2g.10gb' + 1 '1g.5gb' ) to the available empty mig instances, for future usage.
In the end, start the container with 2g.10gb instances * 2

View File

@ -0,0 +1,220 @@
# HyperNode Auto Discovery Design Document
## Introduction
This design document describes the design and implementation of the HyperNode network topology discovery feature in Volcano. This feature automatically discovers the network topology structure within the cluster and creates and maintains HyperNode custom resources (CRs) based on the discovered topology information. Consequently, the Volcano Scheduler will leverage these HyperNode CRs for scheduling decisions, eliminating the need for users to manually maintain HyperNode information.
## Design Goals
* **Automated Discovery**: Automatically discover network topology information from different data sources (such as UFM, RoCE, etc.).
* **Scalability**: Support multiple network topology discovery sources and easily extend new discovery methods.
* **Real-time**: Be able to reflect changes in network topology in a timely manner.
* **Fault Tolerance**: When a discovery source fails, it does not affect the normal operation of the system.
* **Security**: Securely manage authentication credentials using Kubernetes Secrets.
## Overall Design
### Component Architecture
The entire network topology discovery function consists of the following core components:
* **Config Loader**: Responsible for loading network topology discovery configuration information from ConfigMap.
* **Discovery Manager**: Responsible for managing and coordinating various network topology discoverers.
* **Discoverer**: A specific network topology discoverer, responsible for obtaining network topology information from a specific data source and converting it into `HyperNode` resources.
* **HyperNode Controller**: Responsible for listening to changes in `HyperNode` resources and creating, updating, or deleting `HyperNode` resources based on the discovered topology information.
### Process Flow
```
ConfigMap -> Config Loader -> Discovery Manager -> Discoverer -> HyperNode Controller -> HyperNode
```
1. **Configuration Loading**: `Config Loader` loads network topology discovery configuration information from ConfigMap, including enabled discovery sources, discovery intervals, data source addresses, etc.
2. **Discovery Management**: `Discovery Manager` creates and starts the corresponding `Discoverer` based on the configuration information.
3. **Topology Discovery**: `Discoverer` obtains network topology information from the specified data source and converts the topology information into `HyperNode` resources.
4. **Resource Synchronization**: `HyperNode Controller` receives the `HyperNode` resources discovered by `Discoverer`, compares them with the existing `HyperNode` resources, and then creates, updates, or deletes `HyperNode` resources to keep the `HyperNode` resources in the cluster consistent with the actual network topology.
## Detailed Design
### Config Loader
`Config Loader` is responsible for loading network topology discovery configuration information from ConfigMap.
* **Function**:
* Read configuration information from the specified ConfigMap.
* Parse configuration information and convert it into a `NetworkTopologyConfig` object.
* **Configuration Format**:
```yaml
# volcano-controller.conf file in ConfigMap
networkTopologyDiscovery:
- source: ufm
enabled: true
interval: 10m
credentials:
secretRef:
name: ufm-credentials
namespace: volcano-system
config:
[...]
- source: roce
enabled: false
interval: 15m
config:
[...]
- source: label
enabled: false
config:
# Configuration of the label discovery source
```
* **Code Example**:
```go
// NetworkTopologyConfig represents the configuration of the network topology
type NetworkTopologyConfig struct {
// NetworkTopologyDiscovery specifies the network topology to discover,
// Each discovery source has its own specific configuration
NetworkTopologyDiscovery []DiscoveryConfig `json:"networkTopologyDiscovery" yaml:"networkTopologyDiscovery"`
}
// SecretRef refers to a secret containing sensitive information
type SecretRef struct {
Name string `json:"name" yaml:"name"`
Namespace string `json:"namespace" yaml:"namespace"`
}
// Credentials specifies how to retrieve credentials
type Credentials struct {
SecretRef *SecretRef `json:"secretRef" yaml:"secretRef"`
}
// DiscoveryConfig contains configuration for a specific discovery source
type DiscoveryConfig struct {
// Source specifies the discover source (e.g., "ufm", "roce", etc.)
Source string `json:"source" yaml:"source"`
// Enabled determines if discovery for this source is active
Enabled bool `json:"enabled" yaml:"enabled"`
// Interval is the period between topology discovery operations
// If not specified, DefaultDiscoveryInterval will be used
Interval time.Duration `json:"interval" yaml:"interval"`
// Credentials specifies the username/password to access the discovery source
Credentials *Credentials `json:"credentials" yaml:"credentials"`
// Config contains specific configuration parameters for each discovery source
Config map[string]interface{} `json:"config" yaml:"config"`
}
```
### Discovery Manager
`Discovery Manager` is responsible for managing and coordinating various network topology discoverers.
* **Function**:
* Create and start the corresponding `Discoverer` based on the configuration information.
* Periodically obtain network topology information from `Discoverer`.
* Send network topology information to `HyperNode Controller`.
* **Key Code**:
```go
// RegisterDiscoverer registers the discoverer constructor for the specified source
func RegisterDiscoverer(source string, constructor DiscovererConstructor, kubeClient clientset.Interface) {
discovererRegistry[source] = constructor
}
```
### Discoverer
`Discoverer` is a specific network topology discoverer, responsible for obtaining network topology information from a specific data source and converting it into `HyperNode` resources.
* **Function**:
* Retrieve authentication credentials from Kubernetes Secrets.
* Obtain network topology information from the specified data source.
* Convert network topology information into `HyperNode` resources.
* **Interface Definition**:
```go
// Discoverer is the interface for network topology discovery
type Discoverer interface {
// Start begins the discovery process, sending discovered nodes through the provided channel
Start(outputCh chan<- []*topologyv1alpha1.HyperNode) error
// Stop halts the discovery process
Stop() error
// Name returns the discoverer identifier, this is used for labeling discovered hyperNodes for distinction.
Name() string
}
```
* **Implementation**:
* **UFM Discoverer**: Obtain network topology information from UFM (Unified Fabric Manager).
* **RoCE Discoverer**: Obtain network topology information from RoCE (RDMA over Converged Ethernet) devices.
* **Label Discoverer**: Discover network topology information based on the Label on the Node.
* **Credential Management**:
* Credentials are retrieved from Kubernetes Secrets specified in the configuration.
* The Secret reference includes both name and namespace parameters.
### HyperNode Controller
`HyperNode Controller` is responsible for listening to changes in `HyperNode` resources and creating, updating, or deleting `HyperNode` resources based on the discovered topology information.
* **Function**:
* Listen to change events of `HyperNode` resources.
* Create, update, or delete `HyperNode` resources based on the network topology information provided by `Discoverer`.
* **Key Code**:
```go
func (hn *hyperNodeController) reconcileTopology(source string, discoveredNodes []*topologyv1alpha1.HyperNode) {
klog.InfoS("Starting topology reconciliation", "source", source, "discoveredNodeCount", len(discoveredNodes))
existingNodes, err := hn.hyperNodeLister.List(labels.SelectorFromSet(labels.Set{
api.NetworkTopologySourceLabelKey: source,
}))
if err != nil {
klog.ErrorS(err, "Failed to list existing HyperNode resources")
return
}
existingNodeMap := make(map[string]*topologyv1alpha1.HyperNode)
for _, node := range existingNodes {
existingNodeMap[node.Name] = node
}
discoveredNodeMap := make(map[string]*topologyv1alpha1.HyperNode)
for _, node := range discoveredNodes {
if node.Labels == nil {
node.Labels = make(map[string]string)
}
node.Labels[api.NetworkTopologySourceLabelKey] = source
discoveredNodeMap[node.Name] = node
}
for name, node := range discoveredNodeMap {
if _, exists := existingNodeMap[name]; !exists {
[...]
} else {
[...]
}
delete(existingNodeMap, name)
}
for name := range existingNodeMap {
klog.InfoS("Deleting HyperNode", "name", name, "source", source)
if err := utils.DeleteHyperNode(hn.vcClient, name); err != nil {
klog.ErrorS(err, "Failed to delete HyperNode", "name", name)
}
}
klog.InfoS("Topology reconciliation completed",
"source", source,
"discovered", len(discoveredNodes),
"created/updated", len(discoveredNodeMap),
"deleted", len(existingNodeMap))
}
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 511 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 182 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 574 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 129 KiB

0
docs/design/jobflow/README.md Executable file → Normal file
View File

View File

@ -27,32 +27,41 @@ This metrics track execution of plugins and actions of volcano loop.
### volcano operations
This metrics describe internal state of volcano.
| **Metric Name** | **Metric Type** | **Labels** | **Description** |
|---------------------------------|-----------------|-------------------------------------------------------------|-----------------------------------------------|
| `schedule_attempts_total` | Counter | `result`=&lt;result&gt; | The number of attempts to schedule pods |
| `pod_preemption_victims` | Gauge | None | The number of selected preemption victims |
| `total_preemption_attempts` | Counter | None | Total preemption attempts in the cluster |
| `unschedule_task_count` | Gauge | `job_id`=&lt;job_id&gt; | The number of tasks failed to schedule |
| `unschedule_job_counts` | Gauge | None | The number of jobs could not be scheduled |
| `queue_allocated_milli_cpu` | Gauge | `queue_name`=&lt;queue_name&gt; | Allocated CPU count for one queue |
| `queue_allocated_memory_bytes` | Gauge | `queue_name`=&lt;queue_name&gt; | Allocated memory for one queue |
| `queue_request_milli_cpu` | Gauge | `queue_name`=&lt;queue_name&gt; | Requested CPU count for one queue |
| `queue_request_memory_bytes` | Gauge | `queue_name`=&lt;queue_name&gt; | Requested memory for one queue |
| `queue_deserved_milli_cpu` | Gauge | `queue_name`=&lt;queue_name&gt; | Deserved CPU count for one queue |
| `queue_deserved_memory_bytes` | Gauge | `queue_name`=&lt;queue_name&gt; | Deserved memory for one queue |
| `queue_share` | Gauge | `queue_name`=&lt;queue_name&gt; | Share for one queue |
| `queue_weight` | Gauge | `queue_name`=&lt;queue_name&gt; | Weight for one queue |
| `queue_overused` | Gauge | `queue_name`=&lt;queue_name&gt; | Whether one queue is overused |
| `queue_pod_group_inqueue_count` | Gauge | `queue_name`=&lt;queue_name&gt; | The number of Inqueue PodGroups in this queue |
| `queue_pod_group_pending_count` | Gauge | `queue_name`=&lt;queue_name&gt; | The number of Pending PodGroups in this queue |
| `queue_pod_group_running_count` | Gauge | `queue_name`=&lt;queue_name&gt; | The number of Running PodGroups in this queue |
| `queue_pod_group_unknown_count` | Gauge | `queue_name`=&lt;queue_name&gt; | The number of Unknown PodGroups in this queue |
| `namespace_share` | Gauge | `namespace_name`=&lt;namespace_name&gt; | Deserved CPU count for one namespace |
| `namespace_weight` | Gauge | `namespace_name`=&lt;namespace_name&gt; | Weight for one namespace |
| `job_share` | Gauge | `job_id`=&lt;job_id&gt;, `job_ns`=&lt;job_ns&gt; | Share for one job |
| `job_retry_counts` | Counter | `job_id`=&lt;job_id&gt; | The number of retry counts for one job |
| `job_completed_phase_count` | Counter | `job_name`=&lt;job_name&gt; `queue_name`=&lt;queue_name&gt; | The number of job completed phase |
| `job_failed_phase_count` | Counter | `job_name`=&lt;job_name&gt; `queue_name`=&lt;queue_name&gt; | The number of job failed phase |
| **Metric Name** | **Metric Type** | **Labels** | **Description** |
|----------------------------------------|-----------------|-------------------------------------------------------------------|-----------------------------------------------|
| `schedule_attempts_total` | Counter | `result`=&lt;result&gt; | The number of attempts to schedule pods |
| `pod_preemption_victims` | Gauge | None | The number of selected preemption victims |
| `total_preemption_attempts` | Counter | None | Total preemption attempts in the cluster |
| `unschedule_task_count` | Gauge | `job_id`=&lt;job_id&gt; | The number of tasks failed to schedule |
| `unschedule_job_counts` | Gauge | None | The number of jobs could not be scheduled |
| `queue_allocated_milli_cpu` | Gauge | `queue_name`=&lt;queue_name&gt; | Allocated CPU count for one queue |
| `queue_allocated_memory_bytes` | Gauge | `queue_name`=&lt;queue_name&gt; | Allocated memory for one queue |
| `queue_allocated_scalar_resources` | Gauge | `queue_name`=&lt;queue_name&gt;, `resource`=&lt;resource_name&gt; | Allocated scalar resource for one queue |
| `queue_request_milli_cpu` | Gauge | `queue_name`=&lt;queue_name&gt; | Requested CPU count for one queue |
| `queue_request_memory_bytes` | Gauge | `queue_name`=&lt;queue_name&gt; | Requested memory for one queue |
| `queue_request_scalar_resources` | Gauge | `queue_name`=&lt;queue_name&gt;, `resource`=&lt;resource_name&gt; | Requested scalar resource for one queue |
| `queue_deserved_milli_cpu` | Gauge | `queue_name`=&lt;queue_name&gt; | Deserved CPU count for one queue |
| `queue_deserved_memory_bytes` | Gauge | `queue_name`=&lt;queue_name&gt; | Deserved memory for one queue |
| `queue_deserved_scalar_resources` | Gauge | `queue_name`=&lt;queue_name&gt;, | Deserved scalar resource for one queue |
| `queue_capacity_mill_cpu` | Gauge | `queue_name`=&lt;queue_name&gt;, | CPU count capacity for one queue |
| `queue_capacity_memory_bytes` | Gauge | `queue_name`=&lt;queue_name&gt;, | memory capacity for one queue |
| `queue_capacity_scalar_resources` | Gauge | `queue_name`=&lt;queue_name&gt;, `resource`=&lt;resource_name&gt; | Scalar resource capacity for one queue |
| `queue_real_capacity_mill_cpu` | Gauge | `queue_name`=&lt;queue_name&gt;, | CPU count real capacity for one queue |
| `queue_real_capacity_memory_bytes` | Gauge | `queue_name`=&lt;queue_name&gt;, | Memory real capacity for one queue |
| `queue_real_capacity_scalar_resources` | Gauge | `queue_name`=&lt;queue_name&gt;, `resource`=&lt;resource_name&gt; | Scalar resource real capacity for one queue |
| `queue_share` | Gauge | `queue_name`=&lt;queue_name&gt; | Share for one queue |
| `queue_weight` | Gauge | `queue_name`=&lt;queue_name&gt; | Weight for one queue |
| `queue_overused` | Gauge | `queue_name`=&lt;queue_name&gt; | Whether one queue is overused |
| `queue_pod_group_inqueue_count` | Gauge | `queue_name`=&lt;queue_name&gt; | The number of Inqueue PodGroups in this queue |
| `queue_pod_group_pending_count` | Gauge | `queue_name`=&lt;queue_name&gt; | The number of Pending PodGroups in this queue |
| `queue_pod_group_running_count` | Gauge | `queue_name`=&lt;queue_name&gt; | The number of Running PodGroups in this queue |
| `queue_pod_group_unknown_count` | Gauge | `queue_name`=&lt;queue_name&gt; | The number of Unknown PodGroups in this queue |
| `namespace_share` | Gauge | `namespace_name`=&lt;namespace_name&gt; | Deserved CPU count for one namespace |
| `namespace_weight` | Gauge | `namespace_name`=&lt;namespace_name&gt; | Weight for one namespace |
| `job_share` | Gauge | `job_id`=&lt;job_id&gt;, `job_ns`=&lt;job_ns&gt; | Share for one job |
| `job_retry_counts` | Counter | `job_id`=&lt;job_id&gt; | The number of retry counts for one job |
| `job_completed_phase_count` | Counter | `job_name`=&lt;job_name&gt; `queue_name`=&lt;queue_name&gt; | The number of job completed phase |
| `job_failed_phase_count` | Counter | `job_name`=&lt;job_name&gt; `queue_name`=&lt;queue_name&gt; | The number of job failed phase |
### volcano Liveness
Healthcheck last time of volcano activity and timeout

View File

@ -6,7 +6,7 @@
## Backguards
We intended to provide volcano the ability to share third-party resources link GPU,NPU,etc in the near future. These resources are usually registered and allocated in a daemonset called `device plugin`. When a pod requests any third-party resource, the api-server send an `AllocateRequest` to device plugin, containing the target device uuid it requests. However, it does not contain any information about the pod or container requesting the device, so it is impossible for deviceplugin to access the annotation of pod, because is does not know which pod sends this request.
We intended to provide volcano the ability to share third-party resources link GPU,NPU,etc in the near future. These resources are usually registered and allocated in a daemonset called `device plugin`. When a pod requests any third-party resource, the kubelet send an `AllocateRequest` to device plugin, containing the target device uuid it requests. However, it does not contain any information about the pod or container requesting the device, so it is impossible for deviceplugin to access the annotation of pod, because is does not know which pod sends this request.
Using `Node Lock` will resolve this problem because only one pod is requesting this resource on this node. So, device-plugin can access additional information in pod.annotations written by scheduler and will be able to do more fine-grained work As the figure shown below:
![img](./images/node-lock.jpg)

84
docs/design/prebind.md Normal file
View File

@ -0,0 +1,84 @@
# PreBind
## Backgrounds
Inspired by [#3618](https://github.com/volcano-sh/volcano/issues/3618). Volcano has introduced a lot of plugins from kube-scheduler,
such as `NodeAffinity`, `TaintToleration`, `PodTopologySpread`, `ImageLocality`, etc. These plugins implement the `PreFilter`, `Filter`,
`PreScore`, `Score` in the predicates and nodeorder plugins. However, plugins such as `VolumeBinding` and `DynamicResourceAllocation`
require the `PreBind` extension point. This extension point is executed before binding the Pod to a node and is used to bind additional resources,
such as `PVC` and `ResourceClaim`. If the `PreBind` operation fails, the pod cannot be bound to a node. Currently, volcano lacks the `PreBind` extension point.
## Motivation
Add `PreBind` extension point to volcano, enables the introduction of plugins such as `VolumeBinding` and `DynamicResourceAllocation`, to bind additional resources
## Design Details
A scheduling session is responsible for filtering nodes, scoring nodes, and pre-selecting a node for a pod.
While the cache is responsible for binding the Pod to a node. The session and cache execute in two different goroutines.
Since `Bind` is executed in the cache, `PreBind` also needs to be executed in the cache.
However, the current extension points are all registered in the session, such as `PrePredicateFn`, `PredicateFn`, `NodeOrderFn`, etc.
Adding an extension point like called `PreBindFn` in the session is meaningless because `PreBind` needs to be executed in the cache,
If we try to add `PreBindFn` to the session like adding other extension points, when the session dispatches the pod bind task to the cache,
it still needs to copy the plugins' `PreBind` execution func to the cache. If some plugins' `PreBind` logic contains references to the session,
this might also lead to the session not being released even after `CloseSession` has been called.
Therefore, it is best for the plugin to register `PreBind` directly to the cache.
### PreBinder interface
A new interface needs to be added to the cache, called the `PreBinder` interface:
```go
type PreBinder interface {
PreBind(ctx context.Context, bindCtx *BindContext) error
// PreBindRollBack is called when the pre-bind or bind fails.
PreBindRollBack(ctx context.Context, bindCtx *BindContext)
}
```
It contains two methods, one is `PreBind`, which is the method for the plugin to execute the main `PreBind` logic, and the other is `PreBindRollBack`.
When `PreBind` fails or `Bind` fails, the executed `PreBind` needs to be rolled back to unbind the additional bound resources. If the plugin wants to execute `PreBind`,
it needs to implement the `PreBinder` interface and call the newly added `RegisterPreBinder` method in session to register the plugin into the cache:
```go
// RegisterPreBinder registers the passed bind handler to the cache
func (ssn *Session) RegisterPreBinder(name string, preBinder interface{}) {
ssn.cache.RegisterPreBinder(name, preBinder)
}
```
### BindContext
For the parameters of `PreBind` and `PreBindRollBack`, `BindContext` is a newly defined structure:
```go
type BindContextExtension interface{}
type BindContext struct {
TaskInfo *schedulingapi.TaskInfo
// Extensions stores extra bind context information of each plugin
Extensions map[string]BindContextExtension
}
```
This structure contains the `TaskInfo` that needs to be bound, and a new field called `Extensions`. It's a map contains the bind context information of each plugin.
The `BindContextExtension` carries the information that the plugin needs to pass into the `PreBind` extension point, the `Bind` extension point,
and even the `PostBind` extension point that may need to be added in the future. The `BindContextExtension` is just a empty `interface{}` type.
```go
type BindContextHandler interface {
// SetupBindContextExtension allows the plugin to set up extension information in the bind context
SetupBindContextExtension(ssn *Session, bindCtx *cache.BindContext)
}
```
`BindContextHandler` is a new interface added to session. Plugin can choose whether to implement `BindContextHandler` interface. For example,
`VolumeBinding` needs to carry cycleState to `PreBind`, it can wrap cycleState and then set the additional information to be passed through `SetUpBindContextExtension` method.
```go
// bind context extension information
type bindContextExtension struct {
State *k8sframework.CycleState
}
func (pp *predicatesPlugin) SetupBindContextExtension(ssn *framework.Session, bindCtx *cache.BindContext) {
// ...
// set up bind context extension
bindCtx.Extensions[pp.Name()] = &bindContextExtension{State: ssn.GetCycleState(bindCtx.TaskInfo.UID)}
}
```
So to summarize, if a new plugin needs to call `PreBind`, the process involves:
1. Implement the `PreBinder` interface
2. Call `RegisterPreBinder` to register the plugin when in `OpenSession`
(Optional) If the plugin needs to pass additional information to `PreBind`, the process involves:
1. Implement the `SetupBindContextExtension` method of the `BindContextHandler` interface
2. In the `SetupBindContextExtension` implementation, set the additional information that the plugin needs to carry in the `Extensions` map of the `BindContext` parameter

View File

@ -0,0 +1,98 @@
# Preempt Action Support Topology
## Motivation
In cloud-native task scheduling scenarios, preemption is a key feature to ensure timely scheduling of high-priority tasks. Compared to the K8s scheduler, Volcano's current preemption implementation is relatively simple, especially in handling affinity judgments. To improve the accuracy and efficiency of the preemption mechanism, the existing implementation needs to be optimized, particularly in supporting topology awareness.
## In Scope
- Optimize Volcano's preemption mechanism to support affinity judgments
- Improve single Pod preemption process
- Implement simulation scheduling interface to ensure simulated addition and removal of pods won't cause topology changes
## Out of Scope
- Gang scheduling preemption scenario optimization
## User Stories
### Story 1
As a cluster administrator, I want the system to accurately judge Pod affinity constraints during preemption scheduling to avoid scheduling failures caused by topology changes.
### Story 2
As a user, I expect high-priority Pod preemption to minimize impact on existing Pods while maintaining consistency of affinity rules.
### Story 3
When topology-sensitive resources like GPUs exist, the preemption process needs to consider resource topology relationships to ensure resource allocation after preemption still satisfies original topology constraints.
For example, if a node has 2 GPUs (8GB each), Pod A and Pod B each use 4GB, and Pod C needs 8GB. When Pod C needs to be scheduled, it triggers the preemption mechanism. During the simulation scheduling process, the system will try to preempt Pod A and reschedule it. There are two possible scenarios:
1. If topology changes during simulation scheduling:
- System chooses to preempt Pod A
- The predicate function check successfully for Pod C
- When simulating the re-addition of Pod A, the binpack strategy causes Pod A to be scheduled to a different GPU
- After re-adding Pod A, the predicate function check still passes for Pod C
- This means Pod C can be scheduled without actually removing Pod A
- Therefore, the preemption is considered unnecessary and fails
2. If topology remains consistent during simulation scheduling:
- System chooses to preempt Pod A
- The predicate function check successfully for Pod C
- When simulating the re-addition of Pod A, the original topology relationship is maintained
- After re-adding Pod A, the predicate function check fails for Pod C
- This confirms that Pod A must be removed for Pod C to be scheduled
- Therefore, the preemption is considered necessary and succeeds
Therefore, when implementing the preemption mechanism, it's crucial to verify the necessity of preemption by checking if the topology changes during pod re-addition would affect the scheduling of the preempting pod.
![preempt-action-support-topology-1](images/preempt-action-support-topology/preempt-action-support-topology-1.png)
## Design Detail
### Preemption Process
![preempt-action-support-topology-2](images/preempt-action-support-topology/preempt-action-support-topology-2.png)
1. Execute Predicate on all nodes that are not UnschedulableAndUnresolvable to obtain candidate node list, and perform parallel simulation scheduling on all candidate nodes.
2. The simulation scheduling process for each node is as follows:
1. First consider Pods with lower priority as potential victims on the node
2. Sort the victim list (lower priority and non-PDB-violating victims come first)
3. Remove victims in order, add each removed one to eviction candidates, and observe if the verification function passes
4. Verification function: Try to add pods (pipelined) with higher priority targeting the current node, observe if they can pass predicate; then remove them and observe if they can pass predicate
5. If passed, try to add back the previous eviction candidates in PDB and priority order (to minimize impact), calling verification function after each addition; if verification fails, add to final eviction list
6. If final eviction list is not empty, return it
3. Sort filtered nodes using PreemptNodeOrderFn
4. Schedule Pod to the top-ranked node, evict victims list, and cancel nominatedNodeName of lower priority pods that had nominated this node, moving them from pipeline to pending schedule
### Key Function Modifications
- `SimulateRemoveTaskFn`: Simulate the removal of a task from a node, plugins implement this function to ensure the removal action does not cause topology changes
```go
type SimulateRemoveTaskFn func(ctx context.Context, state *k8sframework.CycleState, taskToSchedule *TaskInfo, taskInfoToRemove *TaskInfo, nodeInfo *NodeInfo) error
```
- `SimulateAddTaskFn`: Simulate the addition of a task to a node, plugins implement this function to ensure the addition action does not cause topology changes
```go
type SimulateAddTaskFn func(ctx context.Context, state *k8sframework.CycleState, taskToSchedule *TaskInfo, taskInfoToAdd *TaskInfo, nodeInfo *NodeInfo) error
```
- `SimulatePredicateFn`: Simulate the predicate check for a task on a node, plugins implement this function to verify if the task can be scheduled to the node while maintaining topology constraints
```go
type SimulatePredicateFn func(ctx context.Context, state *k8sframework.CycleState, task *TaskInfo, nodeInfo *NodeInfo) error
```
- `SimulateAllocatableFn`: Simulate the allocatable check for a node, plugins implement this function to verify if the queue has enough resources to schedule the task while maintaining topology constraints
```go
type SimulateAllocatableFn func(ctx context.Context, state *k8sframework.CycleState, queue *QueueInfo, task *TaskInfo) bool
```
### Limitations
- Current design focuses on single pod preemption scenarios. Does not handle complex topology changes in gang scheduling
- For complex combinations of affinity rules, multiple attempts may be needed to find the optimal solution. Performance impact of simulation scheduling needs to be evaluated in large-scale clusters

View File

@ -2,6 +2,7 @@
[@eggiter](https://github.com/eggiter); Aug 13, 2021
[@tgaddair](https://github.com/tgaddair); Dec 12, 2022
[@ouyangshengjia](https://github.com/ouyangshengjia); Mar 20, 2025
## Table of Contents
@ -20,31 +21,28 @@
- Introducing an extra scheduling reasons in PodScheduled PodCondition:
+ `Schedulable`: means that the scheduler can schedule the pod right now, but not bind yet.
An additional reason was originally considered that conflicts with cluster autoscaler. Instead, this reason only appears in the detailed `Message` field, and shows up with `Reason: Unschedulable` in the pod:
+ `Undetermined`: means that the scheduler skips scheduling the pod which left the pod `Undetermined`, for example due to unschedulable pod already occurred.
- Case:
+ 3 nodes: node1, node2, node3;
+ 6 tasks in the PodGroup(`minAvailable = 6`);
+ only 1 task(Task6) is unschedulable and other 5 tasks(Task1 ~ 5) can be scheduled, thus the whole job is unscheduable.
+ 3 tasks(Task1 ~ 3) can be scheduled and other 3 tasks(Task4 ~ 6) is unschedulable, thus the whole job is unscheduable.
- Current information:
+ |Tasks|Reason|Message|
|-|-|-|
|PodGroup |NotEnoughResources|6/6 tasks in gang unschedulable: pod group is not ready, 6 Pending, 6 minAvailable |
|Task1 ~ 5|Unschedulable|6/6 tasks in gang unschedulable: pod group is not ready, 6 Pending, 6 minAvailable|
|Task6|Unschedulable| all nodes are unavailable: 1 plugin InterPodAffinity predicates failed node(s) didn't match pod affinity/anti-affinity, node(s) didn't match pod affinity rules, 2 node(s) resource fit failed.|
+ | Tasks | Reason | Message |
|-----------|--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PodGroup | NotEnoughResources | 6/6 tasks in gang unschedulable: pod group is not ready, 6 Pending, 6 minAvailable |
| Task1 ~ 3 | Unschedulable | 6/6 tasks in gang unschedulable: pod group is not ready, 6 Pending, 6 minAvailable |
| Task4 ~ 6 | Unschedulable | all nodes are unavailable: 1 plugin InterPodAffinity predicates failed node(s) didn't match pod affinity/anti-affinity, node(s) didn't match pod affinity rules, 2 node(s) resource fit failed. |
- Improved information:
+ |Tasks|Reason|Message|
|-|-|-|
|PodGroup |(same)| **3/6** tasks in gang unschedulable: pod group is not ready, 6 Pending, 6 minAvailable; **Pending: 1 Unschedulable, 2 Undetermined, 3 Schedulable** |
|Task1 ~ 2|**Unschedulable**| **3/6** tasks in gang unschedulable: pod group is not ready, 6 Pending, 6 minAvailable; **Pending: 1 Unschedulable, 2 Undetermined, 3 Schedulable** |
|Task3|**Schedulable**| **Pod ns1/task-1 can possibly be assgined to node1** |
|Task4|**Schedulable**| **Pod ns1/task-2 can possibly be assgined to node2** |
|Task5|**Schedulable**| **Pod ns1/task-3 can possibly be assgined to node3** |
|Task6|Unschedulable| all nodes are unavailable: 1 plugin InterPodAffinity predicates failed node(s) didn't match pod affinity/anti-affinity, node(s) didn't match pod affinity rules, 2 node(s) resource fit failed.|
+ | Tasks | Reason | Message |
|-----------|--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PodGroup | NotEnoughResources | **3/6** tasks in gang unschedulable: pod group is not ready, 6 Pending, 6 minAvailable; **Pending: 3 Unschedulable, 3 Schedulable** |
| Task1 | **Schedulable** | **Pod ns1/task-1 can possibly be assgined to node1, once minAvailable is satisfied** |
| Task2 | **Schedulable** | **Pod ns1/task-2 can possibly be assgined to node2, once minAvailable is satisfied** |
| Task3 | **Schedulable** | **Pod ns1/task-3 can possibly be assgined to node3, once minAvailable is satisfied** |
| Task4 | Unschedulable | all nodes are unavailable: 1 plugin InterPodAffinity predicates failed node(s) didn't match pod affinity/anti-affinity, node(s) didn't match pod affinity rules, 2 node(s) resource fit failed. |
| Task5 ~ 6 | Unschedulable | **3/6** tasks in gang unschedulable: pod group is not ready, 6 Pending, 6 minAvailable; **Pending: 3 Unschedulable, 3 Schedulable** |
- Note: Task1 & 2 are `Unschedulable` maybe because that this two locate after task6 by `TaskOrderFn`;
- Note: Task5 ~ 6 are `Unschedulable` maybe because that this two locate after task4 by `TaskOrderFn`;
- In improved information, we can easily find the one that breaks the whole scheduling cycle and why dose that happen. Additionally, we can find the histogram of reason why there are some tasks whose status is pending.

View File

@ -14,7 +14,7 @@ TDM plugin enhance the volcano time-sharing multiplexing resource ability. It wi
![tdmsolution](./images/tdmsolution.png)
1. Add `volcano.sh/preemptable` annotaion for Pod/PodGroup. For volcano job when add this annotaion in job level, Pod/PodGroup will inherit this annotation. The pod with `volcano.sh/preemptable: "true"` annotation can be dispatched to `revocable node`.
1. Add `volcano.sh/preemptable` annotation for Pod/PodGroup. For volcano job when add this annotation in job level, Pod/PodGroup will inherit this annotation. The pod with `volcano.sh/preemptable: "true"` annotation can be dispatched to `revocable node`.
Otherwise, the pod can not be dispatched to `revocable node`.
2. Add `tdm` plugin and config for volcano scheduler. `tdm.revocable-zone` is a const prefix(), `rz1` is the revocable zone name and the value is a time frame.

View File

@ -0,0 +1,39 @@
# Volume Binding
## Backgrounds
Currently, Volcano does not support the `WaitForFirstConsumer` mode for volume binding.
The current operations for binding volumes are intrusively modified into the scheduler cache, becoming part of the cache's interface.
The `VolumeBinding` plugin, as a plugin also introduced from kube-scheduler, should be introduced into the `predicates` or `nodeorder` plugins like other plugins,
volume binding functionality should not be intrusively modified into the cache.
## Design Details
`BindVolumes`, `AllocateVolumes`, and other intrusive changes to the scheduler cache as interfaces need to be removed,
which are originally part of the binder in the `VolumeBinding` plugin and should not be exposed to the cache.
The extension points involved in the `VolumeBinding` plugin are `PreFilter`, `Filter`, `PreScore`, `Score`, `Reserve`, `Unreserve`, and `PreBind`,
so it is most appropriate to introduce `VolumeBinding` into the `predicates` plugin, and `predicates` is the core plugin.
`VolumeBinding` should be integrated into `predicates` plugin as a default enabled feature. However,
predicates did not implement extension points such as `Score`, `Reserve`, `PreBind`, etc., so the predicates plugin can implement these interfaces as follows:
1. `Score`: For scoring, `predicates` can refer to and reuse the `AddBatchNodeOrderFn` logic of the
[nodeorder](https://github.com/volcano-sh/volcano/blob/7103c18de19821cd278f949fa24c13da350a8c5d/pkg/scheduler/plugins/nodeorder/nodeorder.go#L301-L335) plugin
2. `Reserve` and `Unreserve`: For `Reserve` and `Unreserve`, `predicates` plugin can implement them in event handler's `AllocateFunc` and `DeallocateFunc`, for example,
add following codes for `Reserve`:
```go
func (pp *predicatesPlugin) runReservePlugins(ssn *framework.Session, event *framework.Event) {
// Volume Binding Reserve
if pp.volumeBindingPlugin != nil {
status := pp.volumeBindingPlugin.Reserve(context.TODO(), state, event.Task.Pod, event.Task.Pod.Spec.NodeName)
if !status.IsSuccess() {
event.Err = status.AsError()
return
}
}
// other plugins' Reserve
}
```
3. `PreBind`: For `PreBind`, a new interface called `PreBinder` should be added and the predicates plugin should implement it, contains
two methods `PreBind` and `PreBindRollback`, and `PreBind` method directly calls the `PreBind` extension points of each plugin.
Because the plugin lifecycle is generally in the session, that is, to pre-allocate nodes to pods, while `PreBind` and `Bind` are in the cache,
which are two different goroutines, so if the plugin needs to pass additional information to the `PreBind` method,
it can be set through the `BindContext` parameter of the `PreBind` method, which is a new struct containing a map that can carry the information that the plugin needs to pass to `PreBind`.
The specific implementation details of `PreBind` can be referred to this doc: [PreBind](prebind.md).

View File

@ -1,6 +1,4 @@
This document helps you get started using the Volcano code base.
If you follow this guide and find some problem, please take
a few minutes to update this file.
This document helps you get started using the Volcano code base. If you follow this guide and find a problem, please take a few minutes to update this file.
- [Building the code](#building-the-code)
- [Building docker images](#building-docker-images)
@ -13,39 +11,37 @@ a few minutes to update this file.
- [Adding dependencies](#adding-dependencies)
- [About testing](#about-testing)
## Cloning the Code
## Cloning the code
You will need to clone the main `volcano` repo to `$GOPATH/src/volcano.sh/volcano` for the below commands to work correctly.
You will need to clone the main `volcano` repo to `$GOPATH/src/volcano.sh/volcano` for
the below commands to work correctly.
## Building the Code
## Building the code
To build volcano all components for your host architecture, go to
the source root and run:
To build all Volcano components for your host architecture, go to the source root and run:
```bash
make image_bins
```
the binaries will be generated at .../src/volcano.sh/volcano/_output/bin/linux/amd64/
but if we just make as below
The binaries will be generated at `.../src/volcano.sh/volcano/_output/bin/linux/amd64/`.
If we just run `make` as below:
```bash
make
```
then the binaries would be generated at .../src/volcano.sh/volcano/_output/bin/
To build a specific component for your host architecture, go to
the source root and run `make <component name>`:
Then the binaries would be generated at `.../src/volcano.sh/volcano/_output/bin/`.
To build a specific component for your host architecture, go to the source root and run `make <component name>`:
```bash
make vc-scheduler
```
## Building Docker Images
## Building docker images
Build the containers in your local docker cache:
Build the containers in your local Docker cache:
```bash
make images
@ -57,44 +53,44 @@ To build cross-platform images:
make images DOCKER_PLATFORMS="linux/amd64,linux/arm64" BUILDX_OUTPUT_TYPE=registry IMAGE_PREFIX=[yourregistry]
```
## Building a Specific Component
## Building a specific component
If you want to make a local change and test some component, say `vc-controller-manager`, you could do:
If you want to make a local change and test some component, say `vc-controller-manager`, you
could do:
Under volcano.sh/volcano repo
Under `volcano.sh/volcano` repo:
```bash
pwd
```
The path should be
The path should be:
```bash
.../src/volcano.sh/volcano
```
Set up environment variables HUB and TAG by
Set up environment variables HUB and TAG:
```bash
export HUB=docker.io/yourrepo
export TAG=citadel
```
Make some local change of the code, then build `vc-controller-manager`
Make some local change to the code, then build `vc-controller-manager`:
```bash
make vc-controller-manager
```
## Building the Volcano manifests
## Building the Volcano Manifests
Use the following command to build the deploy yaml files:
Use the following command to build the deploy YAML files:
```bash
make generate-yaml
```
## Cleaning outputs
## Cleaning Outputs
You can delete any build artifacts with:
@ -102,9 +98,9 @@ You can delete any build artifacts with:
make clean
```
## Running tests
## Running Tests
### Running unit tests
### Running Unit Tests
You can run all the available unit tests with:
@ -112,7 +108,7 @@ You can run all the available unit tests with:
make unit-test
```
### Running e2e tests
### Running E2E Tests
You can run all the available e2e tests with:
@ -122,23 +118,22 @@ make images
make e2e
```
If you want to run e2e test in a existing cluster with volcano deployed, run the following:
If you want to run e2e tests in an existing cluster with Volcano deployed, run the following:
```bash
export VC_BIN= need to set vcctl binary path (eg:.../src/volcano.sh/volcano/_output/bin/)
export VC_BIN=<path-to-vcctl-binary> (e.g., .../src/volcano.sh/volcano/_output/bin/)
KUBECONFIG=${KUBECONFIG} go test ./test/e2e
```
## Auto-formatting source code
## Auto-Formatting Source Code
You can automatically format the source code to follow our conventions by going to the
top of the repo and entering:
You can automatically format the source code to follow our conventions by going to the top of the repo and entering:
```bash
./hack/update-gofmt.sh
```
## Running the verification
## Running the Verification
You can run all the verification we require on your local repo by going to the top of the repo and entering:
@ -146,10 +141,9 @@ You can run all the verification we require on your local repo by going to the t
make verify
```
## Adding dependencies
## Adding Dependencies
Volcano uses [Go Modules](https://blog.golang.org/migrating-to-go-modules) to manage its dependencies.
If you want to add or update a dependency, running:
Volcano uses [Go Modules](https://blog.golang.org/migrating-to-go-modules) to manage its dependencies. If you want to add or update a dependency, run:
```bash
go get dependency-name@version
@ -157,19 +151,17 @@ go mod tidy
go mod vendor
```
Note: Go's module system, introduced in Go 1.11, provides an official dependency management solution built into the go command.
Make sure `GO111MODULE` env is not `off` before using it.
Note: Go's module system, introduced in Go 1.11, provides an official dependency management solution built into the `go` command. Make sure `GO111MODULE` env is not `off` before using it.
## About testing
## About Testing
Before sending pull requests you should at least make sure your changes have
passed both unit and the verification. We only merge pull requests when
**all** tests are passing.
Before sending pull requests, you should at least make sure your changes have passed both unit tests and verification. We only merge pull requests when **all** tests are passing.
- Unit tests should be fully hermetic
- Only access resources in the test binary.
- All packages and any significant files require unit tests.
- Unit tests are written using the standard Go testing package.
- The preferred method of testing multiple scenarios or input is
[table driven testing](https://github.com/golang/go/wiki/TableDrivenTests)
- The preferred method of testing multiple scenarios or input is [table-driven testing](https://github.com/golang/go/wiki/TableDrivenTests).
- Concurrent unit test runs must pass.

View File

@ -0,0 +1,76 @@
# How to Enable Dynamic Resource Allocation (DRA) in Volcano Scheduler
This document describes the steps required to enable Dynamic Resource Allocation (DRA) support in the Volcano scheduler.
## Prerequisites
Before proceeding with the configuration steps, ensure your cluster meets the following prerequisites:
### Configure Cluster Nodes (Containerd)
For nodes running containerd as the container runtime, you must enable the Container Device Interface (CDI) feature.
This is crucial for containerd to properly interact with DRA drivers and inject dynamic resources into Pods.
Modify the containerd configuration file on each node (typically /etc/containerd/config.toml) to ensure the following setting is present:
```toml
# Enable CDI as described in
# https://tags.cncf.io/container-device-interface#containerd-configuration
[plugins."io.containerd.grpc.v1.cri"]
enable_cdi = true
cdi_spec_dirs = ["/etc/cdi", "/var/run/cdi"]
```
After modifying the configuration, restart the containerd service on each node for the changes to take effect. For example: `sudo systemctl restart containerd`
> If you are using other container runtimes, please refer to: [how-to-configure-cdi](https://github.com/cncf-tags/container-device-interface?tab=readme-ov-file#how-to-configure-cdi)
## 1. Configure Kube-apiserver
DRA-related APIs are k8s built-in resources instead of CRD resources, and these resources are not registered by default in v1.32,
so you need to set the startup parameters of kube-apiserver to manually register DRA-related APIs, add or ensure the following flag is present in your kube-apiserver manifest or configuration:
```yaml
--runtime-config=resource.k8s.io/v1beta1=true
```
## 2. Install Volcano With DRA feature gates enabled
When installing Volcano, you need to enable the DRA related feature gates, e.g., `DynamicResourceAllocation` must be enabled when you need to use DRA,
you can also choose to enable the `DRAAdminAccess` feature gate to manage devices as your need.
When you are using helm to install Volcano, you can use following command to install Volcano with DRA feature gates enabled:
```bash
helm install volcano volcano/volcano --namespace volcano-system --create-namespace \
--set custom.scheduler_feature_gates="DynamicResourceAllocation=true" \
# Add other necessary Helm values for your installation
```
When you directly use `kubectl apply -f` to install Volcano, you need to add or ensure the following flag is present in your volcano-scheduler manifest:
```yaml
--feature-gates=DynamicResourceAllocation=true
```
## 3. Configure Volcano Scheduler Plugins
After installing Volcano, you need to configure the Volcano scheduler's plugin configuration to enable the DRA plugin within the predicates plugin arguments.
Locate your Volcano scheduler configuration (A ConfigMap contains the configuration). Find the predicates plugin configuration and add or modify its arguments to enable DRA plugin.
An example snippet of the scheduler configuration (within the volcano-scheduler.conf key of the ConfigMap) might look like this:
```yaml
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: priority
- name: gang
- plugins:
- name: drf
- name: predicates
arguments:
predicate.DynamicResourceAllocationEnable: true
- name: proportion
- name: nodeorder
- name: binpack
```
## 4. Deploy a DRA Driver
To utilize Dynamic Resource Allocation, you need to deploy a DRA driver in your cluster. The driver is responsible for managing the lifecycle of dynamic resources.
For example, you can refer to the [kubernetes-sigs/dra-example-driver](https://github.com/kubernetes-sigs/dra-example-driver) to deploy a example DRA driver for testing.
For some DRA Drivers which have already been used in actual production, you can refer to:
- [NVIDIA/k8s-dra-driver-gpu](https://github.com/NVIDIA/k8s-dra-driver-gpu)
- [intel/intel-resource-drivers-for-kubernetes](https://github.com/intel/intel-resource-drivers-for-kubernetes)

View File

@ -0,0 +1,113 @@
# Usage Document
## Introduction
This document describes how to use the HyperNode network topology auto-discovery feature in Volcano. This feature automatically discovers the network topology within the cluster and creates and maintains HyperNode custom resources (CRs) based on the discovered information. The Volcano scheduler leverages these HyperNode CRs for scheduling decisions, eliminating the need for users to manually maintain HyperNode information.
## Prerequisites
Please [Install Volcano](https://github.com/volcano-sh/volcano/tree/master?tab=readme-ov-file#quick-start-guide) with version >= v1.12.0 first.
## Configuration
The HyperNode network topology discovery feature is configured via a ConfigMap. The ConfigMap contains the configuration for the discovery sources, such as UFM, RoCE, and label, you can modify the configuration according to your own cluster environments.
Please note that you should replace with your Volcano namespace if Volcano is not installed in the default namespace.
### Secret Configuration (Required First Step)
Before configuring the UFM discovery, you must first create a Kubernetes Secret to store your UFM credentials:
```bash
kubectl create secret generic ufm-credentials \
--from-literal=username='your-ufm-username' \
--from-literal=password='your-ufm-password' \
-n volcano-system
```
> Note: Replace your-ufm-username and your-ufm-password with your actual UFM credentials, and adjust the namespace if needed.
### Example ConfigMap
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: volcano-controller-configmap
namespace: volcano-system # Replace with your Volcano namespace if Volcano is not installed in the default namespace.
data:
volcano-controller.conf: |
networkTopologyDiscovery:
- source: ufm
enabled: true
interval: 10m
credentials:
secretRef:
name: ufm-credentials # Replace with the secret name that stores the UFM credentials.
namespace: volcano-system #Replace with the secret namespace that stores the UFM credentials.
config:
endpoint: https://ufm-server:8080
insecureSkipVerify: true
- source: roce
enabled: false
interval: 15m
config:
endpoint: https://roce-server:9090
- source: label
enabled: false
config: {}
```
### Configuration Options
* `source`: The discovery source. Supported values are `ufm`, `roce`, and `label`.
* `enabled`: Whether the discovery source is enabled.
* `interval`: The interval between discovery operations. If not specified, the default value is 1 hour.
* `config`: The configuration for the discovery source. The configuration options vary depending on the discovery source.
* `credentials`: The credentials configuration for accessing the discovery source.
* `secretRef`: Reference to a Kubernetes Secret containing credentials.
* `name`: The name of the Secret.
* `namespace`: The namespace of the Secret.
#### UFM Configuration Options
* `endpoint`: The UFM API endpoint.
* `insecureSkipVerify`: Whether to skip TLS certificate verification. This should only be used in development environments.
#### RoCE Configuration Options(Currently not supported)
* `endpoint`: The RoCE API endpoint.
* `token`: The RoCE API token.
#### Label Configuration Options(Currently not supported)
* No configuration options are currently supported for the label discovery source.
## Verification
1. Check the Volcano controller logs to ensure that the discovery sources are started successfully.
```bash
kubectl logs -n volcano-system -l app=volcano-controllers -c volcano-controllers | grep "Successfully started all network topology discoverers"
```
2. Check the created HyperNode resources.
```bash
kubectl get hypernodes -l volcano.sh/network-topology-source=<source>
```
Replace `<source>` with the discovery source you configured, such as `ufm`.
## Troubleshooting
* If the discovery sources are not started successfully, check the Volcano controller logs for errors.
* If the HyperNode resources are not created, check the discovery source configuration and ensure that the discovery source is able to connect to the network topology data source.
## Best Practices
* Volcano uses Kubernetes-standard Secrets to store sensitive credential information (username/password or token). For more stringent key encryption requirements, users should consider additional mechanisms like [Encrypting Secret Data at Rest](https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/).
* The credential Secrets can be placed in a specified namespace for better isolation.
* For UFM discoverer, the controller only needs read access to the specific Secret containing credentials.
* When deploying in production environments, proper RBAC policies should be configured to limit access to Secrets.
* TLS certificate verification should be enabled in production environments to prevent MITM attacks.
* Monitor the Volcano controller logs for errors.
* Set a reasonable discovery interval to avoid overloading the network topology data source.

View File

@ -11,7 +11,7 @@ event(events) happens, the target action will be triggered. If timeout is config
under `task.spec` only, it will only work for the task. If the policy is configured in both job and task level, it will obey
the task policy.
* Users can set multiple policy for a job or a task.
* Currently, Volcano provides **6 build-in events** for users. The details are as follows.
* Currently, Volcano provides **6 built-in events** for users. The details are as follows.
| ID | Event | Description |
|-----|----------------|-------------------------------------------------------------------------------------------------------------------|
@ -22,7 +22,7 @@ the task policy.
| 4 | `Unknown` | Check whether the status of a volcano job is `Unknown`. The most possible factor is task unschedulable. It is triggered when part pods can't be scheduled while some are already running in gang-scheduling case. |
| 5 | `*` | It means all the events, which is not so common used. |
* Currently, Volcano provides **5 build-in actions** for users. The details are as follows.
* Currently, Volcano provides **5 built-in actions** for users. The details are as follows.
| ID | Action | Description |
|-----|-------------------|------------------------------------------------------------------------------------------------------------------|

View File

@ -0,0 +1,190 @@
# Volcano vGPU User Guide
## Background Knowledge of GPU Sharing Modes in Volcano
Volcano supports **two GPU sharing modes** for virtual GPU (vGPU) scheduling:
### 1. HAMI-core (Software-based vGPU)
**Description**:
Leverages **VCUDA**, a CUDA API hijacking technique to enforce GPU core and memory usage limits, enabling **software-level virtual GPU slicing**.
**Use case**:
Ideal for environments requiring **fine-grained GPU sharing**. Compatible with all GPU types.
---
### 2. Dynamic MIG (Hardware-level GPU Slicing)
**Description**:
Utilizes **NVIDIA's MIG (Multi-Instance GPU)** technology to partition a physical GPU into isolated instances with **hardware-level performance guarantees**.
**Use case**:
Best for **performance-sensitive** workloads. Requires **MIG-capable GPUs** (e.g., A100, H100).
---
GPU Sharing mode is a node configuration. Volcano supports heterogeneous cluster(i.e a part of node uses HAMi-core while another part uses dynamic MIG), See [volcano-vgpu-device-plugin](https://github.com/Project-HAMi/volcano-vgpu-device-plugin) for configuration and details.
## Installation
To enable vGPU scheduling, the following components must be set up based on the selected mode:
### Common Requirements
* **Prerequisites**:
* NVIDIA driver > 440
* nvidia-docker > 2.0
* Docker configured with `nvidia` as the default runtime
* Kubernetes >= 1.16
* Volcano >= 1.9
* **Install Volcano**:
* Follow instructions in [Volcano Installer Guide](https://github.com/volcano-sh/volcano?tab=readme-ov-file#quick-start-guide)
* **Install Device Plugin**:
* Deploy [`volcano-vgpu-device-plugin`](https://github.com/Project-HAMi/volcano-vgpu-device-plugin)
**Note:** the [vgpu device plugin yaml](https://github.com/Project-HAMi/volcano-vgpu-device-plugin/blob/main/volcano-vgpu-device-plugin.yml) also includes the ***Node GPU mode*** and the ***MIG geometry*** configuration. Please refer to the [vgpu device plugin config](https://github.com/Project-HAMi/volcano-vgpu-device-plugin/blob/main/doc/config.md).
* **Validate Setup**:
Ensure node allocatable resources include:
```yaml
volcano.sh/vgpu-memory: "89424"
volcano.sh/vgpu-number: "8"
```
* **Scheduler Config Update**:
```yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: volcano-scheduler-configmap
namespace: volcano-system
data:
volcano-scheduler.conf: |
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: predicates
- name: deviceshare
arguments:
deviceshare.VGPUEnable: true # enable vgpu plugin
deviceshare.SchedulePolicy: binpack # scheduling policy. binpack / spread
```
Check with:
```bash
kubectl get node {node-name} -o yaml
```
---
### HAMI-core Usage
* **Pod Spec**:
```yaml
metadata:
name: hami-pod
annotations:
volcano.sh/vgpu-mode: "hami-core"
spec:
schedulerName: volcano
containers:
- name: cuda-container
image: nvidia/cuda:9.0-devel
resources:
limits:
volcano.sh/vgpu-number: 1 # requesting 1 gpu cards
volcano.sh/vgpu-cores: 50 # (optional)each vGPU uses 50%
volcano.sh/vgpu-memory: 3000 # (optional)each vGPU uses 3G GPU memory
```
---
### Dynamic MIG Usage
* **Enable MIG Mode**:
If you need to use MIG (Multi-Instance GPU), you must run the following command on the GPU node.
```bash
sudo nvidia-smi -mig 1
```
* **Geometry Config (Optional)**:
The volcano-vgpu-device-plugin automatically generates an initial MIG configuration, which is stored in the `volcano-vgpu-device-config` ConfigMap under the `kube-system` namespace. You can customize this configuration as needed. For more details, refer to the [vgpu device plugin yaml](https://github.com/Project-HAMi/volcano-vgpu-device-plugin/blob/main/volcano-vgpu-device-plugin.yml).
* **Pod Spec with MIG Annotation**:
```yaml
metadata:
name: mig-pod
annotations:
volcano.sh/vgpu-mode: "mig"
spec:
schedulerName: volcano
containers:
- name: cuda-container
image: nvidia/cuda:9.0-devel
resources:
limits:
volcano.sh/vgpu-number: 1
volcano.sh/vgpu-memory: 3000
```
Note: Actual memory allocated depends on best-fit MIG slice (e.g., request 3GB → 5GB slice used).
---
## Scheduler Mode Selection
* **Explicit Mode**:
* Use annotation `volcano.sh/vgpu-mode` to force hami-core or MIG mode.
* If annotation is absent, scheduler selects mode based on resource fit and policy.
* **Scheduling Policy**:
* Modes like `binpack` or `spread` influence node selection.
---
## Summary Table
| Mode | Isolation | MIG GPU Required | Annotation | Core/Memory Control | Recommended For |
| ----------- | ---------------- | ---------------- | ---------- | ------------------- | -------------------------- |
| HAMI-core | Software (VCUDA) | No | No | Yes | General workloads |
| Dynamic MIG | Hardware | Yes | Yes | MIG-controlled | Performance-sensitive jobs |
---
## Monitoring
* **Scheduler Metrics**:
```bash
curl http://<volcano-scheduler-ip>:8080/metrics
```
* **Device Plugin Metrics**:
```bash
curl http://<plugin-pod-ip>:9394/metrics
```
Metrics include GPU utilization, pod memory usage, and limits.
---
## Issues and Contributions
* File bugs: [Volcano Issues](https://github.com/volcano-sh/volcano/issues)
* Contribute: [Pull Requests Guide](https://help.github.com/articles/using-pull-requests/)

View File

@ -1,2 +1,33 @@
FROM volcanosh/vc-scheduler:latest
COPY plugins plugins
FROM golang:1.23.7 AS builder
WORKDIR /go/src/volcano.sh/
# Install musl
RUN apt-get update && \
apt-get install -y sudo
RUN wget http://musl.libc.org/releases/musl-latest.tar.gz && \
mkdir musl-latest && \
tar -xf musl-latest.tar.gz -C musl-latest --strip-components=1 && \
cd musl-latest && \
./configure && make && sudo make install
COPY go.mod go.sum ./
RUN go mod download
ADD . volcano
# Build plugin
RUN cd volcano && CC=/usr/local/musl/bin/musl-gcc CGO_ENABLED=1 \
go build -buildmode=plugin -ldflags '-linkmode=external' \
-o example/custom-plugin/magic.so example/custom-plugin/magic.go
# Build vc scheduler base image with plugin enabled
RUN cd volcano && SUPPORT_PLUGINS=yes make vc-scheduler
# Build vc scheduler image with plugin
FROM alpine:latest
COPY --from=builder /go/src/volcano.sh/volcano/_output/bin/vc-scheduler /vc-scheduler
COPY --from=builder /go/src/volcano.sh/volcano/example/custom-plugin/magic.so /plugins/magic.so
ENTRYPOINT ["/vc-scheduler"]

View File

@ -1,29 +1,45 @@
# Build plugin
# Build image and plugin
## Use `musl-libc` build plugin
## Use `musl-libc` build image and plugin
Because the default `vc-scheduler` base image is `alpine`, which only has `musl-libc`, so we should use `musl-gcc` to
build the plugin.
### Build the plugin in `Docker`:
### Build the image and plugin in `Docker`(Recommended):
Please run this command at the root path of the project.
```bash
# You may need run this command at the root path of the project, cause this need the go.mod file.
docker run -v `pwd`:/volcano golang:1.20-alpine sh -c "cd /volcano && apk add musl-dev gcc && go build -buildmode=plugin -o example/custom-plugin/magic.so example/custom-plugin/magic.go"
docker build -t volcanosh/vc-scheduler:custom-plugins -f example/custom-plugin/Dockerfile .
```
And then replace the image and set `--plugins-dir=plugins` parameter in `vc-scheduler` deployment yaml file.
### Build the plugin in local:
### Build the image and plugin locally:
Please run this command at the root path of the project.
```bash
# install musl
wget http://musl.libc.org/releases/musl-1.2.1.tar.gz
tar -xf musl-1.2.1.tar.gz && cd musl-1.2.1
wget http://musl.libc.org/releases/musl-latest.tar.gz
mkdir musl-latest && tar -xf musl-latest.tar.gz -C musl-latest --strip-components=1 && cd musl-latest
./configure
make && sudo make install
# build plugin
CC=/usr/local/musl/bin/musl-gcc CGO_ENABLED=1 go build -buildmode=plugin magic.go
# build plugin .so file
CC=/usr/local/musl/bin/musl-gcc CGO_ENABLED=1 go build -buildmode=plugin -ldflags '-linkmode=external' \
-o example/custom-plugin/magic.so example/custom-plugin/magic.go
# build vc scheduler binary
SUPPORT_PLUGINS=yes make vc-scheduler
cat << EOF > Dockerfile
FROM alpine:latest
COPY _output/bin/vc-scheduler /vc-scheduler
COPY example/custom-plugin/magic.so /plugins/magic.so
ENTRYPOINT ["/vc-scheduler"]
EOF
# build vc scheduler image
docker build -t volcanosh/vc-scheduler:custom-plugins .
```
And then replace the image and set `--plugins-dir=plugins` parameter in `vc-scheduler` deployment yaml file.
## Use `gnu-libc` build plugin

View File

@ -67,7 +67,7 @@ spec:
# of the resource (not just labels). Multiple AND conditions can be represented by comma
# delimited expressions.
# For more details: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
# argoexec will get the resource information by "kubectl get -o json -w resouce/name" and check if the conditions are match
# argoexec will get the resource information by "kubectl get -o json -w resource/name" and check if the conditions are match
# Completed is the phase that all tasks of Job are completed
# Failed is the phase that the job is restarted failed reached the maximum number of retries.
# change the successCondition or failureCondition according to the actual situation

View File

@ -5,15 +5,6 @@ metadata:
spec:
minAvailable: 1
schedulerName: volcano
priorityClassName: high-priority
policies:
- event: PodEvicted
action: RestartJob
plugins:
ssh: []
env: []
svc: []
maxRetry: 5
queue: default
tasks:
- replicas: 1
@ -42,15 +33,6 @@ metadata:
spec:
minAvailable: 1
schedulerName: volcano
priorityClassName: high-priority
policies:
- event: PodEvicted
action: RestartJob
plugins:
ssh: []
env: []
svc: []
maxRetry: 5
queue: default
tasks:
- replicas: 1
@ -79,15 +61,6 @@ metadata:
spec:
minAvailable: 1
schedulerName: volcano
priorityClassName: high-priority
policies:
- event: PodEvicted
action: RestartJob
plugins:
ssh: []
env: []
svc: []
maxRetry: 5
queue: default
tasks:
- replicas: 1
@ -116,15 +89,6 @@ metadata:
spec:
minAvailable: 1
schedulerName: volcano
priorityClassName: high-priority
policies:
- event: PodEvicted
action: RestartJob
plugins:
ssh: []
env: []
svc: []
maxRetry: 5
queue: default
tasks:
- replicas: 1
@ -153,15 +117,6 @@ metadata:
spec:
minAvailable: 1
schedulerName: volcano
priorityClassName: high-priority
policies:
- event: PodEvicted
action: RestartJob
plugins:
ssh: []
env: []
svc: []
maxRetry: 5
queue: default
tasks:
- replicas: 1

View File

@ -8,53 +8,16 @@ read design at [here](../../docs/design/jobflow).
### Prerequisites
- docker: `18.06`
- Kubernetes: >`1.17`
## startup steps
build image from local
```bash
# get volcano and jobflow source code from github
git clone http://github.com/volcano-sh/volcano.git
git clone https://github.com/BoCloud/JobFlow.git
# build image beyondcent/jobflow:v0.0.1 from local
cd JobFlow
make
make docker-build
```
##### deploy JobFlow from [here](https://github.com/BoCloud/JobFlow#deploy)
```bash
kubectl apply -f https://raw.githubusercontent.com/BoCloud/JobFlow/main/deploy/jobflow.yaml
```
##### deploy Volcano from [here](https://volcano.sh/en/docs/installation/#install-with-yaml-files)
```bash
kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/master/installer/volcano-development.yaml
```
if cert of `jobflow-webhook-service.kube-system.svc` has expired, generate one to replace it.
```bash
# delete expired cert in secrets
kubectl delete secret jobflow-webhook-server-cert -nkube-system
# use gen-admission-secret.sh register new secret
cd volcano
./installer/dockerfile/webhook-manager/gen-admission-secret.sh --service jobflow-webhook-service --namespace kube-system --secret jobflow-webhook-server-cert
# restart jobflow-controller-manager
kubectl delete pod/jobflow-controller-manager-67847d59dd-j8dmc -nkube-system
```
##### run jobflow example
### run jobflow example
```bash
# deploy jobTemplate first
cd volcano
kubectl apply -f example/jobflow/JobTemplate.yaml
kubectl apply -f JobTemplate.yaml
# deploy jobFlow second
kubectl apply -f example/jobflow/JobFlow.yaml
kubectl apply -f JobFlow.yaml
# check them
kubectl get jt

View File

@ -10,7 +10,7 @@ spec:
plugins:
ssh: []
svc: []
# 如果有pod被 杀死,重启整个作业
# Restart the entire job if any pod is killed.
policies:
- event: PodEvicted
action: RestartJob

202
go.mod
View File

@ -1,10 +1,11 @@
module volcano.sh/volcano
go 1.22.0
go 1.23.0
require (
github.com/AdaLogics/go-fuzz-headers v0.0.0-20240806141605-e8a1dd7889d6
github.com/agiledragon/gomonkey/v2 v2.11.0
github.com/cilium/ebpf v0.9.3
github.com/cilium/ebpf v0.16.0
github.com/containernetworking/cni v1.1.2
github.com/containernetworking/plugins v1.1.1
github.com/elastic/go-elasticsearch/v7 v7.17.7
@ -15,9 +16,9 @@ require (
github.com/hashicorp/go-multierror v1.1.1
github.com/imdario/mergo v0.3.16
github.com/mitchellh/mapstructure v1.5.0
github.com/onsi/ginkgo/v2 v2.19.0
github.com/onsi/gomega v1.33.1
github.com/opencontainers/runc v1.1.13
github.com/onsi/ginkgo/v2 v2.21.0
github.com/onsi/gomega v1.35.1
github.com/opencontainers/runc v1.2.1
github.com/pkg/errors v0.9.1
github.com/prometheus/client_golang v1.19.1
github.com/prometheus/common v0.55.0
@ -25,44 +26,64 @@ require (
github.com/spf13/cobra v1.8.1
github.com/spf13/pflag v1.0.5
github.com/stretchr/testify v1.9.0
github.com/vishvananda/netlink v1.1.1-0.20210330154013-f5de75959ad5
github.com/vishvananda/netlink v1.3.1-0.20240905180732-b1ce50cfa9be
go.uber.org/automaxprocs v1.5.1
golang.org/x/crypto v0.24.0
golang.org/x/sys v0.21.0
golang.org/x/time v0.3.0
golang.org/x/crypto v0.37.0
golang.org/x/sys v0.32.0
golang.org/x/time v0.7.0
gopkg.in/yaml.v2 v2.4.0
k8s.io/api v0.31.3
k8s.io/apimachinery v0.31.3
k8s.io/apiserver v0.31.3
k8s.io/client-go v0.31.3
k8s.io/code-generator v0.31.3
k8s.io/component-base v0.31.3
k8s.io/component-helpers v0.31.3
k8s.io/csi-translation-lib v0.31.3
k8s.io/api v0.32.2
k8s.io/apimachinery v0.32.2
k8s.io/apiserver v0.32.2
k8s.io/client-go v0.32.2
k8s.io/code-generator v0.32.2
k8s.io/component-base v0.32.2
k8s.io/component-helpers v0.32.2
k8s.io/csi-translation-lib v0.32.2
k8s.io/klog/v2 v2.130.1
k8s.io/kubernetes v1.31.3
k8s.io/metrics v0.0.0
k8s.io/utils v0.0.0-20240711033017-18e509b52bc8
k8s.io/kubectl v0.0.0
k8s.io/kubernetes v1.32.2
k8s.io/metrics v0.32.2
k8s.io/pod-security-admission v0.0.0
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738
sigs.k8s.io/controller-runtime v0.13.0
sigs.k8s.io/yaml v1.4.0
stathat.com/c/consistent v1.0.0
volcano.sh/apis v1.10.0-alpha.0.0.20241210014034-bf27f4e986d0
volcano.sh/apis v1.12.1
)
require (
github.com/Microsoft/go-winio v0.6.0 // indirect
cel.dev/expr v0.18.0 // indirect
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161 // indirect
github.com/JeffAshton/win_pdh v0.0.0-20161109143554-76bb4ee9f0ab // indirect
github.com/Microsoft/go-winio v0.6.2 // indirect
github.com/antlr4-go/antlr/v4 v4.13.0 // indirect
github.com/bits-and-blooms/bitset v1.2.0 // indirect
github.com/cyphar/filepath-securejoin v0.2.4 // indirect
github.com/container-storage-interface/spec v1.9.0 // indirect
github.com/containerd/containerd/api v1.7.19 // indirect
github.com/containerd/errdefs v0.1.0 // indirect
github.com/containerd/log v0.1.0 // indirect
github.com/containerd/ttrpc v1.2.5 // indirect
github.com/cyphar/filepath-securejoin v0.3.4 // indirect
github.com/docker/go-units v0.5.0 // indirect
github.com/euank/go-kmsg-parser v2.0.0+incompatible // indirect
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
github.com/godbus/dbus/v5 v5.1.0 // indirect
github.com/opencontainers/runtime-spec v1.0.3-0.20220909204839-494a5a6aca78 // indirect
github.com/google/btree v1.0.1 // indirect
github.com/gorilla/websocket v1.5.0 // indirect
github.com/karrick/godirwalk v1.17.0 // indirect
github.com/mistifyio/go-zfs v2.1.2-0.20190413222219-f784269be439+incompatible // indirect
github.com/moby/spdystream v0.5.0 // indirect
github.com/moby/sys/userns v0.1.0 // indirect
github.com/moby/term v0.5.0 // indirect
github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f // indirect
github.com/opencontainers/runtime-spec v1.2.0 // indirect
github.com/sirupsen/logrus v1.9.3 // indirect
github.com/vishvananda/netns v0.0.4 // indirect
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.53.0 // indirect
k8s.io/cri-api v0.31.3 // indirect
k8s.io/cri-api v0.32.2 // indirect
k8s.io/cri-client v0.0.0 // indirect
k8s.io/gengo/v2 v2.0.0-20240228010128-51d4e06bde70 // indirect
k8s.io/dynamic-resource-allocation v0.0.0 // indirect
k8s.io/gengo/v2 v2.0.0-20240911193312-2b36238f13e9 // indirect
)
require (
@ -75,24 +96,23 @@ require (
github.com/coreos/go-semver v0.3.1 // indirect
github.com/coreos/go-systemd/v22 v22.5.0 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/distribution/reference v0.5.0 // indirect
github.com/distribution/reference v0.6.0 // indirect
github.com/emicklei/go-restful/v3 v3.11.0 // indirect
github.com/evanphx/json-patch/v5 v5.6.0 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/fxamacker/cbor/v2 v2.7.0 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-openapi/jsonpointer v0.19.6 // indirect
github.com/go-openapi/jsonpointer v0.21.0 // indirect
github.com/go-openapi/jsonreference v0.20.2 // indirect
github.com/go-openapi/swag v0.22.4 // indirect
github.com/go-openapi/swag v0.23.0 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
github.com/golang/protobuf v1.5.4 // indirect
github.com/google/cadvisor v0.49.0 // indirect
github.com/google/cel-go v0.20.1 // indirect
github.com/google/cadvisor v0.51.0 // indirect
github.com/google/cel-go v0.22.0 // indirect
github.com/google/gnostic-models v0.6.8 // indirect
github.com/google/gofuzz v1.2.0 // indirect
github.com/google/pprof v0.0.0-20240525223248-4bfdf5a9a2af // indirect
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.20.0 // indirect
@ -101,20 +121,20 @@ require (
github.com/josharian/intern v1.0.0 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/moby/sys/mountinfo v0.7.1 // indirect
github.com/moby/sys/mountinfo v0.7.2 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/opencontainers/go-digest v1.0.0 // indirect
github.com/opencontainers/selinux v1.11.0 // indirect
github.com/opencontainers/selinux v1.11.1 // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/prometheus/client_model v0.6.1 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/stoewer/go-strcase v1.2.0 // indirect
github.com/stoewer/go-strcase v1.3.0 // indirect
github.com/x448/float16 v0.8.4 // indirect
go.etcd.io/etcd/api/v3 v3.5.14 // indirect
go.etcd.io/etcd/client/pkg/v3 v3.5.14 // indirect
go.etcd.io/etcd/client/v3 v3.5.14 // indirect
go.etcd.io/etcd/api/v3 v3.5.16 // indirect
go.etcd.io/etcd/client/pkg/v3 v3.5.16 // indirect
go.etcd.io/etcd/client/v3 v3.5.16 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.53.0 // indirect
go.opentelemetry.io/otel v1.28.0 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.28.0 // indirect
@ -124,70 +144,70 @@ require (
go.opentelemetry.io/otel/trace v1.28.0 // indirect
go.opentelemetry.io/proto/otlp v1.3.1 // indirect
go.uber.org/multierr v1.11.0 // indirect
go.uber.org/zap v1.26.0 // indirect
golang.org/x/exp v0.0.0-20230515195305-f3d0a9c9a5cc // indirect
golang.org/x/mod v0.17.0 // indirect
golang.org/x/net v0.26.0 // indirect
golang.org/x/oauth2 v0.21.0 // indirect
golang.org/x/sync v0.7.0 // indirect
golang.org/x/term v0.21.0 // indirect
golang.org/x/text v0.16.0 // indirect
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20240528184218-531527333157 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20240701130421-f6361c86f094 // indirect
go.uber.org/zap v1.27.0 // indirect
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 // indirect
golang.org/x/mod v0.21.0 // indirect
golang.org/x/net v0.38.0 // indirect
golang.org/x/oauth2 v0.23.0 // indirect
golang.org/x/sync v0.13.0 // indirect
golang.org/x/term v0.31.0 // indirect
golang.org/x/text v0.24.0 // indirect
golang.org/x/tools v0.26.0 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20240826202546-f6391c0de4c7 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20240826202546-f6391c0de4c7 // indirect
google.golang.org/grpc v1.65.0 // indirect
google.golang.org/protobuf v1.34.2 // indirect
google.golang.org/protobuf v1.35.1 // indirect
gopkg.in/evanphx/json-patch.v4 v4.12.0 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/natefinch/lumberjack.v2 v2.2.1 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
k8s.io/apiextensions-apiserver v0.25.0 // indirect
k8s.io/cloud-provider v0.0.0 // indirect
k8s.io/controller-manager v0.31.3
k8s.io/kms v0.31.3 // indirect
k8s.io/kube-openapi v0.0.0-20240228011516-70dd3763d340 // indirect
k8s.io/controller-manager v0.32.2
k8s.io/kms v0.32.2 // indirect
k8s.io/kube-openapi v0.0.0-20241105132330-32ad38e42d3f // indirect
k8s.io/kube-scheduler v0.0.0 // indirect
k8s.io/kubelet v0.0.0 // indirect
k8s.io/kubelet v0.32.2 // indirect
k8s.io/mount-utils v0.0.0 // indirect
sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.30.3 // indirect
sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd // indirect
sigs.k8s.io/structured-merge-diff/v4 v4.4.1 // indirect
sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.31.0 // indirect
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 // indirect
sigs.k8s.io/structured-merge-diff/v4 v4.4.2 // indirect
)
replace (
cloud.google.com/go => cloud.google.com/go v0.100.2
github.com/opencontainers/runc => github.com/opencontainers/runc v1.0.3
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc => go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.19.0
google.golang.org/grpc => google.golang.org/grpc v1.57.0
k8s.io/api => k8s.io/api v0.31.1
k8s.io/apiextensions-apiserver => k8s.io/apiextensions-apiserver v0.31.3
k8s.io/apimachinery => k8s.io/apimachinery v0.31.3
k8s.io/apiserver => k8s.io/apiserver v0.31.3
k8s.io/cli-runtime => k8s.io/cli-runtime v0.31.3
k8s.io/client-go => k8s.io/client-go v0.31.3
k8s.io/cloud-provider => k8s.io/cloud-provider v0.31.3
k8s.io/cluster-bootstrap => k8s.io/cluster-bootstrap v0.31.3
k8s.io/code-generator => k8s.io/code-generator v0.31.3
k8s.io/component-base => k8s.io/component-base v0.31.3
k8s.io/component-helpers => k8s.io/component-helpers v0.31.3
k8s.io/controller-manager => k8s.io/controller-manager v0.31.3
k8s.io/cri-api => k8s.io/cri-api v0.31.3
k8s.io/cri-client => k8s.io/cri-client v0.31.3
k8s.io/csi-translation-lib => k8s.io/csi-translation-lib v0.31.3
k8s.io/dynamic-resource-allocation => k8s.io/dynamic-resource-allocation v0.31.3
k8s.io/endpointslice => k8s.io/endpointslice v0.31.3
k8s.io/kube-aggregator => k8s.io/kube-aggregator v0.31.3
k8s.io/kube-controller-manager => k8s.io/kube-controller-manager v0.31.3
k8s.io/kube-proxy => k8s.io/kube-proxy v0.31.3
k8s.io/kube-scheduler => k8s.io/kube-scheduler v0.31.3
k8s.io/kubectl => k8s.io/kubectl v0.31.3
k8s.io/kubelet => k8s.io/kubelet v0.31.3
k8s.io/legacy-cloud-providers => k8s.io/legacy-cloud-providers v0.31.3
k8s.io/metrics => k8s.io/metrics v0.31.3
k8s.io/mount-utils => k8s.io/mount-utils v0.31.3
k8s.io/node-api => k8s.io/node-api v0.31.3
k8s.io/pod-security-admission => k8s.io/pod-security-admission v0.31.3
k8s.io/sample-apiserver => k8s.io/sample-apiserver v0.31.3
k8s.io/sample-cli-plugin => k8s.io/sample-cli-plugin v0.31.3
k8s.io/sample-controller => k8s.io/sample-controller v0.31.3
k8s.io/api => k8s.io/api v0.32.2
k8s.io/apiextensions-apiserver => k8s.io/apiextensions-apiserver v0.32.2
k8s.io/apimachinery => k8s.io/apimachinery v0.32.2
k8s.io/apiserver => k8s.io/apiserver v0.32.2
k8s.io/cli-runtime => k8s.io/cli-runtime v0.32.2
k8s.io/client-go => k8s.io/client-go v0.32.2
k8s.io/cloud-provider => k8s.io/cloud-provider v0.32.2
k8s.io/cluster-bootstrap => k8s.io/cluster-bootstrap v0.32.2
k8s.io/code-generator => k8s.io/code-generator v0.32.2
k8s.io/component-base => k8s.io/component-base v0.32.2
k8s.io/component-helpers => k8s.io/component-helpers v0.32.2
k8s.io/controller-manager => k8s.io/controller-manager v0.32.2
k8s.io/cri-api => k8s.io/cri-api v0.32.2
k8s.io/cri-client => k8s.io/cri-client v0.32.2
k8s.io/csi-translation-lib => k8s.io/csi-translation-lib v0.32.2
k8s.io/dynamic-resource-allocation => k8s.io/dynamic-resource-allocation v0.32.2
k8s.io/endpointslice => k8s.io/endpointslice v0.32.2
k8s.io/externaljwt => k8s.io/externaljwt v0.32.2
k8s.io/kube-aggregator => k8s.io/kube-aggregator v0.32.2
k8s.io/kube-controller-manager => k8s.io/kube-controller-manager v0.32.2
k8s.io/kube-proxy => k8s.io/kube-proxy v0.32.2
k8s.io/kube-scheduler => k8s.io/kube-scheduler v0.32.2
k8s.io/kubectl => k8s.io/kubectl v0.32.2
k8s.io/kubelet => k8s.io/kubelet v0.32.2
k8s.io/legacy-cloud-providers => k8s.io/legacy-cloud-providers v0.32.2
k8s.io/metrics => k8s.io/metrics v0.32.2
k8s.io/mount-utils => k8s.io/mount-utils v0.32.2
k8s.io/node-api => k8s.io/node-api v0.32.2
k8s.io/pod-security-admission => k8s.io/pod-security-admission v0.32.2
k8s.io/sample-apiserver => k8s.io/sample-apiserver v0.32.2
k8s.io/sample-cli-plugin => k8s.io/sample-cli-plugin v0.32.2
k8s.io/sample-controller => k8s.io/sample-controller v0.32.2
)

380
go.sum
View File

@ -1,67 +1,86 @@
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
github.com/Microsoft/go-winio v0.6.0 h1:slsWYD/zyx7lCXoZVlvQrj0hPTM1HI4+v1sIda2yDvg=
github.com/Microsoft/go-winio v0.6.0/go.mod h1:cTAf44im0RAYeL23bpB+fzCyDH2MJiz2BO69KH/soAE=
cel.dev/expr v0.18.0 h1:CJ6drgk+Hf96lkLikr4rFf19WrU0BOWEihyZnI2TAzo=
cel.dev/expr v0.18.0/go.mod h1:MrpN08Q+lEBs+bGYdLxxHkZoUSsCp0nSKTs0nTymJgw=
github.com/AdaLogics/go-fuzz-headers v0.0.0-20240806141605-e8a1dd7889d6 h1:He8afgbRMd7mFxO99hRNu+6tazq8nFF9lIwo9JFroBk=
github.com/AdaLogics/go-fuzz-headers v0.0.0-20240806141605-e8a1dd7889d6/go.mod h1:8o94RPi1/7XTJvwPpRSzSUedZrtlirdB3r9Z20bi2f8=
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161 h1:L/gRVlceqvL25UVaW/CKtUDjefjrs0SPonmDGUVOYP0=
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
github.com/JeffAshton/win_pdh v0.0.0-20161109143554-76bb4ee9f0ab h1:UKkYhof1njT1/xq4SEg5z+VpTgjmNeHwPGRQl7takDI=
github.com/JeffAshton/win_pdh v0.0.0-20161109143554-76bb4ee9f0ab/go.mod h1:3VYc5hodBMJ5+l/7J4xAyMeuM2PNuepvHlGs8yilUCA=
github.com/Microsoft/go-winio v0.6.2 h1:F2VQgta7ecxGYO8k3ZZz3RS8fVIXVxONVUPlNERoyfY=
github.com/Microsoft/go-winio v0.6.2/go.mod h1:yd8OoFMLzJbo9gZq8j5qaps8bJ9aShtEA8Ipt1oGCvU=
github.com/NYTimes/gziphandler v1.1.1 h1:ZUDjpQae29j0ryrS0u/B8HZfJBtBQHjqw2rQ2cqUQ3I=
github.com/NYTimes/gziphandler v1.1.1/go.mod h1:n/CVRwUEOgIxrgPvAQhUUr9oeUtvrhMomdKFjzJNB0c=
github.com/agiledragon/gomonkey/v2 v2.11.0 h1:5oxSgA+tC1xuGsrIorR+sYiziYltmJyEZ9qA25b6l5U=
github.com/agiledragon/gomonkey/v2 v2.11.0/go.mod h1:ap1AmDzcVOAz1YpeJ3TCzIgstoaWLA6jbbgxfB4w2iY=
github.com/antlr4-go/antlr/v4 v4.13.0 h1:lxCg3LAv+EUK6t1i0y1V6/SLeUi0eKEKdhQAlS8TVTI=
github.com/antlr4-go/antlr/v4 v4.13.0/go.mod h1:pfChB/xh/Unjila75QW7+VU4TSnWnnk9UTnmpPaOR2g=
github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5 h1:0CwZNZbxp69SHPdPJAN/hZIm0C4OItdklCFmMRWYpio=
github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5/go.mod h1:wHh0iHkYZB8zMSxRWpUBQtwG5a7fFgvEO+odwuTv2gs=
github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 h1:DklsrG3dyBCFEj5IhUbnKptjxatkF07cF2ak3yi77so=
github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2/go.mod h1:WaHUgvxTVq04UNunO+XhnAqY/wQc+bxr74GqbsZ/Jqw=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/bits-and-blooms/bitset v1.2.0 h1:Kn4yilvwNtMACtf1eYDlG8H77R07mZSPbMjLyS07ChA=
github.com/bits-and-blooms/bitset v1.2.0/go.mod h1:gIdJ4wp64HaoK2YrL1Q5/N7Y16edYb8uY+O0FJTyyDA=
github.com/blang/semver/v4 v4.0.0 h1:1PFHFE6yCCTv8C1TeyNNarDzntLi7wMI5i/pzqYIsAM=
github.com/blang/semver/v4 v4.0.0/go.mod h1:IbckMUScFkM3pff0VJDNKRiT6TG/YpiHIM2yvyW5YoQ=
github.com/cenkalti/backoff/v4 v4.3.0 h1:MyRJ/UdXutAwSAT+s3wNd7MfTIcy71VQueUuFK343L8=
github.com/cenkalti/backoff/v4 v4.3.0/go.mod h1:Y3VNntkOUPxTVeUxJ/G5vcM//AlwfmyYozVcomhLiZE=
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
github.com/checkpoint-restore/go-criu/v5 v5.0.0/go.mod h1:cfwC0EG7HMUenopBsUf9d89JlCLQIfgVcNsNN0t6T2M=
github.com/chzyer/logex v1.1.10/go.mod h1:+Ywpsq7O8HXn0nuIou7OrIPyXbp3wmkHB+jjWRnGsAI=
github.com/chzyer/readline v0.0.0-20180603132655-2972be24d48e/go.mod h1:nSuG5e5PlCu98SY8svDHJxuZscDgtXS6KTTbou5AhLI=
github.com/chzyer/test v0.0.0-20180213035817-a1ea475d72b1/go.mod h1:Q3SI9o4m/ZMnBNeIyt5eFwwo7qiLfzFZmjNmxjkiQlU=
github.com/cilium/ebpf v0.6.2/go.mod h1:4tRaxcgiL706VnOzHOdBlY8IEAIdxINsQBcU4xJJXRs=
github.com/cilium/ebpf v0.9.3 h1:5KtxXZU+scyERvkJMEm16TbScVvuuMrlhPly78ZMbSc=
github.com/cilium/ebpf v0.9.3/go.mod h1:w27N4UjpaQ9X/DGrSugxUG+H+NhgntDuPb5lCzxCn8A=
github.com/containerd/console v1.0.2/go.mod h1:ytZPjGgY2oeTkAONYafi2kSj0aYggsf8acV1PGKCbzQ=
github.com/cilium/ebpf v0.16.0 h1:+BiEnHL6Z7lXnlGUsXQPPAE7+kenAd4ES8MQ5min0Ok=
github.com/cilium/ebpf v0.16.0/go.mod h1:L7u2Blt2jMM/vLAVgjxluxtBKlz3/GWjB0dMOEngfwE=
github.com/container-storage-interface/spec v1.9.0 h1:zKtX4STsq31Knz3gciCYCi1SXtO2HJDecIjDVboYavY=
github.com/container-storage-interface/spec v1.9.0/go.mod h1:ZfDu+3ZRyeVqxZM0Ds19MVLkN2d1XJ5MAfi1L3VjlT0=
github.com/containerd/containerd/api v1.7.19 h1:VWbJL+8Ap4Ju2mx9c9qS1uFSB1OVYr5JJrW2yT5vFoA=
github.com/containerd/containerd/api v1.7.19/go.mod h1:fwGavl3LNwAV5ilJ0sbrABL44AQxmNjDRcwheXDb6Ig=
github.com/containerd/errdefs v0.1.0 h1:m0wCRBiu1WJT/Fr+iOoQHMQS/eP5myQ8lCv4Dz5ZURM=
github.com/containerd/errdefs v0.1.0/go.mod h1:YgWiiHtLmSeBrvpw+UfPijzbLaB77mEG1WwJTDETIV0=
github.com/containerd/log v0.1.0 h1:TCJt7ioM2cr/tfR8GPbGf9/VRAX8D2B4PjzCpfX540I=
github.com/containerd/log v0.1.0/go.mod h1:VRRf09a7mHDIRezVKTRCrOq78v577GXq3bSa3EhrzVo=
github.com/containerd/ttrpc v1.2.5 h1:IFckT1EFQoFBMG4c3sMdT8EP3/aKfumK1msY+Ze4oLU=
github.com/containerd/ttrpc v1.2.5/go.mod h1:YCXHsb32f+Sq5/72xHubdiJRQY9inL4a4ZQrAbN1q9o=
github.com/containerd/typeurl/v2 v2.2.0 h1:6NBDbQzr7I5LHgp34xAXYF5DOTQDn05X58lsPEmzLso=
github.com/containerd/typeurl/v2 v2.2.0/go.mod h1:8XOOxnyatxSWuG8OfsZXVnAF4iZfedjS/8UHSPJnX4g=
github.com/containernetworking/cni v1.1.2 h1:wtRGZVv7olUHMOqouPpn3cXJWpJgM6+EUl31EQbXALQ=
github.com/containernetworking/cni v1.1.2/go.mod h1:sDpYKmGVENF3s6uvMvGgldDWeG8dMxakj/u+i9ht9vw=
github.com/containernetworking/plugins v1.1.1 h1:+AGfFigZ5TiQH00vhR8qPeSatj53eNGz0C1d3wVYlHE=
github.com/containernetworking/plugins v1.1.1/go.mod h1:Sr5TH/eBsGLXK/h71HeLfX19sZPp3ry5uHSkI4LPxV8=
github.com/coreos/go-semver v0.3.1 h1:yi21YpKnrx1gt5R+la8n5WgS0kCrsPp33dmEyHReZr4=
github.com/coreos/go-semver v0.3.1/go.mod h1:irMmmIw/7yzSRPWryHsK7EYSg09caPQL03VsM8rvUec=
github.com/coreos/go-systemd/v22 v22.3.2/go.mod h1:Y58oyj3AT4RCenI/lSvhwexgC+NSVTIJ3seZv2GcEnc=
github.com/coreos/go-systemd/v22 v22.5.0 h1:RrqgGjYQKalulkV8NGVIfkXQf6YYmOyiJKk8iXXhfZs=
github.com/coreos/go-systemd/v22 v22.5.0/go.mod h1:Y58oyj3AT4RCenI/lSvhwexgC+NSVTIJ3seZv2GcEnc=
github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU=
github.com/cpuguy83/go-md2man/v2 v2.0.4/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/cyphar/filepath-securejoin v0.2.2/go.mod h1:FpkQEhXnPnOthhzymB7CGsFk2G9VLXONKD9G7QGMM+4=
github.com/cyphar/filepath-securejoin v0.2.4 h1:Ugdm7cg7i6ZK6x3xDF1oEu1nfkyfH53EtKeQYTC3kyg=
github.com/cyphar/filepath-securejoin v0.2.4/go.mod h1:aPGpWjXOXUn2NCNjFvBE6aRxGGx79pTxQpKOJNYHHl4=
github.com/creack/pty v1.1.18 h1:n56/Zwd5o6whRC5PMGretI4IdRLlmBXYNjScPaBgsbY=
github.com/creack/pty v1.1.18/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4=
github.com/cyphar/filepath-securejoin v0.3.4 h1:VBWugsJh2ZxJmLFSM06/0qzQyiQX2Qs0ViKrUAcqdZ8=
github.com/cyphar/filepath-securejoin v0.3.4/go.mod h1:8s/MCNJREmFK0H02MF6Ihv1nakJe4L/w3WZLHNkvlYM=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc h1:U9qPSI2PIWSS1VwoXQT9A3Wy9MM3WgvqSxFWenqJduM=
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/distribution/reference v0.5.0 h1:/FUIFXtfc/x2gpa5/VGfiGLuOIdYa1t65IKK2OFGvA0=
github.com/distribution/reference v0.5.0/go.mod h1:BbU0aIcezP1/5jX/8MP0YiH4SdvB5Y4f/wlDRiLyi3E=
github.com/docker/go-units v0.4.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/distribution/reference v0.6.0 h1:0IXCQ5g4/QMHHkarYzh5l+u8T3t73zM5QvfrDyIgxBk=
github.com/distribution/reference v0.6.0/go.mod h1:BbU0aIcezP1/5jX/8MP0YiH4SdvB5Y4f/wlDRiLyi3E=
github.com/docker/docker v26.1.4+incompatible h1:vuTpXDuoga+Z38m1OZHzl7NKisKWaWlhjQk7IDPSLsU=
github.com/docker/docker v26.1.4+incompatible/go.mod h1:eEKB0N0r5NX/I1kEveEz05bcu8tLC/8azJZsviup8Sk=
github.com/docker/go-connections v0.5.0 h1:USnMq7hx7gwdVZq1L49hLXaFtUdTADjXGp+uj1Br63c=
github.com/docker/go-connections v0.5.0/go.mod h1:ov60Kzw0kKElRwhNs9UlUHAE/F9Fe6GLaXnqyDdmEXc=
github.com/docker/go-units v0.5.0 h1:69rxXcBk27SvSaaxTtLh/8llcHD8vYHT7WSdRZ/jvr4=
github.com/docker/go-units v0.5.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/dustin/go-humanize v1.0.1 h1:GzkhY7T5VNhEkwH0PVJgjz+fX1rhBrR7pRT3mDkpeCY=
github.com/dustin/go-humanize v1.0.1/go.mod h1:Mu1zIs6XwVuF/gI1OepvI0qD18qycQx+mFykh5fBlto=
github.com/elastic/go-elasticsearch/v7 v7.17.7 h1:pcYNfITNPusl+cLwLN6OLmVT+F73Els0nbaWOmYachs=
github.com/elastic/go-elasticsearch/v7 v7.17.7/go.mod h1:OJ4wdbtDNk5g503kvlHLyErCgQwwzmDtaFC4XyOxXA4=
github.com/emicklei/go-restful/v3 v3.11.0 h1:rAQeMHw1c7zTmncogyy8VvRZwtkmkZ4FxERmMY4rD+g=
github.com/emicklei/go-restful/v3 v3.11.0/go.mod h1:6n3XBCmQQb25CM2LCACGz8ukIrRry+4bhvbpWn3mrbc=
github.com/euank/go-kmsg-parser v2.0.0+incompatible h1:cHD53+PLQuuQyLZeriD1V/esuG4MuU0Pjs5y6iknohY=
github.com/euank/go-kmsg-parser v2.0.0+incompatible/go.mod h1:MhmAMZ8V4CYH4ybgdRwPr2TU5ThnS43puaKEMpja1uw=
github.com/evanphx/json-patch/v5 v5.6.0 h1:b91NhWfaz02IuVxO9faSllyAtNXHMPkC5J8sJCLunww=
github.com/evanphx/json-patch/v5 v5.6.0/go.mod h1:G79N1coSVB93tBe7j6PhzjmR3/2VvlbKOFpnXhI9Bw4=
github.com/felixge/httpsnoop v1.0.4 h1:NFTV2Zj1bL4mc9sqWACXbQFVBBg2W3GPvqp8/ESS2Wg=
github.com/felixge/httpsnoop v1.0.4/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U=
github.com/frankban/quicktest v1.11.3/go.mod h1:wRf/ReqHper53s+kmmSZizM8NamnL3IM0I9ntUbOk+k=
github.com/frankban/quicktest v1.14.0 h1:+cqqvzZV87b4adx/5ayVOaYZ2CrvM4ejQvUdBzPPUss=
github.com/frankban/quicktest v1.14.0/go.mod h1:NeW+ay9A/U67EYXNFA1nPE8e/tnQv/09mUdL/ijj8og=
github.com/fsnotify/fsnotify v1.4.7/go.mod h1:jwhsz4b93w/PPRr/qN1Yymfu8t87LnFCMoQvtojpjFo=
github.com/fsnotify/fsnotify v1.4.9/go.mod h1:znqG4EE+3YCdAaPaxE2ZRY/06pZUdp0tY4IgpuI1SZQ=
github.com/fsnotify/fsnotify v1.7.0 h1:8JEhPFa5W2WU7YfeZzPNqzMP6Lwt7L2715Ggo0nosvA=
@ -75,13 +94,16 @@ github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
github.com/go-logr/zapr v1.3.0 h1:XGdV8XW8zdwFiwOA2Dryh1gj2KRQyOOoNmBy4EplIcQ=
github.com/go-logr/zapr v1.3.0/go.mod h1:YKepepNBd1u/oyhd/yQmtjVXmm9uML4IXUgMOwR8/Gg=
github.com/go-openapi/jsonpointer v0.19.6 h1:eCs3fxoIi3Wh6vtgmLTOjdhSpiqphQ+DaPn38N2ZdrE=
github.com/go-openapi/jsonpointer v0.19.6/go.mod h1:osyAmYz/mB/C3I+WsTTSgw1ONzaLJoLCyoi6/zppojs=
github.com/go-openapi/jsonpointer v0.21.0 h1:YgdVicSA9vH5RiHs9TZW5oyafXZFc6+2Vc1rr/O9oNQ=
github.com/go-openapi/jsonpointer v0.21.0/go.mod h1:IUyH9l/+uyhIYQ/PXVA41Rexl+kOkAPDdXEYns6fzUY=
github.com/go-openapi/jsonreference v0.20.2 h1:3sVjiK66+uXK/6oQ8xgcRKcFgQ5KXa2KvnJRumpMGbE=
github.com/go-openapi/jsonreference v0.20.2/go.mod h1:Bl1zwGIM8/wsvqjsOQLJ/SH+En5Ap4rVB5KVcIDZG2k=
github.com/go-openapi/swag v0.22.3/go.mod h1:UzaqsxGiab7freDnrUUra0MwWfN/q7tE4j+VcZ0yl14=
github.com/go-openapi/swag v0.22.4 h1:QLMzNJnMGPRNDCbySlcj1x01tzU8/9LTTL9hZZZogBU=
github.com/go-openapi/swag v0.22.4/go.mod h1:UzaqsxGiab7freDnrUUra0MwWfN/q7tE4j+VcZ0yl14=
github.com/go-openapi/swag v0.23.0 h1:vsEVJDUo2hPJ2tu0/Xc+4noaxyEffXNIs3cOULZ+GrE=
github.com/go-openapi/swag v0.23.0/go.mod h1:esZ8ITTYEsH1V2trKHjAN8Ai7xHb8RV+YSZ577vPjgQ=
github.com/go-quicktest/qt v1.101.0 h1:O1K29Txy5P2OK0dGo59b7b0LR6wKfIhttaAhHUyn7eI=
github.com/go-quicktest/qt v1.101.0/go.mod h1:14Bz/f7NwaXPtdYEgzsx46kqSxVwTbzVZsDC26tQJow=
github.com/go-task/slim-sprig v0.0.0-20210107165309-348f09dbbbc0/go.mod h1:fyg7847qk6SyHyPtNmDHnmrv/HOrqktSC+C9fM+CJOE=
github.com/go-task/slim-sprig/v3 v3.0.0 h1:sUs3vkvUymDpBKi3qH1YSqBQk9+9D/8M2mN1vB6EwHI=
github.com/go-task/slim-sprig/v3 v3.0.0/go.mod h1:W848ghGpv3Qj3dhTPRyJypKRiqCdHZiAzKg9hl15HA8=
@ -92,8 +114,6 @@ github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q=
github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q=
github.com/golang-jwt/jwt/v4 v4.5.0 h1:7cYmW1XlMY7h7ii7UhUyChSgS5wUJEnm9uZVTGqOWzg=
github.com/golang-jwt/jwt/v4 v4.5.0/go.mod h1:m21LjoU+eqJr34lmDMbreY2eSTRJ1cv77w39/MY0Ch0=
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da h1:oI5xCqsCo564l8iNU+DwB5epxmsaqB+rhGL0m5jtYqE=
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc=
github.com/golang/mock v1.6.0 h1:ErTB+efbowRARo13NNdxyJji2egdxLGQhRaY+DUumQc=
github.com/golang/mock v1.6.0/go.mod h1:p6yTPP+5HYm5mzsMV8JkE6ZKdX+/wYM6Hr+LicevLPs=
github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
@ -103,23 +123,21 @@ github.com/golang/protobuf v1.4.0-rc.2/go.mod h1:LlEzMj4AhA7rCAGe4KMBDvJI+AwstrU
github.com/golang/protobuf v1.4.0-rc.4.0.20200313231945-b860323f09d0/go.mod h1:WU3c8KckQ9AFe+yFwt9sWVRKCVIyN9cPHBJSNnbL67w=
github.com/golang/protobuf v1.4.0/go.mod h1:jodUvKwWbYaEsadDk5Fwe5c77LiNKVO9IDvqG2KuDX0=
github.com/golang/protobuf v1.4.2/go.mod h1:oDoupMAO8OvCJWAcko0GGGIgR6R6ocIYbsSw735rRwI=
github.com/golang/protobuf v1.4.3/go.mod h1:oDoupMAO8OvCJWAcko0GGGIgR6R6ocIYbsSw735rRwI=
github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaSAoJOfIk=
github.com/golang/protobuf v1.5.2/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY=
github.com/golang/protobuf v1.5.4 h1:i7eJL8qZTpSEXOPTxNKhASYpMn+8e5Q6AdndVa1dWek=
github.com/golang/protobuf v1.5.4/go.mod h1:lnTiLA8Wa4RWRcIUkrtSVa5nRhsEGBg48fD6rSs7xps=
github.com/google/btree v1.0.1 h1:gK4Kx5IaGY9CD5sPJ36FHiBJ6ZXl0kilRiiCj+jdYp4=
github.com/google/btree v1.0.1/go.mod h1:xXMiIv4Fb/0kKde4SpL7qlzvu5cMJDRkFDxJfI9uaxA=
github.com/google/cadvisor v0.49.0 h1:1PYeiORXmcFYi609M4Qvq5IzcvcVaWgYxDt78uH8jYA=
github.com/google/cadvisor v0.49.0/go.mod h1:s6Fqwb2KiWG6leCegVhw4KW40tf9f7m+SF1aXiE8Wsk=
github.com/google/cel-go v0.20.1 h1:nDx9r8S3L4pE61eDdt8igGj8rf5kjYR3ILxWIpWNi84=
github.com/google/cel-go v0.20.1/go.mod h1:kWcIzTsPX0zmQ+H3TirHstLLf9ep5QTsZBN9u4dOYLg=
github.com/google/cadvisor v0.51.0 h1:BspqSPdZoLKrnvuZNOvM/KiJ/A+RdixwagN20n+2H8k=
github.com/google/cadvisor v0.51.0/go.mod h1:czGE/c/P/i0QFpVNKTFrIEzord9Y10YfpwuaSWXELc0=
github.com/google/cel-go v0.22.0 h1:b3FJZxpiv1vTMo2/5RDUqAHPxkT8mmMfJIrq1llbf7g=
github.com/google/cel-go v0.22.0/go.mod h1:BuznPXXfQDpXKWQ9sPW3TzlAJN5zzFe+i9tIs0yC4s8=
github.com/google/gnostic-models v0.6.8 h1:yo/ABAfM5IMRsS1VnXjTBvUb61tFIHozhlYvRgGre9I=
github.com/google/gnostic-models v0.6.8/go.mod h1:5n7qKqH0f5wFt+aWF8CW6pZLLNOfYuF5OpfBSENuI8U=
github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-cmp v0.4.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.4/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
@ -128,8 +146,8 @@ github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/
github.com/google/gofuzz v1.2.0 h1:xRy4A+RhZaiKjJ1bPfwQ8sedCA+YS2YcCHW6ec7JMi0=
github.com/google/gofuzz v1.2.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/pprof v0.0.0-20210407192527-94a9f03dee38/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/pprof v0.0.0-20240525223248-4bfdf5a9a2af h1:kmjWCqn2qkEml422C2Rrd27c3VGxi6a/6HNq8QmHRKM=
github.com/google/pprof v0.0.0-20240525223248-4bfdf5a9a2af/go.mod h1:K1liHPHnj73Fdn/EKuT8nrFqBihUSKXoLYU0BuatOYo=
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db h1:097atOisP2aRj7vFgYQBbFN4U4JNXUNYpxael3UzMyo=
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db/go.mod h1:vavhavw2zAxS5dIdcRluK6cSGGPlZynqzFM8NdvU144=
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 h1:El6M4kTTCOh6aBiKaUGG7oYTSPP8MxqL4YI3kZKwcP4=
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510/go.mod h1:pupxD2MaaD3pAXIBCelhxNneeOaAeabZDe5s4K6zSpQ=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
@ -157,8 +175,8 @@ github.com/imdario/mergo v0.3.16/go.mod h1:WBLT9ZmE3lPoWsEzCh9LPo3TiwVN+ZKEjmz+h
github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8=
github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw=
github.com/jessevdk/go-flags v1.4.0/go.mod h1:4FA24M0QyGHXBuZZK/XkWh8h0e1EYbRYJSGM75WSRxI=
github.com/jonboulle/clockwork v0.2.2 h1:UOGuzwb1PwsrDAObMuhUnj0p5ULPj8V/xJ7Kx9qUBdQ=
github.com/jonboulle/clockwork v0.2.2/go.mod h1:Pkfl5aHPm1nk2H9h0bjmnJD/BcgbGXUBGnn1kMkgxc8=
github.com/jonboulle/clockwork v0.4.0 h1:p4Cf1aMWXnXAUh8lVfewRBx1zaTSYKrKMF2g3ST4RZ4=
github.com/jonboulle/clockwork v0.4.0/go.mod h1:xgRqUGwRcjKCO1vbZUEtSLrqKoPSsUpK7fnezOII0kc=
github.com/josharian/intern v1.0.0 h1:vlS4z54oSdjm0bgjRigI+G1HpF+tI+9rE5LLzOg8HmY=
github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
github.com/jpillora/backoff v1.0.0 h1:uvFg412JmmHBHw7iwprIxkPMI+sGQ4kzOWsMeHnm2EA=
@ -166,6 +184,8 @@ github.com/jpillora/backoff v1.0.0/go.mod h1:J/6gKK9jxlEcS3zixgDgUAsiuZ7yrSoa/FX
github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo=
github.com/jtolds/gls v4.20.0+incompatible/go.mod h1:QJZ7F/aHp+rZTRtaJ1ow/lLfFfVYBRgL+9YlvaHOwJU=
github.com/karrick/godirwalk v1.17.0 h1:b4kY7nqDdioR/6qnbHQyDvmA17u5G1cZ6J+CZXwSWoI=
github.com/karrick/godirwalk v1.17.0/go.mod h1:j4mkqPuvaLI8mp1DroR3P6ad7cyYd4c1qeJ3RV7ULlk=
github.com/kisielk/errcheck v1.5.0/go.mod h1:pFxgyoBC7bSaBwPgfKdkLd5X25qrDl4LWUI2bnpBCr8=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/kr/pretty v0.2.1/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI=
@ -177,21 +197,31 @@ github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0=
github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc=
github.com/mistifyio/go-zfs v2.1.2-0.20190413222219-f784269be439+incompatible h1:aKW/4cBs+yK6gpqU3K/oIwk9Q/XICqd3zOX/UFuvqmk=
github.com/mistifyio/go-zfs v2.1.2-0.20190413222219-f784269be439+incompatible/go.mod h1:8AuVvqP/mXw1px98n46wfvcGfQ4ci2FwoAjKYxuo3Z4=
github.com/mitchellh/mapstructure v1.5.0 h1:jeMsZIYE/09sWLaz43PL7Gy6RuMjD2eJVyuac5Z2hdY=
github.com/mitchellh/mapstructure v1.5.0/go.mod h1:bFUtVrKA4DC2yAKiSyO/QUcy7e+RRV2QTWOzhPopBRo=
github.com/moby/sys/mountinfo v0.4.1/go.mod h1:rEr8tzG/lsIZHBtN/JjGG+LMYx9eXgW2JI+6q0qou+A=
github.com/moby/sys/mountinfo v0.7.1 h1:/tTvQaSJRr2FshkhXiIpux6fQ2Zvc4j7tAhMTStAG2g=
github.com/moby/sys/mountinfo v0.7.1/go.mod h1:IJb6JQeOklcdMU9F5xQ8ZALD+CUr5VlGpwtX+VE0rpI=
github.com/moby/docker-image-spec v1.3.1 h1:jMKff3w6PgbfSa69GfNg+zN/XLhfXJGnEx3Nl2EsFP0=
github.com/moby/docker-image-spec v1.3.1/go.mod h1:eKmb5VW8vQEh/BAr2yvVNvuiJuY6UIocYsFu/DxxRpo=
github.com/moby/spdystream v0.5.0 h1:7r0J1Si3QO/kjRitvSLVVFUjxMEb/YLj6S9FF62JBCU=
github.com/moby/spdystream v0.5.0/go.mod h1:xBAYlnt/ay+11ShkdFKNAG7LsyK/tmNBVvVOwrfMgdI=
github.com/moby/sys/mountinfo v0.7.2 h1:1shs6aH5s4o5H2zQLn796ADW1wMrIwHsyJ2v9KouLrg=
github.com/moby/sys/mountinfo v0.7.2/go.mod h1:1YOa8w8Ih7uW0wALDUgT1dTTSBrZ+HiBLGws92L2RU4=
github.com/moby/sys/userns v0.1.0 h1:tVLXkFOxVu9A64/yh59slHVv9ahO9UIev4JZusOLG/g=
github.com/moby/sys/userns v0.1.0/go.mod h1:IHUYgu/kao6N8YZlp9Cf444ySSvCmDlmzUcYfDHOl28=
github.com/moby/term v0.5.0 h1:xt8Q1nalod/v7BqbG21f8mQPqH+xAaC9C3N3wfWbVP0=
github.com/moby/term v0.5.0/go.mod h1:8FzsFHVUBGZdbDsJw/ot+X+d5HLUbvklYLJ9uGfcI3Y=
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg=
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9Gz0M=
github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
github.com/mrunalp/fileutils v0.5.0/go.mod h1:M1WthSahJixYnrXQl/DFQuteStB1weuxD2QJNHXfbSQ=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f h1:KUppIJq7/+SVif2QVs3tOP0zanoHgBEVAwHxUSIzRqU=
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U=
github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f h1:y5//uYreIhSUg3J1GEMiLbxo1LJaP8RfCpH6pymGZus=
github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f/go.mod h1:ZdcZmHo+o7JKHSa8/e818NopupXU1YMK5fe1lsApnBw=
github.com/nxadm/tail v1.4.4/go.mod h1:kenIhsEOeOJmVchQTgglprH7qJGnHDVpk1VPCcaMI8A=
github.com/nxadm/tail v1.4.8 h1:nPr65rt6Y5JFSKQO7qToXr7pePgD6Gwiw05lkbyAQTE=
github.com/nxadm/tail v1.4.8/go.mod h1:+ncqLTQzXmGhMZNUePPaPqPvBxHAIsmXswZKocGu+AU=
@ -201,23 +231,23 @@ github.com/onsi/ginkgo v1.16.4/go.mod h1:dX+/inL/fNMqNlz0e9LfyB9TswhZpCVdJM/Z6Vv
github.com/onsi/ginkgo v1.16.5 h1:8xi0RTUf59SOSfEtZMvwTvXYMzG4gV23XVHOZiXNtnE=
github.com/onsi/ginkgo v1.16.5/go.mod h1:+E8gABHa3K6zRBolWtd+ROzc/U5bkGt0FwiG042wbpU=
github.com/onsi/ginkgo/v2 v2.1.3/go.mod h1:vw5CSIxN1JObi/U8gcbwft7ZxR2dgaR70JSE3/PpL4c=
github.com/onsi/ginkgo/v2 v2.19.0 h1:9Cnnf7UHo57Hy3k6/m5k3dRfGTMXGvxhHFvkDTCTpvA=
github.com/onsi/ginkgo/v2 v2.19.0/go.mod h1:rlwLi9PilAFJ8jCg9UE1QP6VBpd6/xj3SRC0d6TU0To=
github.com/onsi/ginkgo/v2 v2.21.0 h1:7rg/4f3rB88pb5obDgNZrNHrQ4e6WpjonchcpuBRnZM=
github.com/onsi/ginkgo/v2 v2.21.0/go.mod h1:7Du3c42kxCUegi0IImZ1wUQzMBVecgIHjR1C+NkhLQo=
github.com/onsi/gomega v1.7.1/go.mod h1:XdKZgCCFLUoM/7CFJVPcG8C1xQ1AJ0vpAezJrB7JYyY=
github.com/onsi/gomega v1.10.1/go.mod h1:iN09h71vgCQne3DLsj+A5owkum+a2tYe+TOCB1ybHNo=
github.com/onsi/gomega v1.17.0/go.mod h1:HnhC7FXeEQY45zxNK3PPoIUhzk/80Xly9PcubAlGdZY=
github.com/onsi/gomega v1.33.1 h1:dsYjIxxSR755MDmKVsaFQTE22ChNBcuuTWgkUDSubOk=
github.com/onsi/gomega v1.33.1/go.mod h1:U4R44UsT+9eLIaYRB2a5qajjtQYn0hauxvRm16AVYg0=
github.com/onsi/gomega v1.35.1 h1:Cwbd75ZBPxFSuZ6T+rN/WCb/gOc6YgFBXLlZLhC7Ds4=
github.com/onsi/gomega v1.35.1/go.mod h1:PvZbdDc8J6XJEpDK4HCuRBm8a6Fzp9/DmhC9C7yFlog=
github.com/opencontainers/go-digest v1.0.0 h1:apOUWs51W5PlhuyGyz9FCeeBIOUDA/6nW8Oi/yOhh5U=
github.com/opencontainers/go-digest v1.0.0/go.mod h1:0JzlMkj0TRzQZfJkVvzbP0HBR3IKzErnv2BNG4W4MAM=
github.com/opencontainers/runc v1.0.3 h1:1hbqejyQWCJBvtKAfdO0b1FmaEf2z/bxnjqbARass5k=
github.com/opencontainers/runc v1.0.3/go.mod h1:aTaHFFwQXuA71CiyxOdFFIorAoemI04suvGRQFzWTD0=
github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/runtime-spec v1.0.3-0.20220909204839-494a5a6aca78 h1:R5M2qXZiK/mWPMT4VldCOiSL9HIAMuxQZWdG0CSM5+4=
github.com/opencontainers/runtime-spec v1.0.3-0.20220909204839-494a5a6aca78/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/selinux v1.8.2/go.mod h1:MUIHuUEvKB1wtJjQdOyYRgOnLD2xAPP8dBsCoU0KuF8=
github.com/opencontainers/selinux v1.11.0 h1:+5Zbo97w3Lbmb3PeqQtpmTkMwsW5nRI3YaLpt7tQ7oU=
github.com/opencontainers/selinux v1.11.0/go.mod h1:E5dMC3VPuVvVHDYmi78qvhJp8+M586T4DlDRYpFkyec=
github.com/opencontainers/image-spec v1.1.0 h1:8SG7/vwALn54lVB/0yZ/MMwhFrPYtpEHQb2IpWsCzug=
github.com/opencontainers/image-spec v1.1.0/go.mod h1:W4s4sFTMaBeK1BQLXbG4AdM2szdn85PY75RI83NrTrM=
github.com/opencontainers/runc v1.2.1 h1:mQkmeFSUxqFaVmvIn1VQPeQIKpHFya5R07aJw0DKQa8=
github.com/opencontainers/runc v1.2.1/go.mod h1:/PXzF0h531HTMsYQnmxXkBD7YaGShm/2zcRB79dksUc=
github.com/opencontainers/runtime-spec v1.2.0 h1:z97+pHb3uELt/yiAWD691HNHQIF07bE7dzrbT927iTk=
github.com/opencontainers/runtime-spec v1.2.0/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/selinux v1.11.1 h1:nHFvthhM0qY8/m+vfhJylliSshm8G1jJ2jDMcgULaH8=
github.com/opencontainers/selinux v1.11.1/go.mod h1:E5dMC3VPuVvVHDYmi78qvhJp8+M586T4DlDRYpFkyec=
github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
@ -238,13 +268,7 @@ github.com/prometheus/prometheus v0.39.1 h1:abZM6A+sKAv2eKTbRIaHq4amM/nT07MuxRm0
github.com/prometheus/prometheus v0.39.1/go.mod h1:GjQjgLhHMc0oo4Ko7qt/yBSJMY4hUoiAZwsYQgjaePA=
github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/seccomp/libseccomp-golang v0.9.1/go.mod h1:GbW5+tmTXfcxTToHLXlScSlAvWlF4P2Ca7zGrPiEpWo=
github.com/seccomp/libseccomp-golang v0.10.0 h1:aA4bp+/Zzi0BnWZ2F1wgNBs5gTpm+na2rWM6M9YjLpY=
github.com/seccomp/libseccomp-golang v0.10.0/go.mod h1:JA8cRccbGaA1s33RQf7Y1+q9gHmZX1yB/z9WDN1C6fg=
github.com/shurcooL/sanitized_anchor_name v1.0.0/go.mod h1:1NzhyTcUVG4SuEtjjoZeVRXNmyL/1OwPU0+IJeTBvfc=
github.com/sirupsen/logrus v1.8.1/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0=
github.com/sirupsen/logrus v1.9.3 h1:dueUQJ1C2q9oE3F7wvmSGAaVtTmUizReu6fjN8uqzbQ=
github.com/sirupsen/logrus v1.9.3/go.mod h1:naHLuLoDiP4jHNo9R0sCBMtWGeIprob74mVsIT4qYEQ=
github.com/smartystreets/assertions v0.0.0-20180927180507-b2de0cb4f26d/go.mod h1:OnSkiWE9lh6wB0YB77sQom3nweQdgAjqCqsofrRNTgc=
@ -255,12 +279,13 @@ github.com/spf13/cobra v1.8.1 h1:e5/vxKd/rZsfSJMUX1agtjeTDf+qv1/JdBF8gg5k9ZM=
github.com/spf13/cobra v1.8.1/go.mod h1:wHxEcudfqmLYa8iTfL+OuZPbBZkmvliBWKIezN3kD9Y=
github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
github.com/stoewer/go-strcase v1.2.0 h1:Z2iHWqGXH00XYgqDmNgQbIBxf3wrNq0F3feEy0ainaU=
github.com/stoewer/go-strcase v1.2.0/go.mod h1:IBiWB2sKIp3wVVQ3Y035++gc+knqhUQag1KpM8ahLw8=
github.com/stoewer/go-strcase v1.3.0 h1:g0eASXYtp+yvN9fK8sH94oCIk0fau9uV1/ZdJ0AVEzs=
github.com/stoewer/go-strcase v1.3.0/go.mod h1:fAH5hQ5pehh+j3nZfvwdk2RgEgQjAoM8wodgtPmh1xo=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
github.com/stretchr/objx v0.5.2 h1:xuMeJ0Sdp5ZMRXx/aWO6RZxdr3beISkG5/G/aIRr3pY=
github.com/stretchr/objx v0.5.2/go.mod h1:FRsXN1f5AsAjCGJKqEizvkpNtU+EGNCLh3NxZ/8L+MA=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.5.1/go.mod h1:5W2xD1RspED5o8YsWQXVCued0rvSQ+mT+I5cxcmMvtA=
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
@ -269,40 +294,35 @@ github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO
github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
github.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg=
github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/syndtr/gocapability v0.0.0-20200815063812-42c35b437635/go.mod h1:hkRG7XYTFWNJGYcbNJQlaLq0fg1yr4J4t/NcTQtrfww=
github.com/tmc/grpc-websocket-proxy v0.0.0-20220101234140-673ab2c3ae75 h1:6fotK7otjonDflCTK0BCfls4SPy3NcCVb5dqqmbRknE=
github.com/tmc/grpc-websocket-proxy v0.0.0-20220101234140-673ab2c3ae75/go.mod h1:KO6IkyS8Y3j8OdNO85qEYBsRPuteD+YciPomcXdrMnk=
github.com/urfave/cli v1.22.1/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0=
github.com/vishvananda/netlink v1.1.0/go.mod h1:cTgwzPIzzgDAYoQrMm0EdrjRUBkTqKYppBueQtXaqoE=
github.com/vishvananda/netlink v1.1.1-0.20210330154013-f5de75959ad5 h1:+UB2BJA852UkGH42H+Oee69djmxS3ANzl2b/JtT1YiA=
github.com/vishvananda/netlink v1.1.1-0.20210330154013-f5de75959ad5/go.mod h1:twkDnbuQxJYemMlGd4JFIcuhgX83tXhKS2B/PRMpOho=
github.com/vishvananda/netns v0.0.0-20191106174202-0a2b9b5464df/go.mod h1:JP3t17pCcGlemwknint6hfoeCVQrEMVwxRLRjXpq+BU=
github.com/vishvananda/netns v0.0.0-20200728191858-db3c7e526aae/go.mod h1:DD4vA1DwXk04H54A1oHXtwZmA0grkVMdPxx/VGLCah0=
github.com/vishvananda/netlink v1.3.1-0.20240905180732-b1ce50cfa9be h1:xdCMvyhnKzaepIUgVpUmTJo/+H1AQ7HuFYn1hv7/Neo=
github.com/vishvananda/netlink v1.3.1-0.20240905180732-b1ce50cfa9be/go.mod h1:i6NetklAujEcC6fK0JPjT8qSwWyO0HLn4UKG+hGqeJs=
github.com/vishvananda/netns v0.0.4 h1:Oeaw1EM2JMxD51g9uhtC0D7erkIjgmj8+JZc26m1YX8=
github.com/vishvananda/netns v0.0.4/go.mod h1:SpkAiCQRtJ6TvvxPnOSyH3BMl6unz3xZlaprSwhNNJM=
github.com/x448/float16 v0.8.4 h1:qLwI1I70+NjRFUR3zs1JPUCgaCXSh3SW62uAKT1mSBM=
github.com/x448/float16 v0.8.4/go.mod h1:14CWIYCyZA/cWjXOioeEpHeN/83MdbZDRQHoFcYsOfg=
github.com/xiang90/probing v0.0.0-20190116061207-43a291ad63a2 h1:eY9dn8+vbi4tKz5Qo6v2eYzo7kUS51QINcR5jNpbZS8=
github.com/xiang90/probing v0.0.0-20190116061207-43a291ad63a2/go.mod h1:UETIi67q53MR2AWcXfiuqkDkRtnGDLqkBTpCHuJHxtU=
github.com/xiang90/probing v0.0.0-20221125231312-a49e3df8f510 h1:S2dVYn90KE98chqDkyE9Z4N61UnQd+KOfgp5Iu53llk=
github.com/xiang90/probing v0.0.0-20221125231312-a49e3df8f510/go.mod h1:UETIi67q53MR2AWcXfiuqkDkRtnGDLqkBTpCHuJHxtU=
github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.3.5/go.mod h1:mwnBkeHKe2W/ZEtQ+71ViKU8L12m81fl3OWwC1Zlc8k=
go.etcd.io/bbolt v1.3.9 h1:8x7aARPEXiXbHmtUwAIv7eV2fQFHrLLavdiJ3uzJXoI=
go.etcd.io/bbolt v1.3.9/go.mod h1:zaO32+Ti0PK1ivdPtgMESzuzL2VPoIG1PCQNvOdo/dE=
go.etcd.io/etcd/api/v3 v3.5.14 h1:vHObSCxyB9zlF60w7qzAdTcGaglbJOpSj1Xj9+WGxq0=
go.etcd.io/etcd/api/v3 v3.5.14/go.mod h1:BmtWcRlQvwa1h3G2jvKYwIQy4PkHlDej5t7uLMUdJUU=
go.etcd.io/etcd/client/pkg/v3 v3.5.14 h1:SaNH6Y+rVEdxfpA2Jr5wkEvN6Zykme5+YnbCkxvuWxQ=
go.etcd.io/etcd/client/pkg/v3 v3.5.14/go.mod h1:8uMgAokyG1czCtIdsq+AGyYQMvpIKnSvPjFMunkgeZI=
go.etcd.io/etcd/client/v2 v2.305.13 h1:RWfV1SX5jTU0lbCvpVQe3iPQeAHETWdOTb6pxhd77C8=
go.etcd.io/etcd/client/v2 v2.305.13/go.mod h1:iQnL7fepbiomdXMb3om1rHq96htNNGv2sJkEcZGDRRg=
go.etcd.io/etcd/client/v3 v3.5.14 h1:CWfRs4FDaDoSz81giL7zPpZH2Z35tbOrAJkkjMqOupg=
go.etcd.io/etcd/client/v3 v3.5.14/go.mod h1:k3XfdV/VIHy/97rqWjoUzrj9tk7GgJGH9J8L4dNXmAk=
go.etcd.io/etcd/pkg/v3 v3.5.13 h1:st9bDWNsKkBNpP4PR1MvM/9NqUPfvYZx/YXegsYEH8M=
go.etcd.io/etcd/pkg/v3 v3.5.13/go.mod h1:N+4PLrp7agI/Viy+dUYpX7iRtSPvKq+w8Y14d1vX+m0=
go.etcd.io/etcd/raft/v3 v3.5.13 h1:7r/NKAOups1YnKcfro2RvGGo2PTuizF/xh26Z2CTAzA=
go.etcd.io/etcd/raft/v3 v3.5.13/go.mod h1:uUFibGLn2Ksm2URMxN1fICGhk8Wu96EfDQyuLhAcAmw=
go.etcd.io/etcd/server/v3 v3.5.13 h1:V6KG+yMfMSqWt+lGnhFpP5z5dRUj1BDRJ5k1fQ9DFok=
go.etcd.io/etcd/server/v3 v3.5.13/go.mod h1:K/8nbsGupHqmr5MkgaZpLlH1QdX1pcNQLAkODy44XcQ=
go.etcd.io/bbolt v1.3.11 h1:yGEzV1wPz2yVCLsD8ZAiGHhHVlczyC9d1rP43/VCRJ0=
go.etcd.io/bbolt v1.3.11/go.mod h1:dksAq7YMXoljX0xu6VF5DMZGbhYYoLUalEiSySYAS4I=
go.etcd.io/etcd/api/v3 v3.5.16 h1:WvmyJVbjWqK4R1E+B12RRHz3bRGy9XVfh++MgbN+6n0=
go.etcd.io/etcd/api/v3 v3.5.16/go.mod h1:1P4SlIP/VwkDmGo3OlOD7faPeP8KDIFhqvciH5EfN28=
go.etcd.io/etcd/client/pkg/v3 v3.5.16 h1:ZgY48uH6UvB+/7R9Yf4x574uCO3jIx0TRDyetSfId3Q=
go.etcd.io/etcd/client/pkg/v3 v3.5.16/go.mod h1:V8acl8pcEK0Y2g19YlOV9m9ssUe6MgiDSobSoaBAM0E=
go.etcd.io/etcd/client/v2 v2.305.16 h1:kQrn9o5czVNaukf2A2At43cE9ZtWauOtf9vRZuiKXow=
go.etcd.io/etcd/client/v2 v2.305.16/go.mod h1:h9YxWCzcdvZENbfzBTFCnoNumr2ax3F19sKMqHFmXHE=
go.etcd.io/etcd/client/v3 v3.5.16 h1:sSmVYOAHeC9doqi0gv7v86oY/BTld0SEFGaxsU9eRhE=
go.etcd.io/etcd/client/v3 v3.5.16/go.mod h1:X+rExSGkyqxvu276cr2OwPLBaeqFu1cIl4vmRjAD/50=
go.etcd.io/etcd/pkg/v3 v3.5.16 h1:cnavs5WSPWeK4TYwPYfmcr3Joz9BH+TZ6qoUtz6/+mc=
go.etcd.io/etcd/pkg/v3 v3.5.16/go.mod h1:+lutCZHG5MBBFI/U4eYT5yL7sJfnexsoM20Y0t2uNuY=
go.etcd.io/etcd/raft/v3 v3.5.16 h1:zBXA3ZUpYs1AwiLGPafYAKKl/CORn/uaxYDwlNwndAk=
go.etcd.io/etcd/raft/v3 v3.5.16/go.mod h1:P4UP14AxofMJ/54boWilabqqWoW9eLodl6I5GdGzazI=
go.etcd.io/etcd/server/v3 v3.5.16 h1:d0/SAdJ3vVsZvF8IFVb1k8zqMZ+heGcNfft71ul9GWE=
go.etcd.io/etcd/server/v3 v3.5.16/go.mod h1:ynhyZZpdDp1Gq49jkUg5mfkDWZwXnn3eIqCqtJnrD/s=
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.53.0 h1:9G6E0TXzGFVfTnawRzrPl83iHOAV7L8NJiR8RSGYV1g=
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.53.0/go.mod h1:azvtTADFQJA8mX80jIH/akaE7h+dbm/sVuaHqN13w74=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.53.0 h1:4K4tsIXefpVJtvA/8srF4V4y0akAoPHkIslgAkjixJA=
@ -327,20 +347,20 @@ go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto=
go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE=
go.uber.org/multierr v1.11.0 h1:blXXJkSxSSfBVBlC76pxqeO+LN3aDfLQo+309xJstO0=
go.uber.org/multierr v1.11.0/go.mod h1:20+QtiLqy0Nd6FdQB9TLXag12DsQkrbs3htMFfDN80Y=
go.uber.org/zap v1.26.0 h1:sI7k6L95XOKS281NhVKOFCUNIvv9e0w4BF8N3u+tCRo=
go.uber.org/zap v1.26.0/go.mod h1:dtElttAiwGvoJ/vj4IwHBS/gXsEu/pZ50mUIRWuG0so=
go.uber.org/zap v1.27.0 h1:aJMhYGrd5QSmlpLMr2MftRKl7t8J8PTZPA732ud/XR8=
go.uber.org/zap v1.27.0/go.mod h1:GB2qFLM7cTU87MWRP2mPIjqfIDnGu+VIO4V/SdhGo2E=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
golang.org/x/crypto v0.24.0 h1:mnl8DM0o513X8fdIkmyFE/5hTYxbwYOjDS/+rK6qpRI=
golang.org/x/crypto v0.24.0/go.mod h1:Z1PMYSOR5nyMcyAVAIQSKCDwalqy85Aqn1x3Ws4L5DM=
golang.org/x/exp v0.0.0-20230515195305-f3d0a9c9a5cc h1:mCRnTeVUjcrhlRmO0VK8a6k6Rrf6TF9htwo2pJVSjIU=
golang.org/x/exp v0.0.0-20230515195305-f3d0a9c9a5cc/go.mod h1:V1LtkGg67GoY2N1AnLN78QLrzxkLyJw7RJb1gzOOz9w=
golang.org/x/crypto v0.37.0 h1:kJNSjF/Xp7kU0iB2Z+9viTPMW4EqqsrywMXLJOOsXSE=
golang.org/x/crypto v0.37.0/go.mod h1:vg+k43peMZ0pUMhYmVAWysMK35e6ioLh3wB8ZCAfbVc=
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 h1:2dVuKD2vS7b0QIHQbpyTISPd0LeHDbnYEryqj5Q1ug8=
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56/go.mod h1:M4RDyNAINzryxdtnbRXRL/OHtkFuWGRjvuhBJpk2IlY=
golang.org/x/mod v0.2.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.4.2/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.17.0 h1:zY54UmvipHiNd+pm+m0x9KhZ9hl1/7QNMyxXbc6ICqA=
golang.org/x/mod v0.17.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/mod v0.21.0 h1:vvrHzRwRfVKSiLrG+d4FMl/Qi4ukBCE6kZlTUkDYRT0=
golang.org/x/mod v0.21.0/go.mod h1:6SkKJ3Xj0I0BrPOZoBy3bdMptDDU9oJrpohJ3eWZ1fY=
golang.org/x/net v0.0.0-20180906233101-161cd47e91fd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20190311183353-d8887717615a/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
@ -348,56 +368,49 @@ golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLL
golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200520004742-59133d7f0dd7/go.mod h1:qpuaurCH72eLCgpAm/N6yyVIVM9cpaDIP3A8BGJEC5A=
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.0.0-20201224014010-6772e930b67b/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20210405180319-a5a99cb37ef4/go.mod h1:p54w0d4576C0XHj96bSt6lcn1PtDYWL6XObtHCRCNQM=
golang.org/x/net v0.0.0-20210428140749-89ef3d95e781/go.mod h1:OJAsFXCWl8Ukc7SiCT/9KSuxbyM7479/AVlXFRxuMCk=
golang.org/x/net v0.26.0 h1:soB7SVo0PWrY4vPW/+ay0jKDNScG2X9wFeYlXIvJsOQ=
golang.org/x/net v0.26.0/go.mod h1:5YKkiSynbBIh3p6iOc/vibscux0x38BZDkn8sCUPxHE=
golang.org/x/oauth2 v0.21.0 h1:tsimM75w1tF/uws5rbeHzIWxEqElMehnc+iW793zsZs=
golang.org/x/oauth2 v0.21.0/go.mod h1:XYTD2NtWslqkgxebSiOHnXEap4TF09sJSc7H1sXbhtI=
golang.org/x/net v0.38.0 h1:vRMAPTMaeGqVhG5QyLJHqNDwecKTomGeqbnfZyKlBI8=
golang.org/x/net v0.38.0/go.mod h1:ivrbrMbzFq5J41QOQh0siUuly180yBYtLp+CKbEaFx8=
golang.org/x/oauth2 v0.23.0 h1:PbgcYx2W7i4LvjJWEbf0ngHV6qJYr86PkAV3bXdLEbs=
golang.org/x/oauth2 v0.23.0/go.mod h1:XYTD2NtWslqkgxebSiOHnXEap4TF09sJSc7H1sXbhtI=
golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.7.0 h1:YsImfSBoP9QPYL0xyKJPq0gcaJdG3rInoqxTWbfQu9M=
golang.org/x/sync v0.7.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.13.0 h1:AauUjRAJ9OSnvULf/ARrrVywoJDy0YS2AwQ98I37610=
golang.org/x/sync v0.13.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
golang.org/x/sys v0.0.0-20180909124046-d0be0721c37e/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190606203320-7fc4e5ec1444/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190904154756-749cb33beabd/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191005200804-aed5e4c7ecf9/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191115151921-52ab43148777/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191120155948-bd437916bb0e/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20191204072324-ce4227a45e2e/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200217220822-9197077df867/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200728102440-3e129f6d46b1/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200909081042-eff7692f9009/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210112080510-489259a85091/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210124154548-22da62e12c0c/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210330210617-4fbd30eecc44/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210423082822-04245dca01da/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210426230700-d19ff857e887/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210510120138-977fb7262007/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20210616094352-59db8d763f22/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220715151400-c0bba94af5f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.21.0 h1:rF+pYz3DAGSQAxAu1CbC7catZg4ebC4UIeIhKxBZvws=
golang.org/x/sys v0.21.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.2.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.10.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.32.0 h1:s77OFDvIQeibCmezSnk/q6iAfkdiQaJi4VzroCFrN20=
golang.org/x/sys v0.32.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.21.0 h1:WVXCp+/EBEHOj53Rvu+7KiT/iElMrO8ACK16SMZ3jaA=
golang.org/x/term v0.21.0/go.mod h1:ooXLefLobQVslOqselCNF4SxFAaoS6KujMbsGzSDmX0=
golang.org/x/term v0.31.0 h1:erwDkOK1Msy6offm1mOgvspSkslFnIGsFnxOKoufg3o=
golang.org/x/term v0.31.0/go.mod h1:R4BeIy7D95HzImkxGkTW1UQTtP54tio2RyHz7PwK0aw=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.16.0 h1:a94ExnEXNtEwYLGJSIUxnWoxoRz/ZcCsV63ROupILh4=
golang.org/x/text v0.16.0/go.mod h1:GhwF1Be+LQoKShO3cGOHzqOgRrGaYc9AvblQOmPVHnI=
golang.org/x/time v0.3.0 h1:rg5rLMjNzMS1RkNLzCG38eapWhnYLFYXDXj2gOlr8j4=
golang.org/x/time v0.3.0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/text v0.24.0 h1:dd5Bzh4yt5KYA8f9CJHCP4FB4D51c2c6JvN37xJJkJ0=
golang.org/x/text v0.24.0/go.mod h1:L8rBsPeo2pSS+xqN0d5u2ikmjtmoJbDBT1b7nHvFCdU=
golang.org/x/time v0.7.0 h1:ntUhktv3OPE6TgYxXWv9vKvUSJyIFJlyohwbkEwPrKQ=
golang.org/x/time v0.7.0/go.mod h1:3BpzKBy/shNhVucY/MWOyx10tF3SFh9QdLuxbVysPQM=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20190328211700-ab21143f2384/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
@ -405,20 +418,20 @@ golang.org/x/tools v0.0.0-20200619180055-7c47624df98f/go.mod h1:EkVYQZoAsY45+roY
golang.org/x/tools v0.0.0-20201224043029-2b0845dc783e/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
golang.org/x/tools v0.0.0-20210106214847-113979e3529a/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
golang.org/x/tools v0.1.1/go.mod h1:o0xws9oXOQQZyjljx8fwUC0k7L1pTE6eaCbjGeHmOkk=
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d h1:vU5i/LfpvrRCpgM/VPfJLg5KjxD3E+hfT1SH+d9zLwg=
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d/go.mod h1:aiJjzUbINMkxbQROHiO6hDPo2LHcIPhhQsa9DLh0yGk=
golang.org/x/tools v0.26.0 h1:v/60pFQmzmT9ExmjDv2gGIfi3OqfKoEP6I5+umXlbnQ=
golang.org/x/tools v0.26.0/go.mod h1:TPVVj70c7JJ3WCazhD8OdXcZg/og+b9+tH/KxylGwH0=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
gomodules.xyz/jsonpatch/v2 v2.2.0 h1:4pT439QV83L+G9FkcCriY6EkpcK6r6bK+A5FBUMI7qY=
gomodules.xyz/jsonpatch/v2 v2.2.0/go.mod h1:WXp+iVDkoLQqPudfQ9GBlwB2eZ5DKOnjQZCYdOS8GPY=
google.golang.org/genproto v0.0.0-20230822172742-b8732ec3820d h1:VBu5YqKPv6XiJ199exd8Br+Aetz+o08F+PLMnwJQHAY=
google.golang.org/genproto v0.0.0-20230822172742-b8732ec3820d/go.mod h1:yZTlhN0tQnXo3h00fuXNCxJdLdIdnVFVBaRJ5LWBbw4=
google.golang.org/genproto/googleapis/api v0.0.0-20240528184218-531527333157 h1:7whR9kGa5LUwFtpLm2ArCEejtnxlGeLbAyjFY8sGNFw=
google.golang.org/genproto/googleapis/api v0.0.0-20240528184218-531527333157/go.mod h1:99sLkeliLXfdj2J75X3Ho+rrVCaJze0uwN7zDDkjPVU=
google.golang.org/genproto/googleapis/rpc v0.0.0-20240701130421-f6361c86f094 h1:BwIjyKYGsK9dMCBOorzRri8MQwmi7mT9rGHsCEinZkA=
google.golang.org/genproto/googleapis/rpc v0.0.0-20240701130421-f6361c86f094/go.mod h1:Ue6ibwXGpU+dqIcODieyLOcgj7z8+IcskoNIgZxtrFY=
google.golang.org/genproto v0.0.0-20240123012728-ef4313101c80 h1:KAeGQVN3M9nD0/bQXnr/ClcEMJ968gUXJQ9pwfSynuQ=
google.golang.org/genproto v0.0.0-20240123012728-ef4313101c80/go.mod h1:cc8bqMqtv9gMOr0zHg2Vzff5ULhhL2IXP4sbcn32Dro=
google.golang.org/genproto/googleapis/api v0.0.0-20240826202546-f6391c0de4c7 h1:YcyjlL1PRr2Q17/I0dPk2JmYS5CDXfcdb2Z3YRioEbw=
google.golang.org/genproto/googleapis/api v0.0.0-20240826202546-f6391c0de4c7/go.mod h1:OCdP9MfskevB/rbYvHTsXTtKC+3bHWajPdoKgjcYkfo=
google.golang.org/genproto/googleapis/rpc v0.0.0-20240826202546-f6391c0de4c7 h1:2035KHhUv+EpyB+hWgJnaWKJOdX1E95w2S8Rr4uWKTs=
google.golang.org/genproto/googleapis/rpc v0.0.0-20240826202546-f6391c0de4c7/go.mod h1:UqMtugtsSgubUsoxbuAoiCXvqvErP7Gf0so0mK9tHxU=
google.golang.org/grpc v1.57.0 h1:kfzNeI/klCGD2YPMUlaGNT3pxvYfga7smW3Vth8Zsiw=
google.golang.org/grpc v1.57.0/go.mod h1:Sd+9RMTACXwmub0zcNY2c4arhtrbBYD1AUHI/dt16Mo=
google.golang.org/protobuf v0.0.0-20200109180630-ec00e32a8dfd/go.mod h1:DFci5gLYBciE7Vtevhsrf46CRTquxDuWsQurQQe4oz8=
@ -429,8 +442,8 @@ google.golang.org/protobuf v1.21.0/go.mod h1:47Nbq4nVaFHyn7ilMalzfO3qCViNmqZ2kzi
google.golang.org/protobuf v1.23.0/go.mod h1:EGpADcykh3NcUnDUJcl1+ZksZNG86OlYog2l/sGQquU=
google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw=
google.golang.org/protobuf v1.26.0/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc=
google.golang.org/protobuf v1.34.2 h1:6xV6lTsCfpGD21XK49h7MhtcApnLqkfYgPcdHftf6hg=
google.golang.org/protobuf v1.34.2/go.mod h1:qYOHts0dSfpeUzUFpOMr/WGzszTmLH+DiWniOlNbLDw=
google.golang.org/protobuf v1.35.1 h1:m3LfL6/Ca+fqnjnlqQXNpFPABW1UD7mjh8KO2mKFytA=
google.golang.org/protobuf v1.35.1/go.mod h1:9fA7Ob0pmnwhb644+1+CVWFRbNajQ6iRojtC/QF5bRE=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
@ -445,70 +458,75 @@ gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 h1:uRGJdciOHaEIrze2W8Q3AKkep
gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7/go.mod h1:dt/ZhP58zS4L8KSrWDmTeBkI65Dw0HsyUHuEVlX15mw=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.4/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.3.0/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
k8s.io/api v0.31.1 h1:Xe1hX/fPW3PXYYv8BlozYqw63ytA92snr96zMW9gWTU=
k8s.io/api v0.31.1/go.mod h1:sbN1g6eY6XVLeqNsZGLnI5FwVseTrZX7Fv3O26rhAaI=
k8s.io/apiextensions-apiserver v0.31.3 h1:+GFGj2qFiU7rGCsA5o+p/rul1OQIq6oYpQw4+u+nciE=
k8s.io/apiextensions-apiserver v0.31.3/go.mod h1:2DSpFhUZZJmn/cr/RweH1cEVVbzFw9YBu4T+U3mf1e4=
k8s.io/apimachinery v0.31.3 h1:6l0WhcYgasZ/wk9ktLq5vLaoXJJr5ts6lkaQzgeYPq4=
k8s.io/apimachinery v0.31.3/go.mod h1:rsPdaZJfTfLsNJSQzNHQvYoTmxhoOEofxtOsF3rtsMo=
k8s.io/apiserver v0.31.3 h1:+1oHTtCB+OheqFEz375D0IlzHZ5VeQKX1KGXnx+TTuY=
k8s.io/apiserver v0.31.3/go.mod h1:PrxVbebxrxQPFhJk4powDISIROkNMKHibTg9lTRQ0Qg=
k8s.io/client-go v0.31.3 h1:CAlZuM+PH2cm+86LOBemaJI/lQ5linJ6UFxKX/SoG+4=
k8s.io/client-go v0.31.3/go.mod h1:2CgjPUTpv3fE5dNygAr2NcM8nhHzXvxB8KL5gYc3kJs=
k8s.io/cloud-provider v0.31.3 h1:7C3CHQUUwnv/HWWVIaibZH06iPg663RYQ6C6Zy4FnO8=
k8s.io/cloud-provider v0.31.3/go.mod h1:c7csKppoVb9Ej6upJ28AvHy4B3BtlRMzXfgezsDdPKw=
k8s.io/code-generator v0.31.3 h1:Pj0fYOBms+ZrsulLi4DMsCEx1jG8fWKRLy44onHsLBI=
k8s.io/code-generator v0.31.3/go.mod h1:/umCIlT84g1+Yu5ZXtP1KGSRTnGiIzzX5AzUAxsNlts=
k8s.io/component-base v0.31.3 h1:DMCXXVx546Rfvhj+3cOm2EUxhS+EyztH423j+8sOwhQ=
k8s.io/component-base v0.31.3/go.mod h1:xME6BHfUOafRgT0rGVBGl7TuSg8Z9/deT7qq6w7qjIU=
k8s.io/component-helpers v0.31.3 h1:0zGPD2PrekhFWgmz85XxlMEl7dfhlKC1tERZDe3onQc=
k8s.io/component-helpers v0.31.3/go.mod h1:HZ1HZx2TKXM7xSUV2cR9L5yDoyZPhhHQNaE3BPBLPUQ=
k8s.io/controller-manager v0.31.3 h1:TyUav69iNYwLGwA96JDhusoZoGRdh1sdrLjXmWTcPgs=
k8s.io/controller-manager v0.31.3/go.mod h1:yuhec+dbXmBz+4c32kxJxmcauB+1pjO2ttfYODWuv18=
k8s.io/cri-api v0.31.3 h1:dsZXzrGrCEwHjsTDlAV7rutEplpMLY8bfNRMIqrtXjo=
k8s.io/cri-api v0.31.3/go.mod h1:Po3TMAYH/+KrZabi7QiwQI4a692oZcUOUThd/rqwxrI=
k8s.io/cri-client v0.31.3 h1:9ZwddaNJomqkTBYQqSmB+Ccns3beY4HyYDwmRtWTCJM=
k8s.io/cri-client v0.31.3/go.mod h1:klbWiYkOatOQOkXOYZMZMGSTM8q9eC/efsYGuXcgPes=
k8s.io/csi-translation-lib v0.31.3 h1:hxcPRNdtEsk766jCXSKjgH1V8jUNx5tVqdooQ1Ars/M=
k8s.io/csi-translation-lib v0.31.3/go.mod h1:0B1gQwd868XUIDwJYy5gB2jDXWEwlcWvSsfcQEgzbRk=
k8s.io/gengo/v2 v2.0.0-20240228010128-51d4e06bde70 h1:NGrVE502P0s0/1hudf8zjgwki1X/TByhmAoILTarmzo=
k8s.io/gengo/v2 v2.0.0-20240228010128-51d4e06bde70/go.mod h1:VH3AT8AaQOqiGjMF9p0/IM1Dj+82ZwjfxUP1IxaHE+8=
k8s.io/api v0.32.2 h1:bZrMLEkgizC24G9eViHGOPbW+aRo9duEISRIJKfdJuw=
k8s.io/api v0.32.2/go.mod h1:hKlhk4x1sJyYnHENsrdCWw31FEmCijNGPJO5WzHiJ6Y=
k8s.io/apiextensions-apiserver v0.32.2 h1:2YMk285jWMk2188V2AERy5yDwBYrjgWYggscghPCvV4=
k8s.io/apiextensions-apiserver v0.32.2/go.mod h1:GPwf8sph7YlJT3H6aKUWtd0E+oyShk/YHWQHf/OOgCA=
k8s.io/apimachinery v0.32.2 h1:yoQBR9ZGkA6Rgmhbp/yuT9/g+4lxtsGYwW6dR6BDPLQ=
k8s.io/apimachinery v0.32.2/go.mod h1:GpHVgxoKlTxClKcteaeuF1Ul/lDVb74KpZcxcmLDElE=
k8s.io/apiserver v0.32.2 h1:WzyxAu4mvLkQxwD9hGa4ZfExo3yZZaYzoYvvVDlM6vw=
k8s.io/apiserver v0.32.2/go.mod h1:PEwREHiHNU2oFdte7BjzA1ZyjWjuckORLIK/wLV5goM=
k8s.io/client-go v0.32.2 h1:4dYCD4Nz+9RApM2b/3BtVvBHw54QjMFUl1OLcJG5yOA=
k8s.io/client-go v0.32.2/go.mod h1:fpZ4oJXclZ3r2nDOv+Ux3XcJutfrwjKTCHz2H3sww94=
k8s.io/cloud-provider v0.32.2 h1:8EC+fCYo0r0REczSjOZcVuQPCMxXxCKlgxDbYMrzC30=
k8s.io/cloud-provider v0.32.2/go.mod h1:2s8TeAXhVezp5VISaTxM6vW3yDonOZXoN4Aryz1p1PQ=
k8s.io/code-generator v0.32.2 h1:CIvyPrLWP7cMgrqval2qYT839YAwCDeSvGfXgWSNpHQ=
k8s.io/code-generator v0.32.2/go.mod h1:plh7bWk7JztAUkHM4zpbdy0KOMdrhsePcZL2HLWFH7Y=
k8s.io/component-base v0.32.2 h1:1aUL5Vdmu7qNo4ZsE+569PV5zFatM9hl+lb3dEea2zU=
k8s.io/component-base v0.32.2/go.mod h1:PXJ61Vx9Lg+P5mS8TLd7bCIr+eMJRQTyXe8KvkrvJq0=
k8s.io/component-helpers v0.32.2 h1:2usSAm3zNE5yu5DdAdrKBWLfSYNpU4OPjZywJY5ovP8=
k8s.io/component-helpers v0.32.2/go.mod h1:fvQAoiiOP7jUEUBc9qR0PXiBPuB0I56WTxTkkpcI8g8=
k8s.io/controller-manager v0.32.2 h1:/9XuHWEqofO2Aqa4l7KJGckJUcLVRWfx+qnVkdXoStI=
k8s.io/controller-manager v0.32.2/go.mod h1:o5uo2tLCQhuoMt0RfKcQd0eqaNmSKOKiT+0YELCqXOk=
k8s.io/cri-api v0.32.2 h1:7DuaOHpOcXweZeBUbRdK0iCroxctGp73VwgrA0u7kho=
k8s.io/cri-api v0.32.2/go.mod h1:DCzMuTh2padoinefWME0G678Mc3QFbLMF2vEweGzBAI=
k8s.io/cri-client v0.32.2 h1:vjowJUyu14IbmifqCKJHE9rK/BPSfkXvltqN42W1Zuo=
k8s.io/cri-client v0.32.2/go.mod h1:fRZhmmZW16Qviln8hfy+e8dd2wP/n9B6TiGxLE3zBe0=
k8s.io/csi-translation-lib v0.32.2 h1:aLzAyaoJUc5rgtLi8Xd4No1tet6UpvUsGIgRoGnPSSE=
k8s.io/csi-translation-lib v0.32.2/go.mod h1:PlOKan6Vc0G6a+giQbm36plJ+E1LH+GPRLAVMQMSMcY=
k8s.io/dynamic-resource-allocation v0.32.2 h1:6wP8/GGvhhvTJLrzwPSoMJDnspmosFj1CKmfrAH6m5U=
k8s.io/dynamic-resource-allocation v0.32.2/go.mod h1:+3qnQfvikLHVZrdZ0/gYkRiV96weUR9j7+Ph3Ui/hYU=
k8s.io/gengo/v2 v2.0.0-20240911193312-2b36238f13e9 h1:si3PfKm8dDYxgfbeA6orqrtLkvvIeH8UqffFJDl0bz4=
k8s.io/gengo/v2 v2.0.0-20240911193312-2b36238f13e9/go.mod h1:EJykeLsmFC60UQbYJezXkEsG2FLrt0GPNkU5iK5GWxU=
k8s.io/klog/v2 v2.130.1 h1:n9Xl7H1Xvksem4KFG4PYbdQCQxqc/tTUyrgXaOhHSzk=
k8s.io/klog/v2 v2.130.1/go.mod h1:3Jpz1GvMt720eyJH1ckRHK1EDfpxISzJ7I9OYgaDtPE=
k8s.io/kms v0.31.3 h1:XCFmiJn5CCKs8xoOLpCmu42Ubm/KW85wNHybGFcSAYc=
k8s.io/kms v0.31.3/go.mod h1:OZKwl1fan3n3N5FFxnW5C4V3ygrah/3YXeJWS3O6+94=
k8s.io/kube-openapi v0.0.0-20240228011516-70dd3763d340 h1:BZqlfIlq5YbRMFko6/PM7FjZpUb45WallggurYhKGag=
k8s.io/kube-openapi v0.0.0-20240228011516-70dd3763d340/go.mod h1:yD4MZYeKMBwQKVht279WycxKyM84kkAx2DPrTXaeb98=
k8s.io/kube-scheduler v0.31.3 h1:indE2jtvbwyyEYDrQMjPR2EmnoM1pd4L6cK8r2/KY0I=
k8s.io/kube-scheduler v0.31.3/go.mod h1:vhDtABcWycLi8139K5ScOg54WKlbEZJRgtQmLhW0Wjo=
k8s.io/kubelet v0.31.3 h1:DIXRAmvVGp42mV2vpA1GCLU6oO8who0/vp3Oq6kSpbI=
k8s.io/kubelet v0.31.3/go.mod h1:KSdbEfNy5VzqUlAHlytA/fH12s+sE1u8fb/8JY9sL/8=
k8s.io/kubernetes v1.31.3 h1:oqb7HdfnTelrGlZ6ziNugvQ/L/aJWR704114EAhUn9Q=
k8s.io/kubernetes v1.31.3/go.mod h1:9xmT2buyTYj8TRKwRae7FcuY8k5+xlxv7VivvO0KKfs=
k8s.io/metrics v0.31.3 h1:DkT9I3gFlb2/z+/4BMY7WrQ/PnbukuV4Yli82v/KBCM=
k8s.io/metrics v0.31.3/go.mod h1:2w9gpd8z+13oJmaPR6p3kDyrDqnxSyoKpnOw2qLIdhI=
k8s.io/mount-utils v0.31.3 h1:CANy3prUYvvDCc2X7ZKgpjpDhAidx4gjGh/WwDrCPq8=
k8s.io/mount-utils v0.31.3/go.mod h1:HV/VYBUGqYUj4vt82YltzpWvgv8FPg0G9ItyInT3NPU=
k8s.io/utils v0.0.0-20240711033017-18e509b52bc8 h1:pUdcCO1Lk/tbT5ztQWOBi5HBgbBP1J8+AsQnQCKsi8A=
k8s.io/utils v0.0.0-20240711033017-18e509b52bc8/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0=
sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.30.3 h1:2770sDpzrjjsAtVhSeUFseziht227YAWYHLGNM8QPwY=
sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.30.3/go.mod h1:Ve9uj1L+deCXFrPOk1LpFXqTg7LCFzFso6PA48q/XZw=
k8s.io/kms v0.32.2 h1:7Ff23ht7W40gTcDwUC8G5WjX5W/nxD8WxbNhIYYNZCI=
k8s.io/kms v0.32.2/go.mod h1:Bk2evz/Yvk0oVrvm4MvZbgq8BD34Ksxs2SRHn4/UiOM=
k8s.io/kube-openapi v0.0.0-20241105132330-32ad38e42d3f h1:GA7//TjRY9yWGy1poLzYYJJ4JRdzg3+O6e8I+e+8T5Y=
k8s.io/kube-openapi v0.0.0-20241105132330-32ad38e42d3f/go.mod h1:R/HEjbvWI0qdfb8viZUeVZm0X6IZnxAydC7YU42CMw4=
k8s.io/kube-scheduler v0.32.2 h1:vBm6iIjWaD10OPmtkt/503LTKvrN8dWVceeBcpKj/ns=
k8s.io/kube-scheduler v0.32.2/go.mod h1:dD5yuYpnsCfgZmzvncUNPdvXGJXA1hw3gXq7DH3+aCQ=
k8s.io/kubectl v0.32.2 h1:TAkag6+XfSBgkqK9I7ZvwtF0WVtUAvK8ZqTt+5zi1Us=
k8s.io/kubectl v0.32.2/go.mod h1:+h/NQFSPxiDZYX/WZaWw9fwYezGLISP0ud8nQKg+3g8=
k8s.io/kubelet v0.32.2 h1:WFTSYdt3BB1aTApDuKNI16x/4MYqqX8WBBBBh3KupDg=
k8s.io/kubelet v0.32.2/go.mod h1:cC1ms5RS+lu0ckVr6AviCQXHLSPKEBC3D5oaCBdTGkI=
k8s.io/kubernetes v1.32.2 h1:mShetlA102UpjRVSGzB+5vjJwy8oPy8FMWrkTH5f37o=
k8s.io/kubernetes v1.32.2/go.mod h1:tiIKO63GcdPRBHW2WiUFm3C0eoLczl3f7qi56Dm1W8I=
k8s.io/metrics v0.32.2 h1:7t/rZzTHFrGa9f94XcgLlm3ToAuJtdlHANcJEHlYl9g=
k8s.io/metrics v0.32.2/go.mod h1:VL3nJpzcgB6L5nSljkkzoE0nilZhVgcjCfNRgoylaIQ=
k8s.io/mount-utils v0.32.2 h1:aDwp+ucWiVnDr/LpRg88/dsXf/vm6gI1VZkYH3+3+Vw=
k8s.io/mount-utils v0.32.2/go.mod h1:Kun5c2svjAPx0nnvJKYQWhfeNW+O0EpzHgRhDcYoSY0=
k8s.io/pod-security-admission v0.32.2 h1:zDfAb/t0LbNU3z0ZMHtCb1zp8x05gWCGhmBYpUptm9A=
k8s.io/pod-security-admission v0.32.2/go.mod h1:yxMPB3i1pGMLfxbe4BiWMuowMD7cdHR32y4nCj4wH+s=
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738 h1:M3sRQVHv7vB20Xc2ybTt7ODCeFj6JSWYFzOFnYeS6Ro=
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0=
sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.31.0 h1:CPT0ExVicCzcpeN4baWEV2ko2Z/AsiZgEdwgcfwLgMo=
sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.31.0/go.mod h1:Ve9uj1L+deCXFrPOk1LpFXqTg7LCFzFso6PA48q/XZw=
sigs.k8s.io/controller-runtime v0.13.0 h1:iqa5RNciy7ADWnIc8QxCbOX5FEKVR3uxVxKHRMc2WIQ=
sigs.k8s.io/controller-runtime v0.13.0/go.mod h1:Zbz+el8Yg31jubvAEyglRZGdLAjplZl+PgtYNI6WNTI=
sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd h1:EDPBXCAspyGV4jQlpZSudPeMmr1bNJefnuqLsRAsHZo=
sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd/go.mod h1:B8JuhiUyNFVKdsE8h686QcCxMaH6HrOAZj4vswFpcB0=
sigs.k8s.io/structured-merge-diff/v4 v4.4.1 h1:150L+0vs/8DA78h1u02ooW1/fFq/Lwr+sGiqlzvrtq4=
sigs.k8s.io/structured-merge-diff/v4 v4.4.1/go.mod h1:N8hJocpFajUSSeSJ9bOZ77VzejKZaXsTtZo4/u7Io08=
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 h1:/Rv+M11QRah1itp8VhT6HoVx1Ray9eB4DBr+K+/sCJ8=
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3/go.mod h1:18nIHnGi6636UCz6m8i4DhaJ65T6EruyzmoQqI2BVDo=
sigs.k8s.io/structured-merge-diff/v4 v4.4.2 h1:MdmvkGuXi/8io6ixD5wud3vOLwc1rj0aNqRlpuvjmwA=
sigs.k8s.io/structured-merge-diff/v4 v4.4.2/go.mod h1:N8f93tFZh9U6vpxwRArLiikrE5/2tiu1w1AGfACIGE4=
sigs.k8s.io/yaml v1.4.0 h1:Mk1wCc2gy/F0THH0TAp1QYyJNzRm2KCLy3o5ASXVI5E=
sigs.k8s.io/yaml v1.4.0/go.mod h1:Ejl7/uTz7PSA4eKMyQCUTnhZYNmLIl+5c2lQPGR2BPY=
stathat.com/c/consistent v1.0.0 h1:ezyc51EGcRPJUxfHGSgJjWzJdj3NiMU9pNfLNGiXV0c=
stathat.com/c/consistent v1.0.0/go.mod h1:QkzMWzcbB+yQBL2AttO6sgsQS/JSTapcDISJalmCDS0=
volcano.sh/apis v1.10.0-alpha.0.0.20241210014034-bf27f4e986d0 h1:qcQNg8mEsXU+7YYX6hff9JT+jDj2RJB4aEGwOoWwjBY=
volcano.sh/apis v1.10.0-alpha.0.0.20241210014034-bf27f4e986d0/go.mod h1:FOdmG++9+8lgENJ9XXDh+O3Jcb9YVRnlMSpgIh3NSVI=
volcano.sh/apis v1.12.1 h1:yq5dVj/g21vnWObCIKsJKPhMoThpzDrHDD/GMouYVxk=
volcano.sh/apis v1.12.1/go.mod h1:0XNNnIOevJSYNiXRmwhXUrYCcCcWcBeTY0nxrlkk03A=

View File

@ -20,24 +20,25 @@ set -o pipefail
VK_ROOT=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/..
export RELEASE_FOLDER=${VK_ROOT}/${RELEASE_DIR}
export RELEASE_TAG=${RELEASE_TAG:-"latest"}
if ! diff ${VK_ROOT}/installer/volcano-development.yaml ${RELEASE_FOLDER}/volcano-latest.yaml ; then
if ! diff ${VK_ROOT}/installer/volcano-development.yaml ${RELEASE_FOLDER}/volcano-${RELEASE_TAG}.yaml ; then
{
echo
echo "The Generated yaml is different from the one in installer/volcano-development.yaml"
echo "please run 'make generate-yaml TAG=latest RELEASE_DIR=installer \
&& mv ${VK_ROOT}/installer/volcano-latest.yaml ${VK_ROOT}/installer/volcano-development.yaml' to update"
echo "please run 'make generate-yaml RELEASE_TAG=${RELEASE_TAG} RELEASE_DIR=installer \
&& mv ${VK_ROOT}/installer/volcano-${RELEASE_TAG}.yaml ${VK_ROOT}/installer/volcano-development.yaml' to update"
echo
} >&2
false
fi
if ! diff ${VK_ROOT}/installer/volcano-agent-development.yaml ${RELEASE_FOLDER}/volcano-agent-latest.yaml ; then
if ! diff ${VK_ROOT}/installer/volcano-agent-development.yaml ${RELEASE_FOLDER}/volcano-agent-${RELEASE_TAG}.yaml ; then
{
echo
echo "The Generated yaml is different from the one in installer/volcano-agent-development.yaml"
echo "please run 'make generate-yaml TAG=latest RELEASE_DIR=installer \
&& mv ${VK_ROOT}/installer/volcano-agent-latest.yaml ${VK_ROOT}/installer/volcano-agent-development.yaml' to update"
echo "please run 'make generate-yaml RELEASE_TAG=${RELEASE_TAG} RELEASE_DIR=installer \
&& mv ${VK_ROOT}/installer/volcano-agent-${RELEASE_TAG}.yaml ${VK_ROOT}/installer/volcano-agent-development.yaml' to update"
echo
} >&2
false

View File

@ -1,15 +1,29 @@
# this config file contains all config fields with comments
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
featureGates:
DynamicResourceAllocation: true
DRAResourceClaimDeviceStatus: true
containerdConfigPatches:
# Enable CDI as described in
# https://tags.cncf.io/container-device-interface#containerd-configuration
- |-
[plugins."io.containerd.grpc.v1.cri"]
enable_cdi = true
# 1 control plane node and 4 workers
nodes:
# the control plane node config
- role: control-plane
kubeadmConfigPatches:
- |
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
containerLogMaxSize: "50Mi"
- |
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
containerLogMaxSize: "50Mi"
- |
kind: ClusterConfiguration
apiServer:
extraArgs:
runtime-config: "resource.k8s.io/v1beta1=true"
# the four workers
- role: worker
- role: worker

View File

@ -27,6 +27,10 @@ export RELEASE_FOLDER=${VK_ROOT}/${RELEASE_DIR}
export HELM_VER=${HELM_VER:-v3.6.3}
export VOLCANO_CHART_VERSION=${TAG:-"latest"}
export VOLCANO_IMAGE_TAG=${VOLCANO_CHART_VERSION}
# Add a v prefix to VOLCANO_IMAGE_TAG if it doesn't have one because v is removed in .github/workflows/release.yaml.
if [[ ! ${VOLCANO_IMAGE_TAG} =~ ^v ]]; then
VOLCANO_IMAGE_TAG="v${VOLCANO_IMAGE_TAG}"
fi
LOCAL_OS=${OSTYPE}
case $LOCAL_OS in

View File

@ -41,7 +41,7 @@ case $CRD_VERSION in
v1beta1)
;;
*)
echo Invaild CRD_VERSION $CRD_VERSION !!!
echo Invalid CRD_VERSION $CRD_VERSION !!!
echo CRD_VERSION only support \"bases\", \"v1\" and \"v1beta1\"
exit 1
;;
@ -93,6 +93,7 @@ tail -n +2 ${VOLCANO_CRD_DIR}/bases/bus.volcano.sh_commands.yaml > ${HELM_VOLCAN
tail -n +2 ${VOLCANO_CRD_DIR}/bases/scheduling.volcano.sh_podgroups.yaml > ${HELM_VOLCANO_CRD_DIR}/bases/scheduling.volcano.sh_podgroups.yaml
tail -n +2 ${VOLCANO_CRD_DIR}/bases/scheduling.volcano.sh_queues.yaml > ${HELM_VOLCANO_CRD_DIR}/bases/scheduling.volcano.sh_queues.yaml
tail -n +2 ${VOLCANO_CRD_DIR}/bases/nodeinfo.volcano.sh_numatopologies.yaml > ${HELM_VOLCANO_CRD_DIR}/bases/nodeinfo.volcano.sh_numatopologies.yaml
tail -n +2 ${VOLCANO_CRD_DIR}/bases/topology.volcano.sh_hypernodes.yaml > ${HELM_VOLCANO_CRD_DIR}/bases/topology.volcano.sh_hypernodes.yaml
# sync jobflow bases
tail -n +2 ${JOBFLOW_CRD_DIR}/bases/flow.volcano.sh_jobflows.yaml > ${HELM_JOBFLOW_CRD_DIR}/bases/flow.volcano.sh_jobflows.yaml
@ -136,6 +137,7 @@ ${HELM_BIN_DIR}/helm template ${VK_ROOT}/installer/helm/chart/volcano --namespac
-s templates/scheduling_v1beta1_podgroup.yaml \
-s templates/scheduling_v1beta1_queue.yaml \
-s templates/nodeinfo_v1alpha1_numatopologies.yaml \
-s templates/topology_v1alpha1_hypernodes.yaml \
-s templates/webhooks.yaml \
>> ${DEPLOYMENT_FILE}

View File

@ -69,7 +69,7 @@ function check-kind {
which kind >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "Installing kind ..."
GOOS=${OS} go install sigs.k8s.io/kind@v0.24.0
GOOS=${OS} go install sigs.k8s.io/kind@v0.26.0
else
echo -n "Found kind, version: " && kind version
fi

6
hack/oss-fuzz-build.sh Executable file
View File

@ -0,0 +1,6 @@
#!/bin/bash -eu
printf "package job\nimport _ \"github.com/AdamKorcz/go-118-fuzz-build/testing\"\n" >"$SRC"/volcano/pkg/controllers/job/register.go
go mod tidy
compile_native_go_fuzzer volcano.sh/volcano/pkg/controllers/job FuzzApplyPolicies FuzzApplyPolicies
compile_native_go_fuzzer volcano.sh/volcano/pkg/controllers/job FuzzCreateJobPod FuzzCreateJobPod

View File

@ -62,20 +62,31 @@ basic:
crd_version: ${crd_version}
custom:
scheduler_log_level: 5
admission_tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
controller_tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
scheduler_tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
default_ns:
node-role.kubernetes.io/control-plane: ""
scheduler_feature_gates: ${FEATURE_GATES}
EOF
}
@ -141,30 +152,35 @@ case ${E2E_TYPE} in
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --slow-spec-threshold='30s' --progress ./test/e2e/schedulingbase/
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --slow-spec-threshold='30s' --progress ./test/e2e/schedulingaction/
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --slow-spec-threshold='30s' --progress ./test/e2e/vcctl/
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --slow-spec-threshold='30s' --progress --focus="DRA E2E Test" ./test/e2e/dra/
;;
"JOBP")
echo "Running parallel job e2e suite..."
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --nodes=4 --compilers=4 --randomize-all --randomize-suites --fail-on-pending --cover --trace --race --slow-spec-threshold='30s' --progress ./test/e2e/jobp/
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -v -r --nodes=4 --compilers=4 --randomize-all --randomize-suites --fail-on-pending --cover --trace --race --slow-spec-threshold='30s' --progress ./test/e2e/jobp/
;;
"JOBSEQ")
echo "Running sequence job e2e suite..."
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --slow-spec-threshold='30s' --progress ./test/e2e/jobseq/
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -v -r --slow-spec-threshold='30s' --progress ./test/e2e/jobseq/
;;
"SCHEDULINGBASE")
echo "Running scheduling base e2e suite..."
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --slow-spec-threshold='30s' --progress ./test/e2e/schedulingbase/
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -v -r --slow-spec-threshold='30s' --progress ./test/e2e/schedulingbase/
;;
"SCHEDULINGACTION")
echo "Running scheduling action e2e suite..."
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --slow-spec-threshold='30s' --progress ./test/e2e/schedulingaction/
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -v -r --slow-spec-threshold='30s' --progress ./test/e2e/schedulingaction/
;;
"VCCTL")
echo "Running vcctl e2e suite..."
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --slow-spec-threshold='30s' --progress ./test/e2e/vcctl/
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -v -r --slow-spec-threshold='30s' --progress ./test/e2e/vcctl/
;;
"STRESS")
echo "Running stress e2e suite..."
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -r --slow-spec-threshold='30s' --progress ./test/e2e/stress/
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -v -r --slow-spec-threshold='30s' --progress ./test/e2e/stress/
;;
"DRA")
echo "Running dra e2e suite..."
KUBECONFIG=${KUBECONFIG} GOOS=${OS} ginkgo -v -r --slow-spec-threshold='30s' --progress --focus="DRA E2E Test" ./test/e2e/dra/
;;
esac

View File

@ -25,7 +25,7 @@ CHECK_GIT_REMOTE=${CHECK_GIT_REMOTE:-true}
# Useful Defaults
## Target github organization name
TARGET_ORG=${TARGET_ORG:-"volcano-sh"}
## main repo uptream configs
## main repo upstream configs
UPSTREAM=${UPSTREAM:-"upstream"}
UPSTREAM_HEAD=${UPSTREAM_HEAD:-"master"}
UPSTREAM_REPO_NAME=${UPSTREAM_REPO_NAME:-"volcano"}

View File

@ -163,7 +163,7 @@ cd "${LICENSE_ROOT}"
kube::util::ensure-temp-dir
# Save the genreated LICENSE file for each package temporarily
# Save the generated LICENSE file for each package temporarily
TMP_LICENSE_FILE="${KUBE_TEMP}/LICENSES.$$"
# The directory to save all the LICENSE files

View File

@ -27,7 +27,7 @@ function check_golangci-lint() {
command -v golangci-lint >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "installing golangci-lint ."
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.54.2
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.60.3
if [[ $? -ne 0 ]]; then
echo "golangci-lint installed failed, exiting."
exit 1

View File

@ -100,6 +100,7 @@ The following are the list configurable parameters of Volcano Chart and their de
|`custom.leader_elect_enable`|Whether to Enable leader elect|`false`|
|`custom.admission_config_override`|Override admission configmap|`~`|
|`custom.scheduler_config_override`|Override scheduler configmap|`~`|
| `custom.controller_config_override`| Override controller configmap|`~`|
|`custom.default_affinity`|Default affinity for Admission/Controller/Scheduler pods|`~`|
|`custom.admission_affinity`|Affinity for Admission pods|`~`|
|`custom.controller_affinity`|Affinity for Controller pods|`~`|
@ -130,6 +131,7 @@ The following are the list configurable parameters of Volcano Chart and their de
|`custom.controller_log_level`|Settings log print level for Controller|`4`|
|`custom.scheduler_resources`|Resources for Scheduler pods|`~`|
|`custom.scheduler_log_level`|Settings log print level for Scheduler|`3`|
|`custom.scheduler_plugins_dir`| Settings dir for the Scheduler to load custom plugins|``|
|`custom.webhooks_namespace_selector_expressions`|Additional namespace selector expressions for Volcano admission webhooks|`~`|
|`service.ipFamilyPolicy`|Settings service the family policy|``|
|`service.ipFamilies`|Settings service the address families|`[]`|

View File

@ -14,6 +14,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
VOLCANO_AGENT_LOG_DIR="/var/log/volcano/agent"
VOLCANO_AGENT_LOG_PATH="${VOLCANO_AGENT_LOG_DIR}/volcano-agent.log"
NETWORK_QOS_LOG_PATH="${VOLCANO_AGENT_LOG_DIR}/network-qos.log"
@ -53,5 +55,9 @@ touch ${VOLCANO_AGENT_LOG_PATH}
touch ${NETWORK_QOS_LOG_PATH}
touch ${NETWORK_QOS_TOOLS_LOG_PATH}
chmod 750 ${VOLCANO_AGENT_LOG_DIR}
chown -R 1000:1000 ${VOLCANO_AGENT_LOG_DIR}
chmod 640 ${VOLCANO_AGENT_LOG_DIR}/*.log
set_memory_qos_enabled
set_sched_prio_load_balance_enabled

View File

@ -15,7 +15,7 @@
ARG OPEN_EULER_IMAGE_TAG
ARG BWM_RPM_NAME
FROM golang:1.22.2 AS builder
FROM golang:1.23.7 AS builder
WORKDIR /go/src/volcano.sh/
COPY go.mod go.sum ./
RUN go mod download
@ -29,10 +29,14 @@ RUN yum install -y cpio && \
rpm2cpio $(ls | grep oncn-bwm) | cpio -div
FROM alpine:latest
RUN apk add sudo
RUN apk add sudo libcap
COPY --from=builder /go/src/volcano.sh/volcano/_output/bin/vc-agent /vc-agent
COPY --from=builder /go/src/volcano.sh/volcano/_output/bin/network-qos \
/go/src/volcano.sh/volcano/installer/build/volcano-agent/install.sh /usr/local/bin/
COPY --from=repo /usr/share/bwmcli/bwm_tc.o /usr/local/bin/
RUN chmod +x /usr/local/bin/install.sh
RUN adduser -u 1000 -D appuser
RUN chmod +x /usr/local/bin/install.sh \
&& setcap "cap_dac_override=eip" /vc-agent \
&& setcap "cap_dac_override=eip" /usr/local/bin/network-qos \
&& echo -e '%appuser ALL=(root) NOPASSWD: /bin/cp -f /usr/local/bin/network-qos /opt/cni/bin\n%appuser ALL=(root) NOPASSWD: /bin/cp -f /usr/local/bin/bwm_tc.o /usr/share/bwmcli' >> /etc/sudoers

View File

@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
FROM golang:1.22.2 AS builder
FROM golang:1.23.7 AS builder
WORKDIR /go/src/volcano.sh/
COPY go.mod go.sum ./
RUN go mod download

View File

@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
FROM golang:1.22.2 AS builder
FROM golang:1.23.7 AS builder
WORKDIR /go/src/volcano.sh/
COPY go.mod go.sum ./
RUN go mod download

View File

@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
FROM golang:1.22.2 AS builder
FROM golang:1.23.7 AS builder
WORKDIR /go/src/volcano.sh/
COPY go.mod go.sum ./
RUN go mod download
@ -20,14 +20,14 @@ ADD . volcano
RUN cd volcano && make vc-webhook-manager
FROM alpine:latest
ARG KUBE_VERSION="1.31.0"
ARG KUBE_VERSION="1.32.0"
ARG TARGETARCH
ARG APK_MIRROR
RUN if [[ -n "$APK_MIRROR" ]]; then sed -i "s@https://dl-cdn.alpinelinux.org@${APK_MIRROR}@g" /etc/apk/repositories ; fi && \
apk add --update ca-certificates && \
apk add --update openssl && \
apk add --update -t deps curl && \
curl -L https://storage.googleapis.com/kubernetes-release/release/v$KUBE_VERSION/bin/linux/$TARGETARCH/kubectl -o /usr/local/bin/kubectl && \
curl -L https://dl.k8s.io/release/v$KUBE_VERSION/bin/linux/$TARGETARCH/kubectl -o /usr/local/bin/kubectl && \
chmod +x /usr/local/bin/kubectl && \
apk del --purge deps && \
rm /var/cache/apk/*

View File

@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: jobflows.flow.volcano.sh
spec:
group: flow.volcano.sh

View File

@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: jobtemplates.flow.volcano.sh
spec:
group: flow.volcano.sh
@ -38,6 +38,18 @@ spec:
format: int32
minimum: 1
type: integer
networkTopology:
properties:
highestTierAllowed:
default: 1
type: integer
mode:
default: hard
enum:
- hard
- soft
type: string
type: object
plugins:
additionalProperties:
items:
@ -2798,6 +2810,39 @@ spec:
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
resources:
properties:
claims:
items:
properties:
name:
type: string
request:
type: string
required:
- name
type: object
type: array
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
type: object
type: object
restartPolicy:
type: string
runtimeClassName:
@ -2840,6 +2885,8 @@ spec:
runAsUser:
format: int64
type: integer
seLinuxChangePolicy:
type: string
seLinuxOptions:
properties:
level:

View File

@ -11,6 +11,8 @@ tiers:
- name: drf
enablePreemptable: false
- name: predicates
arguments:
predicate.DynamicResourceAllocationEnable: true
- name: proportion
- name: nodeorder
- name: binpack

View File

@ -2,7 +2,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.16.4
controller-gen.kubebuilder.io/version: v0.17.0
name: jobs.batch.volcano.sh
spec:
group: batch.volcano.sh
@ -56,6 +56,18 @@ spec:
format: int32
minimum: 1
type: integer
networkTopology:
properties:
highestTierAllowed:
default: 1
type: integer
mode:
default: hard
enum:
- hard
- soft
type: string
type: object
plugins:
additionalProperties:
items:
@ -2816,6 +2828,39 @@ spec:
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
resources:
properties:
claims:
items:
properties:
name:
type: string
request:
type: string
required:
- name
type: object
type: array
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
type: object
type: object
restartPolicy:
type: string
runtimeClassName:
@ -2858,6 +2903,8 @@ spec:
runAsUser:
format: int64
type: integer
seLinuxChangePolicy:
type: string
seLinuxOptions:
properties:
level:

Some files were not shown because too many files have changed in this diff Show More