Compare commits

...

11 Commits

Author SHA1 Message Date
Yi Chen ef9a2a134b
Release v2.0.2 (#2233)
* FEATURE: add cli argument to modify controller workqueue ratelimiter (#2186)

* add cli argument to modify controller workqueue ratelimiter

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

* add cli argument to modify controller workqueue ratelimiter support to helm chart

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

---------

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
(cherry picked from commit d37a0e938a)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Fix ingress capability discovery (#2201)

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
(cherry picked from commit 56b4974310)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump github.com/aws/aws-sdk-go-v2 from 1.30.5 to 1.31.0 (#2207)

Bumps [github.com/aws/aws-sdk-go-v2](https://github.com/aws/aws-sdk-go-v2) from 1.30.5 to 1.31.0.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.30.5...v1.31.0)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit faa0822ad0)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump golang.org/x/net from 0.28.0 to 0.29.0 (#2205)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.28.0 to 0.29.0.
- [Commits](https://github.com/golang/net/compare/v0.28.0...v0.29.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 6106178c5f)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump github.com/docker/docker from 27.0.3+incompatible to 27.1.1+incompatible (#2125)

Bumps [github.com/docker/docker](https://github.com/docker/docker) from 27.0.3+incompatible to 27.1.1+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v27.0.3...v27.1.1)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 316536f7b5)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.58.3 to 1.63.3 (#2206)

Bumps [github.com/aws/aws-sdk-go-v2/service/s3](https://github.com/aws/aws-sdk-go-v2) from 1.58.3 to 1.63.3.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/service/s3/v1.58.3...service/s3/v1.63.3)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/service/s3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 1972fb75d2)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Update integration test workflow and add golangci lint check (#2197)

* Update integration test workflow

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update golangci lint config

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 143b16ee75)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump github.com/aws/aws-sdk-go-v2 from 1.31.0 to 1.32.0 (#2229)

Bumps [github.com/aws/aws-sdk-go-v2](https://github.com/aws/aws-sdk-go-v2) from 1.31.0 to 1.32.0.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.31.0...v1.32.0)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit a4dcfcb328)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump cloud.google.com/go/storage from 1.43.0 to 1.44.0 (#2228)

Bumps [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) from 1.43.0 to 1.44.0.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](https://github.com/googleapis/google-cloud-go/compare/pubsub/v1.43.0...spanner/v1.44.0)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/storage
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 254200977b)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump manusa/actions-setup-minikube from 2.11.0 to 2.12.0 (#2226)

Bumps [manusa/actions-setup-minikube](https://github.com/manusa/actions-setup-minikube) from 2.11.0 to 2.12.0.
- [Release notes](https://github.com/manusa/actions-setup-minikube/releases)
- [Commits](https://github.com/manusa/actions-setup-minikube/compare/v2.11.0...v2.12.0)

---
updated-dependencies:
- dependency-name: manusa/actions-setup-minikube
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 4358fd49bb)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump golang.org/x/time from 0.6.0 to 0.7.0 (#2227)

Bumps [golang.org/x/time](https://github.com/golang/time) from 0.6.0 to 0.7.0.
- [Commits](https://github.com/golang/time/compare/v0.6.0...v0.7.0)

---
updated-dependencies:
- dependency-name: golang.org/x/time
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 29ba4e72b0)
Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: imagePullPolicy was ignored (#2222)

Signed-off-by: xuqingtan <missedone@gmail.com>
(cherry picked from commit 7fb14e629e)
Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: spark-submission failed due to lack of permission by user `spark` (#2223)

error: Exception in thread "main" java.io.FileNotFoundException: /home/spark/.ivy2/cache/resolved-org.apache.spark-spark-submit-parent-511288aa-ce7c-4a38-9c8e-4869b71c68fa-1.0.xml (No such file or directory)

Signed-off-by: xuqingtan <missedone@gmail.com>
(cherry picked from commit d07821bcba)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump github.com/aws/aws-sdk-go-v2/config from 1.27.33 to 1.27.42 (#2231)

Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.27.33 to 1.27.42.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.27.33...config/v1.27.42)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 9be8dceb48)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump github.com/prometheus/client_golang from 1.19.1 to 1.20.4 (#2204)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.19.1 to 1.20.4.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.19.1...v1.20.4)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit fe833fa127)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove `cap_net_bind_service` from image (#2216)

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
(cherry picked from commit ac761ef511)
Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: webhook panics due to logging (#2232)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 247e834456)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Add check for generating manifests and code (#2234)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit c75d99f65b)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Spark Operator Official Release v2.0.2

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: xuqingtan <missedone@gmail.com>
Co-authored-by: Sébastien Maintrot <3097030+ImpSy@users.noreply.github.com>
Co-authored-by: Jacob Salway <jacob.salway@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nick Tan <missedone@gmail.com>
2024-10-10 15:13:09 +00:00
Yi Chen 94cdc526d9 Spark Operator Official Release v2.0.1
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-26 10:16:34 +08:00
Yi Chen 78c85f8740 Update controller RBAC for ConfigMap and PersistentVolumeClaim (#2187)
Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 73caefd0d3)
2024-09-26 10:15:56 +08:00
dependabot[bot] c2dcbed1a0 Bump github.com/onsi/gomega from 1.33.1 to 1.34.2 (#2189)
Bumps [github.com/onsi/gomega](https://github.com/onsi/gomega) from 1.33.1 to 1.34.2.
- [Release notes](https://github.com/onsi/gomega/releases)
- [Changelog](https://github.com/onsi/gomega/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/gomega/compare/v1.33.1...v1.34.2)

---
updated-dependencies:
- dependency-name: github.com/onsi/gomega
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 68e3d9cac5)
2024-09-26 10:15:56 +08:00
dependabot[bot] 2987ffbd45 Bump github.com/onsi/ginkgo/v2 from 2.19.0 to 2.20.2 (#2188)
Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.19.0 to 2.20.2.
- [Release notes](https://github.com/onsi/ginkgo/releases)
- [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/ginkgo/compare/v2.19.0...v2.20.2)

---
updated-dependencies:
- dependency-name: github.com/onsi/ginkgo/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 4df6df21ae)
2024-09-26 10:15:56 +08:00
Sébastien Maintrot b105c93773 FEATURE: build operator image as non-root (#2171)
Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
(cherry picked from commit e2cc295204)
2024-09-26 10:15:56 +08:00
Yi Chen f1b8d468f6
Release v2.0.0 (#2182)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-23 03:01:31 +00:00
Yi Chen fab1c46957
Cherry pick commits for releasing v2.0.0 (#2156)
* Support gang scheduling with Yunikorn (#2107)

* Add Yunikorn scheduler and example

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Add test cases

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Add code comments

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Add license comment

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Inline mergeNodeSelector

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Fix initial number implementation

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

---------

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
(cherry picked from commit 8fcda12657)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Update Makefile for building sparkctl (#2119)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 4bc6e89708)
Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: Add default values for namespaces to match usage descriptions  (#2128)

* fix: Add default values for namespaces to match usage descriptions

Signed-off-by: pengfei4.li <pengfei4.li@ly.com>

* fix: remove incorrect cache settings

Signed-off-by: pengfei4.li <pengfei4.li@ly.com>

---------

Signed-off-by: pengfei4.li <pengfei4.li@ly.com>
Co-authored-by: pengfei4.li <pengfei4.li@ly.com>
(cherry picked from commit 52f818d535)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Fix: Spark role binding did not render properly when setting spark service account name (#2135)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit a1a38ea2f1)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Reintroduce option webhook.enable (#2142)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 9e88049af1)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Add default batch scheduler argument (#2143)

* Add default batch scheduler argument

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Add helm unit test

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

---------

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
(cherry picked from commit 9cc1c02c64)
Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: unable to set controller/webhook replicas to zero (#2147)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 1afa72e7a0)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Adding support for setting spark job namespaces to all namespaces (#2123)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit c93b0ec0e7)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Support extended kube-scheduler as batch scheduler (#2136)

* Support coscheduling with kube-scheduler plugins

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add example for using kube-schulder coscheduling

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit e8d3de9e1a)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Run e2e tests on Kind (#2148)

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
(cherry picked from commit c810ece25b)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Set schedulerName to Yunikorn (#2153)

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
(cherry picked from commit 62b4ca636d)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Create role and rolebinding for controller/webhook in every spark job namespace if not watching all namespaces (#2129)

watching all namespaces

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 592b649917)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Fix: e2e test failes due to webhook not ready (#2149)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit dee91ba66c)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Upgrade to Go 1.23.1 (#2155)

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
(cherry picked from commit 10fcb8e19a)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Upgrade to Spark 3.5.2 (#2154)

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
(cherry picked from commit e1b7a27062)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump sigs.k8s.io/scheduler-plugins from 0.29.7 to 0.29.8 (#2159)

Bumps [sigs.k8s.io/scheduler-plugins](https://github.com/kubernetes-sigs/scheduler-plugins) from 0.29.7 to 0.29.8.
- [Release notes](https://github.com/kubernetes-sigs/scheduler-plugins/releases)
- [Changelog](https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/RELEASE.md)
- [Commits](https://github.com/kubernetes-sigs/scheduler-plugins/compare/v0.29.7...v0.29.8)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/scheduler-plugins
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 95d202e95c)
Signed-off-by: Yi Chen <github@chenyicn.net>

* feat: support driver and executor pod use different priority (#2146)

* feat: support driver and executor pod use different priority

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: if *app.Spec.Driver.PriorityClassName and *app.Spec.Executor.PriorityClassName specifically defined, then can precedence over spec.batchSchedulerOptions.priorityClassName

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: merge the logic of setPodPriorityClassName into addPriorityClassName

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: support driver and executor pod use different priority

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>
Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* feat: if *app.Spec.Driver.PriorityClassName and *app.Spec.Executor.PriorityClassName specifically defined, then can precedence over spec.batchSchedulerOptions.priorityClassName

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>
Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* feat: merge the logic of setPodPriorityClassName into addPriorityClassName

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>
Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* feat: add adjust pointer if is nil

Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* feat: remove spec.batchSchedulerOptions.priorityClassName define , split driver and executor pod priorityClass

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: remove spec.batchSchedulerOptions.priorityClassName define , split driver and executor pod priorityClass

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: Optimize code to avoid null pointer exceptions

Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* fix: remove backup crd files

Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* fix: remove BatchSchedulerOptions.PriorityClassName test code

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* fix: add driver and executor pod priorityClassName test code

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

---------

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>
Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>
Co-authored-by: Kevin Wu <kevin.wu@momenta.ai>
(cherry picked from commit 6ae1b2f69c)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump gocloud.dev from 0.37.0 to 0.39.0 (#2160)

Bumps [gocloud.dev](https://github.com/google/go-cloud) from 0.37.0 to 0.39.0.
- [Release notes](https://github.com/google/go-cloud/releases)
- [Commits](https://github.com/google/go-cloud/compare/v0.37.0...v0.39.0)

---
updated-dependencies:
- dependency-name: gocloud.dev
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit e58023b90d)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Update e2e tests (#2161)

* Add sleep buffer to ensture the webhooks are ready before running the e2e tests

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove duplicate operator image build tasks

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update e2e tests

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update examples

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit e6a7805079)
Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: webhook not working when settings spark job namespaces to empty (#2163)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 7785107ec5)
Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: The logger had an odd number of arguments, making it panic (#2166)

Signed-off-by: tcassaert <tcassaert@inuits.eu>
(cherry picked from commit eb48b349a1)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Upgrade to Spark 3.5.2(#2012) (#2157)

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

---------

Signed-off-by: HyukSangCho <a01045542949@gmail.com>
(cherry picked from commit 9f0c08a65e)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Feature: Add pprof endpoint (#2164)

* add pprof support to the operator Controller Manager

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

* add pprof support to helm chart

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

---------

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
(cherry picked from commit 75b926652b)
Signed-off-by: Yi Chen <github@chenyicn.net>

* fix the make kind-delete-custer to avoid accidental kubeconfig deletion (#2172)

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
(cherry picked from commit cbfefd57bb)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump github.com/aws/aws-sdk-go-v2/config from 1.27.27 to 1.27.33 (#2174)

Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.27.27 to 1.27.33.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.27.27...config/v1.27.33)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit b81833246f)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Bump helm.sh/helm/v3 from 3.15.3 to 3.16.1 (#2173)

Bumps [helm.sh/helm/v3](https://github.com/helm/helm) from 3.15.3 to 3.16.1.
- [Release notes](https://github.com/helm/helm/releases)
- [Commits](https://github.com/helm/helm/compare/v3.15.3...v3.16.1)

---
updated-dependencies:
- dependency-name: helm.sh/helm/v3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit f3f80d49b1)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Add specific error in log line when failed to create web UI service (#2170)

* Add specific error in log line when failed to create web UI service

Signed-off-by: tcassaert <tcassaert@inuits.eu>

* Update log to reflect correct resource that could not be created

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: tcassaert <tcassaert@protonmail.com>

---------

Signed-off-by: tcassaert <tcassaert@inuits.eu>
Signed-off-by: tcassaert <tcassaert@protonmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit ed3226ebe7)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Account for spark.executor.pyspark.memory in Yunikorn gang scheduling (#2178)

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
(cherry picked from commit a2f71c6137)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Fix: spark application does not respect time to live seconds (#2165)

* Add time to live seconds example spark application

Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: spark application does not respect time to live seconds

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit c855ee4c8b)
Signed-off-by: Yi Chen <github@chenyicn.net>

* Update release workflow and docs (#2121)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit bca6aa85cc)
Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: pengfei4.li <pengfei4.li@ly.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>
Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>
Signed-off-by: tcassaert <tcassaert@inuits.eu>
Signed-off-by: HyukSangCho <a01045542949@gmail.com>
Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
Signed-off-by: tcassaert <tcassaert@protonmail.com>
Co-authored-by: Jacob Salway <jacob.salway@gmail.com>
Co-authored-by: Neo <56439757+snappyyouth@users.noreply.github.com>
Co-authored-by: pengfei4.li <pengfei4.li@ly.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kevinz <ruoshuidba@gmail.com>
Co-authored-by: Kevin Wu <kevin.wu@momenta.ai>
Co-authored-by: tcassaert <tcassaert@protonmail.com>
Co-authored-by: ha2hi <56156892+ha2hi@users.noreply.github.com>
Co-authored-by: Sébastien Maintrot <3097030+ImpSy@users.noreply.github.com>
2024-09-23 01:58:32 +00:00
Yi Chen 74b345abe3
Release v2.0.0-rc.0 (#2115)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-09 05:17:37 +00:00
Yi Chen 53c6aeded3
Cherry pick #2089 #2109 #2111 (#2110)
* Update workflow and docs for releasing Spark operator (#2089)

* Update .helmignore

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add release docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update release workflow

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update integration test workflow

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add workflow for pushing tag when VERSION file changes

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove the leading 'v' from chart version

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update docker image tags

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit a3ec8f193f)

* Fix broken integration test CI (#2109)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 6ff204a6ab)

* Fix CI: environment variable BRANCH is missed (#2111)

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit e2693f19c5)
2024-08-06 23:14:39 +00:00
Yi Chen 9034cc4e9f
Cherry pick #2081 #2046 #2091 #2072 (#2108)
* Update helm docs (#2081)

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
(cherry picked from commit eca3fc8702)

* Update the process to build api-docs, generate CRD manifests and code (#2046)

* Update .gitignore

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update .dockerignore

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update Makefile

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update the process to generate api docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update the workflow to generate api docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Use controller-gen to generate CRD and deep copy related methods

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update helm chart CRDs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update workflow for building spark operator

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update README.md

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 779ea3debc)

* Add topologySpreadConstraints (#2091)

* Update README and documentation (#2047)

* Update docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove docs and update README

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add link to monthly community meeting

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Add PodDisruptionBudget to chart (#2078)

* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Set topologySpreadConstraints

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README and increase patch version

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Revert replicaCount change

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README after master merger

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
Co-authored-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
(cherry picked from commit 4108f54937)

* Use controller-runtime to reconsturct spark operator (#2072)

* Use controller-runtime to reconstruct spark operator

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update helm charts

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update examples

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
(cherry picked from commit 0dc641bd1d)

---------

Co-authored-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Co-authored-by: jbhalodia-slack <jbhalodia@salesforce.com>
2024-08-01 22:06:06 +00:00
332 changed files with 70397 additions and 36938 deletions

View File

@ -1 +1,31 @@
vendor .github/
.idea/
.vscode/
bin/
charts/
docs/
config/
examples/
hack/
manifest/
spark-docker/
sparkctl/
test/
vendor/
.dockerignore
.DS_Store
.gitignore
.gitlab-ci.yaml
.golangci.yaml
.pre-commit-config.yaml
ADOPTERS.md
CODE_OF_CONDUCT.md
codecov.ymal
CONTRIBUTING.md
cover.out
Dockerfile
LICENSE
OWNERS
PROJECT
README.md
test.sh

64
.github/workflows/check-release.yaml vendored Normal file
View File

@ -0,0 +1,64 @@
name: Check Release
on:
pull_request:
branches:
- release-*
paths:
- VERSION
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
SEMVER_PATTERN: '^v([0-9]+)\.([0-9]+)\.([0-9]+)(-rc\.([0-9]+))?$'
jobs:
check:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Check whether version matches semver pattern
run: |
VERSION=$(cat VERSION)
if [[ ${VERSION} =~ ${{ env.SEMVER_PATTERN }} ]]; then
echo "Version '${VERSION}' matches semver pattern."
else
echo "Version '${VERSION}' does not match semver pattern."
exit 1
fi
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Check whether chart version and appVersion matches version
run: |
VERSION=${VERSION#v}
CHART_VERSION=$(cat charts/spark-operator-chart/Chart.yaml | grep version | awk '{print $2}')
CHART_APP_VERSION=$(cat charts/spark-operator-chart/Chart.yaml | grep appVersion | awk '{print $2}')
if [[ ${CHART_VERSION} == ${VERSION} ]]; then
echo "Chart version '${CHART_VERSION}' matches version '${VERSION}'."
else
echo "Chart version '${CHART_VERSION}' does not match version '${VERSION}'."
exit 1
fi
if [[ ${CHART_APP_VERSION} == ${VERSION} ]]; then
echo "Chart appVersion '${CHART_APP_VERSION}' matches version '${VERSION}'."
else
echo "Chart appVersion '${CHART_APP_VERSION}' does not match version '${VERSION}'."
exit 1
fi
- name: Check if tag exists
run: |
git fetch --tags
if git tag -l | grep -q "^${VERSION}$"; then
echo "Tag '${VERSION}' already exists."
exit 1
else
echo "Tag '${VERSION}' does not exist."
fi

231
.github/workflows/integration.yaml vendored Normal file
View File

@ -0,0 +1,231 @@
name: Integration Test
on:
pull_request:
branches:
- master
- release-*
push:
branches:
- master
- release-*
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.actor }}
cancel-in-progress: true
jobs:
code-check:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Run go mod tidy
run: |
go mod tidy
if ! git diff --quiet; then
echo "Please run 'go mod tidy' and commit the changes."
git diff
false
fi
- name: Generate code
run: |
make generate
if ! git diff --quiet; then
echo "Need to re-run 'make generate' and commit the changes."
git diff
false
fi
- name: Run go fmt check
run: |
make go-fmt
if ! git diff --quiet; then
echo "Need to re-run 'make go-fmt' and commit the changes."
git diff
false
fi
- name: Run go vet check
run: |
make go-vet
if ! git diff --quiet; then
echo "Need to re-run 'make go-vet' and commit the changes."
git diff
false
fi
- name: Run golangci-lint
run: |
make go-lint
build-api-docs:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Build API docs
run: |
make build-api-docs
if ! git diff --quiet; then
echo "Need to re-run 'make build-api-docs' and commit the changes."
git diff
false
fi
build-spark-operator:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Run go unit tests
run: make unit-test
- name: Build Spark operator
run: make build-operator
build-sparkctl:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Build sparkctl
run: make build-sparkctl
build-helm-chart:
runs-on: ubuntu-latest
steps:
- name: Determine branch name
id: get_branch
run: |
BRANCH=""
if [ "${{ github.event_name }}" == "push" ]; then
BRANCH=${{ github.ref_name }}
elif [ "${{ github.event_name }}" == "pull_request" ]; then
BRANCH=${{ github.base_ref }}
fi
echo "Branch name: $BRANCH"
echo "BRANCH=$BRANCH" >> "$GITHUB_OUTPUT"
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install Helm
uses: azure/setup-helm@v4
with:
version: v3.14.3
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.6.1
- name: Generate manifests
run: |
make manifests
if ! git diff --quiet; then
echo "Need to re-run 'make manifests' and commit the changes."
git diff
false
fi
- name: Detect CRDs drift between chart and manifest
run: make detect-crds-drift
- name: Run chart-testing (list-changed)
id: list-changed
env:
BRANCH: ${{ steps.get_branch.outputs.BRANCH }}
run: |
changed=$(ct list-changed --target-branch $BRANCH)
if [[ -n "$changed" ]]; then
echo "changed=true" >> "$GITHUB_OUTPUT"
fi
- name: Run chart-testing (lint)
if: steps.list-changed.outputs.changed == 'true'
env:
BRANCH: ${{ steps.get_branch.outputs.BRANCH }}
run: ct lint --check-version-increment=false --target-branch $BRANCH
- name: Produce the helm documentation
if: steps.list-changed.outputs.changed == 'true'
run: |
make helm-docs
if ! git diff --quiet -- charts/spark-operator-chart/README.md; then
echo "Need to re-run 'make helm-docs' and commit the changes."
false
fi
- name: setup minikube
if: steps.list-changed.outputs.changed == 'true'
uses: manusa/actions-setup-minikube@v2.12.0
with:
minikube version: v1.33.0
kubernetes version: v1.30.0
start args: --memory 6g --cpus=2 --addons ingress
github token: ${{ inputs.github-token }}
- name: Run chart-testing (install)
if: steps.list-changed.outputs.changed == 'true'
run: |
docker build -t docker.io/kubeflow/spark-operator:local .
minikube image load docker.io/kubeflow/spark-operator:local
ct install
e2e-test:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Create a Kind cluster
run: make kind-create-cluster
- name: Build and load image to Kind cluster
run: |
make kind-load-image IMAGE_TAG=local
- name: Run e2e tests
run: make e2e-test

View File

@ -1,176 +0,0 @@
name: Pre-commit checks
on:
pull_request:
branches:
- master
push:
branches:
- master
jobs:
build-api-docs:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: The API documentation hasn't changed
run: |
make build-api-docs
if ! git diff --quiet -- docs/api-docs.md; then
echo "Need to re-run 'make build-api-docs' and commit the changes"
git diff -- docs/api-docs.md;
false
fi
build-sparkctl:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: "go.mod"
- name: build sparkctl
run: |
make all
build-spark-operator:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: "go.mod"
- name: Run gofmt check
run: make fmt-check
- name: Run static analysis
run: make static-analysis
- name: Run unit tests
run: make unit-test
- name: Build Spark-Operator Docker Image
run: |
docker build -t docker.io/kubeflow/spark-operator:latest .
- name: Check changes in resources used in docker file
run: |
DOCKERFILE_RESOURCES=$(cat Dockerfile | grep -P -o "COPY [a-zA-Z0-9].*? " | cut -c6-)
for resource in $DOCKERFILE_RESOURCES; do
# If the resource is different
if ! git diff --quiet origin/master -- $resource; then
## And the appVersion hasn't been updated
if ! git diff origin/master -- charts/spark-operator-chart/Chart.yaml | grep +appVersion; then
echo "resource used in docker.io/kubeflow/spark-operator has changed in $resource, need to update the appVersion in charts/spark-operator-chart/Chart.yaml"
git diff origin/master -- $resource;
echo "failing the build... " && false
fi
fi
done
build-helm-chart:
runs-on: ubuntu-20.04
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: Install Helm
uses: azure/setup-helm@v4
with:
version: v3.14.3
- name: Produce the helm documentation
run: |
make helm-docs
if ! git diff --quiet -- charts/spark-operator-chart/README.md; then
echo "Need to re-run 'make helm-docs' and commit the changes"
false
fi
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.6.1
- name: Print chart-testing version information
run: ct version
- name: Run chart-testing (lint)
run: ct lint
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed)
if [[ -n "$changed" ]]; then
echo "::set-output name=changed::true"
fi
- name: Detect CRDs drift between chart and manifest
run: make detect-crds-drift
- name: setup minikube
uses: manusa/actions-setup-minikube@v2.11.0
with:
minikube version: v1.33.0
kubernetes version: v1.30.0
start args: --memory 6g --cpus=2 --addons ingress
github token: ${{ inputs.github-token }}
- name: Run chart-testing (install)
run: |
docker build -t docker.io/kubeflow/spark-operator:local .
minikube image load docker.io/kubeflow/spark-operator:local
ct install
integration-test:
runs-on: ubuntu-22.04
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: "go.mod"
- name: setup minikube
uses: manusa/actions-setup-minikube@v2.11.0
with:
minikube version: v1.33.0
kubernetes version: v1.30.0
start args: --memory 6g --cpus=2 --addons ingress
github token: ${{ inputs.github-token }}
- name: Build local spark-operator docker image for minikube testing
run: |
docker build -t docker.io/kubeflow/spark-operator:local .
minikube image load docker.io/kubeflow/spark-operator:local
# The integration tests are currently broken see: https://github.com/kubeflow/spark-operator/issues/1416
# - name: Run chart-testing (integration test)
# run: make integration-test
- name: Setup tmate session
if: failure()
uses: mxschmitt/action-tmate@v3
timeout-minutes: 15

View File

@ -0,0 +1,84 @@
name: Release Helm charts
on:
release:
types:
- published
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
HELM_REGISTRY: ghcr.io
HELM_REPOSITORY: ${{ github.repository_owner }}/helm-charts
jobs:
release_helm_charts:
permissions:
contents: write
packages: write
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Set up Helm
uses: azure/setup-helm@v4.2.0
with:
version: v3.14.4
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ${{ env.HELM_REGISTRY }}
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Read version from VERSION file
run: |
VERSION=$(cat VERSION)
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Package Helm charts
run: |
for chart in $(ls charts); do
helm package charts/${chart}
done
- name: Upload charts to GHCR
run: |
for pkg in $(ls *.tgz); do
helm push ${pkg} oci://${{ env.HELM_REGISTRY }}/${{ env.HELM_REPOSITORY }}
done
- name: Save packaged charts to temp directory
run: |
mkdir -p /tmp/charts
cp *.tgz /tmp/charts
- name: Checkout to branch gh-pages
uses: actions/checkout@v4
with:
ref: gh-pages
fetch-depth: 0
- name: Copy packaged charts
run: |
cp /tmp/charts/*.tgz .
- name: Update Helm charts repo index
env:
CHART_URL: https://github.com/${{ github.repository }}/releases/download/${{ github.ref_name }}
run: |
helm repo index --merge index.yaml --url ${CHART_URL} .
git add index.yaml
git commit -s -m "Add index for Spark operator chart ${VERSION}" || exit 0
git push

View File

@ -1,106 +1,136 @@
name: Release Charts name: Release
on: on:
push: push:
branches: branches:
- master - release-*
paths:
- VERSION
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env: env:
REGISTRY_IMAGE: docker.io/kubeflow/spark-operator SEMVER_PATTERN: '^v([0-9]+)\.([0-9]+)\.([0-9]+)(-rc\.([0-9]+))?$'
IMAGE_REGISTRY: docker.io
IMAGE_REPOSITORY: kubeflow/spark-operator
jobs: jobs:
build-skip-check: check-release:
runs-on: ubuntu-latest runs-on: ubuntu-latest
outputs:
image_changed: ${{ steps.skip-check.outputs.image_changed }}
chart_changed: ${{ steps.skip-check.outputs.chart_changed }}
app_version_tag: ${{ steps.skip-check.outputs.app_version_tag }}
chart_version_tag: ${{ steps.skip-check.outputs.chart_version_tag }}
steps: steps:
- name: Checkout source code - name: Checkout source code
uses: actions/checkout@v4 uses: actions/checkout@v4
with: with:
fetch-depth: 0 fetch-depth: 0
- name: Check if build should be skipped
id: skip-check - name: Check whether version matches semver pattern
run: | run: |
app_version_tag=$(cat charts/spark-operator-chart/Chart.yaml | grep "appVersion: .*" | cut -c13-) VERSION=$(cat VERSION)
chart_version_tag=$(cat charts/spark-operator-chart/Chart.yaml | grep "version: .*" | cut -c10-) if [[ ${VERSION} =~ ${{ env.SEMVER_PATTERN }} ]]; then
echo "Version '${VERSION}' matches semver pattern."
else
echo "Version '${VERSION}' does not match semver pattern."
exit 1
fi
echo "VERSION=${VERSION}" >> $GITHUB_ENV
# Initialize flags - name: Check whether chart version and appVersion matches version
image_changed=false run: |
chart_changed=false VERSION=${VERSION#v}
CHART_VERSION=$(cat charts/spark-operator-chart/Chart.yaml | grep version | awk '{print $2}')
if ! git rev-parse -q --verify "refs/tags/$app_version_tag"; then CHART_APP_VERSION=$(cat charts/spark-operator-chart/Chart.yaml | grep appVersion | awk '{print $2}')
image_changed=true if [[ ${CHART_VERSION} == ${VERSION} ]]; then
git tag $app_version_tag echo "Chart version '${CHART_VERSION}' matches version '${VERSION}'."
git push origin $app_version_tag else
echo "Spark-Operator Docker Image new tag: $app_version_tag released" echo "Chart version '${CHART_VERSION}' does not match version '${VERSION}'."
exit 1
fi
if [[ ${CHART_APP_VERSION} == ${VERSION} ]]; then
echo "Chart appVersion '${CHART_APP_VERSION}' matches version '${VERSION}'."
else
echo "Chart appVersion '${CHART_APP_VERSION}' does not match version '${VERSION}'."
exit 1
fi fi
if ! git rev-parse -q --verify "refs/tags/spark-operator-chart-$chart_version_tag"; then - name: Check if tag exists
chart_changed=true run: |
git tag spark-operator-chart-$chart_version_tag git fetch --tags
git push origin spark-operator-chart-$chart_version_tag if git tag -l | grep -q "^${VERSION}$"; then
echo "Spark-Operator Helm Chart new tag: spark-operator-chart-$chart_version_tag released" echo "Tag '${VERSION}' already exists."
exit 1
else
echo "Tag '${VERSION}' does not exist."
fi fi
echo "image_changed=${image_changed}" >> "$GITHUB_OUTPUT" build_images:
echo "chart_changed=${chart_changed}" >> "$GITHUB_OUTPUT"
echo "app_version_tag=${app_version_tag}" >> "$GITHUB_OUTPUT"
echo "chart_version_tag=${chart_version_tag}" >> "$GITHUB_OUTPUT"
release:
runs-on: ubuntu-latest
needs: needs:
- build-skip-check - check-release
if: needs.build-skip-check.outputs.image_changed == 'true'
runs-on: ubuntu-latest
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
platform: platform:
- linux/amd64 - linux/amd64
- linux/arm64 - linux/arm64
steps: steps:
- name: Checkout - name: Prepare
uses: actions/checkout@v4
with:
fetch-depth: 1
- name: Configure Git
run: | run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
platform=${{ matrix.platform }} platform=${{ matrix.platform }}
echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV
echo "SCOPE=${platform//\//-}" >> $GITHUB_ENV
- name: Set up QEMU - name: Checkout source code
timeout-minutes: 1 uses: actions/checkout@v4
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx - name: Read version from VERSION file
uses: docker/setup-buildx-action@v3 run: |
- name: Install Helm VERSION=$(cat VERSION)
uses: azure/setup-helm@v4 if [[ ! ${VERSION} =~ ${{ env.SEMVER_PATTERN }} ]]; then
echo "Version '${VERSION}' does not match semver pattern."
exit 1
fi
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with: with:
version: v3.14.3 images: ${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }}
- name: Login to Packages Container registry tags: |
type=semver,pattern={{version}},value=${{ env.VERSION }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker buildx
uses: docker/setup-buildx-action@v3
- name: Login to container registry
uses: docker/login-action@v3 uses: docker/login-action@v3
with: with:
registry: docker.io registry: ${{ env.IMAGE_REGISTRY }}
username: ${{ secrets.DOCKERHUB_USERNAME }} username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }} password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and Push Spark-Operator Docker Image to Docker Hub
- name: Build and push by digest
id: build id: build
uses: docker/build-push-action@v5 uses: docker/build-push-action@v6
with: with:
context: .
platforms: ${{ matrix.platform }} platforms: ${{ matrix.platform }}
cache-to: type=gha,mode=max,scope=${{ env.SCOPE }} labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha,scope=${{ env.SCOPE }} outputs: type=image,name=${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }},push-by-digest=true,name-canonical=true,push=true
push: true
outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
- name: Export digest - name: Export digest
run: | run: |
mkdir -p /tmp/digests mkdir -p /tmp/digests
digest="${{ steps.build.outputs.digest }}" digest="${{ steps.build.outputs.digest }}"
touch "/tmp/digests/${digest#sha256:}" touch "/tmp/digests/${digest#sha256:}"
- name: Upload digest - name: Upload digest
uses: actions/upload-artifact@v4 uses: actions/upload-artifact@v4
with: with:
@ -108,61 +138,127 @@ jobs:
path: /tmp/digests/* path: /tmp/digests/*
if-no-files-found: error if-no-files-found: error
retention-days: 1 retention-days: 1
publish-image:
runs-on: ubuntu-latest release_images:
needs: needs:
- release - build_images
- build-skip-check
if: needs.build-skip-check.outputs.image_changed == 'true' runs-on: ubuntu-latest
steps: steps:
- name: Download digests - name: Checkout source code
uses: actions/download-artifact@v4 uses: actions/checkout@v4
with:
pattern: digests-* - name: Read version from VERSION file
path: /tmp/digests run: |
merge-multiple: true VERSION=$(cat VERSION)
- name: Setup Docker Buildx echo "VERSION=${VERSION}" >> $GITHUB_ENV
uses: docker/setup-buildx-action@v3
- name: Docker meta - name: Docker meta
id: meta id: meta
uses: docker/metadata-action@v5 uses: docker/metadata-action@v5
with: with:
images: ${{ env.REGISTRY_IMAGE }} images: ${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }}
tags: ${{ needs.build-skip-check.outputs.app_version_tag }} tags: |
- name: Login to Docker Hub type=semver,pattern={{version}},value=${{ env.VERSION }}
- name: Download digests
uses: actions/download-artifact@v4
with:
path: /tmp/digests
pattern: digests-*
merge-multiple: true
- name: Set up Docker buildx
uses: docker/setup-buildx-action@v3
- name: Login to container registry
uses: docker/login-action@v3 uses: docker/login-action@v3
with: with:
registry: docker.io registry: ${{ env.IMAGE_REGISTRY }}
username: ${{ secrets.DOCKERHUB_USERNAME }} username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }} password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Create manifest list and push - name: Create manifest list and push
working-directory: /tmp/digests working-directory: /tmp/digests
run: | run: |
docker buildx imagetools create $(jq -cr '.tags | map("-t " + .) | join(" ")' <<< "$DOCKER_METADATA_OUTPUT_JSON") \ docker buildx imagetools create $(jq -cr '.tags | map("-t " + .) | join(" ")' <<< "$DOCKER_METADATA_OUTPUT_JSON") \
$(printf '${{ env.REGISTRY_IMAGE }}@sha256:%s ' *) $(printf '${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }}@sha256:%s ' *)
- name: Inspect image - name: Inspect image
run: | run: |
docker buildx imagetools inspect ${{ env.REGISTRY_IMAGE }}:${{ steps.meta.outputs.version }} docker buildx imagetools inspect ${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }}:${{ steps.meta.outputs.version }}
publish-chart:
runs-on: ubuntu-latest push_tag:
if: needs.build-skip-check.outputs.chart_changed == 'true'
needs: needs:
- build-skip-check - release_images
runs-on: ubuntu-latest
steps: steps:
- name: Checkout - name: Checkout source code
uses: actions/checkout@v4 uses: actions/checkout@v4
with: with:
fetch-depth: 0 fetch-depth: 0
- name: Install Helm
uses: azure/setup-helm@v4
with:
version: v3.14.3
- name: Configure Git - name: Configure Git
run: | run: |
git config user.name "$GITHUB_ACTOR" git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com" git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Release Spark-Operator Helm Chart
uses: helm/chart-releaser-action@v1.6.0 - name: Read version from VERSION file
env: run: |
CR_TOKEN: "${{ secrets.GITHUB_TOKEN }}" VERSION=$(cat VERSION)
CR_RELEASE_NAME_TEMPLATE: "spark-operator-chart-{{ .Version }}" echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Create and push tag
run: |
git tag -a "${VERSION}" -m "Spark Operator Official Release ${VERSION}"
git push origin "${VERSION}"
draft_release:
needs:
- push_tag
permissions:
contents: write
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Read version from VERSION file
run: |
VERSION=$(cat VERSION)
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Set up Helm
uses: azure/setup-helm@v4.2.0
with:
version: v3.14.4
- name: Package Helm charts
run: |
for chart in $(ls charts); do
helm package charts/${chart}
done
- name: Release
id: release
uses: softprops/action-gh-release@v2
with:
token: ${{ secrets.GITHUB_TOKEN }}
name: "Spark Operator ${{ env.VERSION }}"
tag_name: ${{ env.VERSION }}
prerelease: ${{ contains(env.VERSION, 'rc') }}
target_commitish: ${{ github.sha }}
draft: true
files: |
*.tgz

12
.gitignore vendored
View File

@ -1,9 +1,11 @@
.vscode/ bin/
vendor/ vendor/
spark-operator cover.out
.idea/
**/*.iml
sparkctl/sparkctl sparkctl/sparkctl
spark-on-k8s-operator
sparkctl/sparkctl-linux-amd64 sparkctl/sparkctl-linux-amd64
sparkctl/sparkctl-darwin-amd64 sparkctl/sparkctl-darwin-amd64
**/*.iml
# Various IDEs
.idea/
.vscode/

66
.golangci.yaml Normal file
View File

@ -0,0 +1,66 @@
run:
# Timeout for analysis, e.g. 30s, 5m.
# Default: 1m
timeout: 1m
linters:
# Enable specific linters.
# https://golangci-lint.run/usage/linters/#enabled-by-default
enable:
# Detects places where loop variables are copied.
- copyloopvar
# Checks for duplicate words in the source code.
- dupword
# Tool for detection of FIXME, TODO and other comment keywords.
# - godox
# Check import statements are formatted according to the 'goimport' command.
- goimports
# Enforces consistent import aliases.
- importas
# Find code that shadows one of Go's predeclared identifiers.
- predeclared
# Check that struct tags are well aligned.
- tagalign
# Remove unnecessary type conversions.
- unconvert
# Checks Go code for unused constants, variables, functions and types.
- unused
issues:
# Which dirs to exclude: issues from them won't be reported.
# Can use regexp here: `generated.*`, regexp is applied on full path,
# including the path prefix if one is set.
# Default dirs are skipped independently of this option's value (see exclude-dirs-use-default).
# "/" will be replaced by current OS file path separator to properly work on Windows.
# Default: []
exclude-dirs:
- sparkctl
# Maximum issues count per one linter.
# Set to 0 to disable.
# Default: 50
max-issues-per-linter: 50
# Maximum count of issues with the same text.
# Set to 0 to disable.
# Default: 3
max-same-issues: 3
linters-settings:
importas:
# List of aliases
alias:
- pkg: k8s.io/api/admissionregistration/v1
alias: admissionregistrationv1
- pkg: k8s.io/api/apps/v1
alias: appsv1
- pkg: k8s.io/api/batch/v1
alias: batchv1
- pkg: k8s.io/api/core/v1
alias: corev1
- pkg: k8s.io/api/extensions/v1beta1
alias: extensionsv1beta1
- pkg: k8s.io/api/networking/v1
alias: networkingv1
- pkg: k8s.io/apimachinery/pkg/apis/meta/v1
alias: metav1
- pkg: sigs.k8s.io/controller-runtime
alias: ctrl

View File

@ -7,3 +7,4 @@ repos:
# Make the tool search for charts only under the `charts` directory # Make the tool search for charts only under the `charts` directory
- --chart-search-root=charts - --chart-search-root=charts
- --template-files=README.md.gotmpl - --template-files=README.md.gotmpl
- --sort-values-order=file

View File

@ -14,35 +14,41 @@
# limitations under the License. # limitations under the License.
# #
ARG SPARK_IMAGE=spark:3.5.0 ARG SPARK_IMAGE=spark:3.5.2
FROM golang:1.22-alpine as builder FROM golang:1.23.1 AS builder
WORKDIR /workspace WORKDIR /workspace
# Copy the Go Modules manifests RUN --mount=type=cache,target=/go/pkg/mod/ \
COPY go.mod go.mod --mount=type=bind,source=go.mod,target=go.mod \
COPY go.sum go.sum --mount=type=bind,source=go.sum,target=go.sum \
# Cache deps before building and copying source so that we don't need to re-download as much go mod download
# and so that source changes don't invalidate our downloaded layer
RUN go mod download
# Copy the go source code COPY . .
COPY main.go main.go ENV GOCACHE=/root/.cache/go-build
COPY pkg/ pkg/
# Build
ARG TARGETARCH ARG TARGETARCH
RUN CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} GO111MODULE=on go build -a -o /usr/bin/spark-operator main.go
RUN --mount=type=cache,target=/go/pkg/mod/ \
--mount=type=cache,target="/root/.cache/go-build" \
CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} GO111MODULE=on make build-operator
FROM ${SPARK_IMAGE} FROM ${SPARK_IMAGE}
USER root USER root
COPY --from=builder /usr/bin/spark-operator /usr/bin/
RUN apt-get update --allow-releaseinfo-change \ RUN apt-get update \
&& apt-get update \
&& apt-get install -y tini \ && apt-get install -y tini \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
RUN mkdir -p /etc/k8s-webhook-server/serving-certs /home/spark && \
chmod -R g+rw /etc/k8s-webhook-server/serving-certs && \
chown -R spark /etc/k8s-webhook-server/serving-certs /home/spark
USER spark
COPY --from=builder /workspace/bin/spark-operator /usr/bin/spark-operator
COPY entrypoint.sh /usr/bin/ COPY entrypoint.sh /usr/bin/
ENTRYPOINT ["/usr/bin/entrypoint.sh"] ENTRYPOINT ["/usr/bin/entrypoint.sh"]

389
Makefile
View File

@ -1,86 +1,341 @@
.SILENT: .SILENT:
.PHONY: clean-sparkctl
SPARK_OPERATOR_GOPATH=/go/src/github.com/kubeflow/spark-operator # Get the currently used golang install path (in GOPATH/bin, unless GOBIN is set)
DEP_VERSION:=`grep DEP_VERSION= Dockerfile | awk -F\" '{print $$2}'` ifeq (,$(shell go env GOBIN))
BUILDER=`grep "FROM golang:" Dockerfile | awk '{print $$2}'` GOBIN=$(shell go env GOPATH)/bin
UNAME:=`uname | tr '[:upper:]' '[:lower:]'` else
REPO=github.com/kubeflow/spark-operator GOBIN=$(shell go env GOBIN)
endif
all: clean-sparkctl build-sparkctl install-sparkctl # Setting SHELL to bash allows bash commands to be executed by recipes.
# Options are set to exit when a recipe line exits non-zero or a piped command fails.
SHELL = /usr/bin/env bash -o pipefail
.SHELLFLAGS = -ec
build-sparkctl: # Version information.
[ ! -f "sparkctl/sparkctl-darwin-amd64" ] || [ ! -f "sparkctl/sparkctl-linux-amd64" ] && \ VERSION ?= $(shell cat VERSION | sed "s/^v//")
echo building using $(BUILDER) && \ BUILD_DATE := $(shell date -u +"%Y-%m-%dT%H:%M:%S%:z")
docker run -w $(SPARK_OPERATOR_GOPATH) \ GIT_COMMIT := $(shell git rev-parse HEAD)
-v $$(pwd):$(SPARK_OPERATOR_GOPATH) $(BUILDER) sh -c \ GIT_TAG := $(shell if [ -z "`git status --porcelain`" ]; then git describe --exact-match --tags HEAD 2>/dev/null; fi)
"apk add --no-cache bash git && \ GIT_TREE_STATE := $(shell if [ -z "`git status --porcelain`" ]; then echo "clean" ; else echo "dirty"; fi)
cd sparkctl && \ GIT_SHA := $(shell git rev-parse --short HEAD || echo "HEAD")
./build.sh" || true GIT_VERSION := ${VERSION}+${GIT_SHA}
clean-sparkctl: REPO := github.com/kubeflow/spark-operator
rm -f sparkctl/sparkctl-darwin-amd64 sparkctl/sparkctl-linux-amd64 SPARK_OPERATOR_GOPATH := /go/src/github.com/kubeflow/spark-operator
SPARK_OPERATOR_CHART_PATH := charts/spark-operator-chart
DEP_VERSION := `grep DEP_VERSION= Dockerfile | awk -F\" '{print $$2}'`
BUILDER := `grep "FROM golang:" Dockerfile | awk '{print $$2}'`
UNAME := `uname | tr '[:upper:]' '[:lower:]'`
install-sparkctl: | sparkctl/sparkctl-darwin-amd64 sparkctl/sparkctl-linux-amd64 # CONTAINER_TOOL defines the container tool to be used for building images.
@if [ "$(UNAME)" = "linux" ]; then \ # Be aware that the target commands are only tested with Docker which is
echo "installing linux binary to /usr/local/bin/sparkctl"; \ # scaffolded by default. However, you might want to replace it to use other
sudo cp sparkctl/sparkctl-linux-amd64 /usr/local/bin/sparkctl; \ # tools. (i.e. podman)
sudo chmod +x /usr/local/bin/sparkctl; \ CONTAINER_TOOL ?= docker
elif [ "$(UNAME)" = "darwin" ]; then \
echo "installing macOS binary to /usr/local/bin/sparkctl"; \
cp sparkctl/sparkctl-darwin-amd64 /usr/local/bin/sparkctl; \
chmod +x /usr/local/bin/sparkctl; \
else \
echo "$(UNAME) not supported"; \
fi
build-api-docs: # Image URL to use all building/pushing image targets
docker build -t temp-api-ref-docs hack/api-docs IMAGE_REGISTRY ?= docker.io
docker run -v $$(pwd):/repo/ temp-api-ref-docs \ IMAGE_REPOSITORY ?= kubeflow/spark-operator
sh -c "cd /repo/ && /go/bin/gen-crd-api-reference-docs \ IMAGE_TAG ?= $(VERSION)
-config /repo/hack/api-docs/api-docs-config.json \ IMAGE ?= $(IMAGE_REGISTRY)/$(IMAGE_REPOSITORY):$(IMAGE_TAG)
-api-dir github.com/kubeflow/spark-operator/pkg/apis/sparkoperator.k8s.io/v1beta2 \
-template-dir /repo/hack/api-docs/api-docs-template \
-out-file /repo/docs/api-docs.md"
helm-unittest: # Kind cluster
helm unittest charts/spark-operator-chart --strict --file "tests/**/*_test.yaml" KIND_CLUSTER_NAME ?= spark-operator
KIND_CONFIG_FILE ?= charts/spark-operator-chart/ci/kind-config.yaml
KIND_KUBE_CONFIG ?= $(HOME)/.kube/config
helm-lint: ## Location to install binaries
docker run --rm --workdir /workspace --volume "$$(pwd):/workspace" quay.io/helmpack/chart-testing:latest ct lint LOCALBIN ?= $(shell pwd)/bin
helm-docs: ## Versions
docker run --rm --volume "$$(pwd):/helm-docs" -u "$(id -u)" jnorwood/helm-docs:latest KUSTOMIZE_VERSION ?= v5.4.1
CONTROLLER_TOOLS_VERSION ?= v0.15.0
KIND_VERSION ?= v0.23.0
ENVTEST_VERSION ?= release-0.18
# ENVTEST_K8S_VERSION refers to the version of kubebuilder assets to be downloaded by envtest binary.
ENVTEST_K8S_VERSION ?= 1.29.3
GOLANGCI_LINT_VERSION ?= v1.61.0
GEN_CRD_API_REFERENCE_DOCS_VERSION ?= v0.3.0
HELM_VERSION ?= v3.15.3
HELM_UNITTEST_VERSION ?= 0.5.1
HELM_DOCS_VERSION ?= v1.14.2
fmt-check: clean ## Binaries
@echo "running fmt check"; cd "$(dirname $0)"; \ SPARK_OPERATOR ?= $(LOCALBIN)/spark-operator
if [ -n "$(go fmt ./...)" ]; \ SPARKCTL ?= $(LOCALBIN)/sparkctl
then \ KUBECTL ?= kubectl
echo "Go code is not formatted, please run 'go fmt ./...'." >&2; \ KUSTOMIZE ?= $(LOCALBIN)/kustomize-$(KUSTOMIZE_VERSION)
exit 1; \ CONTROLLER_GEN ?= $(LOCALBIN)/controller-gen-$(CONTROLLER_TOOLS_VERSION)
else \ KIND ?= $(LOCALBIN)/kind-$(KIND_VERSION)
echo "Go code is formatted"; \ ENVTEST ?= $(LOCALBIN)/setup-envtest-$(ENVTEST_VERSION)
fi GOLANGCI_LINT ?= $(LOCALBIN)/golangci-lint-$(GOLANGCI_LINT_VERSION)
GEN_CRD_API_REFERENCE_DOCS ?= $(LOCALBIN)/gen-crd-api-reference-docs-$(GEN_CRD_API_REFERENCE_DOCS_VERSION)
HELM ?= $(LOCALBIN)/helm-$(HELM_VERSION)
HELM_DOCS ?= $(LOCALBIN)/helm-docs-$(HELM_DOCS_VERSION)
detect-crds-drift: ##@ General
diff -q charts/spark-operator-chart/crds manifest/crds --exclude=kustomization.yaml
clean: # The help target prints out all targets with their descriptions organized
# beneath their categories. The categories are represented by '##@' and the
# target descriptions by '##'. The awk command is responsible for reading the
# entire set of makefiles included in this invocation, looking for lines of the
# file as xyz: ## something, and then pretty-format the target and help. Then,
# if there's a line with ##@ something, that gets pretty-printed as a category.
# More info on the usage of ANSI control characters for terminal formatting:
# https://en.wikipedia.org/wiki/ANSI_escape_code#SGR_parameters
# More info on the awk command:
# http://linuxcommand.org/lc3_adv_awk.php
.PHONY: help
help: ## Display this help.
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf " \033[36m%-30s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)
.PHONY: version
version: ## Print version information.
@echo "Version: ${VERSION}"
@echo "Build Date: ${BUILD_DATE}"
@echo "Git Commit: ${GIT_COMMIT}"
@echo "Git Tag: ${GIT_TAG}"
@echo "Git Tree State: ${GIT_TREE_STATE}"
@echo "Git SHA: ${GIT_SHA}"
@echo "Git Version: ${GIT_VERSION}"
##@ Development
.PHONY: manifests
manifests: controller-gen ## Generate CustomResourceDefinition, RBAC and WebhookConfiguration manifests.
$(CONTROLLER_GEN) crd rbac:roleName=spark-operator-controller webhook paths="./..." output:crd:artifacts:config=config/crd/bases
.PHONY: generate
generate: controller-gen ## Generate code containing DeepCopy, DeepCopyInto, and DeepCopyObject method implementations.
$(CONTROLLER_GEN) object:headerFile="hack/boilerplate.go.txt" paths="./..."
.PHONY: update-crd
update-crd: manifests ## Update CRD files in the Helm chart.
cp config/crd/bases/* charts/spark-operator-chart/crds/
.PHONY: go-clean
go-clean: ## Clean up caches and output.
@echo "cleaning up caches and output" @echo "cleaning up caches and output"
go clean -cache -testcache -r -x 2>&1 >/dev/null go clean -cache -testcache -r -x 2>&1 >/dev/null
-rm -rf _output -rm -rf _output
unit-test: clean .PHONY: go-fmt
@echo "running unit tests" go-fmt: ## Run go fmt against code.
go test -v ./... -covermode=atomic @echo "Running go fmt..."
if [ -n "$(shell go fmt ./...)" ]; then \
echo "Go code is not formatted, need to run \"make go-fmt\" and commit the changes."; \
false; \
else \
echo "Go code is formatted."; \
fi
integration-test: clean .PHONY: go-vet
@echo "running integration tests" go-vet: ## Run go vet against code.
go test -v ./test/e2e/ --kubeconfig "$(HOME)/.kube/config" --operator-image=gcr.io/spark-operator/spark-operator:local @echo "Running go vet..."
go vet ./...
static-analysis: .PHONY: go-lint
@echo "running go vet" go-lint: golangci-lint ## Run golangci-lint linter.
# echo "Building using $(BUILDER)" @echo "Running golangci-lint run..."
# go vet ./... $(GOLANGCI_LINT) run
go vet $(REPO)...
.PHONY: go-lint-fix
go-lint-fix: golangci-lint ## Run golangci-lint linter and perform fixes.
@echo "Running golangci-lint run --fix..."
$(GOLANGCI_LINT) run --fix
.PHONY: unit-test
unit-test: envtest ## Run unit tests.
@echo "Running unit tests..."
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)"
go test $(shell go list ./... | grep -v /e2e) -coverprofile cover.out
.PHONY: e2e-test
e2e-test: envtest ## Run the e2e tests against a Kind k8s instance that is spun up.
@echo "Running e2e tests..."
go test ./test/e2e/ -v -ginkgo.v -timeout 30m
##@ Build
override LDFLAGS += \
-X ${REPO}.version=${GIT_VERSION} \
-X ${REPO}.buildDate=${BUILD_DATE} \
-X ${REPO}.gitCommit=${GIT_COMMIT} \
-X ${REPO}.gitTreeState=${GIT_TREE_STATE} \
-extldflags "-static"
.PHONY: build-operator
build-operator: ## Build Spark operator.
echo "Building spark-operator binary..."
go build -o $(SPARK_OPERATOR) -ldflags '${LDFLAGS}' cmd/main.go
.PHONY: build-sparkctl
build-sparkctl: ## Build sparkctl binary.
echo "Building sparkctl binary..."
CGO_ENABLED=0 go build -o $(SPARKCTL) -buildvcs=false sparkctl/main.go
.PHONY: install-sparkctl
install-sparkctl: build-sparkctl ## Install sparkctl binary.
echo "Installing sparkctl binary to /usr/local/bin..."; \
sudo cp $(SPARKCTL) /usr/local/bin
.PHONY: clean
clean: ## Clean spark-operator and sparktcl binaries.
rm -f $(SPARK_OPERATOR)
rm -f $(SPARKCTL)
.PHONY: build-api-docs
build-api-docs: gen-crd-api-reference-docs ## Build api documentaion.
$(GEN_CRD_API_REFERENCE_DOCS) \
-config hack/api-docs/config.json \
-api-dir github.com/kubeflow/spark-operator/api/v1beta2 \
-template-dir hack/api-docs/template \
-out-file docs/api-docs.md
# If you wish to build the operator image targeting other platforms you can use the --platform flag.
# (i.e. docker build --platform linux/arm64). However, you must enable docker buildKit for it.
# More info: https://docs.docker.com/develop/develop-images/build_enhancements/
.PHONY: docker-build
docker-build: ## Build docker image with the operator.
$(CONTAINER_TOOL) build -t ${IMAGE} .
.PHONY: docker-push
docker-push: ## Push docker image with the operator.
$(CONTAINER_TOOL) push ${IMAGE}
# PLATFORMS defines the target platforms for the operator image be built to provide support to multiple
# architectures. (i.e. make docker-buildx IMG=myregistry/mypoperator:0.0.1). To use this option you need to:
# - be able to use docker buildx. More info: https://docs.docker.com/build/buildx/
# - have enabled BuildKit. More info: https://docs.docker.com/develop/develop-images/build_enhancements/
# - be able to push the image to your registry (i.e. if you do not set a valid value via IMG=<myregistry/image:<tag>> then the export will fail)
# To adequately provide solutions that are compatible with multiple platforms, you should consider using this option.
PLATFORMS ?= linux/amd64,linux/arm64
.PHONY: docker-buildx
docker-buildx: ## Build and push docker image for the operator for cross-platform support
- $(CONTAINER_TOOL) buildx create --name spark-operator-builder
$(CONTAINER_TOOL) buildx use spark-operator-builder
- $(CONTAINER_TOOL) buildx build --push --platform=$(PLATFORMS) --tag ${IMAGE} -f Dockerfile .
- $(CONTAINER_TOOL) buildx rm spark-operator-builder
##@ Helm
.PHONY: detect-crds-drift
detect-crds-drift: manifests ## Detect CRD drift.
diff -q $(SPARK_OPERATOR_CHART_PATH)/crds config/crd/bases
.PHONY: helm-unittest
helm-unittest: helm-unittest-plugin ## Run Helm chart unittests.
$(HELM) unittest $(SPARK_OPERATOR_CHART_PATH) --strict --file "tests/**/*_test.yaml"
.PHONY: helm-lint
helm-lint: ## Run Helm chart lint test.
docker run --rm --workdir /workspace --volume "$$(pwd):/workspace" quay.io/helmpack/chart-testing:latest ct lint --target-branch master --validate-maintainers=false
.PHONY: helm-docs
helm-docs: helm-docs-plugin ## Generates markdown documentation for helm charts from requirements and values files.
$(HELM_DOCS) --sort-values-order=file
##@ Deployment
ifndef ignore-not-found
ignore-not-found = false
endif
.PHONY: kind-create-cluster
kind-create-cluster: kind ## Create a kind cluster for integration tests.
if ! $(KIND) get clusters 2>/dev/null | grep -q "^$(KIND_CLUSTER_NAME)$$"; then \
kind create cluster --name $(KIND_CLUSTER_NAME) --config $(KIND_CONFIG_FILE) --kubeconfig $(KIND_KUBE_CONFIG) --wait=1m; \
fi
.PHONY: kind-load-image
kind-load-image: kind-create-cluster docker-build ## Load the image into the kind cluster.
$(KIND) load docker-image --name $(KIND_CLUSTER_NAME) $(IMAGE)
.PHONY: kind-delete-custer
kind-delete-custer: kind ## Delete the created kind cluster.
$(KIND) delete cluster --name $(KIND_CLUSTER_NAME) --kubeconfig $(KIND_KUBE_CONFIG)
.PHONY: install
install-crd: manifests kustomize ## Install CRDs into the K8s cluster specified in ~/.kube/config.
$(KUSTOMIZE) build config/crd | $(KUBECTL) apply -f -
.PHONY: uninstall
uninstall-crd: manifests kustomize ## Uninstall CRDs from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
$(KUSTOMIZE) build config/crd | $(KUBECTL) delete --ignore-not-found=$(ignore-not-found) -f -
.PHONY: deploy
deploy: manifests kustomize ## Deploy controller to the K8s cluster specified in ~/.kube/config.
cd config/manager && $(KUSTOMIZE) edit set image controller=${IMG}
$(KUSTOMIZE) build config/default | $(KUBECTL) apply -f -
.PHONY: undeploy
undeploy: kustomize ## Undeploy controller from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
$(KUSTOMIZE) build config/default | $(KUBECTL) delete --ignore-not-found=$(ignore-not-found) -f -
##@ Dependencies
$(LOCALBIN):
mkdir -p $(LOCALBIN)
.PHONY: kustomize
kustomize: $(KUSTOMIZE) ## Download kustomize locally if necessary.
$(KUSTOMIZE): $(LOCALBIN)
$(call go-install-tool,$(KUSTOMIZE),sigs.k8s.io/kustomize/kustomize/v5,$(KUSTOMIZE_VERSION))
.PHONY: controller-gen
controller-gen: $(CONTROLLER_GEN) ## Download controller-gen locally if necessary.
$(CONTROLLER_GEN): $(LOCALBIN)
$(call go-install-tool,$(CONTROLLER_GEN),sigs.k8s.io/controller-tools/cmd/controller-gen,$(CONTROLLER_TOOLS_VERSION))
.PHONY: kind
kind: $(KIND) ## Download kind locally if necessary.
$(KIND): $(LOCALBIN)
$(call go-install-tool,$(KIND),sigs.k8s.io/kind,$(KIND_VERSION))
.PHONY: envtest
envtest: $(ENVTEST) ## Download setup-envtest locally if necessary.
$(ENVTEST): $(LOCALBIN)
$(call go-install-tool,$(ENVTEST),sigs.k8s.io/controller-runtime/tools/setup-envtest,$(ENVTEST_VERSION))
.PHONY: golangci-lint
golangci-lint: $(GOLANGCI_LINT) ## Download golangci-lint locally if necessary.
$(GOLANGCI_LINT): $(LOCALBIN)
$(call go-install-tool,$(GOLANGCI_LINT),github.com/golangci/golangci-lint/cmd/golangci-lint,${GOLANGCI_LINT_VERSION})
.PHONY: gen-crd-api-reference-docs
gen-crd-api-reference-docs: $(GEN_CRD_API_REFERENCE_DOCS) ## Download gen-crd-api-reference-docs locally if necessary.
$(GEN_CRD_API_REFERENCE_DOCS): $(LOCALBIN)
$(call go-install-tool,$(GEN_CRD_API_REFERENCE_DOCS),github.com/ahmetb/gen-crd-api-reference-docs,$(GEN_CRD_API_REFERENCE_DOCS_VERSION))
.PHONY: helm
helm: $(HELM) ## Download helm locally if necessary.
$(HELM): $(LOCALBIN)
$(call go-install-tool,$(HELM),helm.sh/helm/v3/cmd/helm,$(HELM_VERSION))
.PHONY: helm-unittest-plugin
helm-unittest-plugin: helm ## Download helm unittest plugin locally if necessary.
if [ -z "$(shell $(HELM) plugin list | grep unittest)" ]; then \
echo "Installing helm unittest plugin"; \
$(HELM) plugin install https://github.com/helm-unittest/helm-unittest.git --version $(HELM_UNITTEST_VERSION); \
fi
.PHONY: helm-docs-plugin
helm-docs-plugin: $(HELM_DOCS) ## Download helm-docs plugin locally if necessary.
$(HELM_DOCS): $(LOCALBIN)
$(call go-install-tool,$(HELM_DOCS),github.com/norwoodj/helm-docs/cmd/helm-docs,$(HELM_DOCS_VERSION))
# go-install-tool will 'go install' any package with custom target and name of binary, if it doesn't exist
# $1 - target path with name of binary (ideally with version)
# $2 - package url which can be installed
# $3 - specific version of package
define go-install-tool
@[ -f $(1) ] || { \
set -e; \
package=$(2)@$(3) ;\
echo "Downloading $${package}" ;\
GOBIN=$(LOCALBIN) go install $${package} ;\
mv "$$(echo "$(1)" | sed "s/-$(3)$$//")" $(1) ;\
}
endef

47
PROJECT Normal file
View File

@ -0,0 +1,47 @@
# Code generated by tool. DO NOT EDIT.
# This file is used to track the info used to scaffold your project
# and allow the plugins properly work.
# More info: https://book.kubebuilder.io/reference/project-config.html
domain: sparkoperator.k8s.io
layout:
- go.kubebuilder.io/v4
projectName: spark-operator
repo: github.com/kubeflow/spark-operator
resources:
- api:
crdVersion: v1
namespaced: true
controller: true
domain: sparkoperator.k8s.io
kind: SparkApplication
path: github.com/kubeflow/spark-operator/api/v1beta1
version: v1beta1
- api:
crdVersion: v1
namespaced: true
controller: true
domain: sparkoperator.k8s.io
kind: ScheduledSparkApplication
path: github.com/kubeflow/spark-operator/api/v1beta1
version: v1beta1
- api:
crdVersion: v1
namespaced: true
controller: true
domain: sparkoperator.k8s.io
kind: SparkApplication
path: github.com/kubeflow/spark-operator/api/v1beta2
version: v1beta2
webhooks:
defaulting: true
validation: true
webhookVersion: v1
- api:
crdVersion: v1
namespaced: true
controller: true
domain: sparkoperator.k8s.io
kind: ScheduledSparkApplication
path: github.com/kubeflow/spark-operator/api/v1beta2
version: v1beta2
version: "3"

View File

@ -31,7 +31,7 @@ The Kubernetes Operator for Apache Spark currently supports the following list o
**Current API version:** *`v1beta2`* **Current API version:** *`v1beta2`*
**If you are currently using the `v1beta1` version of the APIs in your manifests, please update them to use the `v1beta2` version by changing `apiVersion: "sparkoperator.k8s.io/<version>"` to `apiVersion: "sparkoperator.k8s.io/v1beta2"`. You will also need to delete the `previous` version of the CustomResourceDefinitions named `sparkapplications.sparkoperator.k8s.io` and `scheduledsparkapplications.sparkoperator.k8s.io`, and replace them with the `v1beta2` version either by installing the latest version of the operator or by running `kubectl create -f manifest/crds`.** **If you are currently using the `v1beta1` version of the APIs in your manifests, please update them to use the `v1beta2` version by changing `apiVersion: "sparkoperator.k8s.io/<version>"` to `apiVersion: "sparkoperator.k8s.io/v1beta2"`. You will also need to delete the `previous` version of the CustomResourceDefinitions named `sparkapplications.sparkoperator.k8s.io` and `scheduledsparkapplications.sparkoperator.k8s.io`, and replace them with the `v1beta2` version either by installing the latest version of the operator or by running `kubectl create -f config/crd/bases`.**
## Prerequisites ## Prerequisites

1
VERSION Normal file
View File

@ -0,0 +1 @@
v2.0.2

View File

@ -0,0 +1,36 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Package v1beta1 contains API Schema definitions for the v1beta1 API group
// +kubebuilder:object:generate=true
// +groupName=sparkoperator.k8s.io
package v1beta1
import (
"k8s.io/apimachinery/pkg/runtime/schema"
"sigs.k8s.io/controller-runtime/pkg/scheme"
)
var (
// GroupVersion is group version used to register these objects.
GroupVersion = schema.GroupVersion{Group: "sparkoperator.k8s.io", Version: "v1beta1"}
// SchemeBuilder is used to add go types to the GroupVersionKind scheme.
SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}
// AddToScheme adds the types in this group-version to the given scheme.
AddToScheme = SchemeBuilder.AddToScheme
)

View File

@ -1,5 +1,5 @@
/* /*
Copyright 2017 Google LLC Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
@ -17,36 +17,18 @@ limitations under the License.
package v1beta1 package v1beta1
import ( import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema" "k8s.io/apimachinery/pkg/runtime/schema"
"github.com/kubeflow/spark-operator/pkg/apis/sparkoperator.k8s.io"
) )
const Version = "v1beta1" const (
Group = "sparkoperator.k8s.io"
var ( Version = "v1beta1"
SchemeBuilder = runtime.NewSchemeBuilder(addKnownTypes)
AddToScheme = SchemeBuilder.AddToScheme
) )
// SchemeGroupVersion is the group version used to register these objects. // SchemeGroupVersion is the group version used to register these objects.
var SchemeGroupVersion = schema.GroupVersion{Group: sparkoperator.GroupName, Version: Version} var SchemeGroupVersion = schema.GroupVersion{Group: Group, Version: Version}
// Resource takes an unqualified resource and returns a Group-qualified GroupResource. // Resource takes an unqualified resource and returns a Group-qualified GroupResource.
func Resource(resource string) schema.GroupResource { func Resource(resource string) schema.GroupResource {
return SchemeGroupVersion.WithResource(resource).GroupResource() return SchemeGroupVersion.WithResource(resource).GroupResource()
} }
// addKnownTypes adds the set of types defined in this package to the supplied scheme.
func addKnownTypes(scheme *runtime.Scheme) error {
scheme.AddKnownTypes(SchemeGroupVersion,
&SparkApplication{},
&SparkApplicationList{},
&ScheduledSparkApplication{},
&ScheduledSparkApplicationList{},
)
metav1.AddToGroupVersion(scheme, SchemeGroupVersion)
return nil
}

View File

@ -0,0 +1,104 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1beta1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
// +kubebuilder:skip
func init() {
SchemeBuilder.Register(&ScheduledSparkApplication{}, &ScheduledSparkApplicationList{})
}
// ScheduledSparkApplicationSpec defines the desired state of ScheduledSparkApplication
type ScheduledSparkApplicationSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// Schedule is a cron schedule on which the application should run.
Schedule string `json:"schedule"`
// Template is a template from which SparkApplication instances can be created.
Template SparkApplicationSpec `json:"template"`
// Suspend is a flag telling the controller to suspend subsequent runs of the application if set to true.
// Optional.
// Defaults to false.
Suspend *bool `json:"suspend,omitempty"`
// ConcurrencyPolicy is the policy governing concurrent SparkApplication runs.
ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`
// SuccessfulRunHistoryLimit is the number of past successful runs of the application to keep.
// Optional.
// Defaults to 1.
SuccessfulRunHistoryLimit *int32 `json:"successfulRunHistoryLimit,omitempty"`
// FailedRunHistoryLimit is the number of past failed runs of the application to keep.
// Optional.
// Defaults to 1.
FailedRunHistoryLimit *int32 `json:"failedRunHistoryLimit,omitempty"`
}
// ScheduledSparkApplicationStatus defines the observed state of ScheduledSparkApplication
type ScheduledSparkApplicationStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// LastRun is the time when the last run of the application started.
LastRun metav1.Time `json:"lastRun,omitempty"`
// NextRun is the time when the next run of the application will start.
NextRun metav1.Time `json:"nextRun,omitempty"`
// LastRunName is the name of the SparkApplication for the most recent run of the application.
LastRunName string `json:"lastRunName,omitempty"`
// PastSuccessfulRunNames keeps the names of SparkApplications for past successful runs.
PastSuccessfulRunNames []string `json:"pastSuccessfulRunNames,omitempty"`
// PastFailedRunNames keeps the names of SparkApplications for past failed runs.
PastFailedRunNames []string `json:"pastFailedRunNames,omitempty"`
// ScheduleState is the current scheduling state of the application.
ScheduleState ScheduleState `json:"scheduleState,omitempty"`
// Reason tells why the ScheduledSparkApplication is in the particular ScheduleState.
Reason string `json:"reason,omitempty"`
}
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// ScheduledSparkApplication is the Schema for the scheduledsparkapplications API
type ScheduledSparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ScheduledSparkApplicationSpec `json:"spec,omitempty"`
Status ScheduledSparkApplicationStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// ScheduledSparkApplicationList contains a list of ScheduledSparkApplication
type ScheduledSparkApplicationList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []ScheduledSparkApplication `json:"items"`
}
type ScheduleState string
const (
FailedValidationState ScheduleState = "FailedValidation"
ScheduledState ScheduleState = "Scheduled"
)

View File

@ -1,5 +1,5 @@
/* /*
Copyright 2017 Google LLC Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
@ -14,13 +14,163 @@ See the License for the specific language governing permissions and
limitations under the License. limitations under the License.
*/ */
// +kubebuilder:skip
package v1beta1 package v1beta1
import ( import (
apiv1 "k8s.io/api/core/v1" corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
) )
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
// +kubebuilder:skip
func init() {
SchemeBuilder.Register(&SparkApplication{}, &SparkApplicationList{})
}
// SparkApplicationSpec defines the desired state of SparkApplication
type SparkApplicationSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// Type tells the type of the Spark application.
Type SparkApplicationType `json:"type"`
// SparkVersion is the version of Spark the application uses.
SparkVersion string `json:"sparkVersion"`
// Mode is the deployment mode of the Spark application.
Mode DeployMode `json:"mode,omitempty"`
// Image is the container image for the driver, executor, and init-container. Any custom container images for the
// driver, executor, or init-container takes precedence over this.
// Optional.
Image *string `json:"image,omitempty"`
// InitContainerImage is the image of the init-container to use. Overrides Spec.Image if set.
// Optional.
InitContainerImage *string `json:"initContainerImage,omitempty"`
// ImagePullPolicy is the image pull policy for the driver, executor, and init-container.
// Optional.
ImagePullPolicy *string `json:"imagePullPolicy,omitempty"`
// ImagePullSecrets is the list of image-pull secrets.
// Optional.
ImagePullSecrets []string `json:"imagePullSecrets,omitempty"`
// MainClass is the fully-qualified main class of the Spark application.
// This only applies to Java/Scala Spark applications.
// Optional.
MainClass *string `json:"mainClass,omitempty"`
// MainFile is the path to a bundled JAR, Python, or R file of the application.
// Optional.
MainApplicationFile *string `json:"mainApplicationFile"`
// Arguments is a list of arguments to be passed to the application.
// Optional.
Arguments []string `json:"arguments,omitempty"`
// SparkConf carries user-specified Spark configuration properties as they would use the "--conf" option in
// spark-submit.
// Optional.
SparkConf map[string]string `json:"sparkConf,omitempty"`
// HadoopConf carries user-specified Hadoop configuration properties as they would use the "--conf" option
// in spark-submit. The SparkApplication controller automatically adds prefix "spark.hadoop." to Hadoop
// configuration properties.
// Optional.
HadoopConf map[string]string `json:"hadoopConf,omitempty"`
// SparkConfigMap carries the name of the ConfigMap containing Spark configuration files such as log4j.properties.
// The controller will add environment variable SPARK_CONF_DIR to the path where the ConfigMap is mounted to.
// Optional.
SparkConfigMap *string `json:"sparkConfigMap,omitempty"`
// HadoopConfigMap carries the name of the ConfigMap containing Hadoop configuration files such as core-site.xml.
// The controller will add environment variable HADOOP_CONF_DIR to the path where the ConfigMap is mounted to.
// Optional.
HadoopConfigMap *string `json:"hadoopConfigMap,omitempty"`
// Volumes is the list of Kubernetes volumes that can be mounted by the driver and/or executors.
// Optional.
Volumes []corev1.Volume `json:"volumes,omitempty"`
// Driver is the driver specification.
Driver DriverSpec `json:"driver"`
// Executor is the executor specification.
Executor ExecutorSpec `json:"executor"`
// Deps captures all possible types of dependencies of a Spark application.
Deps Dependencies `json:"deps"`
// RestartPolicy defines the policy on if and in which conditions the controller should restart an application.
RestartPolicy RestartPolicy `json:"restartPolicy,omitempty"`
// NodeSelector is the Kubernetes node selector to be added to the driver and executor pods.
// This field is mutually exclusive with nodeSelector at podSpec level (driver or executor).
// This field will be deprecated in future versions (at SparkApplicationSpec level).
// Optional.
NodeSelector map[string]string `json:"nodeSelector,omitempty"`
// FailureRetries is the number of times to retry a failed application before giving up.
// This is best effort and actual retry attempts can be >= the value specified.
// Optional.
FailureRetries *int32 `json:"failureRetries,omitempty"`
// RetryInterval is the unit of intervals in seconds between submission retries.
// Optional.
RetryInterval *int64 `json:"retryInterval,omitempty"`
// This sets the major Python version of the docker
// image used to run the driver and executor containers. Can either be 2 or 3, default 2.
// Optional.
PythonVersion *string `json:"pythonVersion,omitempty"`
// This sets the Memory Overhead Factor that will allocate memory to non-JVM memory.
// For JVM-based jobs this value will default to 0.10, for non-JVM jobs 0.40. Value of this field will
// be overridden by `Spec.Driver.MemoryOverhead` and `Spec.Executor.MemoryOverhead` if they are set.
// Optional.
MemoryOverheadFactor *string `json:"memoryOverheadFactor,omitempty"`
// Monitoring configures how monitoring is handled.
// Optional.
Monitoring *MonitoringSpec `json:"monitoring,omitempty"`
// BatchScheduler configures which batch scheduler will be used for scheduling
// Optional.
BatchScheduler *string `json:"batchScheduler,omitempty"`
}
// SparkApplicationStatus defines the observed state of SparkApplication
type SparkApplicationStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// SparkApplicationID is set by the spark-distribution(via spark.app.id config) on the driver and executor pods
SparkApplicationID string `json:"sparkApplicationId,omitempty"`
// SubmissionID is a unique ID of the current submission of the application.
SubmissionID string `json:"submissionID,omitempty"`
// LastSubmissionAttemptTime is the time for the last application submission attempt.
LastSubmissionAttemptTime metav1.Time `json:"lastSubmissionAttemptTime,omitempty"`
// CompletionTime is the time when the application runs to completion if it does.
TerminationTime metav1.Time `json:"terminationTime,omitempty"`
// DriverInfo has information about the driver.
DriverInfo DriverInfo `json:"driverInfo"`
// AppState tells the overall application state.
AppState ApplicationState `json:"applicationState,omitempty"`
// ExecutorState records the state of executors by executor Pod names.
ExecutorState map[string]ExecutorState `json:"executorState,omitempty"`
// ExecutionAttempts is the total number of attempts to run a submitted application to completion.
// Incremented upon each attempted run of the application and reset upon invalidation.
ExecutionAttempts int32 `json:"executionAttempts,omitempty"`
// SubmissionAttempts is the total number of attempts to submit an application to run.
// Incremented upon each attempted submission of the application and reset upon invalidation and rerun.
SubmissionAttempts int32 `json:"submissionAttempts,omitempty"`
}
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// SparkApplication is the Schema for the sparkapplications API
type SparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec SparkApplicationSpec `json:"spec,omitempty"`
Status SparkApplicationStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// SparkApplicationList contains a list of SparkApplication
type SparkApplicationList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []SparkApplication `json:"items"`
}
// SparkApplicationType describes the type of a Spark application. // SparkApplicationType describes the type of a Spark application.
type SparkApplicationType string type SparkApplicationType string
@ -66,18 +216,6 @@ const (
Always RestartPolicyType = "Always" Always RestartPolicyType = "Always"
) )
// +genclient
// +genclient:noStatus
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +k8s:defaulter-gen=true
type ScheduledSparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec ScheduledSparkApplicationSpec `json:"spec"`
Status ScheduledSparkApplicationStatus `json:"status,omitempty"`
}
type ConcurrencyPolicy string type ConcurrencyPolicy string
const ( const (
@ -90,162 +228,6 @@ const (
ConcurrencyReplace ConcurrencyPolicy = "Replace" ConcurrencyReplace ConcurrencyPolicy = "Replace"
) )
type ScheduledSparkApplicationSpec struct {
// Schedule is a cron schedule on which the application should run.
Schedule string `json:"schedule"`
// Template is a template from which SparkApplication instances can be created.
Template SparkApplicationSpec `json:"template"`
// Suspend is a flag telling the controller to suspend subsequent runs of the application if set to true.
// Optional.
// Defaults to false.
Suspend *bool `json:"suspend,omitempty"`
// ConcurrencyPolicy is the policy governing concurrent SparkApplication runs.
ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`
// SuccessfulRunHistoryLimit is the number of past successful runs of the application to keep.
// Optional.
// Defaults to 1.
SuccessfulRunHistoryLimit *int32 `json:"successfulRunHistoryLimit,omitempty"`
// FailedRunHistoryLimit is the number of past failed runs of the application to keep.
// Optional.
// Defaults to 1.
FailedRunHistoryLimit *int32 `json:"failedRunHistoryLimit,omitempty"`
}
type ScheduleState string
const (
FailedValidationState ScheduleState = "FailedValidation"
ScheduledState ScheduleState = "Scheduled"
)
type ScheduledSparkApplicationStatus struct {
// LastRun is the time when the last run of the application started.
LastRun metav1.Time `json:"lastRun,omitempty"`
// NextRun is the time when the next run of the application will start.
NextRun metav1.Time `json:"nextRun,omitempty"`
// LastRunName is the name of the SparkApplication for the most recent run of the application.
LastRunName string `json:"lastRunName,omitempty"`
// PastSuccessfulRunNames keeps the names of SparkApplications for past successful runs.
PastSuccessfulRunNames []string `json:"pastSuccessfulRunNames,omitempty"`
// PastFailedRunNames keeps the names of SparkApplications for past failed runs.
PastFailedRunNames []string `json:"pastFailedRunNames,omitempty"`
// ScheduleState is the current scheduling state of the application.
ScheduleState ScheduleState `json:"scheduleState,omitempty"`
// Reason tells why the ScheduledSparkApplication is in the particular ScheduleState.
Reason string `json:"reason,omitempty"`
}
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// ScheduledSparkApplicationList carries a list of ScheduledSparkApplication objects.
type ScheduledSparkApplicationList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []ScheduledSparkApplication `json:"items,omitempty"`
}
// +genclient
// +genclient:noStatus
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +k8s:defaulter-gen=true
// SparkApplication represents a Spark application running on and using Kubernetes as a cluster manager.
type SparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec SparkApplicationSpec `json:"spec"`
Status SparkApplicationStatus `json:"status,omitempty"`
}
// SparkApplicationSpec describes the specification of a Spark application using Kubernetes as a cluster manager.
// It carries every pieces of information a spark-submit command takes and recognizes.
type SparkApplicationSpec struct {
// Type tells the type of the Spark application.
Type SparkApplicationType `json:"type"`
// SparkVersion is the version of Spark the application uses.
SparkVersion string `json:"sparkVersion"`
// Mode is the deployment mode of the Spark application.
Mode DeployMode `json:"mode,omitempty"`
// Image is the container image for the driver, executor, and init-container. Any custom container images for the
// driver, executor, or init-container takes precedence over this.
// Optional.
Image *string `json:"image,omitempty"`
// InitContainerImage is the image of the init-container to use. Overrides Spec.Image if set.
// Optional.
InitContainerImage *string `json:"initContainerImage,omitempty"`
// ImagePullPolicy is the image pull policy for the driver, executor, and init-container.
// Optional.
ImagePullPolicy *string `json:"imagePullPolicy,omitempty"`
// ImagePullSecrets is the list of image-pull secrets.
// Optional.
ImagePullSecrets []string `json:"imagePullSecrets,omitempty"`
// MainClass is the fully-qualified main class of the Spark application.
// This only applies to Java/Scala Spark applications.
// Optional.
MainClass *string `json:"mainClass,omitempty"`
// MainFile is the path to a bundled JAR, Python, or R file of the application.
// Optional.
MainApplicationFile *string `json:"mainApplicationFile"`
// Arguments is a list of arguments to be passed to the application.
// Optional.
Arguments []string `json:"arguments,omitempty"`
// SparkConf carries user-specified Spark configuration properties as they would use the "--conf" option in
// spark-submit.
// Optional.
SparkConf map[string]string `json:"sparkConf,omitempty"`
// HadoopConf carries user-specified Hadoop configuration properties as they would use the the "--conf" option
// in spark-submit. The SparkApplication controller automatically adds prefix "spark.hadoop." to Hadoop
// configuration properties.
// Optional.
HadoopConf map[string]string `json:"hadoopConf,omitempty"`
// SparkConfigMap carries the name of the ConfigMap containing Spark configuration files such as log4j.properties.
// The controller will add environment variable SPARK_CONF_DIR to the path where the ConfigMap is mounted to.
// Optional.
SparkConfigMap *string `json:"sparkConfigMap,omitempty"`
// HadoopConfigMap carries the name of the ConfigMap containing Hadoop configuration files such as core-site.xml.
// The controller will add environment variable HADOOP_CONF_DIR to the path where the ConfigMap is mounted to.
// Optional.
HadoopConfigMap *string `json:"hadoopConfigMap,omitempty"`
// Volumes is the list of Kubernetes volumes that can be mounted by the driver and/or executors.
// Optional.
Volumes []apiv1.Volume `json:"volumes,omitempty"`
// Driver is the driver specification.
Driver DriverSpec `json:"driver"`
// Executor is the executor specification.
Executor ExecutorSpec `json:"executor"`
// Deps captures all possible types of dependencies of a Spark application.
Deps Dependencies `json:"deps"`
// RestartPolicy defines the policy on if and in which conditions the controller should restart an application.
RestartPolicy RestartPolicy `json:"restartPolicy,omitempty"`
// NodeSelector is the Kubernetes node selector to be added to the driver and executor pods.
// This field is mutually exclusive with nodeSelector at podSpec level (driver or executor).
// This field will be deprecated in future versions (at SparkApplicationSpec level).
// Optional.
NodeSelector map[string]string `json:"nodeSelector,omitempty"`
// FailureRetries is the number of times to retry a failed application before giving up.
// This is best effort and actual retry attempts can be >= the value specified.
// Optional.
FailureRetries *int32 `json:"failureRetries,omitempty"`
// RetryInterval is the unit of intervals in seconds between submission retries.
// Optional.
RetryInterval *int64 `json:"retryInterval,omitempty"`
// This sets the major Python version of the docker
// image used to run the driver and executor containers. Can either be 2 or 3, default 2.
// Optional.
PythonVersion *string `json:"pythonVersion,omitempty"`
// This sets the Memory Overhead Factor that will allocate memory to non-JVM memory.
// For JVM-based jobs this value will default to 0.10, for non-JVM jobs 0.40. Value of this field will
// be overridden by `Spec.Driver.MemoryOverhead` and `Spec.Executor.MemoryOverhead` if they are set.
// Optional.
MemoryOverheadFactor *string `json:"memoryOverheadFactor,omitempty"`
// Monitoring configures how monitoring is handled.
// Optional.
Monitoring *MonitoringSpec `json:"monitoring,omitempty"`
// BatchScheduler configures which batch scheduler will be used for scheduling
// Optional.
BatchScheduler *string `json:"batchScheduler,omitempty"`
}
// ApplicationStateType represents the type of the current state of an application. // ApplicationStateType represents the type of the current state of an application.
type ApplicationStateType string type ApplicationStateType string
@ -282,39 +264,6 @@ const (
ExecutorUnknownState ExecutorState = "UNKNOWN" ExecutorUnknownState ExecutorState = "UNKNOWN"
) )
// SparkApplicationStatus describes the current status of a Spark application.
type SparkApplicationStatus struct {
// SparkApplicationID is set by the spark-distribution(via spark.app.id config) on the driver and executor pods
SparkApplicationID string `json:"sparkApplicationId,omitempty"`
// SubmissionID is a unique ID of the current submission of the application.
SubmissionID string `json:"submissionID,omitempty"`
// LastSubmissionAttemptTime is the time for the last application submission attempt.
LastSubmissionAttemptTime metav1.Time `json:"lastSubmissionAttemptTime,omitempty"`
// CompletionTime is the time when the application runs to completion if it does.
TerminationTime metav1.Time `json:"terminationTime,omitempty"`
// DriverInfo has information about the driver.
DriverInfo DriverInfo `json:"driverInfo"`
// AppState tells the overall application state.
AppState ApplicationState `json:"applicationState,omitempty"`
// ExecutorState records the state of executors by executor Pod names.
ExecutorState map[string]ExecutorState `json:"executorState,omitempty"`
// ExecutionAttempts is the total number of attempts to run a submitted application to completion.
// Incremented upon each attempted run of the application and reset upon invalidation.
ExecutionAttempts int32 `json:"executionAttempts,omitempty"`
// SubmissionAttempts is the total number of attempts to submit an application to run.
// Incremented upon each attempted submission of the application and reset upon invalidation and rerun.
SubmissionAttempts int32 `json:"submissionAttempts,omitempty"`
}
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// SparkApplicationList carries a list of SparkApplication objects.
type SparkApplicationList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []SparkApplication `json:"items,omitempty"`
}
// Dependencies specifies all possible types of dependencies of a Spark application. // Dependencies specifies all possible types of dependencies of a Spark application.
type Dependencies struct { type Dependencies struct {
// Jars is a list of JAR files the Spark application depends on. // Jars is a list of JAR files the Spark application depends on.
@ -379,22 +328,22 @@ type SparkPodSpec struct {
Annotations map[string]string `json:"annotations,omitempty"` Annotations map[string]string `json:"annotations,omitempty"`
// VolumeMounts specifies the volumes listed in ".spec.volumes" to mount into the main container's filesystem. // VolumeMounts specifies the volumes listed in ".spec.volumes" to mount into the main container's filesystem.
// Optional. // Optional.
VolumeMounts []apiv1.VolumeMount `json:"volumeMounts,omitempty"` VolumeMounts []corev1.VolumeMount `json:"volumeMounts,omitempty"`
// Affinity specifies the affinity/anti-affinity settings for the pod. // Affinity specifies the affinity/anti-affinity settings for the pod.
// Optional. // Optional.
Affinity *apiv1.Affinity `json:"affinity,omitempty"` Affinity *corev1.Affinity `json:"affinity,omitempty"`
// Tolerations specifies the tolerations listed in ".spec.tolerations" to be applied to the pod. // Tolerations specifies the tolerations listed in ".spec.tolerations" to be applied to the pod.
// Optional. // Optional.
Tolerations []apiv1.Toleration `json:"tolerations,omitempty"` Tolerations []corev1.Toleration `json:"tolerations,omitempty"`
// SecurityContext specifies the PodSecurityContext to apply. // SecurityContext specifies the PodSecurityContext to apply.
// Optional. // Optional.
SecurityContext *apiv1.PodSecurityContext `json:"securityContext,omitempty"` SecurityContext *corev1.PodSecurityContext `json:"securityContext,omitempty"`
// SchedulerName specifies the scheduler that will be used for scheduling // SchedulerName specifies the scheduler that will be used for scheduling
// Optional. // Optional.
SchedulerName *string `json:"schedulerName,omitempty"` SchedulerName *string `json:"schedulerName,omitempty"`
// Sidecars is a list of sidecar containers that run along side the main Spark container. // Sidecars is a list of sidecar containers that run along side the main Spark container.
// Optional. // Optional.
Sidecars []apiv1.Container `json:"sidecars,omitempty"` Sidecars []corev1.Container `json:"sidecars,omitempty"`
// HostNetwork indicates whether to request host networking for the pod or not. // HostNetwork indicates whether to request host networking for the pod or not.
// Optional. // Optional.
HostNetwork *bool `json:"hostNetwork,omitempty"` HostNetwork *bool `json:"hostNetwork,omitempty"`
@ -404,12 +353,12 @@ type SparkPodSpec struct {
NodeSelector map[string]string `json:"nodeSelector,omitempty"` NodeSelector map[string]string `json:"nodeSelector,omitempty"`
// DnsConfig dns settings for the pod, following the Kubernetes specifications. // DnsConfig dns settings for the pod, following the Kubernetes specifications.
// Optional. // Optional.
DNSConfig *apiv1.PodDNSConfig `json:"dnsConfig,omitempty"` DNSConfig *corev1.PodDNSConfig `json:"dnsConfig,omitempty"`
} }
// DriverSpec is specification of the driver. // DriverSpec is specification of the driver.
type DriverSpec struct { type DriverSpec struct {
SparkPodSpec SparkPodSpec `json:",inline"`
// PodName is the name of the driver pod that the user creates. This is used for the // PodName is the name of the driver pod that the user creates. This is used for the
// in-cluster client mode in which the user creates a client pod where the driver of // in-cluster client mode in which the user creates a client pod where the driver of
// the user application runs. It's an error to set this field if Mode is not // the user application runs. It's an error to set this field if Mode is not
@ -426,7 +375,7 @@ type DriverSpec struct {
// ExecutorSpec is specification of the executor. // ExecutorSpec is specification of the executor.
type ExecutorSpec struct { type ExecutorSpec struct {
SparkPodSpec SparkPodSpec `json:",inline"`
// Instances is the number of executor instances. // Instances is the number of executor instances.
// Optional. // Optional.
Instances *int32 `json:"instances,omitempty"` Instances *int32 `json:"instances,omitempty"`

View File

@ -1,10 +1,7 @@
//go:build !ignore_autogenerated //go:build !ignore_autogenerated
// +build !ignore_autogenerated
// Code generated by k8s code-generator DO NOT EDIT.
/* /*
Copyright 2018 Google LLC Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
@ -19,19 +16,18 @@ See the License for the specific language governing permissions and
limitations under the License. limitations under the License.
*/ */
// Code generated by deepcopy-gen. DO NOT EDIT. // Code generated by controller-gen. DO NOT EDIT.
package v1beta1 package v1beta1
import ( import (
v1 "k8s.io/api/core/v1" "k8s.io/api/core/v1"
runtime "k8s.io/apimachinery/pkg/runtime" runtime "k8s.io/apimachinery/pkg/runtime"
) )
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *ApplicationState) DeepCopyInto(out *ApplicationState) { func (in *ApplicationState) DeepCopyInto(out *ApplicationState) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ApplicationState. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ApplicationState.
@ -82,7 +78,6 @@ func (in *Dependencies) DeepCopyInto(out *Dependencies) {
*out = new(int32) *out = new(int32)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Dependencies. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Dependencies.
@ -98,7 +93,6 @@ func (in *Dependencies) DeepCopy() *Dependencies {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *DriverInfo) DeepCopyInto(out *DriverInfo) { func (in *DriverInfo) DeepCopyInto(out *DriverInfo) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverInfo. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverInfo.
@ -130,7 +124,6 @@ func (in *DriverSpec) DeepCopyInto(out *DriverSpec) {
*out = new(string) *out = new(string)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverSpec.
@ -162,7 +155,6 @@ func (in *ExecutorSpec) DeepCopyInto(out *ExecutorSpec) {
*out = new(string) *out = new(string)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ExecutorSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ExecutorSpec.
@ -178,7 +170,6 @@ func (in *ExecutorSpec) DeepCopy() *ExecutorSpec {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *GPUSpec) DeepCopyInto(out *GPUSpec) { func (in *GPUSpec) DeepCopyInto(out *GPUSpec) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GPUSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GPUSpec.
@ -204,7 +195,6 @@ func (in *MonitoringSpec) DeepCopyInto(out *MonitoringSpec) {
*out = new(PrometheusSpec) *out = new(PrometheusSpec)
(*in).DeepCopyInto(*out) (*in).DeepCopyInto(*out)
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MonitoringSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MonitoringSpec.
@ -220,7 +210,6 @@ func (in *MonitoringSpec) DeepCopy() *MonitoringSpec {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *NameKey) DeepCopyInto(out *NameKey) { func (in *NameKey) DeepCopyInto(out *NameKey) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new NameKey. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new NameKey.
@ -236,7 +225,6 @@ func (in *NameKey) DeepCopy() *NameKey {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *NamePath) DeepCopyInto(out *NamePath) { func (in *NamePath) DeepCopyInto(out *NamePath) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new NamePath. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new NamePath.
@ -267,7 +255,6 @@ func (in *PrometheusSpec) DeepCopyInto(out *PrometheusSpec) {
*out = new(string) *out = new(string)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PrometheusSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PrometheusSpec.
@ -303,7 +290,6 @@ func (in *RestartPolicy) DeepCopyInto(out *RestartPolicy) {
*out = new(int64) *out = new(int64)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RestartPolicy. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RestartPolicy.
@ -323,7 +309,6 @@ func (in *ScheduledSparkApplication) DeepCopyInto(out *ScheduledSparkApplication
in.ObjectMeta.DeepCopyInto(&out.ObjectMeta) in.ObjectMeta.DeepCopyInto(&out.ObjectMeta)
in.Spec.DeepCopyInto(&out.Spec) in.Spec.DeepCopyInto(&out.Spec)
in.Status.DeepCopyInto(&out.Status) in.Status.DeepCopyInto(&out.Status)
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplication. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplication.
@ -356,7 +341,6 @@ func (in *ScheduledSparkApplicationList) DeepCopyInto(out *ScheduledSparkApplica
(*in)[i].DeepCopyInto(&(*out)[i]) (*in)[i].DeepCopyInto(&(*out)[i])
} }
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationList. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationList.
@ -396,7 +380,6 @@ func (in *ScheduledSparkApplicationSpec) DeepCopyInto(out *ScheduledSparkApplica
*out = new(int32) *out = new(int32)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationSpec.
@ -424,7 +407,6 @@ func (in *ScheduledSparkApplicationStatus) DeepCopyInto(out *ScheduledSparkAppli
*out = make([]string, len(*in)) *out = make([]string, len(*in))
copy(*out, *in) copy(*out, *in)
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationStatus. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationStatus.
@ -440,7 +422,6 @@ func (in *ScheduledSparkApplicationStatus) DeepCopy() *ScheduledSparkApplication
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *SecretInfo) DeepCopyInto(out *SecretInfo) { func (in *SecretInfo) DeepCopyInto(out *SecretInfo) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SecretInfo. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SecretInfo.
@ -460,7 +441,6 @@ func (in *SparkApplication) DeepCopyInto(out *SparkApplication) {
in.ObjectMeta.DeepCopyInto(&out.ObjectMeta) in.ObjectMeta.DeepCopyInto(&out.ObjectMeta)
in.Spec.DeepCopyInto(&out.Spec) in.Spec.DeepCopyInto(&out.Spec)
in.Status.DeepCopyInto(&out.Status) in.Status.DeepCopyInto(&out.Status)
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplication. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplication.
@ -493,7 +473,6 @@ func (in *SparkApplicationList) DeepCopyInto(out *SparkApplicationList) {
(*in)[i].DeepCopyInto(&(*out)[i]) (*in)[i].DeepCopyInto(&(*out)[i])
} }
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationList. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationList.
@ -624,7 +603,6 @@ func (in *SparkApplicationSpec) DeepCopyInto(out *SparkApplicationSpec) {
*out = new(string) *out = new(string)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationSpec.
@ -651,7 +629,6 @@ func (in *SparkApplicationStatus) DeepCopyInto(out *SparkApplicationStatus) {
(*out)[key] = val (*out)[key] = val
} }
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationStatus. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationStatus.
@ -788,7 +765,6 @@ func (in *SparkPodSpec) DeepCopyInto(out *SparkPodSpec) {
*out = new(v1.PodDNSConfig) *out = new(v1.PodDNSConfig)
(*in).DeepCopyInto(*out) (*in).DeepCopyInto(*out)
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkPodSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkPodSpec.

View File

@ -24,15 +24,19 @@ func SetSparkApplicationDefaults(app *SparkApplication) {
return return
} }
if app.Spec.Type == "" {
app.Spec.Type = SparkApplicationTypeScala
}
if app.Spec.Mode == "" { if app.Spec.Mode == "" {
app.Spec.Mode = ClusterMode app.Spec.Mode = DeployModeCluster
} }
if app.Spec.RestartPolicy.Type == "" { if app.Spec.RestartPolicy.Type == "" {
app.Spec.RestartPolicy.Type = Never app.Spec.RestartPolicy.Type = RestartPolicyNever
} }
if app.Spec.RestartPolicy.Type != Never { if app.Spec.RestartPolicy.Type != RestartPolicyNever {
// Default to 5 sec if the RestartPolicy is OnFailure or Always and these values aren't specified. // Default to 5 sec if the RestartPolicy is OnFailure or Always and these values aren't specified.
if app.Spec.RestartPolicy.OnFailureRetryInterval == nil { if app.Spec.RestartPolicy.OnFailureRetryInterval == nil {
app.Spec.RestartPolicy.OnFailureRetryInterval = new(int64) app.Spec.RestartPolicy.OnFailureRetryInterval = new(int64)
@ -50,7 +54,6 @@ func SetSparkApplicationDefaults(app *SparkApplication) {
} }
func setDriverSpecDefaults(spec *DriverSpec, sparkConf map[string]string) { func setDriverSpecDefaults(spec *DriverSpec, sparkConf map[string]string) {
if _, exists := sparkConf["spark.driver.cores"]; !exists && spec.Cores == nil { if _, exists := sparkConf["spark.driver.cores"]; !exists && spec.Cores == nil {
spec.Cores = new(int32) spec.Cores = new(int32)
*spec.Cores = 1 *spec.Cores = 1

View File

@ -36,11 +36,11 @@ func TestSetSparkApplicationDefaultsEmptyModeShouldDefaultToClusterMode(t *testi
SetSparkApplicationDefaults(app) SetSparkApplicationDefaults(app)
assert.Equal(t, ClusterMode, app.Spec.Mode) assert.Equal(t, DeployModeCluster, app.Spec.Mode)
} }
func TestSetSparkApplicationDefaultsModeShouldNotChangeIfSet(t *testing.T) { func TestSetSparkApplicationDefaultsModeShouldNotChangeIfSet(t *testing.T) {
expectedMode := ClientMode expectedMode := DeployModeClient
app := &SparkApplication{ app := &SparkApplication{
Spec: SparkApplicationSpec{ Spec: SparkApplicationSpec{
Mode: expectedMode, Mode: expectedMode,
@ -59,21 +59,21 @@ func TestSetSparkApplicationDefaultsEmptyRestartPolicyShouldDefaultToNever(t *te
SetSparkApplicationDefaults(app) SetSparkApplicationDefaults(app)
assert.Equal(t, Never, app.Spec.RestartPolicy.Type) assert.Equal(t, RestartPolicyNever, app.Spec.RestartPolicy.Type)
} }
func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValues(t *testing.T) { func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValues(t *testing.T) {
app := &SparkApplication{ app := &SparkApplication{
Spec: SparkApplicationSpec{ Spec: SparkApplicationSpec{
RestartPolicy: RestartPolicy{ RestartPolicy: RestartPolicy{
Type: OnFailure, Type: RestartPolicyOnFailure,
}, },
}, },
} }
SetSparkApplicationDefaults(app) SetSparkApplicationDefaults(app)
assert.Equal(t, OnFailure, app.Spec.RestartPolicy.Type) assert.Equal(t, RestartPolicyOnFailure, app.Spec.RestartPolicy.Type)
assert.NotNil(t, app.Spec.RestartPolicy.OnFailureRetryInterval) assert.NotNil(t, app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.Equal(t, int64(5), *app.Spec.RestartPolicy.OnFailureRetryInterval) assert.Equal(t, int64(5), *app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.NotNil(t, app.Spec.RestartPolicy.OnSubmissionFailureRetryInterval) assert.NotNil(t, app.Spec.RestartPolicy.OnSubmissionFailureRetryInterval)
@ -85,7 +85,7 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
app := &SparkApplication{ app := &SparkApplication{
Spec: SparkApplicationSpec{ Spec: SparkApplicationSpec{
RestartPolicy: RestartPolicy{ RestartPolicy: RestartPolicy{
Type: OnFailure, Type: RestartPolicyOnFailure,
OnSubmissionFailureRetryInterval: &expectedOnSubmissionFailureRetryInterval, OnSubmissionFailureRetryInterval: &expectedOnSubmissionFailureRetryInterval,
}, },
}, },
@ -93,7 +93,7 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
SetSparkApplicationDefaults(app) SetSparkApplicationDefaults(app)
assert.Equal(t, OnFailure, app.Spec.RestartPolicy.Type) assert.Equal(t, RestartPolicyOnFailure, app.Spec.RestartPolicy.Type)
assert.NotNil(t, app.Spec.RestartPolicy.OnFailureRetryInterval) assert.NotNil(t, app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.Equal(t, int64(5), *app.Spec.RestartPolicy.OnFailureRetryInterval) assert.Equal(t, int64(5), *app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.NotNil(t, app.Spec.RestartPolicy.OnSubmissionFailureRetryInterval) assert.NotNil(t, app.Spec.RestartPolicy.OnSubmissionFailureRetryInterval)
@ -105,7 +105,7 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
app := &SparkApplication{ app := &SparkApplication{
Spec: SparkApplicationSpec{ Spec: SparkApplicationSpec{
RestartPolicy: RestartPolicy{ RestartPolicy: RestartPolicy{
Type: OnFailure, Type: RestartPolicyOnFailure,
OnFailureRetryInterval: &expectedOnFailureRetryInterval, OnFailureRetryInterval: &expectedOnFailureRetryInterval,
}, },
}, },
@ -113,7 +113,7 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
SetSparkApplicationDefaults(app) SetSparkApplicationDefaults(app)
assert.Equal(t, OnFailure, app.Spec.RestartPolicy.Type) assert.Equal(t, RestartPolicyOnFailure, app.Spec.RestartPolicy.Type)
assert.NotNil(t, app.Spec.RestartPolicy.OnFailureRetryInterval) assert.NotNil(t, app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.Equal(t, expectedOnFailureRetryInterval, *app.Spec.RestartPolicy.OnFailureRetryInterval) assert.Equal(t, expectedOnFailureRetryInterval, *app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.NotNil(t, app.Spec.RestartPolicy.OnSubmissionFailureRetryInterval) assert.NotNil(t, app.Spec.RestartPolicy.OnSubmissionFailureRetryInterval)
@ -121,7 +121,6 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
} }
func TestSetSparkApplicationDefaultsDriverSpecDefaults(t *testing.T) { func TestSetSparkApplicationDefaultsDriverSpecDefaults(t *testing.T) {
//Case1: Driver config not set. //Case1: Driver config not set.
app := &SparkApplication{ app := &SparkApplication{
Spec: SparkApplicationSpec{}, Spec: SparkApplicationSpec{},

View File

@ -15,7 +15,6 @@ limitations under the License.
*/ */
// +k8s:deepcopy-gen=package,register // +k8s:deepcopy-gen=package,register
// go:generate controller-gen crd:trivialVersions=true paths=. output:dir=.
// Package v1beta2 is the v1beta2 version of the API. // Package v1beta2 is the v1beta2 version of the API.
// +groupName=sparkoperator.k8s.io // +groupName=sparkoperator.k8s.io

View File

@ -0,0 +1,36 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Package v1beta2 contains API Schema definitions for the v1beta2 API group
// +kubebuilder:object:generate=true
// +groupName=sparkoperator.k8s.io
package v1beta2
import (
"k8s.io/apimachinery/pkg/runtime/schema"
"sigs.k8s.io/controller-runtime/pkg/scheme"
)
var (
// GroupVersion is group version used to register these objects.
GroupVersion = schema.GroupVersion{Group: "sparkoperator.k8s.io", Version: "v1beta2"}
// SchemeBuilder is used to add go types to the GroupVersionKind scheme.
SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}
// AddToScheme adds the types in this group-version to the given scheme.
AddToScheme = SchemeBuilder.AddToScheme
)

View File

@ -1,7 +1,5 @@
// Code generated by k8s code-generator DO NOT EDIT.
/* /*
Copyright 2018 Google LLC Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
@ -15,3 +13,5 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and See the License for the specific language governing permissions and
limitations under the License. limitations under the License.
*/ */
package v1beta2

View File

@ -1,5 +1,5 @@
/* /*
Copyright 2017 Google LLC Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
@ -17,36 +17,18 @@ limitations under the License.
package v1beta2 package v1beta2
import ( import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema" "k8s.io/apimachinery/pkg/runtime/schema"
"github.com/kubeflow/spark-operator/pkg/apis/sparkoperator.k8s.io"
) )
const Version = "v1beta2" const (
Group = "sparkoperator.k8s.io"
var ( Version = "v1beta2"
SchemeBuilder = runtime.NewSchemeBuilder(addKnownTypes)
AddToScheme = SchemeBuilder.AddToScheme
) )
// SchemeGroupVersion is the group version used to register these objects. // SchemeGroupVersion is the group version used to register these objects.
var SchemeGroupVersion = schema.GroupVersion{Group: sparkoperator.GroupName, Version: Version} var SchemeGroupVersion = schema.GroupVersion{Group: Group, Version: Version}
// Resource takes an unqualified resource and returns a Group-qualified GroupResource. // Resource takes an unqualified resource and returns a Group-qualified GroupResource.
func Resource(resource string) schema.GroupResource { func Resource(resource string) schema.GroupResource {
return SchemeGroupVersion.WithResource(resource).GroupResource() return SchemeGroupVersion.WithResource(resource).GroupResource()
} }
// addKnownTypes adds the set of types defined in this package to the supplied scheme.
func addKnownTypes(scheme *runtime.Scheme) error {
scheme.AddKnownTypes(SchemeGroupVersion,
&SparkApplication{},
&SparkApplicationList{},
&ScheduledSparkApplication{},
&ScheduledSparkApplicationList{},
)
metav1.AddToGroupVersion(scheme, SchemeGroupVersion)
return nil
}

View File

@ -0,0 +1,125 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1beta2
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
func init() {
SchemeBuilder.Register(&ScheduledSparkApplication{}, &ScheduledSparkApplicationList{})
}
// ScheduledSparkApplicationSpec defines the desired state of ScheduledSparkApplication.
type ScheduledSparkApplicationSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// Schedule is a cron schedule on which the application should run.
Schedule string `json:"schedule"`
// Template is a template from which SparkApplication instances can be created.
Template SparkApplicationSpec `json:"template"`
// Suspend is a flag telling the controller to suspend subsequent runs of the application if set to true.
// +optional
// Defaults to false.
Suspend *bool `json:"suspend,omitempty"`
// ConcurrencyPolicy is the policy governing concurrent SparkApplication runs.
ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`
// SuccessfulRunHistoryLimit is the number of past successful runs of the application to keep.
// +optional
// Defaults to 1.
SuccessfulRunHistoryLimit *int32 `json:"successfulRunHistoryLimit,omitempty"`
// FailedRunHistoryLimit is the number of past failed runs of the application to keep.
// +optional
// Defaults to 1.
FailedRunHistoryLimit *int32 `json:"failedRunHistoryLimit,omitempty"`
}
// ScheduledSparkApplicationStatus defines the observed state of ScheduledSparkApplication.
type ScheduledSparkApplicationStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// LastRun is the time when the last run of the application started.
// +nullable
LastRun metav1.Time `json:"lastRun,omitempty"`
// NextRun is the time when the next run of the application will start.
// +nullable
NextRun metav1.Time `json:"nextRun,omitempty"`
// LastRunName is the name of the SparkApplication for the most recent run of the application.
LastRunName string `json:"lastRunName,omitempty"`
// PastSuccessfulRunNames keeps the names of SparkApplications for past successful runs.
PastSuccessfulRunNames []string `json:"pastSuccessfulRunNames,omitempty"`
// PastFailedRunNames keeps the names of SparkApplications for past failed runs.
PastFailedRunNames []string `json:"pastFailedRunNames,omitempty"`
// ScheduleState is the current scheduling state of the application.
ScheduleState ScheduleState `json:"scheduleState,omitempty"`
// Reason tells why the ScheduledSparkApplication is in the particular ScheduleState.
Reason string `json:"reason,omitempty"`
}
// +kubebuilder:object:root=true
// +kubebuilder:metadata:annotations="api-approved.kubernetes.io=https://github.com/kubeflow/spark-operator/pull/1298"
// +kubebuilder:resource:scope=Namespaced,shortName=scheduledsparkapp,singular=scheduledsparkapplication
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=.spec.schedule,name=Schedule,type=string
// +kubebuilder:printcolumn:JSONPath=.spec.suspend,name=Suspend,type=string
// +kubebuilder:printcolumn:JSONPath=.status.lastRun,name=Last Run,type=date
// +kubebuilder:printcolumn:JSONPath=.status.lastRunName,name=Last Run Name,type=string
// +kubebuilder:printcolumn:JSONPath=.metadata.creationTimestamp,name=Age,type=date
// ScheduledSparkApplication is the Schema for the scheduledsparkapplications API.
type ScheduledSparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec ScheduledSparkApplicationSpec `json:"spec,omitempty"`
Status ScheduledSparkApplicationStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// ScheduledSparkApplicationList contains a list of ScheduledSparkApplication.
type ScheduledSparkApplicationList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []ScheduledSparkApplication `json:"items"`
}
type ConcurrencyPolicy string
const (
// ConcurrencyAllow allows SparkApplications to run concurrently.
ConcurrencyAllow ConcurrencyPolicy = "Allow"
// ConcurrencyForbid forbids concurrent runs of SparkApplications, skipping the next run if the previous
// one hasn't finished yet.
ConcurrencyForbid ConcurrencyPolicy = "Forbid"
// ConcurrencyReplace kills the currently running SparkApplication instance and replaces it with a new one.
ConcurrencyReplace ConcurrencyPolicy = "Replace"
)
type ScheduleState string
const (
ScheduleStateNew ScheduleState = ""
ScheduleStateValidating ScheduleState = "Validating"
ScheduleStateScheduled ScheduleState = "Scheduled"
ScheduleStateFailedValidation ScheduleState = "FailedValidation"
)

View File

@ -1,5 +1,5 @@
/* /*
Copyright 2017 Google LLC Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
@ -17,168 +17,24 @@ limitations under the License.
package v1beta2 package v1beta2
import ( import (
apiv1 "k8s.io/api/core/v1" corev1 "k8s.io/api/core/v1"
networkingv1 "k8s.io/api/networking/v1" networkingv1 "k8s.io/api/networking/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
) )
// SparkApplicationType describes the type of a Spark application. // EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
type SparkApplicationType string // NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
// Different types of Spark applications. func init() {
const ( SchemeBuilder.Register(&SparkApplication{}, &SparkApplicationList{})
JavaApplicationType SparkApplicationType = "Java"
ScalaApplicationType SparkApplicationType = "Scala"
PythonApplicationType SparkApplicationType = "Python"
RApplicationType SparkApplicationType = "R"
)
// DeployMode describes the type of deployment of a Spark application.
type DeployMode string
// Different types of deployments.
const (
ClusterMode DeployMode = "cluster"
ClientMode DeployMode = "client"
InClusterClientMode DeployMode = "in-cluster-client"
)
// RestartPolicy is the policy of if and in which conditions the controller should restart a terminated application.
// This completely defines actions to be taken on any kind of Failures during an application run.
type RestartPolicy struct {
// Type specifies the RestartPolicyType.
// +kubebuilder:validation:Enum={Never,Always,OnFailure}
Type RestartPolicyType `json:"type,omitempty"`
// OnSubmissionFailureRetries is the number of times to retry submitting an application before giving up.
// This is best effort and actual retry attempts can be >= the value specified due to caching.
// These are required if RestartPolicy is OnFailure.
// +kubebuilder:validation:Minimum=0
// +optional
OnSubmissionFailureRetries *int32 `json:"onSubmissionFailureRetries,omitempty"`
// OnFailureRetries the number of times to retry running an application before giving up.
// +kubebuilder:validation:Minimum=0
// +optional
OnFailureRetries *int32 `json:"onFailureRetries,omitempty"`
// OnSubmissionFailureRetryInterval is the interval in seconds between retries on failed submissions.
// +kubebuilder:validation:Minimum=1
// +optional
OnSubmissionFailureRetryInterval *int64 `json:"onSubmissionFailureRetryInterval,omitempty"`
// OnFailureRetryInterval is the interval in seconds between retries on failed runs.
// +kubebuilder:validation:Minimum=1
// +optional
OnFailureRetryInterval *int64 `json:"onFailureRetryInterval,omitempty"`
} }
type RestartPolicyType string // SparkApplicationSpec defines the desired state of SparkApplication
const (
Never RestartPolicyType = "Never"
OnFailure RestartPolicyType = "OnFailure"
Always RestartPolicyType = "Always"
)
// +genclient
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +k8s:defaulter-gen=true
// +kubebuilder:subresource:status
// +kubebuilder:resource:scope=Namespaced,shortName=scheduledsparkapp,singular=scheduledsparkapplication
type ScheduledSparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec ScheduledSparkApplicationSpec `json:"spec"`
Status ScheduledSparkApplicationStatus `json:"status,omitempty"`
}
type ConcurrencyPolicy string
const (
// ConcurrencyAllow allows SparkApplications to run concurrently.
ConcurrencyAllow ConcurrencyPolicy = "Allow"
// ConcurrencyForbid forbids concurrent runs of SparkApplications, skipping the next run if the previous
// one hasn't finished yet.
ConcurrencyForbid ConcurrencyPolicy = "Forbid"
// ConcurrencyReplace kills the currently running SparkApplication instance and replaces it with a new one.
ConcurrencyReplace ConcurrencyPolicy = "Replace"
)
type ScheduledSparkApplicationSpec struct {
// Schedule is a cron schedule on which the application should run.
Schedule string `json:"schedule"`
// Template is a template from which SparkApplication instances can be created.
Template SparkApplicationSpec `json:"template"`
// Suspend is a flag telling the controller to suspend subsequent runs of the application if set to true.
// +optional
// Defaults to false.
Suspend *bool `json:"suspend,omitempty"`
// ConcurrencyPolicy is the policy governing concurrent SparkApplication runs.
ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`
// SuccessfulRunHistoryLimit is the number of past successful runs of the application to keep.
// +optional
// Defaults to 1.
SuccessfulRunHistoryLimit *int32 `json:"successfulRunHistoryLimit,omitempty"`
// FailedRunHistoryLimit is the number of past failed runs of the application to keep.
// +optional
// Defaults to 1.
FailedRunHistoryLimit *int32 `json:"failedRunHistoryLimit,omitempty"`
}
type ScheduleState string
const (
FailedValidationState ScheduleState = "FailedValidation"
ScheduledState ScheduleState = "Scheduled"
)
type ScheduledSparkApplicationStatus struct {
// LastRun is the time when the last run of the application started.
// +nullable
LastRun metav1.Time `json:"lastRun,omitempty"`
// NextRun is the time when the next run of the application will start.
// +nullable
NextRun metav1.Time `json:"nextRun,omitempty"`
// LastRunName is the name of the SparkApplication for the most recent run of the application.
LastRunName string `json:"lastRunName,omitempty"`
// PastSuccessfulRunNames keeps the names of SparkApplications for past successful runs.
PastSuccessfulRunNames []string `json:"pastSuccessfulRunNames,omitempty"`
// PastFailedRunNames keeps the names of SparkApplications for past failed runs.
PastFailedRunNames []string `json:"pastFailedRunNames,omitempty"`
// ScheduleState is the current scheduling state of the application.
ScheduleState ScheduleState `json:"scheduleState,omitempty"`
// Reason tells why the ScheduledSparkApplication is in the particular ScheduleState.
Reason string `json:"reason,omitempty"`
}
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// ScheduledSparkApplicationList carries a list of ScheduledSparkApplication objects.
type ScheduledSparkApplicationList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []ScheduledSparkApplication `json:"items,omitempty"`
}
// +genclient
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +k8s:defaulter-gen=true
// +kubebuilder:subresource:status
// +kubebuilder:resource:scope=Namespaced,shortName=sparkapp,singular=sparkapplication
// SparkApplication represents a Spark application running on and using Kubernetes as a cluster manager.
type SparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec SparkApplicationSpec `json:"spec"`
Status SparkApplicationStatus `json:"status,omitempty"`
}
// SparkApplicationSpec describes the specification of a Spark application using Kubernetes as a cluster manager.
// It carries every pieces of information a spark-submit command takes and recognizes. // It carries every pieces of information a spark-submit command takes and recognizes.
type SparkApplicationSpec struct { type SparkApplicationSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// Type tells the type of the Spark application. // Type tells the type of the Spark application.
// +kubebuilder:validation:Enum={Java,Python,Scala,R} // +kubebuilder:validation:Enum={Java,Python,Scala,R}
Type SparkApplicationType `json:"type"` Type SparkApplicationType `json:"type"`
@ -215,7 +71,7 @@ type SparkApplicationSpec struct {
// spark-submit. // spark-submit.
// +optional // +optional
SparkConf map[string]string `json:"sparkConf,omitempty"` SparkConf map[string]string `json:"sparkConf,omitempty"`
// HadoopConf carries user-specified Hadoop configuration properties as they would use the the "--conf" option // HadoopConf carries user-specified Hadoop configuration properties as they would use the "--conf" option
// in spark-submit. The SparkApplication controller automatically adds prefix "spark.hadoop." to Hadoop // in spark-submit. The SparkApplication controller automatically adds prefix "spark.hadoop." to Hadoop
// configuration properties. // configuration properties.
// +optional // +optional
@ -230,7 +86,7 @@ type SparkApplicationSpec struct {
HadoopConfigMap *string `json:"hadoopConfigMap,omitempty"` HadoopConfigMap *string `json:"hadoopConfigMap,omitempty"`
// Volumes is the list of Kubernetes volumes that can be mounted by the driver and/or executors. // Volumes is the list of Kubernetes volumes that can be mounted by the driver and/or executors.
// +optional // +optional
Volumes []apiv1.Volume `json:"volumes,omitempty"` Volumes []corev1.Volume `json:"volumes,omitempty"`
// Driver is the driver specification. // Driver is the driver specification.
Driver DriverSpec `json:"driver"` Driver DriverSpec `json:"driver"`
// Executor is the executor specification. // Executor is the executor specification.
@ -289,124 +145,11 @@ type SparkApplicationSpec struct {
DynamicAllocation *DynamicAllocation `json:"dynamicAllocation,omitempty"` DynamicAllocation *DynamicAllocation `json:"dynamicAllocation,omitempty"`
} }
// BatchSchedulerConfiguration used to configure how to batch scheduling Spark Application // SparkApplicationStatus defines the observed state of SparkApplication
type BatchSchedulerConfiguration struct {
// Queue stands for the resource queue which the application belongs to, it's being used in Volcano batch scheduler.
// +optional
Queue *string `json:"queue,omitempty"`
// PriorityClassName stands for the name of k8s PriorityClass resource, it's being used in Volcano batch scheduler.
// +optional
PriorityClassName *string `json:"priorityClassName,omitempty"`
// Resources stands for the resource list custom request for. Usually it is used to define the lower-bound limit.
// If specified, volcano scheduler will consider it as the resources requested.
// +optional
Resources apiv1.ResourceList `json:"resources,omitempty"`
}
// SparkUIConfiguration is for driver UI specific configuration parameters.
type SparkUIConfiguration struct {
// ServicePort allows configuring the port at service level that might be different from the targetPort.
// TargetPort should be the same as the one defined in spark.ui.port
// +optional
ServicePort *int32 `json:"servicePort"`
// ServicePortName allows configuring the name of the service port.
// This may be useful for sidecar proxies like Envoy injected by Istio which require specific ports names to treat traffic as proper HTTP.
// Defaults to spark-driver-ui-port.
// +optional
ServicePortName *string `json:"servicePortName"`
// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *apiv1.ServiceType `json:"serviceType"`
// ServiceAnnotations is a map of key,value pairs of annotations that might be added to the service object.
// +optional
ServiceAnnotations map[string]string `json:"serviceAnnotations,omitempty"`
// ServiceLables is a map of key,value pairs of labels that might be added to the service object.
// +optional
ServiceLabels map[string]string `json:"serviceLabels,omitempty"`
// IngressAnnotations is a map of key,value pairs of annotations that might be added to the ingress object. i.e. specify nginx as ingress.class
// +optional
IngressAnnotations map[string]string `json:"ingressAnnotations,omitempty"`
// TlsHosts is useful If we need to declare SSL certificates to the ingress object
// +optional
IngressTLS []networkingv1.IngressTLS `json:"ingressTLS,omitempty"`
}
// DriverIngressConfiguration is for driver ingress specific configuration parameters.
type DriverIngressConfiguration struct {
// ServicePort allows configuring the port at service level that might be different from the targetPort.
ServicePort *int32 `json:"servicePort"`
// ServicePortName allows configuring the name of the service port.
// This may be useful for sidecar proxies like Envoy injected by Istio which require specific ports names to treat traffic as proper HTTP.
ServicePortName *string `json:"servicePortName"`
// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *apiv1.ServiceType `json:"serviceType"`
// ServiceAnnotations is a map of key,value pairs of annotations that might be added to the service object.
// +optional
ServiceAnnotations map[string]string `json:"serviceAnnotations,omitempty"`
// ServiceLables is a map of key,value pairs of labels that might be added to the service object.
// +optional
ServiceLabels map[string]string `json:"serviceLabels,omitempty"`
// IngressURLFormat is the URL for the ingress.
IngressURLFormat string `json:"ingressURLFormat,omitempty"`
// IngressAnnotations is a map of key,value pairs of annotations that might be added to the ingress object. i.e. specify nginx as ingress.class
// +optional
IngressAnnotations map[string]string `json:"ingressAnnotations,omitempty"`
// TlsHosts is useful If we need to declare SSL certificates to the ingress object
// +optional
IngressTLS []networkingv1.IngressTLS `json:"ingressTLS,omitempty"`
}
// ApplicationStateType represents the type of the current state of an application.
type ApplicationStateType string
// Different states an application may have.
const (
NewState ApplicationStateType = ""
SubmittedState ApplicationStateType = "SUBMITTED"
RunningState ApplicationStateType = "RUNNING"
CompletedState ApplicationStateType = "COMPLETED"
FailedState ApplicationStateType = "FAILED"
FailedSubmissionState ApplicationStateType = "SUBMISSION_FAILED"
PendingRerunState ApplicationStateType = "PENDING_RERUN"
InvalidatingState ApplicationStateType = "INVALIDATING"
SucceedingState ApplicationStateType = "SUCCEEDING"
FailingState ApplicationStateType = "FAILING"
UnknownState ApplicationStateType = "UNKNOWN"
)
// ApplicationState tells the current state of the application and an error message in case of failures.
type ApplicationState struct {
State ApplicationStateType `json:"state"`
ErrorMessage string `json:"errorMessage,omitempty"`
}
// DriverState tells the current state of a spark driver.
type DriverState string
// Different states a spark driver may have.
const (
DriverPendingState DriverState = "PENDING"
DriverRunningState DriverState = "RUNNING"
DriverCompletedState DriverState = "COMPLETED"
DriverFailedState DriverState = "FAILED"
DriverUnknownState DriverState = "UNKNOWN"
)
// ExecutorState tells the current state of an executor.
type ExecutorState string
// Different states an executor may have.
const (
ExecutorPendingState ExecutorState = "PENDING"
ExecutorRunningState ExecutorState = "RUNNING"
ExecutorCompletedState ExecutorState = "COMPLETED"
ExecutorFailedState ExecutorState = "FAILED"
ExecutorUnknownState ExecutorState = "UNKNOWN"
)
// SparkApplicationStatus describes the current status of a Spark application.
type SparkApplicationStatus struct { type SparkApplicationStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// SparkApplicationID is set by the spark-distribution(via spark.app.id config) on the driver and executor pods // SparkApplicationID is set by the spark-distribution(via spark.app.id config) on the driver and executor pods
SparkApplicationID string `json:"sparkApplicationId,omitempty"` SparkApplicationID string `json:"sparkApplicationId,omitempty"`
// SubmissionID is a unique ID of the current submission of the application. // SubmissionID is a unique ID of the current submission of the application.
@ -431,15 +174,209 @@ type SparkApplicationStatus struct {
SubmissionAttempts int32 `json:"submissionAttempts,omitempty"` SubmissionAttempts int32 `json:"submissionAttempts,omitempty"`
} }
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object // +kubebuilder:object:root=true
// +kubebuilder:metadata:annotations="api-approved.kubernetes.io=https://github.com/kubeflow/spark-operator/pull/1298"
// +kubebuilder:resource:scope=Namespaced,shortName=sparkapp,singular=sparkapplication
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=.status.applicationState.state,name=Status,type=string
// +kubebuilder:printcolumn:JSONPath=.status.executionAttempts,name=Attempts,type=string
// +kubebuilder:printcolumn:JSONPath=.status.lastSubmissionAttemptTime,name=Start,type=string
// +kubebuilder:printcolumn:JSONPath=.status.terminationTime,name=Finish,type=string
// +kubebuilder:printcolumn:JSONPath=.metadata.creationTimestamp,name=Age,type=date
// SparkApplicationList carries a list of SparkApplication objects. // SparkApplication is the Schema for the sparkapplications API
type SparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec SparkApplicationSpec `json:"spec,omitempty"`
Status SparkApplicationStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// SparkApplicationList contains a list of SparkApplication
type SparkApplicationList struct { type SparkApplicationList struct {
metav1.TypeMeta `json:",inline"` metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"` metav1.ListMeta `json:"metadata,omitempty"`
Items []SparkApplication `json:"items,omitempty"` Items []SparkApplication `json:"items"`
} }
// SparkApplicationType describes the type of a Spark application.
type SparkApplicationType string
// Different types of Spark applications.
const (
SparkApplicationTypeJava SparkApplicationType = "Java"
SparkApplicationTypeScala SparkApplicationType = "Scala"
SparkApplicationTypePython SparkApplicationType = "Python"
SparkApplicationTypeR SparkApplicationType = "R"
)
// DeployMode describes the type of deployment of a Spark application.
type DeployMode string
// Different types of deployments.
const (
DeployModeCluster DeployMode = "cluster"
DeployModeClient DeployMode = "client"
DeployModeInClusterClient DeployMode = "in-cluster-client"
)
// RestartPolicy is the policy of if and in which conditions the controller should restart a terminated application.
// This completely defines actions to be taken on any kind of Failures during an application run.
type RestartPolicy struct {
// Type specifies the RestartPolicyType.
// +kubebuilder:validation:Enum={Never,Always,OnFailure}
Type RestartPolicyType `json:"type,omitempty"`
// OnSubmissionFailureRetries is the number of times to retry submitting an application before giving up.
// This is best effort and actual retry attempts can be >= the value specified due to caching.
// These are required if RestartPolicy is OnFailure.
// +kubebuilder:validation:Minimum=0
// +optional
OnSubmissionFailureRetries *int32 `json:"onSubmissionFailureRetries,omitempty"`
// OnFailureRetries the number of times to retry running an application before giving up.
// +kubebuilder:validation:Minimum=0
// +optional
OnFailureRetries *int32 `json:"onFailureRetries,omitempty"`
// OnSubmissionFailureRetryInterval is the interval in seconds between retries on failed submissions.
// +kubebuilder:validation:Minimum=1
// +optional
OnSubmissionFailureRetryInterval *int64 `json:"onSubmissionFailureRetryInterval,omitempty"`
// OnFailureRetryInterval is the interval in seconds between retries on failed runs.
// +kubebuilder:validation:Minimum=1
// +optional
OnFailureRetryInterval *int64 `json:"onFailureRetryInterval,omitempty"`
}
type RestartPolicyType string
const (
RestartPolicyNever RestartPolicyType = "Never"
RestartPolicyOnFailure RestartPolicyType = "OnFailure"
RestartPolicyAlways RestartPolicyType = "Always"
)
// BatchSchedulerConfiguration used to configure how to batch scheduling Spark Application
type BatchSchedulerConfiguration struct {
// Queue stands for the resource queue which the application belongs to, it's being used in Volcano batch scheduler.
// +optional
Queue *string `json:"queue,omitempty"`
// PriorityClassName stands for the name of k8s PriorityClass resource, it's being used in Volcano batch scheduler.
// +optional
PriorityClassName *string `json:"priorityClassName,omitempty"`
// Resources stands for the resource list custom request for. Usually it is used to define the lower-bound limit.
// If specified, volcano scheduler will consider it as the resources requested.
// +optional
Resources corev1.ResourceList `json:"resources,omitempty"`
}
// SparkUIConfiguration is for driver UI specific configuration parameters.
type SparkUIConfiguration struct {
// ServicePort allows configuring the port at service level that might be different from the targetPort.
// TargetPort should be the same as the one defined in spark.ui.port
// +optional
ServicePort *int32 `json:"servicePort"`
// ServicePortName allows configuring the name of the service port.
// This may be useful for sidecar proxies like Envoy injected by Istio which require specific ports names to treat traffic as proper HTTP.
// Defaults to spark-driver-ui-port.
// +optional
ServicePortName *string `json:"servicePortName"`
// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *corev1.ServiceType `json:"serviceType"`
// ServiceAnnotations is a map of key,value pairs of annotations that might be added to the service object.
// +optional
ServiceAnnotations map[string]string `json:"serviceAnnotations,omitempty"`
// ServiceLabels is a map of key,value pairs of labels that might be added to the service object.
// +optional
ServiceLabels map[string]string `json:"serviceLabels,omitempty"`
// IngressAnnotations is a map of key,value pairs of annotations that might be added to the ingress object. i.e. specify nginx as ingress.class
// +optional
IngressAnnotations map[string]string `json:"ingressAnnotations,omitempty"`
// TlsHosts is useful If we need to declare SSL certificates to the ingress object
// +optional
IngressTLS []networkingv1.IngressTLS `json:"ingressTLS,omitempty"`
}
// DriverIngressConfiguration is for driver ingress specific configuration parameters.
type DriverIngressConfiguration struct {
// ServicePort allows configuring the port at service level that might be different from the targetPort.
ServicePort *int32 `json:"servicePort"`
// ServicePortName allows configuring the name of the service port.
// This may be useful for sidecar proxies like Envoy injected by Istio which require specific ports names to treat traffic as proper HTTP.
ServicePortName *string `json:"servicePortName"`
// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *corev1.ServiceType `json:"serviceType"`
// ServiceAnnotations is a map of key,value pairs of annotations that might be added to the service object.
// +optional
ServiceAnnotations map[string]string `json:"serviceAnnotations,omitempty"`
// ServiceLabels is a map of key,value pairs of labels that might be added to the service object.
// +optional
ServiceLabels map[string]string `json:"serviceLabels,omitempty"`
// IngressURLFormat is the URL for the ingress.
IngressURLFormat string `json:"ingressURLFormat,omitempty"`
// IngressAnnotations is a map of key,value pairs of annotations that might be added to the ingress object. i.e. specify nginx as ingress.class
// +optional
IngressAnnotations map[string]string `json:"ingressAnnotations,omitempty"`
// TlsHosts is useful If we need to declare SSL certificates to the ingress object
// +optional
IngressTLS []networkingv1.IngressTLS `json:"ingressTLS,omitempty"`
}
// ApplicationStateType represents the type of the current state of an application.
type ApplicationStateType string
// Different states an application may have.
const (
ApplicationStateNew ApplicationStateType = ""
ApplicationStateSubmitted ApplicationStateType = "SUBMITTED"
ApplicationStateRunning ApplicationStateType = "RUNNING"
ApplicationStateCompleted ApplicationStateType = "COMPLETED"
ApplicationStateFailed ApplicationStateType = "FAILED"
ApplicationStateFailedSubmission ApplicationStateType = "SUBMISSION_FAILED"
ApplicationStatePendingRerun ApplicationStateType = "PENDING_RERUN"
ApplicationStateInvalidating ApplicationStateType = "INVALIDATING"
ApplicationStateSucceeding ApplicationStateType = "SUCCEEDING"
ApplicationStateFailing ApplicationStateType = "FAILING"
ApplicationStateUnknown ApplicationStateType = "UNKNOWN"
)
// ApplicationState tells the current state of the application and an error message in case of failures.
type ApplicationState struct {
State ApplicationStateType `json:"state"`
ErrorMessage string `json:"errorMessage,omitempty"`
}
// DriverState tells the current state of a spark driver.
type DriverState string
// Different states a spark driver may have.
const (
DriverStatePending DriverState = "PENDING"
DriverStateRunning DriverState = "RUNNING"
DriverStateCompleted DriverState = "COMPLETED"
DriverStateFailed DriverState = "FAILED"
DriverStateUnknown DriverState = "UNKNOWN"
)
// ExecutorState tells the current state of an executor.
type ExecutorState string
// Different states an executor may have.
const (
ExecutorStatePending ExecutorState = "PENDING"
ExecutorStateRunning ExecutorState = "RUNNING"
ExecutorStateCompleted ExecutorState = "COMPLETED"
ExecutorStateFailed ExecutorState = "FAILED"
ExecutorStateUnknown ExecutorState = "UNKNOWN"
)
// Dependencies specifies all possible types of dependencies of a Spark application. // Dependencies specifies all possible types of dependencies of a Spark application.
type Dependencies struct { type Dependencies struct {
// Jars is a list of JAR files the Spark application depends on. // Jars is a list of JAR files the Spark application depends on.
@ -497,14 +434,14 @@ type SparkPodSpec struct {
Secrets []SecretInfo `json:"secrets,omitempty"` Secrets []SecretInfo `json:"secrets,omitempty"`
// Env carries the environment variables to add to the pod. // Env carries the environment variables to add to the pod.
// +optional // +optional
Env []apiv1.EnvVar `json:"env,omitempty"` Env []corev1.EnvVar `json:"env,omitempty"`
// EnvVars carries the environment variables to add to the pod. // EnvVars carries the environment variables to add to the pod.
// Deprecated. Consider using `env` instead. // Deprecated. Consider using `env` instead.
// +optional // +optional
EnvVars map[string]string `json:"envVars,omitempty"` EnvVars map[string]string `json:"envVars,omitempty"`
// EnvFrom is a list of sources to populate environment variables in the container. // EnvFrom is a list of sources to populate environment variables in the container.
// +optional // +optional
EnvFrom []apiv1.EnvFromSource `json:"envFrom,omitempty"` EnvFrom []corev1.EnvFromSource `json:"envFrom,omitempty"`
// EnvSecretKeyRefs holds a mapping from environment variable names to SecretKeyRefs. // EnvSecretKeyRefs holds a mapping from environment variable names to SecretKeyRefs.
// Deprecated. Consider using `env` instead. // Deprecated. Consider using `env` instead.
// +optional // +optional
@ -517,28 +454,28 @@ type SparkPodSpec struct {
Annotations map[string]string `json:"annotations,omitempty"` Annotations map[string]string `json:"annotations,omitempty"`
// VolumeMounts specifies the volumes listed in ".spec.volumes" to mount into the main container's filesystem. // VolumeMounts specifies the volumes listed in ".spec.volumes" to mount into the main container's filesystem.
// +optional // +optional
VolumeMounts []apiv1.VolumeMount `json:"volumeMounts,omitempty"` VolumeMounts []corev1.VolumeMount `json:"volumeMounts,omitempty"`
// Affinity specifies the affinity/anti-affinity settings for the pod. // Affinity specifies the affinity/anti-affinity settings for the pod.
// +optional // +optional
Affinity *apiv1.Affinity `json:"affinity,omitempty"` Affinity *corev1.Affinity `json:"affinity,omitempty"`
// Tolerations specifies the tolerations listed in ".spec.tolerations" to be applied to the pod. // Tolerations specifies the tolerations listed in ".spec.tolerations" to be applied to the pod.
// +optional // +optional
Tolerations []apiv1.Toleration `json:"tolerations,omitempty"` Tolerations []corev1.Toleration `json:"tolerations,omitempty"`
// PodSecurityContext specifies the PodSecurityContext to apply. // PodSecurityContext specifies the PodSecurityContext to apply.
// +optional // +optional
PodSecurityContext *apiv1.PodSecurityContext `json:"podSecurityContext,omitempty"` PodSecurityContext *corev1.PodSecurityContext `json:"podSecurityContext,omitempty"`
// SecurityContext specifies the container's SecurityContext to apply. // SecurityContext specifies the container's SecurityContext to apply.
// +optional // +optional
SecurityContext *apiv1.SecurityContext `json:"securityContext,omitempty"` SecurityContext *corev1.SecurityContext `json:"securityContext,omitempty"`
// SchedulerName specifies the scheduler that will be used for scheduling // SchedulerName specifies the scheduler that will be used for scheduling
// +optional // +optional
SchedulerName *string `json:"schedulerName,omitempty"` SchedulerName *string `json:"schedulerName,omitempty"`
// Sidecars is a list of sidecar containers that run along side the main Spark container. // Sidecars is a list of sidecar containers that run along side the main Spark container.
// +optional // +optional
Sidecars []apiv1.Container `json:"sidecars,omitempty"` Sidecars []corev1.Container `json:"sidecars,omitempty"`
// InitContainers is a list of init-containers that run to completion before the main Spark container. // InitContainers is a list of init-containers that run to completion before the main Spark container.
// +optional // +optional
InitContainers []apiv1.Container `json:"initContainers,omitempty"` InitContainers []corev1.Container `json:"initContainers,omitempty"`
// HostNetwork indicates whether to request host networking for the pod or not. // HostNetwork indicates whether to request host networking for the pod or not.
// +optional // +optional
HostNetwork *bool `json:"hostNetwork,omitempty"` HostNetwork *bool `json:"hostNetwork,omitempty"`
@ -548,7 +485,7 @@ type SparkPodSpec struct {
NodeSelector map[string]string `json:"nodeSelector,omitempty"` NodeSelector map[string]string `json:"nodeSelector,omitempty"`
// DnsConfig dns settings for the pod, following the Kubernetes specifications. // DnsConfig dns settings for the pod, following the Kubernetes specifications.
// +optional // +optional
DNSConfig *apiv1.PodDNSConfig `json:"dnsConfig,omitempty"` DNSConfig *corev1.PodDNSConfig `json:"dnsConfig,omitempty"`
// Termination grace period seconds for the pod // Termination grace period seconds for the pod
// +optional // +optional
TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty"` TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty"`
@ -557,7 +494,7 @@ type SparkPodSpec struct {
ServiceAccount *string `json:"serviceAccount,omitempty"` ServiceAccount *string `json:"serviceAccount,omitempty"`
// HostAliases settings for the pod, following the Kubernetes specifications. // HostAliases settings for the pod, following the Kubernetes specifications.
// +optional // +optional
HostAliases []apiv1.HostAlias `json:"hostAliases,omitempty"` HostAliases []corev1.HostAlias `json:"hostAliases,omitempty"`
// ShareProcessNamespace settings for the pod, following the Kubernetes specifications. // ShareProcessNamespace settings for the pod, following the Kubernetes specifications.
// +optional // +optional
ShareProcessNamespace *bool `json:"shareProcessNamespace,omitempty"` ShareProcessNamespace *bool `json:"shareProcessNamespace,omitempty"`
@ -583,7 +520,7 @@ type DriverSpec struct {
JavaOptions *string `json:"javaOptions,omitempty"` JavaOptions *string `json:"javaOptions,omitempty"`
// Lifecycle for running preStop or postStart commands // Lifecycle for running preStop or postStart commands
// +optional // +optional
Lifecycle *apiv1.Lifecycle `json:"lifecycle,omitempty"` Lifecycle *corev1.Lifecycle `json:"lifecycle,omitempty"`
// KubernetesMaster is the URL of the Kubernetes master used by the driver to manage executor pods and // KubernetesMaster is the URL of the Kubernetes master used by the driver to manage executor pods and
// other Kubernetes resources. Default to https://kubernetes.default.svc. // other Kubernetes resources. Default to https://kubernetes.default.svc.
// +optional // +optional
@ -599,6 +536,9 @@ type DriverSpec struct {
// Ports settings for the pods, following the Kubernetes specifications. // Ports settings for the pods, following the Kubernetes specifications.
// +optional // +optional
Ports []Port `json:"ports,omitempty"` Ports []Port `json:"ports,omitempty"`
// PriorityClassName is the name of the PriorityClass for the driver pod.
// +optional
PriorityClassName *string `json:"priorityClassName,omitempty"`
} }
// ExecutorSpec is specification of the executor. // ExecutorSpec is specification of the executor.
@ -618,7 +558,7 @@ type ExecutorSpec struct {
JavaOptions *string `json:"javaOptions,omitempty"` JavaOptions *string `json:"javaOptions,omitempty"`
// Lifecycle for running preStop or postStart commands // Lifecycle for running preStop or postStart commands
// +optional // +optional
Lifecycle *apiv1.Lifecycle `json:"lifecycle,omitempty"` Lifecycle *corev1.Lifecycle `json:"lifecycle,omitempty"`
// DeleteOnTermination specify whether executor pods should be deleted in case of failure or normal termination. // DeleteOnTermination specify whether executor pods should be deleted in case of failure or normal termination.
// Maps to `spark.kubernetes.executor.deleteOnTermination` that is available since Spark 3.0. // Maps to `spark.kubernetes.executor.deleteOnTermination` that is available since Spark 3.0.
// +optional // +optional
@ -626,6 +566,9 @@ type ExecutorSpec struct {
// Ports settings for the pods, following the Kubernetes specifications. // Ports settings for the pods, following the Kubernetes specifications.
// +optional // +optional
Ports []Port `json:"ports,omitempty"` Ports []Port `json:"ports,omitempty"`
// PriorityClassName is the name of the PriorityClass for the executor pod.
// +optional
PriorityClassName *string `json:"priorityClassName,omitempty"`
} }
// NamePath is a pair of a name and a path to which the named objects should be mounted to. // NamePath is a pair of a name and a path to which the named objects should be mounted to.
@ -639,22 +582,22 @@ type SecretType string
// An enumeration of secret types supported. // An enumeration of secret types supported.
const ( const (
// GCPServiceAccountSecret is for secrets from a GCP service account Json key file that needs // SecretTypeGCPServiceAccount is for secrets from a GCP service account Json key file that needs
// the environment variable GOOGLE_APPLICATION_CREDENTIALS. // the environment variable GOOGLE_APPLICATION_CREDENTIALS.
GCPServiceAccountSecret SecretType = "GCPServiceAccount" SecretTypeGCPServiceAccount SecretType = "GCPServiceAccount"
// HadoopDelegationTokenSecret is for secrets from an Hadoop delegation token that needs the // SecretTypeHadoopDelegationToken is for secrets from an Hadoop delegation token that needs the
// environment variable HADOOP_TOKEN_FILE_LOCATION. // environment variable HADOOP_TOKEN_FILE_LOCATION.
HadoopDelegationTokenSecret SecretType = "HadoopDelegationToken" SecretTypeHadoopDelegationToken SecretType = "HadoopDelegationToken"
// GenericType is for secrets that needs no special handling. // SecretTypeGeneric is for secrets that needs no special handling.
GenericType SecretType = "Generic" SecretTypeGeneric SecretType = "Generic"
) )
// DriverInfo captures information about the driver. // DriverInfo captures information about the driver.
type DriverInfo struct { type DriverInfo struct {
WebUIServiceName string `json:"webUIServiceName,omitempty"` WebUIServiceName string `json:"webUIServiceName,omitempty"`
// UI Details for the UI created via ClusterIP service accessible from within the cluster. // UI Details for the UI created via ClusterIP service accessible from within the cluster.
WebUIPort int32 `json:"webUIPort,omitempty"`
WebUIAddress string `json:"webUIAddress,omitempty"` WebUIAddress string `json:"webUIAddress,omitempty"`
WebUIPort int32 `json:"webUIPort,omitempty"`
// Ingress Details if an ingress for the UI was created. // Ingress Details if an ingress for the UI was created.
WebUIIngressName string `json:"webUIIngressName,omitempty"` WebUIIngressName string `json:"webUIIngressName,omitempty"`
WebUIIngressAddress string `json:"webUIIngressAddress,omitempty"` WebUIIngressAddress string `json:"webUIIngressAddress,omitempty"`
@ -752,39 +695,3 @@ type DynamicAllocation struct {
// +optional // +optional
ShuffleTrackingTimeout *int64 `json:"shuffleTrackingTimeout,omitempty"` ShuffleTrackingTimeout *int64 `json:"shuffleTrackingTimeout,omitempty"`
} }
// PrometheusMonitoringEnabled returns if Prometheus monitoring is enabled or not.
func (s *SparkApplication) PrometheusMonitoringEnabled() bool {
return s.Spec.Monitoring != nil && s.Spec.Monitoring.Prometheus != nil
}
// HasPrometheusConfigFile returns if Prometheus monitoring uses a configuration file in the container.
func (s *SparkApplication) HasPrometheusConfigFile() bool {
return s.PrometheusMonitoringEnabled() &&
s.Spec.Monitoring.Prometheus.ConfigFile != nil &&
*s.Spec.Monitoring.Prometheus.ConfigFile != ""
}
// HasPrometheusConfig returns if Prometheus monitoring defines metricsProperties in the spec.
func (s *SparkApplication) HasMetricsProperties() bool {
return s.PrometheusMonitoringEnabled() &&
s.Spec.Monitoring.MetricsProperties != nil &&
*s.Spec.Monitoring.MetricsProperties != ""
}
// HasPrometheusConfigFile returns if Monitoring defines metricsPropertiesFile in the spec.
func (s *SparkApplication) HasMetricsPropertiesFile() bool {
return s.PrometheusMonitoringEnabled() &&
s.Spec.Monitoring.MetricsPropertiesFile != nil &&
*s.Spec.Monitoring.MetricsPropertiesFile != ""
}
// ExposeDriverMetrics returns if driver metrics should be exposed.
func (s *SparkApplication) ExposeDriverMetrics() bool {
return s.Spec.Monitoring != nil && s.Spec.Monitoring.ExposeDriverMetrics
}
// ExposeExecutorMetrics returns if executor metrics should be exposed.
func (s *SparkApplication) ExposeExecutorMetrics() bool {
return s.Spec.Monitoring != nil && s.Spec.Monitoring.ExposeExecutorMetrics
}

View File

@ -1,10 +1,7 @@
//go:build !ignore_autogenerated //go:build !ignore_autogenerated
// +build !ignore_autogenerated
// Code generated by k8s code-generator DO NOT EDIT.
/* /*
Copyright 2018 Google LLC Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
@ -19,12 +16,12 @@ See the License for the specific language governing permissions and
limitations under the License. limitations under the License.
*/ */
// Code generated by deepcopy-gen. DO NOT EDIT. // Code generated by controller-gen. DO NOT EDIT.
package v1beta2 package v1beta2
import ( import (
v1 "k8s.io/api/core/v1" "k8s.io/api/core/v1"
networkingv1 "k8s.io/api/networking/v1" networkingv1 "k8s.io/api/networking/v1"
runtime "k8s.io/apimachinery/pkg/runtime" runtime "k8s.io/apimachinery/pkg/runtime"
) )
@ -32,7 +29,6 @@ import (
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *ApplicationState) DeepCopyInto(out *ApplicationState) { func (in *ApplicationState) DeepCopyInto(out *ApplicationState) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ApplicationState. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ApplicationState.
@ -65,7 +61,6 @@ func (in *BatchSchedulerConfiguration) DeepCopyInto(out *BatchSchedulerConfigura
(*out)[key] = val.DeepCopy() (*out)[key] = val.DeepCopy()
} }
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BatchSchedulerConfiguration. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BatchSchedulerConfiguration.
@ -111,7 +106,6 @@ func (in *Dependencies) DeepCopyInto(out *Dependencies) {
*out = make([]string, len(*in)) *out = make([]string, len(*in))
copy(*out, *in) copy(*out, *in)
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Dependencies. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Dependencies.
@ -127,7 +121,6 @@ func (in *Dependencies) DeepCopy() *Dependencies {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *DriverInfo) DeepCopyInto(out *DriverInfo) { func (in *DriverInfo) DeepCopyInto(out *DriverInfo) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverInfo. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverInfo.
@ -140,6 +133,64 @@ func (in *DriverInfo) DeepCopy() *DriverInfo {
return out return out
} }
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *DriverIngressConfiguration) DeepCopyInto(out *DriverIngressConfiguration) {
*out = *in
if in.ServicePort != nil {
in, out := &in.ServicePort, &out.ServicePort
*out = new(int32)
**out = **in
}
if in.ServicePortName != nil {
in, out := &in.ServicePortName, &out.ServicePortName
*out = new(string)
**out = **in
}
if in.ServiceType != nil {
in, out := &in.ServiceType, &out.ServiceType
*out = new(v1.ServiceType)
**out = **in
}
if in.ServiceAnnotations != nil {
in, out := &in.ServiceAnnotations, &out.ServiceAnnotations
*out = make(map[string]string, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.ServiceLabels != nil {
in, out := &in.ServiceLabels, &out.ServiceLabels
*out = make(map[string]string, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.IngressAnnotations != nil {
in, out := &in.IngressAnnotations, &out.IngressAnnotations
*out = make(map[string]string, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.IngressTLS != nil {
in, out := &in.IngressTLS, &out.IngressTLS
*out = make([]networkingv1.IngressTLS, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverIngressConfiguration.
func (in *DriverIngressConfiguration) DeepCopy() *DriverIngressConfiguration {
if in == nil {
return nil
}
out := new(DriverIngressConfiguration)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *DriverSpec) DeepCopyInto(out *DriverSpec) { func (in *DriverSpec) DeepCopyInto(out *DriverSpec) {
*out = *in *out = *in
@ -176,12 +227,23 @@ func (in *DriverSpec) DeepCopyInto(out *DriverSpec) {
(*out)[key] = val (*out)[key] = val
} }
} }
if in.ServiceLabels != nil {
in, out := &in.ServiceLabels, &out.ServiceLabels
*out = make(map[string]string, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.Ports != nil { if in.Ports != nil {
in, out := &in.Ports, &out.Ports in, out := &in.Ports, &out.Ports
*out = make([]Port, len(*in)) *out = make([]Port, len(*in))
copy(*out, *in) copy(*out, *in)
} }
return if in.PriorityClassName != nil {
in, out := &in.PriorityClassName, &out.PriorityClassName
*out = new(string)
**out = **in
}
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverSpec.
@ -217,7 +279,6 @@ func (in *DynamicAllocation) DeepCopyInto(out *DynamicAllocation) {
*out = new(int64) *out = new(int64)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DynamicAllocation. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DynamicAllocation.
@ -249,6 +310,11 @@ func (in *ExecutorSpec) DeepCopyInto(out *ExecutorSpec) {
*out = new(string) *out = new(string)
**out = **in **out = **in
} }
if in.Lifecycle != nil {
in, out := &in.Lifecycle, &out.Lifecycle
*out = new(v1.Lifecycle)
(*in).DeepCopyInto(*out)
}
if in.DeleteOnTermination != nil { if in.DeleteOnTermination != nil {
in, out := &in.DeleteOnTermination, &out.DeleteOnTermination in, out := &in.DeleteOnTermination, &out.DeleteOnTermination
*out = new(bool) *out = new(bool)
@ -259,7 +325,11 @@ func (in *ExecutorSpec) DeepCopyInto(out *ExecutorSpec) {
*out = make([]Port, len(*in)) *out = make([]Port, len(*in))
copy(*out, *in) copy(*out, *in)
} }
return if in.PriorityClassName != nil {
in, out := &in.PriorityClassName, &out.PriorityClassName
*out = new(string)
**out = **in
}
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ExecutorSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ExecutorSpec.
@ -275,7 +345,6 @@ func (in *ExecutorSpec) DeepCopy() *ExecutorSpec {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *GPUSpec) DeepCopyInto(out *GPUSpec) { func (in *GPUSpec) DeepCopyInto(out *GPUSpec) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GPUSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new GPUSpec.
@ -306,7 +375,6 @@ func (in *MonitoringSpec) DeepCopyInto(out *MonitoringSpec) {
*out = new(PrometheusSpec) *out = new(PrometheusSpec)
(*in).DeepCopyInto(*out) (*in).DeepCopyInto(*out)
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MonitoringSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MonitoringSpec.
@ -322,7 +390,6 @@ func (in *MonitoringSpec) DeepCopy() *MonitoringSpec {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *NameKey) DeepCopyInto(out *NameKey) { func (in *NameKey) DeepCopyInto(out *NameKey) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new NameKey. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new NameKey.
@ -338,7 +405,6 @@ func (in *NameKey) DeepCopy() *NameKey {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *NamePath) DeepCopyInto(out *NamePath) { func (in *NamePath) DeepCopyInto(out *NamePath) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new NamePath. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new NamePath.
@ -354,7 +420,6 @@ func (in *NamePath) DeepCopy() *NamePath {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *Port) DeepCopyInto(out *Port) { func (in *Port) DeepCopyInto(out *Port) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Port. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Port.
@ -390,7 +455,6 @@ func (in *PrometheusSpec) DeepCopyInto(out *PrometheusSpec) {
*out = new(string) *out = new(string)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PrometheusSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PrometheusSpec.
@ -426,7 +490,6 @@ func (in *RestartPolicy) DeepCopyInto(out *RestartPolicy) {
*out = new(int64) *out = new(int64)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RestartPolicy. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new RestartPolicy.
@ -446,7 +509,6 @@ func (in *ScheduledSparkApplication) DeepCopyInto(out *ScheduledSparkApplication
in.ObjectMeta.DeepCopyInto(&out.ObjectMeta) in.ObjectMeta.DeepCopyInto(&out.ObjectMeta)
in.Spec.DeepCopyInto(&out.Spec) in.Spec.DeepCopyInto(&out.Spec)
in.Status.DeepCopyInto(&out.Status) in.Status.DeepCopyInto(&out.Status)
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplication. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplication.
@ -479,7 +541,6 @@ func (in *ScheduledSparkApplicationList) DeepCopyInto(out *ScheduledSparkApplica
(*in)[i].DeepCopyInto(&(*out)[i]) (*in)[i].DeepCopyInto(&(*out)[i])
} }
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationList. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationList.
@ -519,7 +580,6 @@ func (in *ScheduledSparkApplicationSpec) DeepCopyInto(out *ScheduledSparkApplica
*out = new(int32) *out = new(int32)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationSpec.
@ -547,7 +607,6 @@ func (in *ScheduledSparkApplicationStatus) DeepCopyInto(out *ScheduledSparkAppli
*out = make([]string, len(*in)) *out = make([]string, len(*in))
copy(*out, *in) copy(*out, *in)
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationStatus. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ScheduledSparkApplicationStatus.
@ -563,7 +622,6 @@ func (in *ScheduledSparkApplicationStatus) DeepCopy() *ScheduledSparkApplication
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. // DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *SecretInfo) DeepCopyInto(out *SecretInfo) { func (in *SecretInfo) DeepCopyInto(out *SecretInfo) {
*out = *in *out = *in
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SecretInfo. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SecretInfo.
@ -583,7 +641,6 @@ func (in *SparkApplication) DeepCopyInto(out *SparkApplication) {
in.ObjectMeta.DeepCopyInto(&out.ObjectMeta) in.ObjectMeta.DeepCopyInto(&out.ObjectMeta)
in.Spec.DeepCopyInto(&out.Spec) in.Spec.DeepCopyInto(&out.Spec)
in.Status.DeepCopyInto(&out.Status) in.Status.DeepCopyInto(&out.Status)
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplication. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplication.
@ -616,7 +673,6 @@ func (in *SparkApplicationList) DeepCopyInto(out *SparkApplicationList) {
(*in)[i].DeepCopyInto(&(*out)[i]) (*in)[i].DeepCopyInto(&(*out)[i])
} }
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationList. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationList.
@ -774,7 +830,6 @@ func (in *SparkApplicationSpec) DeepCopyInto(out *SparkApplicationSpec) {
*out = new(DynamicAllocation) *out = new(DynamicAllocation)
(*in).DeepCopyInto(*out) (*in).DeepCopyInto(*out)
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationSpec.
@ -801,7 +856,6 @@ func (in *SparkApplicationStatus) DeepCopyInto(out *SparkApplicationStatus) {
(*out)[key] = val (*out)[key] = val
} }
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationStatus. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkApplicationStatus.
@ -986,7 +1040,6 @@ func (in *SparkPodSpec) DeepCopyInto(out *SparkPodSpec) {
*out = new(bool) *out = new(bool)
**out = **in **out = **in
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkPodSpec. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkPodSpec.
@ -1024,6 +1077,13 @@ func (in *SparkUIConfiguration) DeepCopyInto(out *SparkUIConfiguration) {
(*out)[key] = val (*out)[key] = val
} }
} }
if in.ServiceLabels != nil {
in, out := &in.ServiceLabels, &out.ServiceLabels
*out = make(map[string]string, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.IngressAnnotations != nil { if in.IngressAnnotations != nil {
in, out := &in.IngressAnnotations, &out.IngressAnnotations in, out := &in.IngressAnnotations, &out.IngressAnnotations
*out = make(map[string]string, len(*in)) *out = make(map[string]string, len(*in))
@ -1038,7 +1098,6 @@ func (in *SparkUIConfiguration) DeepCopyInto(out *SparkUIConfiguration) {
(*in)[i].DeepCopyInto(&(*out)[i]) (*in)[i].DeepCopyInto(&(*out)[i])
} }
} }
return
} }
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkUIConfiguration. // DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkUIConfiguration.
@ -1050,56 +1109,3 @@ func (in *SparkUIConfiguration) DeepCopy() *SparkUIConfiguration {
in.DeepCopyInto(out) in.DeepCopyInto(out)
return out return out
} }
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *DriverIngressConfiguration) DeepCopyInto(out *DriverIngressConfiguration) {
*out = *in
if in.ServicePort != nil {
in, out := &in.ServicePort, &out.ServicePort
*out = new(int32)
**out = **in
}
if in.ServicePortName != nil {
in, out := &in.ServicePortName, &out.ServicePortName
*out = new(string)
**out = **in
}
if in.ServiceType != nil {
in, out := &in.ServiceType, &out.ServiceType
*out = new(v1.ServiceType)
**out = **in
}
if in.ServiceAnnotations != nil {
in, out := &in.ServiceAnnotations, &out.ServiceAnnotations
*out = make(map[string]string, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
out.IngressURLFormat = in.IngressURLFormat
if in.IngressAnnotations != nil {
in, out := &in.IngressAnnotations, &out.IngressAnnotations
*out = make(map[string]string, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.IngressTLS != nil {
in, out := &in.IngressTLS, &out.IngressTLS
*out = make([]networkingv1.IngressTLS, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
return
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverIngressConfiguration.
func (in *DriverIngressConfiguration) DeepCopy() *DriverIngressConfiguration {
if in == nil {
return nil
}
out := new(DriverIngressConfiguration)
in.DeepCopyInto(out)
return out
}

View File

@ -3,6 +3,7 @@
# negation (prefixed with !). Only one pattern per line. # negation (prefixed with !). Only one pattern per line.
ci/ ci/
.helmignore
# Common VCS dirs # Common VCS dirs
.git/ .git/
@ -21,16 +22,16 @@ ci/
*~ *~
# Various IDEs # Various IDEs
*.tmproj
.project .project
.idea/ .idea/
*.tmproj
.vscode/ .vscode/
# MacOS # MacOS
.DS_Store .DS_Store
# helm-unittest # helm-unittest
./tests tests
.debug .debug
__snapshot__ __snapshot__

View File

@ -1,11 +1,39 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
apiVersion: v2 apiVersion: v2
name: spark-operator name: spark-operator
description: A Helm chart for Spark on Kubernetes operator
version: 1.4.3 description: A Helm chart for Spark on Kubernetes operator.
appVersion: v1beta2-1.6.1-3.5.0
version: 2.0.2
appVersion: 2.0.2
keywords: keywords:
- spark - apache spark
- big data
home: https://github.com/kubeflow/spark-operator home: https://github.com/kubeflow/spark-operator
maintainers: maintainers:
- name: yuchaoran2011 - name: yuchaoran2011
email: yuchaoran2011@gmail.com email: yuchaoran2011@gmail.com
url: https://github.com/yuchaoran2011
- name: ChenYi015
email: github@chenyicn.net
url: https://github.com/ChenYi015

View File

@ -1,8 +1,8 @@
# spark-operator # spark-operator
![Version: 1.4.2](https://img.shields.io/badge/Version-1.4.2-informational?style=flat-square) ![AppVersion: v1beta2-1.6.1-3.5.0](https://img.shields.io/badge/AppVersion-v1beta2--1.6.1--3.5.0-informational?style=flat-square) ![Version: 2.0.2](https://img.shields.io/badge/Version-2.0.2-informational?style=flat-square) ![AppVersion: 2.0.2](https://img.shields.io/badge/AppVersion-2.0.2-informational?style=flat-square)
A Helm chart for Spark on Kubernetes operator A Helm chart for Spark on Kubernetes operator.
**Homepage:** <https://github.com/kubeflow/spark-operator> **Homepage:** <https://github.com/kubeflow/spark-operator>
@ -41,13 +41,7 @@ See [helm repo](https://helm.sh/docs/helm/helm_repo) for command documentation.
helm install [RELEASE_NAME] spark-operator/spark-operator helm install [RELEASE_NAME] spark-operator/spark-operator
``` ```
For example, if you want to create a release with name `spark-operator` in the `default` namespace: For example, if you want to create a release with name `spark-operator` in the `spark-operator` namespace:
```shell
helm install spark-operator spark-operator/spark-operator
```
Note that `helm` will fail to install if the namespace doesn't exist. Either create the namespace beforehand or pass the `--create-namespace` flag to the `helm install` command.
```shell ```shell
helm install spark-operator spark-operator/spark-operator \ helm install spark-operator spark-operator/spark-operator \
@ -55,6 +49,8 @@ helm install spark-operator spark-operator/spark-operator \
--create-namespace --create-namespace
``` ```
Note that by passing the `--create-namespace` flag to the `helm install` command, `helm` will create the release namespace if it does not exist.
See [helm install](https://helm.sh/docs/helm/helm_install) for command documentation. See [helm install](https://helm.sh/docs/helm/helm_install) for command documentation.
### Upgrade the chart ### Upgrade the chart
@ -78,71 +74,102 @@ See [helm uninstall](https://helm.sh/docs/helm/helm_uninstall) for command docum
## Values ## Values
| Key | Type | Default | Description | | Key | Type | Default | Description |
|-------------------------------------------|--------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |-----|------|---------|-------------|
| affinity | object | `{}` | Affinity for pod assignment | | nameOverride | string | `""` | String to partially override release name. |
| batchScheduler.enable | bool | `false` | Enable batch scheduler for spark jobs scheduling. If enabled, users can specify batch scheduler name in spark application | | fullnameOverride | string | `""` | String to fully override release name. |
| commonLabels | object | `{}` | Common labels to add to the resources | | commonLabels | object | `{}` | Common labels to add to the resources. |
| controllerThreads | int | `10` | Operator concurrency, higher values might increase memory usage | | image.registry | string | `"docker.io"` | Image registry. |
| envFrom | list | `[]` | Pod environment variable sources | | image.repository | string | `"kubeflow/spark-operator"` | Image repository. |
| fullnameOverride | string | `""` | String to override release name | | image.tag | string | If not set, the chart appVersion will be used. | Image tag. |
| image.pullPolicy | string | `"IfNotPresent"` | Image pull policy | | image.pullPolicy | string | `"IfNotPresent"` | Image pull policy. |
| image.repository | string | `"docker.io/kubeflow/spark-operator"` | Image repository | | image.pullSecrets | list | `[]` | Image pull secrets for private image registry. |
| image.tag | string | `""` | if set, override the image tag whose default is the chart appVersion. | | controller.replicas | int | `1` | Number of replicas of controller. |
| imagePullSecrets | list | `[]` | Image pull secrets | | controller.workers | int | `10` | Reconcile concurrency, higher values might increase memory usage. |
| ingressUrlFormat | string | `""` | Ingress URL format. Requires the UI service to be enabled by setting `uiService.enable` to true. | | controller.logLevel | string | `"info"` | Configure the verbosity of logging, can be one of `debug`, `info`, `error`. |
| istio.enabled | bool | `false` | When using `istio`, spark jobs need to run without a sidecar to properly terminate | | controller.uiService.enable | bool | `true` | Specifies whether to create service for Spark web UI. |
| labelSelectorFilter | string | `""` | A comma-separated list of key=value, or key labels to filter resources during watch and list based on the specified labels. | | controller.uiIngress.enable | bool | `false` | Specifies whether to create ingress for Spark web UI. `controller.uiService.enable` must be `true` to enable ingress. |
| leaderElection.lockName | string | `"spark-operator-lock"` | Leader election lock name. Ref: https://github.com/kubeflow/spark-operator/blob/master/docs/user-guide.md#enabling-leader-election-for-high-availability. | | controller.uiIngress.urlFormat | string | `""` | Ingress URL format. Required if `controller.uiIngress.enable` is true. |
| leaderElection.lockNamespace | string | `""` | Optionally store the lock in another namespace. Defaults to operator's namespace | | controller.batchScheduler.enable | bool | `false` | Specifies whether to enable batch scheduler for spark jobs scheduling. If enabled, users can specify batch scheduler name in spark application. |
| logLevel | int | `2` | Set higher levels for more verbose logging | | controller.batchScheduler.kubeSchedulerNames | list | `[]` | Specifies a list of kube-scheduler names for scheduling Spark pods. |
| metrics.enable | bool | `true` | Enable prometheus metric scraping | | controller.batchScheduler.default | string | `""` | Default batch scheduler to be used if not specified by the user. If specified, this value must be either "volcano" or "yunikorn". Specifying any other value will cause the controller to error on startup. |
| metrics.endpoint | string | `"/metrics"` | Metrics serving endpoint | | controller.serviceAccount.create | bool | `true` | Specifies whether to create a service account for the controller. |
| metrics.port | int | `10254` | Metrics port | | controller.serviceAccount.name | string | `""` | Optional name for the controller service account. |
| metrics.portName | string | `"metrics"` | Metrics port name | | controller.serviceAccount.annotations | object | `{}` | Extra annotations for the controller service account. |
| metrics.prefix | string | `""` | Metric prefix, will be added to all exported metrics | | controller.rbac.create | bool | `true` | Specifies whether to create RBAC resources for the controller. |
| nameOverride | string | `""` | String to partially override `spark-operator.fullname` template (will maintain the release name) | | controller.rbac.annotations | object | `{}` | Extra annotations for the controller RBAC resources. |
| nodeSelector | object | `{}` | Node labels for pod assignment | | controller.labels | object | `{}` | Extra labels for controller pods. |
| podDisruptionBudget.enabled | bool | `false` | Whether to deploy a PodDisruptionBudget | | controller.annotations | object | `{}` | Extra annotations for controller pods. |
| podDisruptionBudget.minAvailable | int | `1` | An eviction is allowed if at least "minAvailable" pods selected by "selector" will still be available after the eviction | | controller.volumes | list | `[]` | Volumes for controller pods. |
| podAnnotations | object | `{}` | Additional annotations to add to the pod | | controller.nodeSelector | object | `{}` | Node selector for controller pods. |
| podLabels | object | `{}` | Additional labels to add to the pod | | controller.affinity | object | `{}` | Affinity for controller pods. |
| podMonitor | object | `{"enable":false,"jobLabel":"spark-operator-podmonitor","labels":{},"podMetricsEndpoint":{"interval":"5s","scheme":"http"}}` | Prometheus pod monitor for operator's pod. | | controller.tolerations | list | `[]` | List of node taints to tolerate for controller pods. |
| podMonitor.enable | bool | `false` | If enabled, a pod monitor for operator's pod will be submitted. Note that prometheus metrics should be enabled as well. | | controller.priorityClassName | string | `""` | Priority class for controller pods. |
| podMonitor.jobLabel | string | `"spark-operator-podmonitor"` | The label to use to retrieve the job name from | | controller.podSecurityContext | object | `{}` | Security context for controller pods. |
| podMonitor.labels | object | `{}` | Pod monitor labels | | controller.topologySpreadConstraints | list | `[]` | Topology spread constraints rely on node labels to identify the topology domain(s) that each Node is in. Ref: [Pod Topology Spread Constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/). The labelSelector field in topology spread constraint will be set to the selector labels for controller pods if not specified. |
| podMonitor.podMetricsEndpoint | object | `{"interval":"5s","scheme":"http"}` | Prometheus metrics endpoint properties. `metrics.portName` will be used as a port | | controller.env | list | `[]` | Environment variables for controller containers. |
| podSecurityContext | object | `{}` | Pod security context | | controller.envFrom | list | `[]` | Environment variable sources for controller containers. |
| priorityClassName | string | `""` | A priority class to be used for running spark-operator pod. | | controller.volumeMounts | list | `[]` | Volume mounts for controller containers. |
| rbac.annotations | object | `{}` | Optional annotations for rbac | | controller.resources | object | `{}` | Pod resource requests and limits for controller containers. Note, that each job submission will spawn a JVM within the controller pods using "/usr/local/openjdk-11/bin/java -Xmx128m". Kubernetes may kill these Java processes at will to enforce resource limits. When that happens, you will see the following error: 'failed to run spark-submit for SparkApplication [...]: signal: killed' - when this happens, you may want to increase memory limits. |
| rbac.create | bool | `false` | **DEPRECATED** use `createRole` and `createClusterRole` | | controller.securityContext | object | `{}` | Security context for controller containers. |
| rbac.createClusterRole | bool | `true` | Create and use RBAC `ClusterRole` resources | | controller.sidecars | list | `[]` | Sidecar containers for controller pods. |
| rbac.createRole | bool | `true` | Create and use RBAC `Role` resources | | controller.podDisruptionBudget.enable | bool | `false` | Specifies whether to create pod disruption budget for controller. Ref: [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) |
| replicaCount | int | `1` | Desired number of pods, leaderElection will be enabled if this is greater than 1 | | controller.podDisruptionBudget.minAvailable | int | `1` | The number of pods that must be available. Require `controller.replicas` to be greater than 1 |
| resourceQuotaEnforcement.enable | bool | `false` | Whether to enable the ResourceQuota enforcement for SparkApplication resources. Requires the webhook to be enabled by setting `webhook.enable` to true. Ref: https://github.com/kubeflow/spark-operator/blob/master/docs/user-guide.md#enabling-resource-quota-enforcement. | | controller.pprof.enable | bool | `false` | Specifies whether to enable pprof. |
| resources | object | `{}` | Pod resource requests and limits Note, that each job submission will spawn a JVM within the Spark Operator Pod using "/usr/local/openjdk-11/bin/java -Xmx128m". Kubernetes may kill these Java processes at will to enforce resource limits. When that happens, you will see the following error: 'failed to run spark-submit for SparkApplication [...]: signal: killed' - when this happens, you may want to increase memory limits. | | controller.pprof.port | int | `6060` | Specifies pprof port. |
| resyncInterval | int | `30` | Operator resync interval. Note that the operator will respond to events (e.g. create, update) unrelated to this setting | | controller.pprof.portName | string | `"pprof"` | Specifies pprof service port name. |
| securityContext | object | `{}` | Operator container security context | | controller.workqueueRateLimiter.bucketQPS | int | `50` | Specifies the average rate of items process by the workqueue rate limiter. |
| serviceAccounts.spark.annotations | object | `{}` | Optional annotations for the spark service account | | controller.workqueueRateLimiter.bucketSize | int | `500` | Specifies the maximum number of items that can be in the workqueue at any given time. |
| serviceAccounts.spark.create | bool | `true` | Create a service account for spark apps | | controller.workqueueRateLimiter.maxDelay.enable | bool | `true` | Specifies whether to enable max delay for the workqueue rate limiter. This is useful to avoid losing events when the workqueue is full. |
| serviceAccounts.spark.name | string | `""` | Optional name for the spark service account | | controller.workqueueRateLimiter.maxDelay.duration | string | `"6h"` | Specifies the maximum delay duration for the workqueue rate limiter. |
| serviceAccounts.sparkoperator.annotations | object | `{}` | Optional annotations for the operator service account | | webhook.enable | bool | `true` | Specifies whether to enable webhook. |
| serviceAccounts.sparkoperator.create | bool | `true` | Create a service account for the operator | | webhook.replicas | int | `1` | Number of replicas of webhook server. |
| serviceAccounts.sparkoperator.name | string | `""` | Optional name for the operator service account | | webhook.logLevel | string | `"info"` | Configure the verbosity of logging, can be one of `debug`, `info`, `error`. |
| sidecars | list | `[]` | Sidecar containers | | webhook.port | int | `9443` | Specifies webhook port. |
| sparkJobNamespaces | list | `[""]` | List of namespaces where to run spark jobs | | webhook.portName | string | `"webhook"` | Specifies webhook service port name. |
| tolerations | list | `[]` | List of node taints to tolerate | | webhook.failurePolicy | string | `"Fail"` | Specifies how unrecognized errors are handled. Available options are `Ignore` or `Fail`. |
| uiService.enable | bool | `true` | Enable UI service creation for Spark application | | webhook.timeoutSeconds | int | `10` | Specifies the timeout seconds of the webhook, the value must be between 1 and 30. |
| volumeMounts | list | `[]` | | | webhook.resourceQuotaEnforcement.enable | bool | `false` | Specifies whether to enable the ResourceQuota enforcement for SparkApplication resources. |
| volumes | list | `[]` | | | webhook.serviceAccount.create | bool | `true` | Specifies whether to create a service account for the webhook. |
| webhook.enable | bool | `false` | Enable webhook server | | webhook.serviceAccount.name | string | `""` | Optional name for the webhook service account. |
| webhook.namespaceSelector | string | `""` | The webhook server will only operate on namespaces with this label, specified in the form key1=value1,key2=value2. Empty string (default) will operate on all namespaces | | webhook.serviceAccount.annotations | object | `{}` | Extra annotations for the webhook service account. |
| webhook.objectSelector | string | `""` | The webhook will only operate on resources with this label/s, specified in the form key1=value1,key2=value2, OR key in (value1,value2). Empty string (default) will operate on all objects | | webhook.rbac.create | bool | `true` | Specifies whether to create RBAC resources for the webhook. |
| webhook.port | int | `8080` | Webhook service port | | webhook.rbac.annotations | object | `{}` | Extra annotations for the webhook RBAC resources. |
| webhook.portName | string | `"webhook"` | Webhook container port name and service target port name | | webhook.labels | object | `{}` | Extra labels for webhook pods. |
| webhook.timeout | int | `30` | The annotations applied to init job, required to restore certs deleted by the cleanup job during upgrade | | webhook.annotations | object | `{}` | Extra annotations for webhook pods. |
| webhook.sidecars | list | `[]` | Sidecar containers for webhook pods. |
| webhook.volumes | list | `[]` | Volumes for webhook pods. |
| webhook.nodeSelector | object | `{}` | Node selector for webhook pods. |
| webhook.affinity | object | `{}` | Affinity for webhook pods. |
| webhook.tolerations | list | `[]` | List of node taints to tolerate for webhook pods. |
| webhook.priorityClassName | string | `""` | Priority class for webhook pods. |
| webhook.podSecurityContext | object | `{}` | Security context for webhook pods. |
| webhook.topologySpreadConstraints | list | `[]` | Topology spread constraints rely on node labels to identify the topology domain(s) that each Node is in. Ref: [Pod Topology Spread Constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/). The labelSelector field in topology spread constraint will be set to the selector labels for webhook pods if not specified. |
| webhook.env | list | `[]` | Environment variables for webhook containers. |
| webhook.envFrom | list | `[]` | Environment variable sources for webhook containers. |
| webhook.volumeMounts | list | `[]` | Volume mounts for webhook containers. |
| webhook.resources | object | `{}` | Pod resource requests and limits for webhook pods. |
| webhook.securityContext | object | `{}` | Security context for webhook containers. |
| webhook.podDisruptionBudget.enable | bool | `false` | Specifies whether to create pod disruption budget for webhook. Ref: [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) |
| webhook.podDisruptionBudget.minAvailable | int | `1` | The number of pods that must be available. Require `webhook.replicas` to be greater than 1 |
| spark.jobNamespaces | list | `["default"]` | List of namespaces where to run spark jobs. If empty string is included, all namespaces will be allowed. Make sure the namespaces have already existed. |
| spark.serviceAccount.create | bool | `true` | Specifies whether to create a service account for spark applications. |
| spark.serviceAccount.name | string | `""` | Optional name for the spark service account. |
| spark.serviceAccount.annotations | object | `{}` | Optional annotations for the spark service account. |
| spark.rbac.create | bool | `true` | Specifies whether to create RBAC resources for spark applications. |
| spark.rbac.annotations | object | `{}` | Optional annotations for the spark application RBAC resources. |
| prometheus.metrics.enable | bool | `true` | Specifies whether to enable prometheus metrics scraping. |
| prometheus.metrics.port | int | `8080` | Metrics port. |
| prometheus.metrics.portName | string | `"metrics"` | Metrics port name. |
| prometheus.metrics.endpoint | string | `"/metrics"` | Metrics serving endpoint. |
| prometheus.metrics.prefix | string | `""` | Metrics prefix, will be added to all exported metrics. |
| prometheus.podMonitor.create | bool | `false` | Specifies whether to create pod monitor. Note that prometheus metrics should be enabled as well. |
| prometheus.podMonitor.labels | object | `{}` | Pod monitor labels |
| prometheus.podMonitor.jobLabel | string | `"spark-operator-podmonitor"` | The label to use to retrieve the job name from |
| prometheus.podMonitor.podMetricsEndpoint | object | `{"interval":"5s","scheme":"http"}` | Prometheus metrics endpoint properties. `metrics.portName` will be used as a port |
## Maintainers ## Maintainers
| Name | Email | Url | | Name | Email | Url |
| ---- | ------ | --- | | ---- | ------ | --- |
| yuchaoran2011 | <yuchaoran2011@gmail.com> | | | yuchaoran2011 | <yuchaoran2011@gmail.com> | <https://github.com/yuchaoran2011> |
| ChenYi015 | <github@chenyicn.net> | <https://github.com/ChenYi015> |

View File

@ -43,13 +43,7 @@ See [helm repo](https://helm.sh/docs/helm/helm_repo) for command documentation.
helm install [RELEASE_NAME] spark-operator/spark-operator helm install [RELEASE_NAME] spark-operator/spark-operator
``` ```
For example, if you want to create a release with name `spark-operator` in the `default` namespace: For example, if you want to create a release with name `spark-operator` in the `spark-operator` namespace:
```shell
helm install spark-operator spark-operator/spark-operator
```
Note that `helm` will fail to install if the namespace doesn't exist. Either create the namespace beforehand or pass the `--create-namespace` flag to the `helm install` command.
```shell ```shell
helm install spark-operator spark-operator/spark-operator \ helm install spark-operator spark-operator/spark-operator \
@ -57,6 +51,8 @@ helm install spark-operator spark-operator/spark-operator \
--create-namespace --create-namespace
``` ```
Note that by passing the `--create-namespace` flag to the `helm install` command, `helm` will create the release namespace if it does not exist.
See [helm install](https://helm.sh/docs/helm/helm_install) for command documentation. See [helm install](https://helm.sh/docs/helm/helm_install) for command documentation.
### Upgrade the chart ### Upgrade the chart

View File

@ -1,2 +1,2 @@
image: image:
tag: "local" tag: local

View File

@ -0,0 +1,7 @@
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
image: kindest/node:v1.29.2
- role: worker
image: kindest/node:v1.29.2

View File

@ -1,3 +1,19 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{/* vim: set filetype=mustache: */}} {{/* vim: set filetype=mustache: */}}
{{/* {{/*
Expand the name of the chart. Expand the name of the chart.
@ -37,13 +53,13 @@ Common labels
{{- define "spark-operator.labels" -}} {{- define "spark-operator.labels" -}}
helm.sh/chart: {{ include "spark-operator.chart" . }} helm.sh/chart: {{ include "spark-operator.chart" . }}
{{ include "spark-operator.selectorLabels" . }} {{ include "spark-operator.selectorLabels" . }}
{{- if .Values.commonLabels }}
{{ toYaml .Values.commonLabels }}
{{- end }}
{{- if .Chart.AppVersion }} {{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }} app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }} {{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }} app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- with .Values.commonLabels }}
{{ toYaml . }}
{{- end }}
{{- end }} {{- end }}
{{/* {{/*
@ -55,25 +71,8 @@ app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }} {{- end }}
{{/* {{/*
Create the name of the service account to be used by the operator Spark Operator image
*/}} */}}
{{- define "spark-operator.serviceAccountName" -}} {{- define "spark-operator.image" -}}
{{- if .Values.serviceAccounts.sparkoperator.create -}} {{ printf "%s/%s:%s" .Values.image.registry .Values.image.repository (.Values.image.tag | default .Chart.AppVersion) }}
{{ default (include "spark-operator.fullname" .) .Values.serviceAccounts.sparkoperator.name }}
{{- else -}}
{{ default "default" .Values.serviceAccounts.sparkoperator.name }}
{{- end -}} {{- end -}}
{{- end -}}
{{/*
Create the name of the service account to be used by spark apps
*/}}
{{- define "spark.serviceAccountName" -}}
{{- if .Values.serviceAccounts.spark.create -}}
{{- $sparkServiceaccount := printf "%s-%s" .Release.Name "spark" -}}
{{ default $sparkServiceaccount .Values.serviceAccounts.spark.name }}
{{- else -}}
{{ default "default" .Values.serviceAccounts.spark.name }}
{{- end -}}
{{- end -}}

View File

@ -0,0 +1,204 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{/*
Create the name of controller component
*/}}
{{- define "spark-operator.controller.name" -}}
{{- include "spark-operator.fullname" . }}-controller
{{- end -}}
{{/*
Common labels for the controller
*/}}
{{- define "spark-operator.controller.labels" -}}
{{ include "spark-operator.labels" . }}
app.kubernetes.io/component: controller
{{- end -}}
{{/*
Selector labels for the controller
*/}}
{{- define "spark-operator.controller.selectorLabels" -}}
{{ include "spark-operator.selectorLabels" . }}
app.kubernetes.io/component: controller
{{- end -}}
{{/*
Create the name of the service account to be used by the controller
*/}}
{{- define "spark-operator.controller.serviceAccountName" -}}
{{- if .Values.controller.serviceAccount.create -}}
{{ .Values.controller.serviceAccount.name | default (include "spark-operator.controller.name" .) }}
{{- else -}}
{{ .Values.controller.serviceAccount.name | default "default" }}
{{- end -}}
{{- end -}}
{{/*
Create the name of the cluster role to be used by the controller
*/}}
{{- define "spark-operator.controller.clusterRoleName" -}}
{{ include "spark-operator.controller.name" . }}
{{- end }}
{{/*
Create the name of the cluster role binding to be used by the controller
*/}}
{{- define "spark-operator.controller.clusterRoleBindingName" -}}
{{ include "spark-operator.controller.clusterRoleName" . }}
{{- end }}
{{/*
Create the name of the role to be used by the controller
*/}}
{{- define "spark-operator.controller.roleName" -}}
{{ include "spark-operator.controller.name" . }}
{{- end }}
{{/*
Create the name of the role binding to be used by the controller
*/}}
{{- define "spark-operator.controller.roleBindingName" -}}
{{ include "spark-operator.controller.roleName" . }}
{{- end }}
{{/*
Create the name of the deployment to be used by controller
*/}}
{{- define "spark-operator.controller.deploymentName" -}}
{{ include "spark-operator.controller.name" . }}
{{- end -}}
{{/*
Create the name of the lease resource to be used by leader election
*/}}
{{- define "spark-operator.controller.leaderElectionName" -}}
{{ include "spark-operator.controller.name" . }}-lock
{{- end -}}
{{/*
Create the name of the pod disruption budget to be used by controller
*/}}
{{- define "spark-operator.controller.podDisruptionBudgetName" -}}
{{ include "spark-operator.controller.name" . }}-pdb
{{- end -}}
{{/*
Create the name of the service used by controller
*/}}
{{- define "spark-operator.controller.serviceName" -}}
{{ include "spark-operator.controller.name" . }}-svc
{{- end -}}
{{/*
Create the role policy rules for the controller in every Spark job namespace
*/}}
{{- define "spark-operator.controller.policyRules" -}}
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- deletecollection
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- ""
resources:
- persistentvolumeclaims
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- create
- delete
- apiGroups:
- sparkoperator.k8s.io
resources:
- sparkapplications
- scheduledsparkapplications
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- sparkoperator.k8s.io
resources:
- sparkapplications/status
- sparkapplications/finalizers
- scheduledsparkapplications/status
- scheduledsparkapplications/finalizers
verbs:
- get
- update
- patch
{{- if .Values.controller.batchScheduler.enable }}
{{/* required for the `volcano` batch scheduler */}}
- apiGroups:
- scheduling.incubator.k8s.io
- scheduling.sigs.dev
- scheduling.volcano.sh
resources:
- podgroups
verbs:
- "*"
{{- end }}
{{- end -}}

View File

@ -0,0 +1,184 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "spark-operator.controller.deploymentName" . }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.controller.replicas }}
selector:
matchLabels:
{{- include "spark-operator.controller.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "spark-operator.controller.selectorLabels" . | nindent 8 }}
{{- with .Values.controller.labels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if or .Values.controller.annotations .Values.prometheus.metrics.enable }}
annotations:
{{- if .Values.prometheus.metrics.enable }}
prometheus.io/scrape: "true"
prometheus.io/port: {{ .Values.prometheus.metrics.port | quote }}
prometheus.io/path: {{ .Values.prometheus.metrics.endpoint }}
{{- end }}
{{- with .Values.controller.annotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
spec:
containers:
- name: spark-operator-controller
image: {{ include "spark-operator.image" . }}
{{- with .Values.image.pullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
args:
- controller
- start
{{- with .Values.controller.logLevel }}
- --zap-log-level={{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if has "" . }}
- --namespaces=""
{{- else }}
- --namespaces={{ . | join "," }}
{{- end }}
{{- end }}
- --controller-threads={{ .Values.controller.workers }}
{{- with .Values.controller.uiService.enable }}
- --enable-ui-service=true
{{- end }}
{{- if .Values.controller.uiIngress.enable }}
{{- with .Values.controller.uiIngress.urlFormat }}
- --ingress-url-format={{ . }}
{{- end }}
{{- end }}
{{- if .Values.controller.batchScheduler.enable }}
- --enable-batch-scheduler=true
{{- with .Values.controller.batchScheduler.kubeSchedulerNames }}
- --kube-scheduler-names={{ . | join "," }}
{{- end }}
{{- with .Values.controller.batchScheduler.default }}
- --default-batch-scheduler={{ . }}
{{- end }}
{{- end }}
{{- if .Values.prometheus.metrics.enable }}
- --enable-metrics=true
- --metrics-bind-address=:{{ .Values.prometheus.metrics.port }}
- --metrics-endpoint={{ .Values.prometheus.metrics.endpoint }}
- --metrics-prefix={{ .Values.prometheus.metrics.prefix }}
- --metrics-labels=app_type
{{- end }}
- --leader-election=true
- --leader-election-lock-name={{ include "spark-operator.controller.leaderElectionName" . }}
- --leader-election-lock-namespace={{ .Release.Namespace }}
{{- if .Values.controller.pprof.enable }}
- --pprof-bind-address=:{{ .Values.controller.pprof.port }}
{{- end }}
- --workqueue-ratelimiter-bucket-qps={{ .Values.controller.workqueueRateLimiter.bucketQPS }}
- --workqueue-ratelimiter-bucket-size={{ .Values.controller.workqueueRateLimiter.bucketSize }}
{{- if .Values.controller.workqueueRateLimiter.maxDelay.enable }}
- --workqueue-ratelimiter-max-delay={{ .Values.controller.workqueueRateLimiter.maxDelay.duration }}
{{- end }}
{{- if or .Values.prometheus.metrics.enable .Values.controller.pprof.enable }}
ports:
{{- if .Values.controller.pprof.enable }}
- name: {{ .Values.controller.pprof.portName | quote }}
containerPort: {{ .Values.controller.pprof.port }}
{{- end }}
{{- if .Values.prometheus.metrics.enable }}
- name: {{ .Values.prometheus.metrics.portName | quote }}
containerPort: {{ .Values.prometheus.metrics.port }}
{{- end }}
{{- end }}
{{- with .Values.controller.env }}
env:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.envFrom }}
envFrom:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.volumeMounts }}
volumeMounts:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.resources }}
resources:
{{- toYaml . | nindent 10 }}
{{- end }}
livenessProbe:
httpGet:
port: 8081
scheme: HTTP
path: /healthz
readinessProbe:
httpGet:
port: 8081
scheme: HTTP
path: /readyz
{{- with .Values.controller.securityContext }}
securityContext:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- with .Values.controller.sidecars }}
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.image.pullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.controller.volumes }}
volumes:
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.controller.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.tolerations }}
tolerations:
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.controller.priorityClassName }}
priorityClassName: {{ . }}
{{- end }}
serviceAccountName: {{ include "spark-operator.controller.serviceAccountName" . }}
{{- with .Values.controller.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.controller.topologySpreadConstraints }}
{{- if le (int .Values.controller.replicas) 1 }}
{{- fail "controller.replicas must be greater than 1 to enable topology spread constraints for controller pods"}}
{{- end }}
{{- $selectorLabels := include "spark-operator.controller.selectorLabels" . | fromYaml }}
{{- $labelSelectorDict := dict "labelSelector" ( dict "matchLabels" $selectorLabels ) }}
topologySpreadConstraints:
{{- range .Values.controller.topologySpreadConstraints }}
- {{ mergeOverwrite . $labelSelectorDict | toYaml | nindent 8 | trim }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,34 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.controller.podDisruptionBudget.enable }}
{{- if le (int .Values.controller.replicas) 1 }}
{{- fail "controller.replicas must be greater than 1 to enable pod disruption budget for controller" }}
{{- end -}}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "spark-operator.controller.podDisruptionBudgetName" . }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "spark-operator.controller.selectorLabels" . | nindent 6 }}
{{- with .Values.controller.podDisruptionBudget.minAvailable }}
minAvailable: {{ . }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,170 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.controller.rbac.create -}}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "spark-operator.controller.clusterRoleName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- events
verbs:
- create
- update
- patch
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
{{- if not .Values.spark.jobNamespaces | or (has "" .Values.spark.jobNamespaces) }}
{{ include "spark-operator.controller.policyRules" . }}
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "spark-operator.controller.clusterRoleBindingName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "spark-operator.controller.clusterRoleName" . }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.controller.roleName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- create
- apiGroups:
- coordination.k8s.io
resources:
- leases
resourceNames:
- {{ include "spark-operator.controller.leaderElectionName" . }}
verbs:
- get
- update
{{- if has .Release.Namespace .Values.spark.jobNamespaces }}
{{ include "spark-operator.controller.policyRules" . }}
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.controller.roleBindingName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.controller.roleName" . }}
{{- if and .Values.spark.jobNamespaces (not (has "" .Values.spark.jobNamespaces)) }}
{{- range $jobNamespace := .Values.spark.jobNamespaces }}
{{- if ne $jobNamespace $.Release.Namespace }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.controller.roleName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.controller.labels" $ | nindent 4 }}
{{- with $.Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
{{ include "spark-operator.controller.policyRules" $ }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.controller.roleBindingName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.controller.labels" $ | nindent 4 }}
{{- with $.Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.controller.serviceAccountName" $ }}
namespace: {{ $.Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.controller.roleName" $ }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,31 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.controller.pprof.enable }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "spark-operator.controller.serviceName" . }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
spec:
selector:
{{- include "spark-operator.controller.selectorLabels" . | nindent 4 }}
ports:
- port: {{ .Values.controller.pprof.port }}
targetPort: {{ .Values.controller.pprof.portName | quote }}
name: {{ .Values.controller.pprof.portName }}
{{- end }}

View File

@ -0,0 +1,29 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.controller.serviceAccount.create }}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "spark-operator.controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -1,140 +0,0 @@
# If the admission webhook is enabled, then a post-install step is required
# to generate and install the secret in the operator namespace.
# In the post-install hook, the token corresponding to the operator service account
# is used to authenticate with the Kubernetes API server to install the secret bundle.
{{- $jobNamespaces := .Values.sparkJobNamespaces | default list }}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "spark-operator.fullname" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
strategy:
type: Recreate
template:
metadata:
{{- if or .Values.podAnnotations .Values.metrics.enable }}
annotations:
{{- if .Values.metrics.enable }}
prometheus.io/scrape: "true"
prometheus.io/port: "{{ .Values.metrics.port }}"
prometheus.io/path: {{ .Values.metrics.endpoint }}
{{- end }}
{{- if .Values.podAnnotations }}
{{- toYaml .Values.podAnnotations | trim | nindent 8 }}
{{- end }}
{{- end }}
labels:
{{- include "spark-operator.selectorLabels" . | nindent 8 }}
{{- with .Values.podLabels }}
{{- toYaml . | trim | nindent 8 }}
{{- end }}
spec:
serviceAccountName: {{ include "spark-operator.serviceAccountName" . }}
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
image: {{ .Values.image.repository }}:{{ default .Chart.AppVersion .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
{{- if gt (int .Values.replicaCount) 1 }}
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
{{- end }}
envFrom:
{{- toYaml .Values.envFrom | nindent 10 }}
securityContext:
{{- toYaml .Values.securityContext | nindent 10 }}
{{- if or .Values.metrics.enable .Values.webhook.enable }}
ports:
{{ if .Values.metrics.enable -}}
- name: {{ .Values.metrics.portName | quote }}
containerPort: {{ .Values.metrics.port }}
{{- end }}
{{ if .Values.webhook.enable -}}
- name: {{ .Values.webhook.portName | quote }}
containerPort: {{ .Values.webhook.port }}
{{- end }}
{{ end -}}
args:
- -v={{ .Values.logLevel }}
- -logtostderr
{{- if eq (len $jobNamespaces) 1 }}
- -namespace={{ index $jobNamespaces 0 }}
{{- end }}
- -enable-ui-service={{ .Values.uiService.enable}}
- -ingress-url-format={{ .Values.ingressUrlFormat }}
- -controller-threads={{ .Values.controllerThreads }}
- -resync-interval={{ .Values.resyncInterval }}
- -enable-batch-scheduler={{ .Values.batchScheduler.enable }}
- -label-selector-filter={{ .Values.labelSelectorFilter }}
{{- if .Values.metrics.enable }}
- -enable-metrics=true
- -metrics-labels=app_type
- -metrics-port={{ .Values.metrics.port }}
- -metrics-endpoint={{ .Values.metrics.endpoint }}
- -metrics-prefix={{ .Values.metrics.prefix }}
{{- end }}
{{- if .Values.webhook.enable }}
- -enable-webhook=true
- -webhook-secret-name={{ include "spark-operator.webhookSecretName" . }}
- -webhook-secret-namespace={{ .Release.Namespace }}
- -webhook-svc-name={{ include "spark-operator.webhookServiceName" . }}
- -webhook-svc-namespace={{ .Release.Namespace }}
- -webhook-config-name={{ include "spark-operator.fullname" . }}-webhook-config
- -webhook-port={{ .Values.webhook.port }}
- -webhook-timeout={{ .Values.webhook.timeout }}
- -webhook-namespace-selector={{ .Values.webhook.namespaceSelector }}
- -webhook-object-selector={{ .Values.webhook.objectSelector }}
{{- end }}
- -enable-resource-quota-enforcement={{ .Values.resourceQuotaEnforcement.enable }}
{{- if gt (int .Values.replicaCount) 1 }}
- -leader-election=true
- -leader-election-lock-namespace={{ default .Release.Namespace .Values.leaderElection.lockNamespace }}
- -leader-election-lock-name={{ .Values.leaderElection.lockName }}
{{- end }}
{{- with .Values.resources }}
resources:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- with .Values.volumeMounts }}
volumeMounts:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- with .Values.sidecars }}
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.volumes }}
volumes:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.priorityClassName }}
priorityClassName: {{ .Values.priorityClassName }}
{{- end }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@ -1,17 +0,0 @@
{{- if $.Values.podDisruptionBudget.enable }}
{{- if (gt (int $.Values.replicaCount) 1) }}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "spark-operator.fullname" . }}-pdb
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
minAvailable: {{ $.Values.podDisruptionBudget.minAvailable }}
{{- else }}
{{- fail "replicaCount must be greater than 1 to enable PodDisruptionBudget" }}
{{- end }}
{{- end }}

View File

@ -1,19 +0,0 @@
{{ if and .Values.metrics.enable .Values.podMonitor.enable }}
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: {{ include "spark-operator.name" . -}}-podmonitor
labels: {{ toYaml .Values.podMonitor.labels | nindent 4 }}
spec:
podMetricsEndpoints:
- interval: {{ .Values.podMonitor.podMetricsEndpoint.interval }}
port: {{ .Values.metrics.portName | quote }}
scheme: {{ .Values.podMonitor.podMetricsEndpoint.scheme }}
jobLabel: {{ .Values.podMonitor.jobLabel }}
namespaceSelector:
matchNames:
- {{ .Release.Namespace }}
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
{{ end }}

View File

@ -0,0 +1,22 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{/*
Create the name of pod monitor
*/}}
{{- define "spark-operator.prometheus.podMonitorName" -}}
{{- include "spark-operator.fullname" . }}-podmonitor
{{- end -}}

View File

@ -0,0 +1,44 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.prometheus.podMonitor.create -}}
{{- if not .Values.prometheus.metrics.enable }}
{{- fail "`metrics.enable` must be set to true when `podMonitor.create` is true." }}
{{- end }}
{{- if not (.Capabilities.APIVersions.Has "monitoring.coreos.com/v1/PodMonitor") }}
{{- fail "The cluster does not support the required API version `monitoring.coreos.com/v1` for `PodMonitor`." }}
{{- end }}
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: {{ include "spark-operator.prometheus.podMonitorName" . }}
{{- with .Values.prometheus.podMonitor.labels }}
labels:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
podMetricsEndpoints:
- interval: {{ .Values.prometheus.podMonitor.podMetricsEndpoint.interval }}
port: {{ .Values.prometheus.metrics.portName | quote }}
scheme: {{ .Values.prometheus.podMonitor.podMetricsEndpoint.scheme }}
jobLabel: {{ .Values.prometheus.podMonitor.jobLabel }}
namespaceSelector:
matchNames:
- {{ .Release.Namespace }}
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
{{- end }}

View File

@ -1,148 +0,0 @@
{{- if or .Values.rbac.create .Values.rbac.createClusterRole -}}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "spark-operator.fullname" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
{{- with .Values.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- pods
- persistentvolumeclaims
verbs:
- "*"
- apiGroups:
- ""
resources:
- services
- configmaps
- secrets
verbs:
- create
- get
- delete
- update
- patch
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs:
- create
- get
- delete
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- events
verbs:
- create
- update
- patch
- apiGroups:
- ""
resources:
- resourcequotas
verbs:
- get
- list
- watch
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
verbs:
- create
- get
- update
- delete
- apiGroups:
- sparkoperator.k8s.io
resources:
- sparkapplications
- sparkapplications/status
- sparkapplications/finalizers
- scheduledsparkapplications
- scheduledsparkapplications/status
- scheduledsparkapplications/finalizers
verbs:
- "*"
{{- if .Values.batchScheduler.enable }}
# required for the `volcano` batch scheduler
- apiGroups:
- scheduling.incubator.k8s.io
- scheduling.sigs.dev
- scheduling.volcano.sh
resources:
- podgroups
verbs:
- "*"
{{- end }}
{{ if .Values.webhook.enable }}
- apiGroups:
- batch
resources:
- jobs
verbs:
- delete
{{- end }}
{{- if gt (int .Values.replicaCount) 1 }}
- apiGroups:
- coordination.k8s.io
resources:
- leases
resourceNames:
- {{ .Values.leaderElection.lockName }}
verbs:
- get
- update
- patch
- delete
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- create
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "spark-operator.fullname" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
{{- with .Values.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: {{ include "spark-operator.fullname" . }}
apiGroup: rbac.authorization.k8s.io
{{- end }}

View File

@ -1,12 +0,0 @@
{{- if .Values.serviceAccounts.sparkoperator.create }}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "spark-operator.serviceAccountName" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
{{- with .Values.serviceAccounts.sparkoperator.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -1,39 +0,0 @@
{{- if or .Values.rbac.create .Values.rbac.createRole }}
{{- $jobNamespaces := .Values.sparkJobNamespaces | default list }}
{{- range $jobNamespace := $jobNamespaces }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: spark-role
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.labels" $ | nindent 4 }}
rules:
- apiGroups:
- ""
resources:
- pods
- services
- configmaps
- persistentvolumeclaims
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: spark
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.labels" $ | nindent 4 }}
subjects:
- kind: ServiceAccount
name: {{ include "spark.serviceAccountName" $ }}
namespace: {{ $jobNamespace }}
roleRef:
kind: Role
name: spark-role
apiGroup: rbac.authorization.k8s.io
{{- end }}
{{- end }}

View File

@ -1,14 +0,0 @@
{{- if .Values.serviceAccounts.spark.create }}
{{- range $sparkJobNamespace := .Values.sparkJobNamespaces | default (list .Release.Namespace) }}
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "spark.serviceAccountName" $ }}
namespace: {{ $sparkJobNamespace }}
{{- with $.Values.serviceAccounts.spark.annotations }}
annotations: {{ toYaml . | nindent 4 }}
{{- end }}
labels: {{ include "spark-operator.labels" $ | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,47 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{/*
Create the name of spark component
*/}}
{{- define "spark-operator.spark.name" -}}
{{- include "spark-operator.fullname" . }}-spark
{{- end -}}
{{/*
Create the name of the service account to be used by spark applications
*/}}
{{- define "spark-operator.spark.serviceAccountName" -}}
{{- if .Values.spark.serviceAccount.create -}}
{{- .Values.spark.serviceAccount.name | default (include "spark-operator.spark.name" .) -}}
{{- else -}}
{{- .Values.spark.serviceAccount.name | default "default" -}}
{{- end -}}
{{- end -}}
{{/*
Create the name of the role to be used by spark service account
*/}}
{{- define "spark-operator.spark.roleName" -}}
{{- include "spark-operator.spark.serviceAccountName" . }}
{{- end -}}
{{/*
Create the name of the role binding to be used by spark service account
*/}}
{{- define "spark-operator.spark.roleBindingName" -}}
{{- include "spark-operator.spark.serviceAccountName" . }}
{{- end -}}

View File

@ -0,0 +1,73 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.spark.rbac.create -}}
{{- range $jobNamespace := .Values.spark.jobNamespaces | default list }}
{{- if ne $jobNamespace "" }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.spark.roleName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.labels" $ | nindent 4 }}
{{- with $.Values.spark.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- pods
- configmaps
- persistentvolumeclaims
- services
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- deletecollection
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.spark.roleBindingName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.labels" $ | nindent 4 }}
{{- with $.Values.spark.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.spark.serviceAccountName" $ }}
namespace: {{ $jobNamespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.spark.roleName" $ }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,33 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.spark.serviceAccount.create }}
{{- range $jobNamespace := .Values.spark.jobNamespaces | default list }}
{{- if ne $jobNamespace "" }}
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "spark-operator.spark.serviceAccountName" $ }}
namespace: {{ $jobNamespace }}
labels: {{ include "spark-operator.labels" $ | nindent 4 }}
{{- with $.Values.spark.serviceAccount.annotations }}
annotations: {{ toYaml . | nindent 4 }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -1,14 +1,166 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{/*
Create the name of webhook component
*/}}
{{- define "spark-operator.webhook.name" -}}
{{- include "spark-operator.fullname" . }}-webhook
{{- end -}}
{{/*
Common labels for the webhook
*/}}
{{- define "spark-operator.webhook.labels" -}}
{{ include "spark-operator.labels" . }}
app.kubernetes.io/component: webhook
{{- end -}}
{{/*
Selector labels for the webhook
*/}}
{{- define "spark-operator.webhook.selectorLabels" -}}
{{ include "spark-operator.selectorLabels" . }}
app.kubernetes.io/component: webhook
{{- end -}}
{{/*
Create the name of service account to be used by webhook
*/}}
{{- define "spark-operator.webhook.serviceAccountName" -}}
{{- if .Values.webhook.serviceAccount.create -}}
{{ .Values.webhook.serviceAccount.name | default (include "spark-operator.webhook.name" .) }}
{{- else -}}
{{ .Values.webhook.serviceAccount.name | default "default" }}
{{- end -}}
{{- end -}}
{{/*
Create the name of the cluster role to be used by the webhook
*/}}
{{- define "spark-operator.webhook.clusterRoleName" -}}
{{ include "spark-operator.webhook.name" . }}
{{- end }}
{{/*
Create the name of the cluster role binding to be used by the webhook
*/}}
{{- define "spark-operator.webhook.clusterRoleBindingName" -}}
{{ include "spark-operator.webhook.clusterRoleName" . }}
{{- end }}
{{/*
Create the name of the role to be used by the webhook
*/}}
{{- define "spark-operator.webhook.roleName" -}}
{{ include "spark-operator.webhook.name" . }}
{{- end }}
{{/*
Create the name of the role binding to be used by the webhook
*/}}
{{- define "spark-operator.webhook.roleBindingName" -}}
{{ include "spark-operator.webhook.roleName" . }}
{{- end }}
{{/* {{/*
Create the name of the secret to be used by webhook Create the name of the secret to be used by webhook
*/}} */}}
{{- define "spark-operator.webhookSecretName" -}} {{- define "spark-operator.webhook.secretName" -}}
{{ include "spark-operator.fullname" . }}-webhook-certs {{ include "spark-operator.webhook.name" . }}-certs
{{- end -}} {{- end -}}
{{/* {{/*
Create the name of the service to be used by webhook Create the name of the service to be used by webhook
*/}} */}}
{{- define "spark-operator.webhookServiceName" -}} {{- define "spark-operator.webhook.serviceName" -}}
{{ include "spark-operator.fullname" . }}-webhook-svc {{ include "spark-operator.webhook.name" . }}-svc
{{- end -}}
{{/*
Create the name of mutating webhook configuration
*/}}
{{- define "spark-operator.mutatingWebhookConfigurationName" -}}
webhook.sparkoperator.k8s.io
{{- end -}}
{{/*
Create the name of mutating webhook configuration
*/}}
{{- define "spark-operator.validatingWebhookConfigurationName" -}}
quotaenforcer.sparkoperator.k8s.io
{{- end -}}
{{/*
Create the name of the deployment to be used by webhook
*/}}
{{- define "spark-operator.webhook.deploymentName" -}}
{{ include "spark-operator.webhook.name" . }}
{{- end -}}
{{/*
Create the name of the lease resource to be used by leader election
*/}}
{{- define "spark-operator.webhook.leaderElectionName" -}}
{{ include "spark-operator.webhook.name" . }}-lock
{{- end -}}
{{/*
Create the name of the pod disruption budget to be used by webhook
*/}}
{{- define "spark-operator.webhook.podDisruptionBudgetName" -}}
{{ include "spark-operator.webhook.name" . }}-pdb
{{- end -}}
{{/*
Create the role policy rules for the webhook in every Spark job namespace
*/}}
{{- define "spark-operator.webhook.policyRules" -}}
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- resourcequotas
verbs:
- get
- list
- watch
- apiGroups:
- sparkoperator.k8s.io
resources:
- sparkapplications
- sparkapplications/status
- sparkapplications/finalizers
- scheduledsparkapplications
- scheduledsparkapplications/status
- scheduledsparkapplications/finalizers
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
{{- end -}} {{- end -}}

View File

@ -0,0 +1,159 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "spark-operator.webhook.deploymentName" . }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.webhook.replicas }}
selector:
matchLabels:
{{- include "spark-operator.webhook.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "spark-operator.webhook.selectorLabels" . | nindent 8 }}
{{- with .Values.webhook.labels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.annotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
containers:
- name: spark-operator-webhook
image: {{ include "spark-operator.image" . }}
{{- with .Values.image.pullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
args:
- webhook
- start
{{- with .Values.webhook.logLevel }}
- --zap-log-level={{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if has "" . }}
- --namespaces=""
{{- else }}
- --namespaces={{ . | join "," }}
{{- end }}
{{- end }}
- --webhook-secret-name={{ include "spark-operator.webhook.secretName" . }}
- --webhook-secret-namespace={{ .Release.Namespace }}
- --webhook-svc-name={{ include "spark-operator.webhook.serviceName" . }}
- --webhook-svc-namespace={{ .Release.Namespace }}
- --webhook-port={{ .Values.webhook.port }}
- --mutating-webhook-name={{ include "spark-operator.webhook.name" . }}
- --validating-webhook-name={{ include "spark-operator.webhook.name" . }}
{{- with .Values.webhook.resourceQuotaEnforcement.enable }}
- --enable-resource-quota-enforcement=true
{{- end }}
{{- if .Values.prometheus.metrics.enable }}
- --enable-metrics=true
- --metrics-bind-address=:{{ .Values.prometheus.metrics.port }}
- --metrics-endpoint={{ .Values.prometheus.metrics.endpoint }}
- --metrics-prefix={{ .Values.prometheus.metrics.prefix }}
- --metrics-labels=app_type
{{- end }}
- --leader-election=true
- --leader-election-lock-name={{ include "spark-operator.webhook.leaderElectionName" . }}
- --leader-election-lock-namespace={{ .Release.Namespace }}
ports:
- name: {{ .Values.webhook.portName | quote }}
containerPort: {{ .Values.webhook.port }}
{{- if .Values.prometheus.metrics.enable }}
- name: {{ .Values.prometheus.metrics.portName | quote }}
containerPort: {{ .Values.prometheus.metrics.port }}
{{- end }}
{{- with .Values.webhook.env }}
env:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.envFrom }}
envFrom:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.volumeMounts }}
volumeMounts:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- with .Values.webhook.resources }}
resources:
{{- toYaml . | nindent 10 }}
{{- end }}
livenessProbe:
httpGet:
port: 8081
scheme: HTTP
path: /healthz
readinessProbe:
httpGet:
port: 8081
scheme: HTTP
path: /readyz
{{- with .Values.webhook.securityContext }}
securityContext:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- with .Values.webhook.sidecars }}
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.image.pullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.volumes }}
volumes:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.priorityClassName }}
priorityClassName: {{ . }}
{{- end }}
serviceAccountName: {{ include "spark-operator.webhook.serviceAccountName" . }}
{{- with .Values.webhook.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.webhook.topologySpreadConstraints }}
{{- if le (int .Values.webhook.replicas) 1 }}
{{- fail "webhook.replicas must be greater than 1 to enable topology spread constraints for webhook pods"}}
{{- end }}
{{- $selectorLabels := include "spark-operator.webhook.selectorLabels" . | fromYaml }}
{{- $labelSelectorDict := dict "labelSelector" ( dict "matchLabels" $selectorLabels ) }}
topologySpreadConstraints:
{{- range .Values.webhook.topologySpreadConstraints }}
- {{ mergeOverwrite . $labelSelectorDict | toYaml | nindent 8 | trim }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,124 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
name: {{ include "spark-operator.webhook.name" . }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
webhooks:
- name: mutate--v1-pod.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /mutate--v1-pod
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
objectSelector:
matchLabels:
sparkoperator.k8s.io/launched-by-spark-operator: "true"
rules:
- apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
operations: ["CREATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
- name: mutate-sparkoperator-k8s-io-v1beta2-sparkapplication.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /mutate-sparkoperator-k8s-io-v1beta2-sparkapplication
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
rules:
- apiGroups: ["sparkoperator.k8s.io"]
apiVersions: ["v1beta2"]
resources: ["sparkapplications"]
operations: ["CREATE", "UPDATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
- name: mutate-sparkoperator-k8s-io-v1beta2-scheduledsparkapplication.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /mutate-sparkoperator-k8s-io-v1beta2-scheduledsparkapplication
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
rules:
- apiGroups: ["sparkoperator.k8s.io"]
apiVersions: ["v1beta2"]
resources: ["scheduledsparkapplications"]
operations: ["CREATE", "UPDATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,36 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
{{- if .Values.webhook.podDisruptionBudget.enable }}
{{- if le (int .Values.webhook.replicas) 1 }}
{{- fail "webhook.replicas must be greater than 1 to enable pod disruption budget for webhook" }}
{{- end -}}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "spark-operator.webhook.podDisruptionBudgetName" . }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "spark-operator.webhook.selectorLabels" . | nindent 6 }}
{{- with .Values.webhook.podDisruptionBudget.minAvailable }}
minAvailable: {{ . }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,193 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
{{- if .Values.webhook.rbac.create }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "spark-operator.webhook.clusterRoleName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- update
- patch
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
verbs:
- list
- watch
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
resourceNames:
- {{ include "spark-operator.webhook.name" . }}
verbs:
- get
- update
{{- if not .Values.spark.jobNamespaces | or (has "" .Values.spark.jobNamespaces) }}
{{ include "spark-operator.webhook.policyRules" . }}
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "spark-operator.webhook.clusterRoleBindingName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.webhook.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "spark-operator.webhook.clusterRoleName" . }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.webhook.roleName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- secrets
verbs:
- create
- apiGroups:
- ""
resources:
- secrets
resourceNames:
- {{ include "spark-operator.webhook.secretName" . }}
verbs:
- get
- update
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- create
- apiGroups:
- coordination.k8s.io
resources:
- leases
resourceNames:
- {{ include "spark-operator.webhook.leaderElectionName" . }}
verbs:
- get
- update
{{- if has .Release.Namespace .Values.spark.jobNamespaces }}
{{ include "spark-operator.webhook.policyRules" . }}
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.webhook.roleBindingName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.webhook.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.webhook.roleName" . }}
{{- if and .Values.spark.jobNamespaces (not (has "" .Values.spark.jobNamespaces)) }}
{{- range $jobNamespace := .Values.spark.jobNamespaces }}
{{- if ne $jobNamespace $.Release.Namespace }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.webhook.roleName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.webhook.labels" $ | nindent 4 }}
{{- with $.Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
{{ include "spark-operator.webhook.policyRules" $ }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.webhook.roleBindingName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.webhook.labels" $ | nindent 4 }}
{{- with $.Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.webhook.serviceAccountName" $ }}
namespace: {{ $.Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.webhook.roleName" $ }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -1,13 +0,0 @@
{{- if .Values.webhook.enable -}}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "spark-operator.webhookSecretName" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
data:
ca-key.pem: ""
ca-cert.pem: ""
server-key.pem: ""
server-cert.pem: ""
{{- end }}

View File

@ -1,15 +1,31 @@
{{- if .Values.webhook.enable -}} {{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
apiVersion: v1 apiVersion: v1
kind: Service kind: Service
metadata: metadata:
name: {{ include "spark-operator.webhookServiceName" . }} name: {{ include "spark-operator.webhook.serviceName" . }}
labels: labels:
{{- include "spark-operator.labels" . | nindent 4 }} {{- include "spark-operator.webhook.labels" . | nindent 4 }}
spec: spec:
selector: selector:
{{- include "spark-operator.selectorLabels" . | nindent 4 }} {{- include "spark-operator.webhook.selectorLabels" . | nindent 4 }}
ports: ports:
- port: 443 - port: {{ .Values.webhook.port }}
targetPort: {{ .Values.webhook.portName | quote }} targetPort: {{ .Values.webhook.portName | quote }}
name: {{ .Values.webhook.portName }} name: {{ .Values.webhook.portName }}
{{- end }} {{- end }}

View File

@ -0,0 +1,31 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
{{- if .Values.webhook.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "spark-operator.webhook.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,89 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: {{ include "spark-operator.webhook.name" . }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
webhooks:
- name: validate-sparkoperator-k8s-io-v1beta2-sparkapplication.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /validate-sparkoperator-k8s-io-v1beta2-sparkapplication
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
rules:
- apiGroups: ["sparkoperator.k8s.io"]
apiVersions: ["v1beta2"]
resources: ["sparkapplications"]
operations: ["CREATE", "UPDATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
- name: validate-sparkoperator-k8s-io-v1beta2-scheduledsparkapplication.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /validate-sparkoperator-k8s-io-v1beta2-scheduledsparkapplication
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
rules:
- apiGroups: ["sparkoperator.k8s.io"]
apiVersions: ["v1beta2"]
resources: ["scheduledsparkapplications"]
operations: ["CREATE", "UPDATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,626 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller deployment
templates:
- controller/deployment.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should use the specified image repository if `image.registry`, `image.repository` and `image.tag` are set
set:
image:
registry: test-registry
repository: test-repository
tag: test-tag
asserts:
- equal:
path: spec.template.spec.containers[0].image
value: test-registry/test-repository:test-tag
- it: Should use the specified image pull policy if `image.pullPolicy` is set
set:
image:
pullPolicy: Always
asserts:
- equal:
path: spec.template.spec.containers[*].imagePullPolicy
value: Always
- it: Should set replicas if `controller.replicas` is set
set:
controller:
replicas: 10
asserts:
- equal:
path: spec.replicas
value: 10
- it: Should set replicas if `controller.replicas` is set
set:
controller:
replicas: 0
asserts:
- equal:
path: spec.replicas
value: 0
- it: Should add pod labels if `controller.labels` is set
set:
controller:
labels:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.metadata.labels.key1
value: value1
- equal:
path: spec.template.metadata.labels.key2
value: value2
- it: Should add prometheus annotations if `metrics.enable` is true
set:
prometheus:
metrics:
enable: true
port: 10254
endpoint: /metrics
asserts:
- equal:
path: spec.template.metadata.annotations["prometheus.io/scrape"]
value: "true"
- equal:
path: spec.template.metadata.annotations["prometheus.io/port"]
value: "10254"
- equal:
path: spec.template.metadata.annotations["prometheus.io/path"]
value: /metrics
- it: Should add pod annotations if `controller.annotations` is set
set:
controller:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.metadata.annotations.key1
value: value1
- equal:
path: spec.template.metadata.annotations.key2
value: value2
- it: Should contain `--zap-log-level` arg if `controller.logLevel` is set
set:
controller:
logLevel: debug
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --zap-log-level=debug
- it: Should contain `--namespaces` arg if `spark.jobNamespaces` is set
set:
spark:
jobNamespaces:
- ns1
- ns2
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --namespaces=ns1,ns2
- it: Should set namespaces to all namespaces (`""`) if `spark.jobNamespaces` contains empty string
set:
spark:
jobNamespaces:
- ""
- default
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --namespaces=""
- it: Should contain `--controller-threads` arg if `controller.workers` is set
set:
controller:
workers: 30
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --controller-threads=30
- it: Should contain `--enable-ui-service` arg if `controller.uiService.enable` is set to `true`
set:
controller:
uiService:
enable: true
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --enable-ui-service=true
- it: Should contain `--ingress-url-format` arg if `controller.uiIngress.enable` is set to `true` and `controller.uiIngress.urlFormat` is set
set:
controller:
uiService:
enable: true
uiIngress:
enable: true
urlFormat: "{{$appName}}.example.com/{{$appNamespace}}/{{$appName}}"
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --ingress-url-format={{$appName}}.example.com/{{$appNamespace}}/{{$appName}}
- it: Should contain `--enable-batch-scheduler` arg if `controller.batchScheduler.enable` is `true`
set:
controller:
batchScheduler:
enable: true
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --enable-batch-scheduler=true
- it: Should contain `--default-batch-scheduler` arg if `controller.batchScheduler.default` is set
set:
controller:
batchScheduler:
enable: true
default: yunikorn
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --default-batch-scheduler=yunikorn
- it: Should contain `--enable-metrics` arg if `prometheus.metrics.enable` is set to `true`
set:
prometheus:
metrics:
enable: true
port: 12345
portName: test-port
endpoint: /test-endpoint
prefix: test-prefix
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --enable-metrics=true
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --metrics-bind-address=:12345
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --metrics-endpoint=/test-endpoint
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --metrics-prefix=test-prefix
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --metrics-labels=app_type
- it: Should enable leader election by default
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --leader-election=true
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --leader-election-lock-name=spark-operator-controller-lock
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --leader-election-lock-namespace=spark-operator
- it: Should add metric ports if `prometheus.metrics.enable` is true
set:
prometheus:
metrics:
enable: true
port: 10254
portName: metrics
asserts:
- contains:
path: spec.template.spec.containers[0].ports
content:
name: metrics
containerPort: 10254
count: 1
- it: Should add environment variables if `controller.env` is set
set:
controller:
env:
- name: ENV_NAME_1
value: ENV_VALUE_1
- name: ENV_NAME_2
valueFrom:
configMapKeyRef:
name: test-configmap
key: test-key
optional: false
asserts:
- contains:
path: spec.template.spec.containers[0].env
content:
name: ENV_NAME_1
value: ENV_VALUE_1
- contains:
path: spec.template.spec.containers[0].env
content:
name: ENV_NAME_2
valueFrom:
configMapKeyRef:
name: test-configmap
key: test-key
optional: false
- it: Should add environment variable sources if `controller.envFrom` is set
set:
controller:
envFrom:
- configMapRef:
name: test-configmap
optional: false
- secretRef:
name: test-secret
optional: false
asserts:
- contains:
path: spec.template.spec.containers[0].envFrom
content:
configMapRef:
name: test-configmap
optional: false
- contains:
path: spec.template.spec.containers[0].envFrom
content:
secretRef:
name: test-secret
optional: false
- it: Should add volume mounts if `controller.volumeMounts` is set
set:
controller:
volumeMounts:
- name: volume1
mountPath: /volume1
- name: volume2
mountPath: /volume2
asserts:
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume1
mountPath: /volume1
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume2
mountPath: /volume2
- it: Should add resources if `controller.resources` is set
set:
controller:
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
asserts:
- equal:
path: spec.template.spec.containers[0].resources
value:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- it: Should add container securityContext if `controller.securityContext` is set
set:
controller:
securityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
asserts:
- equal:
path: spec.template.spec.containers[0].securityContext
value:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
- it: Should add sidecars if `controller.sidecars` is set
set:
controller:
sidecars:
- name: sidecar1
image: sidecar-image1
- name: sidecar2
image: sidecar-image2
asserts:
- contains:
path: spec.template.spec.containers
content:
name: sidecar1
image: sidecar-image1
- contains:
path: spec.template.spec.containers
content:
name: sidecar2
image: sidecar-image2
- it: Should add secrets if `image.pullSecrets` is set
set:
image:
pullSecrets:
- name: test-secret1
- name: test-secret2
asserts:
- equal:
path: spec.template.spec.imagePullSecrets[0].name
value: test-secret1
- equal:
path: spec.template.spec.imagePullSecrets[1].name
value: test-secret2
- it: Should add volumes if `controller.volumes` is set
set:
controller:
volumes:
- name: volume1
emptyDir: {}
- name: volume2
emptyDir: {}
asserts:
- contains:
path: spec.template.spec.volumes
content:
name: volume1
emptyDir: {}
count: 1
- contains:
path: spec.template.spec.volumes
content:
name: volume2
emptyDir: {}
count: 1
- it: Should add nodeSelector if `controller.nodeSelector` is set
set:
controller:
nodeSelector:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.spec.nodeSelector.key1
value: value1
- equal:
path: spec.template.spec.nodeSelector.key2
value: value2
- it: Should add affinity if `controller.affinity` is set
set:
controller:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
asserts:
- equal:
path: spec.template.spec.affinity
value:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
- it: Should add tolerations if `controller.tolerations` is set
set:
controller:
tolerations:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule
asserts:
- equal:
path: spec.template.spec.tolerations
value:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule
- it: Should add priorityClassName if `controller.priorityClassName` is set
set:
controller:
priorityClassName: test-priority-class
asserts:
- equal:
path: spec.template.spec.priorityClassName
value: test-priority-class
- it: Should add pod securityContext if `controller.podSecurityContext` is set
set:
controller:
podSecurityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
asserts:
- equal:
path: spec.template.spec.securityContext
value:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
- it: Should not contain topologySpreadConstraints if `controller.topologySpreadConstraints` is not set
set:
controller:
topologySpreadConstraints: []
asserts:
- notExists:
path: spec.template.spec.topologySpreadConstraints
- it: Should add topologySpreadConstraints if `controller.topologySpreadConstraints` is set and `controller.replicas` is greater than 1
set:
controller:
replicas: 2
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
asserts:
- equal:
path: spec.template.spec.topologySpreadConstraints
value:
- labelSelector:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: spark-operator
app.kubernetes.io/name: spark-operator
maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- labelSelector:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: spark-operator
app.kubernetes.io/name: spark-operator
maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
- it: Should fail if `controller.topologySpreadConstraints` is set and `controller.replicas` is not greater than 1
set:
controller:
replicas: 1
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
asserts:
- failedTemplate:
errorMessage: "controller.replicas must be greater than 1 to enable topology spread constraints for controller pods"
- it: Should contain `--pprof-bind-address` arg if `controller.pprof.enable` is set to `true`
set:
controller:
pprof:
enable: true
port: 12345
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --pprof-bind-address=:12345
- it: Should add pprof ports if `controller.pprof.enable` is set to `true`
set:
controller:
pprof:
enable: true
port: 12345
portName: pprof-test
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].ports
content:
name: pprof-test
containerPort: 12345
count: 1
- it: Should contain `--workqueue-ratelimiter-max-delay` arg if `controller.workqueueRateLimiter.maxDelay.enable` is set to `true`
set:
controller:
workqueueRateLimiter:
bucketQPS: 1
bucketSize: 2
maxDelay:
enable: true
duration: 3h
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --workqueue-ratelimiter-bucket-qps=1
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --workqueue-ratelimiter-bucket-size=2
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --workqueue-ratelimiter-max-delay=3h
- it: Should contain `--workqueue-ratelimiter-max-delay` arg if `controller.workqueueRateLimiter.maxDelay.enable` is set to `true`
set:
controller:
maxDelay:
enable: false
duration: 1h
asserts:
- notContains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --workqueue-ratelimiter-max-delay=1h

View File

@ -0,0 +1,68 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller pod disruption budget
templates:
- controller/poddisruptionbudget.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not render podDisruptionBudget if `controller.podDisruptionBudget.enable` is false
set:
controller:
podDisruptionBudget:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should fail if `controller.replicas` is less than 2 when `controller.podDisruptionBudget.enable` is true
set:
controller:
replicas: 1
podDisruptionBudget:
enable: true
asserts:
- failedTemplate:
errorMessage: "controller.replicas must be greater than 1 to enable pod disruption budget for controller"
- it: Should render spark operator podDisruptionBudget if `controller.podDisruptionBudget.enable` is true
set:
controller:
replicas: 2
podDisruptionBudget:
enable: true
asserts:
- containsDocument:
apiVersion: policy/v1
kind: PodDisruptionBudget
name: spark-operator-controller-pdb
- it: Should set minAvailable if `controller.podDisruptionBudget.minAvailable` is specified
set:
controller:
replicas: 2
podDisruptionBudget:
enable: true
minAvailable: 3
asserts:
- equal:
path: spec.minAvailable
value: 3

View File

@ -0,0 +1,165 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller rbac
templates:
- controller/rbac.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create controller RBAC resources if `controller.rbac.create` is false
set:
controller:
rbac:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should create controller ClusterRole by default
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
name: spark-operator-controller
- it: Should create controller ClusterRoleBinding by default
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
name: spark-operator-controller
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-controller
namespace: spark-operator
count: 1
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: spark-operator-controller
- it: Should add extra annotations to controller ClusterRole if `controller.rbac.annotations` is set
set:
controller:
rbac:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2
- it: Should create role and rolebinding for controller in release namespace
documentIndex: 2
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-controller
namespace: spark-operator
- it: Should create role and rolebinding for controller in release namespace
documentIndex: 3
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-controller
namespace: spark-operator
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-controller
namespace: spark-operator
count: 1
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark-operator-controller
- it: Should create roles and rolebindings for controller in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 4
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-controller
namespace: default
- it: Should create roles and rolebindings for controller in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 5
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-controller
namespace: default
- it: Should create roles and rolebindings for controller in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 6
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-controller
namespace: spark
- it: Should create roles and rolebindings for controller in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 7
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-controller
namespace: spark

View File

@ -0,0 +1,44 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller deployment
templates:
- controller/service.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should create the pprof service correctly
set:
controller:
pprof:
enable: true
port: 12345
portName: pprof-test
asserts:
- containsDocument:
apiVersion: v1
kind: Service
name: spark-operator-controller-svc
- equal:
path: spec.ports[0]
value:
port: 12345
targetPort: pprof-test
name: pprof-test

View File

@ -0,0 +1,67 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller service account
templates:
- controller/serviceaccount.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create controller service account if `controller.serviceAccount.create` is false
set:
controller:
serviceAccount:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should create controller service account by default
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-controller
- it: Should use the specified service account name if `controller.serviceAccount.name` is set
set:
controller:
serviceAccount:
name: custom-service-account
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: custom-service-account
- it: Should add extra annotations if `controller.serviceAccount.annotations` is set
set:
controller:
serviceAccount:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2

View File

@ -1,301 +0,0 @@
suite: Test spark operator deployment
templates:
- deployment.yaml
release:
name: spark-operator
tests:
- it: Should contain namespace arg when sparkJobNamespaces is equal to 1
set:
sparkJobNamespaces:
- ns1
asserts:
- contains:
path: spec.template.spec.containers[0].args
content: -namespace=ns1
- it: Should add pod annotations if podAnnotations is set
set:
podAnnotations:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.metadata.annotations.key1
value: value1
- equal:
path: spec.template.metadata.annotations.key2
value: value2
- it: Should add prometheus annotations if metrics.enable is true
set:
metrics:
enable: true
port: 10254
endpoint: /metrics
asserts:
- equal:
path: spec.template.metadata.annotations["prometheus.io/scrape"]
value: "true"
- equal:
path: spec.template.metadata.annotations["prometheus.io/port"]
value: "10254"
- equal:
path: spec.template.metadata.annotations["prometheus.io/path"]
value: /metrics
- it: Should add secrets if imagePullSecrets is set
set:
imagePullSecrets:
- name: test-secret1
- name: test-secret2
asserts:
- equal:
path: spec.template.spec.imagePullSecrets[0].name
value: test-secret1
- equal:
path: spec.template.spec.imagePullSecrets[1].name
value: test-secret2
- it: Should add pod securityContext if podSecurityContext is set
set:
podSecurityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
asserts:
- equal:
path: spec.template.spec.securityContext.runAsUser
value: 1000
- equal:
path: spec.template.spec.securityContext.runAsGroup
value: 2000
- equal:
path: spec.template.spec.securityContext.fsGroup
value: 3000
- it: Should use the specified image repository if image.repository and image.tag is set
set:
image:
repository: test-repository
tag: test-tag
asserts:
- equal:
path: spec.template.spec.containers[0].image
value: test-repository:test-tag
- it: Should use the specified image pull policy if image.pullPolicy is set
set:
image:
pullPolicy: Always
asserts:
- equal:
path: spec.template.spec.containers[0].imagePullPolicy
value: Always
- it: Should add container securityContext if securityContext is set
set:
securityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
asserts:
- equal:
path: spec.template.spec.containers[0].securityContext.runAsUser
value: 1000
- equal:
path: spec.template.spec.containers[0].securityContext.runAsGroup
value: 2000
- equal:
path: spec.template.spec.containers[0].securityContext.fsGroup
value: 3000
- it: Should add metric ports if metrics.enable is true
set:
metrics:
enable: true
port: 10254
portName: metrics
asserts:
- contains:
path: spec.template.spec.containers[0].ports
content:
name: metrics
containerPort: 10254
count: 1
- it: Should add webhook ports if webhook.enable is true
set:
webhook:
enable: true
port: 8080
portName: webhook
asserts:
- contains:
path: spec.template.spec.containers[0].ports
content:
name: webhook
containerPort: 8080
count: 1
- it: Should add resources if resources is set
set:
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
asserts:
- equal:
path: spec.template.spec.containers[0].resources
value:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- it: Should add sidecars if sidecars is set
set:
sidecars:
- name: sidecar1
image: sidecar-image1
- name: sidecar2
image: sidecar-image2
asserts:
- contains:
path: spec.template.spec.containers
content:
name: sidecar1
image: sidecar-image1
count: 1
- contains:
path: spec.template.spec.containers
content:
name: sidecar2
image: sidecar-image2
count: 1
- it: Should add volumes if volumes is set
set:
volumes:
- name: volume1
emptyDir: {}
- name: volume2
emptyDir: {}
asserts:
- contains:
path: spec.template.spec.volumes
content:
name: volume1
emptyDir: {}
count: 1
- contains:
path: spec.template.spec.volumes
content:
name: volume2
emptyDir: {}
count: 1
- it: Should add volume mounts if volumeMounts is set
set:
volumeMounts:
- name: volume1
mountPath: /volume1
- name: volume2
mountPath: /volume2
asserts:
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume1
mountPath: /volume1
count: 1
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume2
mountPath: /volume2
count: 1
- it: Should add nodeSelector if nodeSelector is set
set:
nodeSelector:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.spec.nodeSelector.key1
value: value1
- equal:
path: spec.template.spec.nodeSelector.key2
value: value2
- it: Should add affinity if affinity is set
set:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
asserts:
- equal:
path: spec.template.spec.affinity
value:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
- it: Should add tolerations if tolerations is set
set:
tolerations:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule
asserts:
- equal:
path: spec.template.spec.tolerations
value:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule

View File

@ -1,37 +0,0 @@
suite: Test spark operator podDisruptionBudget
templates:
- poddisruptionbudget.yaml
release:
name: spark-operator
tests:
- it: Should not render spark operator podDisruptionBudget if podDisruptionBudget.enable is false
set:
podDisruptionBudget:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should render spark operator podDisruptionBudget if podDisruptionBudget.enable is true
set:
podDisruptionBudget:
enable: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: policy/v1
kind: PodDisruptionBudget
name: spark-operator-podDisruptionBudget
- it: Should set minAvailable from values
set:
podDisruptionBudget:
enable: true
minAvailable: 3
asserts:
- equal:
path: spec.template.minAvailable
value: 3

View File

@ -0,0 +1,102 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test prometheus pod monitor
templates:
- prometheus/podmonitor.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create pod monitor by default
asserts:
- hasDocuments:
count: 0
- it: Should fail if `prometheus.podMonitor.create` is true and `prometheus.metrics.enable` is false
set:
prometheus:
metrics:
enable: false
podMonitor:
create: true
asserts:
- failedTemplate:
errorMessage: "`metrics.enable` must be set to true when `podMonitor.create` is true."
- it: Should fail if the cluster does not support `monitoring.coreos.com/v1/PodMonitor` even if`prometheus.podMonitor.create` and `prometheus.metrics.enable` are both true
set:
prometheus:
metrics:
enable: true
podMonitor:
create: true
asserts:
- failedTemplate:
errorMessage: "The cluster does not support the required API version `monitoring.coreos.com/v1` for `PodMonitor`."
- it: Should create pod monitor if the cluster support `monitoring.coreos.com/v1/PodMonitor` and `prometheus.podMonitor.create` and `prometheus.metrics.enable` are both true
capabilities:
apiVersions:
- monitoring.coreos.com/v1/PodMonitor
set:
prometheus:
metrics:
enable: true
podMonitor:
create: true
asserts:
- containsDocument:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
name: spark-operator-podmonitor
- it: Should use the specified labels, jobLabel and podMetricsEndpoint
capabilities:
apiVersions:
- monitoring.coreos.com/v1/PodMonitor
set:
prometheus:
metrics:
enable: true
portName: custom-port
podMonitor:
create: true
labels:
key1: value1
key2: value2
jobLabel: custom-job-label
podMetricsEndpoint:
scheme: https
interval: 10s
asserts:
- equal:
path: metadata.labels
value:
key1: value1
key2: value2
- equal:
path: spec.podMetricsEndpoints[0]
value:
port: custom-port
scheme: https
interval: 10s
- equal:
path: spec.jobLabel
value: custom-job-label

View File

@ -1,90 +0,0 @@
suite: Test spark operator rbac
templates:
- rbac.yaml
release:
name: spark-operator
tests:
- it: Should not render spark operator rbac resources if rbac.create is false and rbac.createClusterRole is false
set:
rbac:
create: false
createClusterRole: false
asserts:
- hasDocuments:
count: 0
- it: Should render spark operator cluster role if rbac.create is true
set:
rbac:
create: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
name: spark-operator
- it: Should render spark operator cluster role if rbac.createClusterRole is true
set:
rbac:
createClusterRole: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
name: spark-operator
- it: Should render spark operator cluster role binding if rbac.create is true
set:
rbac:
create: true
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
name: spark-operator
- it: Should render spark operator cluster role binding correctly if rbac.createClusterRole is true
set:
rbac:
createClusterRole: true
release:
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
name: spark-operator
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator
namespace: NAMESPACE
count: 1
- equal:
path: roleRef
value:
kind: ClusterRole
name: spark-operator
apiGroup: rbac.authorization.k8s.io
- it: Should add extra annotations to spark operator cluster role if rbac.annotations is set
set:
rbac:
annotations:
key1: value1
key2: value2
documentIndex: 0
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2

View File

@ -1,54 +0,0 @@
suite: Test spark operator service account
templates:
- serviceaccount.yaml
release:
name: spark-operator
tests:
- it: Should not render service account if serviceAccounts.sparkoperator.create is false
set:
serviceAccounts:
sparkoperator:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should render service account if serviceAccounts.sparkoperator.create is true
set:
serviceAccounts:
sparkoperator:
create: true
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator
- it: Should use the specified service account name if serviceAccounts.sparkoperator.name is set
set:
serviceAccounts:
sparkoperator:
name: custom-service-account
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: custom-service-account
- it: Should add extra annotations if serviceAccounts.sparkoperator.annotations is set
set:
serviceAccounts:
sparkoperator:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2

View File

@ -1,133 +0,0 @@
suite: Test spark rbac
templates:
- spark-rbac.yaml
release:
name: spark-operator
tests:
- it: Should not render spark rbac resources if rbac.create is false and rbac.createRole is false
set:
rbac:
create: false
createRole: false
asserts:
- hasDocuments:
count: 0
- it: Should render spark role if rbac.create is true
set:
rbac:
create: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
- it: Should render spark role if rbac.createRole is true
set:
rbac:
createRole: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
- it: Should render spark role binding if rbac.create is true
set:
rbac:
create: true
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
- it: Should render spark role binding if rbac.createRole is true
set:
rbac:
createRole: true
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
- it: Should create a single spark role with namespace "" by default
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
namespace: ""
- it: Should create a single spark role binding with namespace "" by default
values:
- ../values.yaml
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
namespace: ""
- it: Should render multiple spark roles if sparkJobNamespaces is set with multiple values
set:
sparkJobNamespaces:
- ns1
- ns2
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
namespace: ns1
- it: Should render multiple spark role bindings if sparkJobNamespaces is set with multiple values
set:
sparkJobNamespaces:
- ns1
- ns2
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
namespace: ns1
- it: Should render multiple spark roles if sparkJobNamespaces is set with multiple values
set:
sparkJobNamespaces:
- ns1
- ns2
documentIndex: 2
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
namespace: ns2
- it: Should render multiple spark role bindings if sparkJobNamespaces is set with multiple values
set:
sparkJobNamespaces:
- ns1
- ns2
documentIndex: 3
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
namespace: ns2

View File

@ -1,112 +0,0 @@
suite: Test spark service account
templates:
- spark-serviceaccount.yaml
release:
name: spark-operator
tests:
- it: Should not render service account if serviceAccounts.spark.create is false
set:
serviceAccounts:
spark:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should render service account if serviceAccounts.spark.create is true
set:
serviceAccounts:
spark:
create: true
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-spark
- it: Should use the specified service account name if serviceAccounts.spark.name is set
set:
serviceAccounts:
spark:
name: spark
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
- it: Should add extra annotations if serviceAccounts.spark.annotations is set
set:
serviceAccounts:
spark:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2
- it: Should create multiple service accounts if sparkJobNamespaces is set
set:
serviceAccounts:
spark:
name: spark
sparkJobNamespaces:
- ns1
- ns2
- ns3
documentIndex: 0
asserts:
- hasDocuments:
count: 3
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
namespace: ns1
- it: Should create multiple service accounts if sparkJobNamespaces is set
set:
serviceAccounts:
spark:
name: spark
sparkJobNamespaces:
- ns1
- ns2
- ns3
documentIndex: 1
asserts:
- hasDocuments:
count: 3
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
namespace: ns2
- it: Should create multiple service accounts if sparkJobNamespaces is set
set:
serviceAccounts:
spark:
name: spark
sparkJobNamespaces:
- ns1
- ns2
- ns3
documentIndex: 2
asserts:
- hasDocuments:
count: 3
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
namespace: ns3

View File

@ -0,0 +1,182 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test Spark RBAC
templates:
- spark/rbac.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create RBAC resources for Spark if `spark.rbac.create` is false
set:
spark:
rbac:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should create RBAC resources for Spark in namespace `default` by default
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-spark
namespace: default
- it: Should create RBAC resources for Spark in namespace `default` by default
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-spark
namespace: default
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-spark
namespace: default
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark-operator-spark
- it: Should create RBAC resources for Spark in every Spark job namespace
set:
spark:
jobNamespaces:
- ns1
- ns2
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-spark
namespace: ns1
- it: Should create RBAC resources for Spark in every Spark job namespace
set:
spark:
jobNamespaces:
- ns1
- ns2
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-spark
namespace: ns1
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-spark
namespace: ns1
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark-operator-spark
- it: Should create RBAC resources for Spark in every Spark job namespace
set:
spark:
jobNamespaces:
- ns1
- ns2
documentIndex: 2
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-spark
namespace: ns2
- it: Should create RBAC resources for Spark in every Spark job namespace
set:
spark:
jobNamespaces:
- ns1
- ns2
documentIndex: 3
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-spark
namespace: ns2
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-spark
namespace: ns2
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark-operator-spark
- it: Should use the specified service account name if `spark.serviceAccount.name` is set
set:
spark:
serviceAccount:
name: spark
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark
namespace: default
- it: Should use the specified service account name if `spark.serviceAccount.name` is set
set:
spark:
serviceAccount:
name: spark
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
namespace: default
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark
namespace: default
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark

View File

@ -0,0 +1,101 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test spark service account
templates:
- spark/serviceaccount.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create service account if `spark.serviceAccount.create` is false
set:
spark:
serviceAccount:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should create service account by default
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-spark
- it: Should use the specified service account name if `spark.serviceAccount.name` is set
set:
spark:
serviceAccount:
name: spark
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
- it: Should add extra annotations if `spark.serviceAccount.annotations` is set
set:
spark:
serviceAccount:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2
- it: Should create service account for every non-empty spark job namespace if `spark.jobNamespaces` is set with multiple values
set:
spark:
jobNamespaces:
- ""
- ns1
- ns2
documentIndex: 0
asserts:
- hasDocuments:
count: 2
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-spark
namespace: ns1
- it: Should create service account for every non-empty spark job namespace if `spark.jobNamespaces` is set with multiple values
set:
spark:
jobNamespaces:
- ""
- ns1
- ns2
documentIndex: 1
asserts:
- hasDocuments:
count: 2
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-spark
namespace: ns2

View File

@ -0,0 +1,532 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test webhook deployment
templates:
- webhook/deployment.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should create webhook deployment by default
asserts:
- containsDocument:
apiVersion: apps/v1
kind: Deployment
name: spark-operator-webhook
- it: Should not create webhook deployment if `webhook.enable` is `false`
set:
webhook:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should set replicas if `webhook.replicas` is set
set:
webhook:
replicas: 10
asserts:
- equal:
path: spec.replicas
value: 10
- it: Should set replicas if `webhook.replicas` is set
set:
webhook:
replicas: 0
asserts:
- equal:
path: spec.replicas
value: 0
- it: Should add pod labels if `webhook.labels` is set
set:
webhook:
labels:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.metadata.labels.key1
value: value1
- equal:
path: spec.template.metadata.labels.key2
value: value2
- it: Should add pod annotations if `webhook.annotations` is set
set:
webhook:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.metadata.annotations.key1
value: value1
- equal:
path: spec.template.metadata.annotations.key2
value: value2
- it: Should use the specified image repository if `image.registry`, `image.repository` and `image.tag` are set
set:
image:
registry: test-registry
repository: test-repository
tag: test-tag
asserts:
- equal:
path: spec.template.spec.containers[0].image
value: test-registry/test-repository:test-tag
- it: Should use the specified image pull policy if `image.pullPolicy` is set
set:
image:
pullPolicy: Always
asserts:
- equal:
path: spec.template.spec.containers[0].imagePullPolicy
value: Always
- it: Should contain `--zap-log-level` arg if `webhook.logLevel` is set
set:
webhook:
logLevel: debug
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --zap-log-level=debug
- it: Should contain `--namespaces` arg if `spark.jobNamespaces` is set
set:
spark.jobNamespaces:
- ns1
- ns2
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --namespaces=ns1,ns2
- it: Should set namespaces to all namespaces (`""`) if `spark.jobNamespaces` contains empty string
set:
spark:
jobNamespaces:
- ""
- default
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --namespaces=""
- it: Should contain `--enable-metrics` arg if `prometheus.metrics.enable` is set to `true`
set:
prometheus:
metrics:
enable: true
port: 12345
portName: test-port
endpoint: /test-endpoint
prefix: test-prefix
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --enable-metrics=true
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --metrics-bind-address=:12345
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --metrics-endpoint=/test-endpoint
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --metrics-prefix=test-prefix
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --metrics-labels=app_type
- it: Should enable leader election by default
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --leader-election=true
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --leader-election-lock-name=spark-operator-webhook-lock
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-webhook")].args
content: --leader-election-lock-namespace=spark-operator
- it: Should add webhook port
set:
webhook:
port: 12345
portName: test-port
asserts:
- contains:
path: spec.template.spec.containers[0].ports
content:
name: test-port
containerPort: 12345
- it: Should add metric port if `prometheus.metrics.enable` is true
set:
prometheus:
metrics:
enable: true
port: 10254
portName: metrics
asserts:
- contains:
path: spec.template.spec.containers[0].ports
content:
name: metrics
containerPort: 10254
count: 1
- it: Should add environment variables if `webhook.env` is set
set:
webhook:
env:
- name: ENV_NAME_1
value: ENV_VALUE_1
- name: ENV_NAME_2
valueFrom:
configMapKeyRef:
name: test-configmap
key: test-key
optional: false
asserts:
- contains:
path: spec.template.spec.containers[0].env
content:
name: ENV_NAME_1
value: ENV_VALUE_1
- contains:
path: spec.template.spec.containers[0].env
content:
name: ENV_NAME_2
valueFrom:
configMapKeyRef:
name: test-configmap
key: test-key
optional: false
- it: Should add environment variable sources if `webhook.envFrom` is set
set:
webhook:
envFrom:
- configMapRef:
name: test-configmap
optional: false
- secretRef:
name: test-secret
optional: false
asserts:
- contains:
path: spec.template.spec.containers[0].envFrom
content:
configMapRef:
name: test-configmap
optional: false
- contains:
path: spec.template.spec.containers[0].envFrom
content:
secretRef:
name: test-secret
optional: false
- it: Should add volume mounts if `webhook.volumeMounts` is set
set:
webhook:
volumeMounts:
- name: volume1
mountPath: /volume1
- name: volume2
mountPath: /volume2
asserts:
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume1
mountPath: /volume1
count: 1
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume2
mountPath: /volume2
count: 1
- it: Should add resources if `webhook.resources` is set
set:
webhook:
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
asserts:
- equal:
path: spec.template.spec.containers[0].resources
value:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- it: Should add container securityContext if `webhook.securityContext` is set
set:
webhook:
securityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
asserts:
- equal:
path: spec.template.spec.containers[0].securityContext.runAsUser
value: 1000
- equal:
path: spec.template.spec.containers[0].securityContext.runAsGroup
value: 2000
- equal:
path: spec.template.spec.containers[0].securityContext.fsGroup
value: 3000
- it: Should add sidecars if `webhook.sidecars` is set
set:
webhook:
sidecars:
- name: sidecar1
image: sidecar-image1
- name: sidecar2
image: sidecar-image2
asserts:
- contains:
path: spec.template.spec.containers
content:
name: sidecar1
image: sidecar-image1
- contains:
path: spec.template.spec.containers
content:
name: sidecar2
image: sidecar-image2
- it: Should add secrets if `image.pullSecrets` is set
set:
image:
pullSecrets:
- name: test-secret1
- name: test-secret2
asserts:
- equal:
path: spec.template.spec.imagePullSecrets[0].name
value: test-secret1
- equal:
path: spec.template.spec.imagePullSecrets[1].name
value: test-secret2
- it: Should add volumes if `webhook.volumes` is set
set:
webhook:
volumes:
- name: volume1
emptyDir: {}
- name: volume2
emptyDir: {}
asserts:
- contains:
path: spec.template.spec.volumes
content:
name: volume1
emptyDir: {}
count: 1
- contains:
path: spec.template.spec.volumes
content:
name: volume2
emptyDir: {}
count: 1
- it: Should add nodeSelector if `webhook.nodeSelector` is set
set:
webhook:
nodeSelector:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.spec.nodeSelector.key1
value: value1
- equal:
path: spec.template.spec.nodeSelector.key2
value: value2
- it: Should add affinity if `webhook.affinity` is set
set:
webhook:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
asserts:
- equal:
path: spec.template.spec.affinity
value:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
- it: Should add tolerations if `webhook.tolerations` is set
set:
webhook:
tolerations:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule
asserts:
- equal:
path: spec.template.spec.tolerations
value:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule
- it: Should add priorityClassName if `webhook.priorityClassName` is set
set:
webhook:
priorityClassName: test-priority-class
asserts:
- equal:
path: spec.template.spec.priorityClassName
value: test-priority-class
- it: Should add pod securityContext if `webhook.podSecurityContext` is set
set:
webhook:
podSecurityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
asserts:
- equal:
path: spec.template.spec.securityContext.runAsUser
value: 1000
- equal:
path: spec.template.spec.securityContext.runAsGroup
value: 2000
- equal:
path: spec.template.spec.securityContext.fsGroup
value: 3000
- it: Should not contain topologySpreadConstraints if `webhook.topologySpreadConstraints` is not set
set:
webhook:
topologySpreadConstraints: []
asserts:
- notExists:
path: spec.template.spec.topologySpreadConstraints
- it: Should add topologySpreadConstraints if `webhook.topologySpreadConstraints` is set and `webhook.replicas` is greater than 1
set:
webhook:
replicas: 2
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
asserts:
- equal:
path: spec.template.spec.topologySpreadConstraints
value:
- labelSelector:
matchLabels:
app.kubernetes.io/component: webhook
app.kubernetes.io/instance: spark-operator
app.kubernetes.io/name: spark-operator
maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- labelSelector:
matchLabels:
app.kubernetes.io/component: webhook
app.kubernetes.io/instance: spark-operator
app.kubernetes.io/name: spark-operator
maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
- it: Should fail if `webhook.topologySpreadConstraints` is set and `webhook.replicas` is not greater than 1
set:
webhook:
replicas: 1
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
asserts:
- failedTemplate:
errorMessage: "webhook.replicas must be greater than 1 to enable topology spread constraints for webhook pods"

View File

@ -0,0 +1,99 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test mutating webhook configuration
templates:
- webhook/mutatingwebhookconfiguration.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should create the mutating webhook configuration by default
asserts:
- containsDocument:
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
name: spark-operator-webhook
- it: Should not create the mutating webhook configuration if `webhook.enable` is `false`
set:
webhook:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should use the specified webhook port
set:
webhook:
port: 12345
asserts:
- equal:
path: webhooks[*].clientConfig.service.port
value: 12345
- it: Should use the specified failure policy
set:
webhook:
failurePolicy: Fail
asserts:
- equal:
path: webhooks[*].failurePolicy
value: Fail
- it: Should set namespaceSelector if `spark.jobNamespaces` is set with non-empty strings
set:
spark:
jobNamespaces:
- ns1
- ns2
- ns3
asserts:
- equal:
path: webhooks[*].namespaceSelector
value:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
- ns1
- ns2
- ns3
- it: Should not set namespaceSelector if `spark.jobNamespaces` contains empty string
set:
spark:
jobNamespaces:
- ""
- ns1
- ns2
- ns3
asserts:
- notExists:
path: webhooks[*].namespaceSelector
- it: Should should use the specified timeoutSeconds
set:
webhook:
timeoutSeconds: 5
asserts:
- equal:
path: webhooks[*].timeoutSeconds
value: 5

View File

@ -0,0 +1,76 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test webhook pod disruption budget
templates:
- webhook/poddisruptionbudget.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not render podDisruptionBudget if `webhook.enable` is `false`
set:
webhook:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should not render podDisruptionBudget if `webhook.podDisruptionBudget.enable` is false
set:
webhook:
podDisruptionBudget:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should fail if `webhook.replicas` is less than 2 when `webhook.podDisruptionBudget.enable` is true
set:
webhook:
replicas: 1
podDisruptionBudget:
enable: true
asserts:
- failedTemplate:
errorMessage: "webhook.replicas must be greater than 1 to enable pod disruption budget for webhook"
- it: Should render spark operator podDisruptionBudget if `webhook.podDisruptionBudget.enable` is true
set:
webhook:
replicas: 2
podDisruptionBudget:
enable: true
asserts:
- containsDocument:
apiVersion: policy/v1
kind: PodDisruptionBudget
name: spark-operator-webhook-pdb
- it: Should set minAvailable if `webhook.podDisruptionBudget.minAvailable` is specified
set:
webhook:
replicas: 2
podDisruptionBudget:
enable: true
minAvailable: 3
asserts:
- equal:
path: spec.minAvailable
value: 3

View File

@ -0,0 +1,165 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test webhook rbac
templates:
- webhook/rbac.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create webhook RBAC resources if `webhook.rbac.create` is false
set:
webhook:
rbac:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should create webhook ClusterRole by default
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
name: spark-operator-webhook
- it: Should create webhook ClusterRoleBinding by default
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
name: spark-operator-webhook
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-webhook
namespace: spark-operator
count: 1
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: spark-operator-webhook
- it: Should add extra annotations to webhook ClusterRole if `webhook.rbac.annotations` is set
set:
webhook:
rbac:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2
- it: Should create role and rolebinding for webhook in release namespace
documentIndex: 2
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-webhook
namespace: spark-operator
- it: Should create role and rolebinding for webhook in release namespace
documentIndex: 3
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-webhook
namespace: spark-operator
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-webhook
namespace: spark-operator
count: 1
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark-operator-webhook
- it: Should create roles and rolebindings for webhook in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 4
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-webhook
namespace: default
- it: Should create roles and rolebindings for webhook in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 5
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-webhook
namespace: default
- it: Should create roles and rolebindings for webhook in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 6
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-webhook
namespace: spark
- it: Should create roles and rolebindings for webhook in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 7
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-webhook
namespace: spark

View File

@ -1,31 +0,0 @@
suite: Test spark operator webhook secret
templates:
- webhook/secret.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not render the webhook secret if webhook.enable is false
asserts:
- hasDocuments:
count: 0
- it: Should render the webhook secret with empty data fields
set:
webhook:
enable: true
asserts:
- containsDocument:
apiVersion: v1
kind: Secret
name: spark-operator-webhook-certs
- equal:
path: data
value:
ca-key.pem: ""
ca-cert.pem: ""
server-key.pem: ""
server-cert.pem: ""

View File

@ -1,13 +1,30 @@
suite: Test spark operator webhook service #
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test webhook service
templates: templates:
- webhook/service.yaml - webhook/service.yaml
release: release:
name: spark-operator name: spark-operator
namespace: spark-operator
tests: tests:
- it: Should not render the webhook service if webhook.enable is false - it: Should not create webhook service if `webhook.enable` is `false`
set: set:
webhook: webhook:
enable: false enable: false
@ -15,10 +32,9 @@ tests:
- hasDocuments: - hasDocuments:
count: 0 count: 0
- it: Should render the webhook service correctly if webhook.enable is true - it: Should create the webhook service correctly
set: set:
webhook: webhook:
enable: true
portName: webhook portName: webhook
asserts: asserts:
- containsDocument: - containsDocument:
@ -28,6 +44,6 @@ tests:
- equal: - equal:
path: spec.ports[0] path: spec.ports[0]
value: value:
port: 443 port: 9443
targetPort: webhook targetPort: webhook
name: webhook name: webhook

View File

@ -0,0 +1,97 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test validating webhook configuration
templates:
- webhook/validatingwebhookconfiguration.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should create the validating webhook configuration by default
asserts:
- containsDocument:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
name: spark-operator-webhook
- it: Should not create the validating webhook configuration if `webhook.enable` is `false`
set:
webhook:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should use the specified webhook port
set:
webhook:
port: 12345
asserts:
- equal:
path: webhooks[*].clientConfig.service.port
value: 12345
- it: Should use the specified failure policy
set:
webhook:
failurePolicy: Fail
asserts:
- equal:
path: webhooks[*].failurePolicy
value: Fail
- it: Should set namespaceSelector if `spark.jobNamespaces` is set with non-empty strings
set:
spark.jobNamespaces:
- ns1
- ns2
- ns3
asserts:
- equal:
path: webhooks[*].namespaceSelector
value:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
- ns1
- ns2
- ns3
- it: Should not set namespaceSelector if `spark.jobNamespaces` contains empty string
set:
spark:
jobNamespaces:
- ""
- ns1
- ns2
- ns3
asserts:
- notExists:
path: webhooks[*].namespaceSelector
- it: Should should use the specified timeoutSeconds
set:
webhook:
timeoutSeconds: 5
asserts:
- equal:
path: webhooks[*].timeoutSeconds
value: 5

View File

@ -1,130 +1,354 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Default values for spark-operator. # Default values for spark-operator.
# This is a YAML-formatted file. # This is a YAML-formatted file.
# Declare variables to be passed into your templates. # Declare variables to be passed into your templates.
# -- Common labels to add to the resources # -- String to partially override release name.
commonLabels: {}
# replicaCount -- Desired number of pods, leaderElection will be enabled
# if this is greater than 1
replicaCount: 1
image:
# -- Image repository
repository: docker.io/kubeflow/spark-operator
# -- Image pull policy
pullPolicy: IfNotPresent
# -- if set, override the image tag whose default is the chart appVersion.
tag: ""
# -- Image pull secrets
imagePullSecrets: []
# -- String to partially override `spark-operator.fullname` template (will maintain the release name)
nameOverride: "" nameOverride: ""
# -- String to override release name # -- String to fully override release name.
fullnameOverride: "" fullnameOverride: ""
rbac: # -- Common labels to add to the resources.
# -- **DEPRECATED** use `createRole` and `createClusterRole` commonLabels: {}
create: false
# -- Create and use RBAC `Role` resources
createRole: true
# -- Create and use RBAC `ClusterRole` resources
createClusterRole: true
# -- Optional annotations for rbac
annotations: {}
serviceAccounts: image:
spark: # -- Image registry.
# -- Create a service account for spark apps registry: docker.io
create: true # -- Image repository.
# -- Optional name for the spark service account repository: kubeflow/spark-operator
name: "" # -- Image tag.
# -- Optional annotations for the spark service account # @default -- If not set, the chart appVersion will be used.
annotations: {} tag: ""
sparkoperator: # -- Image pull policy.
# -- Create a service account for the operator pullPolicy: IfNotPresent
create: true # -- Image pull secrets for private image registry.
# -- Optional name for the operator service account pullSecrets: []
name: "" # - name: <secret-name>
# -- Optional annotations for the operator service account
annotations: {}
# -- List of namespaces where to run spark jobs controller:
sparkJobNamespaces: # -- Number of replicas of controller.
- "" replicas: 1
# - ns1
# -- Operator concurrency, higher values might increase memory usage # -- Reconcile concurrency, higher values might increase memory usage.
controllerThreads: 10 workers: 10
# -- Operator resync interval. Note that the operator will respond to events (e.g. create, update) # -- Configure the verbosity of logging, can be one of `debug`, `info`, `error`.
# unrelated to this setting logLevel: info
resyncInterval: 30
uiService: uiService:
# -- Enable UI service creation for Spark application # -- Specifies whether to create service for Spark web UI.
enable: true enable: true
# -- Ingress URL format. uiIngress:
# Requires the UI service to be enabled by setting `uiService.enable` to true. # -- Specifies whether to create ingress for Spark web UI.
ingressUrlFormat: "" # `controller.uiService.enable` must be `true` to enable ingress.
enable: false
# -- Ingress URL format.
# Required if `controller.uiIngress.enable` is true.
urlFormat: ""
# -- Set higher levels for more verbose logging batchScheduler:
logLevel: 2 # -- Specifies whether to enable batch scheduler for spark jobs scheduling.
# If enabled, users can specify batch scheduler name in spark application.
enable: false
# -- Specifies a list of kube-scheduler names for scheduling Spark pods.
kubeSchedulerNames: []
# - default-scheduler
# -- Default batch scheduler to be used if not specified by the user.
# If specified, this value must be either "volcano" or "yunikorn". Specifying any other
# value will cause the controller to error on startup.
default: ""
# -- Pod environment variable sources serviceAccount:
envFrom: [] # -- Specifies whether to create a service account for the controller.
create: true
# -- Optional name for the controller service account.
name: ""
# -- Extra annotations for the controller service account.
annotations: {}
# podSecurityContext -- Pod security context rbac:
podSecurityContext: {} # -- Specifies whether to create RBAC resources for the controller.
create: true
# -- Extra annotations for the controller RBAC resources.
annotations: {}
# securityContext -- Operator container security context # -- Extra labels for controller pods.
securityContext: {} labels: {}
# key1: value1
# key2: value2
# sidecars -- Sidecar containers # -- Extra annotations for controller pods.
sidecars: [] annotations: {}
# key1: value1
# key2: value2
# volumes - Operator volumes # -- Volumes for controller pods.
volumes: [] volumes: []
# volumeMounts - Operator volumeMounts # -- Node selector for controller pods.
volumeMounts: [] nodeSelector: {}
# -- Affinity for controller pods.
affinity: {}
# -- List of node taints to tolerate for controller pods.
tolerations: []
# -- Priority class for controller pods.
priorityClassName: ""
# -- Security context for controller pods.
podSecurityContext: {}
# runAsUser: 1000
# runAsGroup: 2000
# fsGroup: 3000
# -- Topology spread constraints rely on node labels to identify the topology domain(s) that each Node is in.
# Ref: [Pod Topology Spread Constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/).
# The labelSelector field in topology spread constraint will be set to the selector labels for controller pods if not specified.
topologySpreadConstraints: []
# - maxSkew: 1
# topologyKey: topology.kubernetes.io/zone
# whenUnsatisfiable: ScheduleAnyway
# - maxSkew: 1
# topologyKey: kubernetes.io/hostname
# whenUnsatisfiable: DoNotSchedule
# -- Environment variables for controller containers.
env: []
# -- Environment variable sources for controller containers.
envFrom: []
# -- Volume mounts for controller containers.
volumeMounts: []
# -- Pod resource requests and limits for controller containers.
# Note, that each job submission will spawn a JVM within the controller pods using "/usr/local/openjdk-11/bin/java -Xmx128m".
# Kubernetes may kill these Java processes at will to enforce resource limits. When that happens, you will see the following error:
# 'failed to run spark-submit for SparkApplication [...]: signal: killed' - when this happens, you may want to increase memory limits.
resources: {}
# limits:
# cpu: 100m
# memory: 300Mi
# requests:
# cpu: 100m
# memory: 300Mi
# -- Security context for controller containers.
securityContext: {}
# runAsUser: 1000
# runAsGroup: 2000
# fsGroup: 3000
# -- Sidecar containers for controller pods.
sidecars: []
# Pod disruption budget for controller to avoid service degradation.
podDisruptionBudget:
# -- Specifies whether to create pod disruption budget for controller.
# Ref: [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/)
enable: false
# -- The number of pods that must be available.
# Require `controller.replicas` to be greater than 1
minAvailable: 1
pprof:
# -- Specifies whether to enable pprof.
enable: false
# -- Specifies pprof port.
port: 6060
# -- Specifies pprof service port name.
portName: pprof
# Workqueue rate limiter configuration forwarded to the controller-runtime Reconciler.
workqueueRateLimiter:
# -- Specifies the average rate of items process by the workqueue rate limiter.
bucketQPS: 50
# -- Specifies the maximum number of items that can be in the workqueue at any given time.
bucketSize: 500
maxDelay:
# -- Specifies whether to enable max delay for the workqueue rate limiter.
# This is useful to avoid losing events when the workqueue is full.
enable: true
# -- Specifies the maximum delay duration for the workqueue rate limiter.
duration: 6h
webhook: webhook:
# -- Enable webhook server # -- Specifies whether to enable webhook.
enable: false
# -- Webhook service port
port: 8080
# -- Webhook container port name and service target port name
portName: webhook
# -- The webhook server will only operate on namespaces with this label, specified in the form key1=value1,key2=value2.
# Empty string (default) will operate on all namespaces
namespaceSelector: ""
# -- The webhook will only operate on resources with this label/s, specified in the form key1=value1,key2=value2, OR key in (value1,value2).
# Empty string (default) will operate on all objects
objectSelector: ""
# -- The annotations applied to init job, required to restore certs deleted by the cleanup job during upgrade
timeout: 30
metrics:
# -- Enable prometheus metric scraping
enable: true enable: true
# -- Metrics port
port: 10254 # -- Number of replicas of webhook server.
# -- Metrics port name replicas: 1
# -- Configure the verbosity of logging, can be one of `debug`, `info`, `error`.
logLevel: info
# -- Specifies webhook port.
port: 9443
# -- Specifies webhook service port name.
portName: webhook
# -- Specifies how unrecognized errors are handled.
# Available options are `Ignore` or `Fail`.
failurePolicy: Fail
# -- Specifies the timeout seconds of the webhook, the value must be between 1 and 30.
timeoutSeconds: 10
resourceQuotaEnforcement:
# -- Specifies whether to enable the ResourceQuota enforcement for SparkApplication resources.
enable: false
serviceAccount:
# -- Specifies whether to create a service account for the webhook.
create: true
# -- Optional name for the webhook service account.
name: ""
# -- Extra annotations for the webhook service account.
annotations: {}
rbac:
# -- Specifies whether to create RBAC resources for the webhook.
create: true
# -- Extra annotations for the webhook RBAC resources.
annotations: {}
# -- Extra labels for webhook pods.
labels: {}
# key1: value1
# key2: value2
# -- Extra annotations for webhook pods.
annotations: {}
# key1: value1
# key2: value2
# -- Sidecar containers for webhook pods.
sidecars: []
# -- Volumes for webhook pods.
volumes: []
# -- Node selector for webhook pods.
nodeSelector: {}
# -- Affinity for webhook pods.
affinity: {}
# -- List of node taints to tolerate for webhook pods.
tolerations: []
# -- Priority class for webhook pods.
priorityClassName: ""
# -- Security context for webhook pods.
podSecurityContext: {}
# runAsUser: 1000
# runAsGroup: 2000
# fsGroup: 3000
# -- Topology spread constraints rely on node labels to identify the topology domain(s) that each Node is in.
# Ref: [Pod Topology Spread Constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/).
# The labelSelector field in topology spread constraint will be set to the selector labels for webhook pods if not specified.
topologySpreadConstraints: []
# - maxSkew: 1
# topologyKey: topology.kubernetes.io/zone
# whenUnsatisfiable: ScheduleAnyway
# - maxSkew: 1
# topologyKey: kubernetes.io/hostname
# whenUnsatisfiable: DoNotSchedule
# -- Environment variables for webhook containers.
env: []
# -- Environment variable sources for webhook containers.
envFrom: []
# -- Volume mounts for webhook containers.
volumeMounts: []
# -- Pod resource requests and limits for webhook pods.
resources: {}
# limits:
# cpu: 100m
# memory: 300Mi
# requests:
# cpu: 100m
# memory: 300Mi
# -- Security context for webhook containers.
securityContext: {}
# runAsUser: 1000
# runAsGroup: 2000
# fsGroup: 3000
# Pod disruption budget for webhook to avoid service degradation.
podDisruptionBudget:
# -- Specifies whether to create pod disruption budget for webhook.
# Ref: [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/)
enable: false
# -- The number of pods that must be available.
# Require `webhook.replicas` to be greater than 1
minAvailable: 1
spark:
# -- List of namespaces where to run spark jobs.
# If empty string is included, all namespaces will be allowed.
# Make sure the namespaces have already existed.
jobNamespaces:
- default
serviceAccount:
# -- Specifies whether to create a service account for spark applications.
create: true
# -- Optional name for the spark service account.
name: ""
# -- Optional annotations for the spark service account.
annotations: {}
rbac:
# -- Specifies whether to create RBAC resources for spark applications.
create: true
# -- Optional annotations for the spark application RBAC resources.
annotations: {}
prometheus:
metrics:
# -- Specifies whether to enable prometheus metrics scraping.
enable: true
# -- Metrics port.
port: 8080
# -- Metrics port name.
portName: metrics portName: metrics
# -- Metrics serving endpoint # -- Metrics serving endpoint.
endpoint: /metrics endpoint: /metrics
# -- Metric prefix, will be added to all exported metrics # -- Metrics prefix, will be added to all exported metrics.
prefix: "" prefix: ""
# -- Prometheus pod monitor for operator's pod. # Prometheus pod monitor for controller pods
podMonitor: podMonitor:
# -- If enabled, a pod monitor for operator's pod will be submitted. Note that prometheus metrics should be enabled as well. # -- Specifies whether to create pod monitor.
enable: false # Note that prometheus metrics should be enabled as well.
create: false
# -- Pod monitor labels # -- Pod monitor labels
labels: {} labels: {}
# -- The label to use to retrieve the job name from # -- The label to use to retrieve the job name from
@ -133,66 +357,3 @@ podMonitor:
podMetricsEndpoint: podMetricsEndpoint:
scheme: http scheme: http
interval: 5s interval: 5s
# -- podDisruptionBudget to avoid service degradation
podDisruptionBudget:
# -- Specifies whether to enable pod disruption budget.
# Ref: [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/)
enable: false
# -- The number of pods that must be available.
# Require `replicaCount` to be greater than 1
minAvailable: 1
# nodeSelector -- Node labels for pod assignment
nodeSelector: {}
# tolerations -- List of node taints to tolerate
tolerations: []
# affinity -- Affinity for pod assignment
affinity: {}
# podAnnotations -- Additional annotations to add to the pod
podAnnotations: {}
# podLabels -- Additional labels to add to the pod
podLabels: {}
# resources -- Pod resource requests and limits
# Note, that each job submission will spawn a JVM within the Spark Operator Pod using "/usr/local/openjdk-11/bin/java -Xmx128m".
# Kubernetes may kill these Java processes at will to enforce resource limits. When that happens, you will see the following error:
# 'failed to run spark-submit for SparkApplication [...]: signal: killed' - when this happens, you may want to increase memory limits.
resources: {}
# limits:
# cpu: 100m
# memory: 300Mi
# requests:
# cpu: 100m
# memory: 300Mi
batchScheduler:
# -- Enable batch scheduler for spark jobs scheduling. If enabled, users can specify batch scheduler name in spark application
enable: false
resourceQuotaEnforcement:
# -- Whether to enable the ResourceQuota enforcement for SparkApplication resources.
# Requires the webhook to be enabled by setting `webhook.enable` to true.
# Ref: https://github.com/kubeflow/spark-operator/blob/master/docs/user-guide.md#enabling-resource-quota-enforcement.
enable: false
leaderElection:
# -- Leader election lock name.
# Ref: https://github.com/kubeflow/spark-operator/blob/master/docs/user-guide.md#enabling-leader-election-for-high-availability.
lockName: "spark-operator-lock"
# -- Optionally store the lock in another namespace. Defaults to operator's namespace
lockNamespace: ""
istio:
# -- When using `istio`, spark jobs need to run without a sidecar to properly terminate
enabled: false
# labelSelectorFilter -- A comma-separated list of key=value, or key labels to filter resources during watch and list based on the specified labels.
labelSelectorFilter: ""
# priorityClassName -- A priority class to be used for running spark-operator pod.
priorityClassName: ""

31
cmd/main.go Normal file
View File

@ -0,0 +1,31 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package main
import (
"fmt"
"os"
"github.com/kubeflow/spark-operator/cmd/operator"
)
func main() {
if err := operator.NewCommand().Execute(); err != nil {
fmt.Fprintf(os.Stderr, "%v\n", err)
os.Exit(1)
}
}

View File

@ -1,5 +1,5 @@
/* /*
Copyright 2019 Google LLC Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
@ -14,16 +14,20 @@ See the License for the specific language governing permissions and
limitations under the License. limitations under the License.
*/ */
package schedulerinterface package controller
import ( import (
"github.com/kubeflow/spark-operator/pkg/apis/sparkoperator.k8s.io/v1beta2" "github.com/spf13/cobra"
) )
type BatchScheduler interface { func NewCommand() *cobra.Command {
Name() string command := &cobra.Command{
Use: "controller",
ShouldSchedule(app *v1beta2.SparkApplication) bool Short: "Spark operator controller",
DoBatchSchedulingOnSubmission(app *v1beta2.SparkApplication) error RunE: func(cmd *cobra.Command, _ []string) error {
CleanupOnCompletion(app *v1beta2.SparkApplication) error return cmd.Help()
},
}
command.AddCommand(NewStartCommand())
return command
} }

View File

@ -0,0 +1,414 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package controller
import (
"crypto/tls"
"flag"
"os"
"slices"
"time"
// Import all Kubernetes client auth plugins (e.g. Azure, GCP, OIDC, etc.)
// to ensure that exec-entrypoint and run can make use of them.
_ "k8s.io/client-go/plugin/pkg/client/auth"
"github.com/spf13/cobra"
"github.com/spf13/viper"
"go.uber.org/zap"
"go.uber.org/zap/zapcore"
"golang.org/x/time/rate"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/runtime"
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
"k8s.io/client-go/kubernetes"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
"k8s.io/utils/clock"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/cache"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller"
"sigs.k8s.io/controller-runtime/pkg/healthz"
logzap "sigs.k8s.io/controller-runtime/pkg/log/zap"
metricsserver "sigs.k8s.io/controller-runtime/pkg/metrics/server"
ctrlwebhook "sigs.k8s.io/controller-runtime/pkg/webhook"
schedulingv1alpha1 "sigs.k8s.io/scheduler-plugins/apis/scheduling/v1alpha1"
sparkoperator "github.com/kubeflow/spark-operator"
"github.com/kubeflow/spark-operator/api/v1beta1"
"github.com/kubeflow/spark-operator/api/v1beta2"
"github.com/kubeflow/spark-operator/internal/controller/scheduledsparkapplication"
"github.com/kubeflow/spark-operator/internal/controller/sparkapplication"
"github.com/kubeflow/spark-operator/internal/metrics"
"github.com/kubeflow/spark-operator/internal/scheduler"
"github.com/kubeflow/spark-operator/internal/scheduler/kubescheduler"
"github.com/kubeflow/spark-operator/internal/scheduler/volcano"
"github.com/kubeflow/spark-operator/internal/scheduler/yunikorn"
"github.com/kubeflow/spark-operator/pkg/common"
"github.com/kubeflow/spark-operator/pkg/util"
// +kubebuilder:scaffold:imports
)
var (
scheme = runtime.NewScheme()
logger = ctrl.Log.WithName("")
)
var (
namespaces []string
// Controller
controllerThreads int
cacheSyncTimeout time.Duration
//WorkQueue
workqueueRateLimiterBucketQPS int
workqueueRateLimiterBucketSize int
workqueueRateLimiterMaxDelay time.Duration
// Batch scheduler
enableBatchScheduler bool
kubeSchedulerNames []string
defaultBatchScheduler string
// Spark web UI service and ingress
enableUIService bool
ingressClassName string
ingressURLFormat string
// Leader election
enableLeaderElection bool
leaderElectionLockName string
leaderElectionLockNamespace string
leaderElectionLeaseDuration time.Duration
leaderElectionRenewDeadline time.Duration
leaderElectionRetryPeriod time.Duration
// Metrics
enableMetrics bool
metricsBindAddress string
metricsEndpoint string
metricsPrefix string
metricsLabels []string
metricsJobStartLatencyBuckets []float64
healthProbeBindAddress string
pprofBindAddress string
secureMetrics bool
enableHTTP2 bool
development bool
zapOptions = logzap.Options{}
)
func init() {
utilruntime.Must(clientgoscheme.AddToScheme(scheme))
utilruntime.Must(schedulingv1alpha1.AddToScheme(scheme))
utilruntime.Must(v1beta1.AddToScheme(scheme))
utilruntime.Must(v1beta2.AddToScheme(scheme))
// +kubebuilder:scaffold:scheme
}
func NewStartCommand() *cobra.Command {
var command = &cobra.Command{
Use: "start",
Short: "Start controller and webhook",
PreRun: func(_ *cobra.Command, args []string) {
development = viper.GetBool("development")
},
Run: func(_ *cobra.Command, args []string) {
sparkoperator.PrintVersion(false)
start()
},
}
command.Flags().IntVar(&controllerThreads, "controller-threads", 10, "Number of worker threads used by the SparkApplication controller.")
command.Flags().StringSliceVar(&namespaces, "namespaces", []string{}, "The Kubernetes namespace to manage. Will manage custom resource objects of the managed CRD types for the whole cluster if unset or contains empty string.")
command.Flags().DurationVar(&cacheSyncTimeout, "cache-sync-timeout", 30*time.Second, "Informer cache sync timeout.")
command.Flags().IntVar(&workqueueRateLimiterBucketQPS, "workqueue-ratelimiter-bucket-qps", 10, "QPS of the bucket rate of the workqueue.")
command.Flags().IntVar(&workqueueRateLimiterBucketSize, "workqueue-ratelimiter-bucket-size", 100, "The token bucket size of the workqueue.")
command.Flags().DurationVar(&workqueueRateLimiterMaxDelay, "workqueue-ratelimiter-max-delay", rate.InfDuration, "The maximum delay of the workqueue.")
command.Flags().BoolVar(&enableBatchScheduler, "enable-batch-scheduler", false, "Enable batch schedulers.")
command.Flags().StringSliceVar(&kubeSchedulerNames, "kube-scheduler-names", []string{}, "The kube-scheduler names for scheduling Spark applications.")
command.Flags().StringVar(&defaultBatchScheduler, "default-batch-scheduler", "", "Default batch scheduler.")
command.Flags().BoolVar(&enableUIService, "enable-ui-service", true, "Enable Spark Web UI service.")
command.Flags().StringVar(&ingressClassName, "ingress-class-name", "", "Set ingressClassName for ingress resources created.")
command.Flags().StringVar(&ingressURLFormat, "ingress-url-format", "", "Ingress URL format.")
command.Flags().BoolVar(&enableLeaderElection, "leader-election", false, "Enable leader election for controller manager. "+
"Enabling this will ensure there is only one active controller manager.")
command.Flags().StringVar(&leaderElectionLockName, "leader-election-lock-name", "spark-operator-lock", "Name of the ConfigMap for leader election.")
command.Flags().StringVar(&leaderElectionLockNamespace, "leader-election-lock-namespace", "spark-operator", "Namespace in which to create the ConfigMap for leader election.")
command.Flags().DurationVar(&leaderElectionLeaseDuration, "leader-election-lease-duration", 15*time.Second, "Leader election lease duration.")
command.Flags().DurationVar(&leaderElectionRenewDeadline, "leader-election-renew-deadline", 14*time.Second, "Leader election renew deadline.")
command.Flags().DurationVar(&leaderElectionRetryPeriod, "leader-election-retry-period", 4*time.Second, "Leader election retry period.")
command.Flags().BoolVar(&enableMetrics, "enable-metrics", false, "Enable metrics.")
command.Flags().StringVar(&metricsBindAddress, "metrics-bind-address", "0", "The address the metric endpoint binds to. "+
"Use the port :8080. If not set, it will be 0 in order to disable the metrics server")
command.Flags().StringVar(&metricsEndpoint, "metrics-endpoint", "/metrics", "Metrics endpoint.")
command.Flags().StringVar(&metricsPrefix, "metrics-prefix", "", "Prefix for the metrics.")
command.Flags().StringSliceVar(&metricsLabels, "metrics-labels", []string{}, "Labels to be added to the metrics.")
command.Flags().Float64SliceVar(&metricsJobStartLatencyBuckets, "metrics-job-start-latency-buckets", []float64{30, 60, 90, 120, 150, 180, 210, 240, 270, 300}, "Buckets for the job start latency histogram.")
command.Flags().StringVar(&healthProbeBindAddress, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.")
command.Flags().BoolVar(&secureMetrics, "secure-metrics", false, "If set the metrics endpoint is served securely")
command.Flags().BoolVar(&enableHTTP2, "enable-http2", false, "If set, HTTP/2 will be enabled for the metrics and webhook servers")
command.Flags().StringVar(&pprofBindAddress, "pprof-bind-address", "0", "The address the pprof endpoint binds to. "+
"If not set, it will be 0 in order to disable the pprof server")
flagSet := flag.NewFlagSet("controller", flag.ExitOnError)
ctrl.RegisterFlags(flagSet)
zapOptions.BindFlags(flagSet)
command.Flags().AddGoFlagSet(flagSet)
return command
}
func start() {
setupLog()
// Create the client rest config. Use kubeConfig if given, otherwise assume in-cluster.
cfg, err := ctrl.GetConfig()
if err != nil {
logger.Error(err, "failed to get kube config")
os.Exit(1)
}
// Create the manager.
tlsOptions := newTLSOptions()
mgr, err := ctrl.NewManager(cfg, ctrl.Options{
Scheme: scheme,
Cache: newCacheOptions(),
Metrics: metricsserver.Options{
BindAddress: metricsBindAddress,
SecureServing: secureMetrics,
TLSOpts: tlsOptions,
},
WebhookServer: ctrlwebhook.NewServer(ctrlwebhook.Options{
TLSOpts: tlsOptions,
}),
HealthProbeBindAddress: healthProbeBindAddress,
PprofBindAddress: pprofBindAddress,
LeaderElection: enableLeaderElection,
LeaderElectionID: leaderElectionLockName,
LeaderElectionNamespace: leaderElectionLockNamespace,
// LeaderElectionReleaseOnCancel defines if the leader should step down voluntarily
// when the Manager ends. This requires the binary to immediately end when the
// Manager is stopped, otherwise, this setting is unsafe. Setting this significantly
// speeds up voluntary leader transitions as the new leader don't have to wait
// LeaseDuration time first.
//
// In the default scaffold provided, the program ends immediately after
// the manager stops, so would be fine to enable this option. However,
// if you are doing or is intended to do any operation such as perform cleanups
// after the manager stops then its usage might be unsafe.
// LeaderElectionReleaseOnCancel: true,
})
if err != nil {
logger.Error(err, "failed to create manager")
os.Exit(1)
}
clientset, err := kubernetes.NewForConfig(cfg)
if err != nil {
logger.Error(err, "failed to create clientset")
os.Exit(1)
}
if err = util.InitializeIngressCapabilities(clientset); err != nil {
logger.Error(err, "failed to retrieve cluster ingress capabilities")
os.Exit(1)
}
var registry *scheduler.Registry
if enableBatchScheduler {
registry = scheduler.GetRegistry()
_ = registry.Register(common.VolcanoSchedulerName, volcano.Factory)
_ = registry.Register(yunikorn.SchedulerName, yunikorn.Factory)
// Register kube-schedulers.
for _, name := range kubeSchedulerNames {
_ = registry.Register(name, kubescheduler.Factory)
}
schedulerNames := registry.GetRegisteredSchedulerNames()
if defaultBatchScheduler != "" && !slices.Contains(schedulerNames, defaultBatchScheduler) {
logger.Error(nil, "Failed to find default batch scheduler in registered schedulers")
os.Exit(1)
}
}
// Setup controller for SparkApplication.
if err = sparkapplication.NewReconciler(
mgr,
mgr.GetScheme(),
mgr.GetClient(),
mgr.GetEventRecorderFor("spark-application-controller"),
registry,
newSparkApplicationReconcilerOptions(),
).SetupWithManager(mgr, newControllerOptions()); err != nil {
logger.Error(err, "Failed to create controller", "controller", "SparkApplication")
os.Exit(1)
}
// Setup controller for ScheduledSparkApplication.
if err = scheduledsparkapplication.NewReconciler(
mgr.GetScheme(),
mgr.GetClient(),
mgr.GetEventRecorderFor("scheduled-spark-application-controller"),
clock.RealClock{},
newScheduledSparkApplicationReconcilerOptions(),
).SetupWithManager(mgr, newControllerOptions()); err != nil {
logger.Error(err, "Failed to create controller", "controller", "ScheduledSparkApplication")
os.Exit(1)
}
// +kubebuilder:scaffold:builder
if err := mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {
logger.Error(err, "Failed to set up health check")
os.Exit(1)
}
if err := mgr.AddReadyzCheck("readyz", healthz.Ping); err != nil {
logger.Error(err, "Failed to set up ready check")
os.Exit(1)
}
logger.Info("Starting manager")
if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
logger.Error(err, "Failed to start manager")
os.Exit(1)
}
}
// setupLog Configures the logging system
func setupLog() {
ctrl.SetLogger(logzap.New(
logzap.UseFlagOptions(&zapOptions),
func(o *logzap.Options) {
o.Development = development
}, func(o *logzap.Options) {
o.ZapOpts = append(o.ZapOpts, zap.AddCaller())
}, func(o *logzap.Options) {
var config zapcore.EncoderConfig
if !development {
config = zap.NewProductionEncoderConfig()
} else {
config = zap.NewDevelopmentEncoderConfig()
}
config.EncodeLevel = zapcore.CapitalColorLevelEncoder
config.EncodeTime = zapcore.ISO8601TimeEncoder
config.EncodeCaller = zapcore.ShortCallerEncoder
o.Encoder = zapcore.NewConsoleEncoder(config)
}),
)
}
func newTLSOptions() []func(c *tls.Config) {
// if the enable-http2 flag is false (the default), http/2 should be disabled
// due to its vulnerabilities. More specifically, disabling http/2 will
// prevent from being vulnerable to the HTTP/2 Stream Cancellation and
// Rapid Reset CVEs. For more information see:
// - https://github.com/advisories/GHSA-qppj-fm5r-hxr3
// - https://github.com/advisories/GHSA-4374-p667-p6c8
disableHTTP2 := func(c *tls.Config) {
logger.Info("disabling http/2")
c.NextProtos = []string{"http/1.1"}
}
tlsOpts := []func(*tls.Config){}
if !enableHTTP2 {
tlsOpts = append(tlsOpts, disableHTTP2)
}
return tlsOpts
}
// newCacheOptions creates and returns a cache.Options instance configured with default namespaces and object caching settings.
func newCacheOptions() cache.Options {
defaultNamespaces := make(map[string]cache.Config)
if !util.ContainsString(namespaces, cache.AllNamespaces) {
for _, ns := range namespaces {
defaultNamespaces[ns] = cache.Config{}
}
}
options := cache.Options{
Scheme: scheme,
DefaultNamespaces: defaultNamespaces,
ByObject: map[client.Object]cache.ByObject{
&corev1.Pod{}: {
Label: labels.SelectorFromSet(labels.Set{
common.LabelLaunchedBySparkOperator: "true",
}),
},
&corev1.ConfigMap{}: {},
&corev1.PersistentVolumeClaim{}: {},
&corev1.Service{}: {},
&v1beta2.SparkApplication{}: {},
},
}
return options
}
// newControllerOptions creates and returns a controller.Options instance configured with the given options.
func newControllerOptions() controller.Options {
options := controller.Options{
MaxConcurrentReconciles: controllerThreads,
CacheSyncTimeout: cacheSyncTimeout,
RateLimiter: util.NewRateLimiter(workqueueRateLimiterBucketQPS, workqueueRateLimiterBucketSize, workqueueRateLimiterMaxDelay),
}
return options
}
func newSparkApplicationReconcilerOptions() sparkapplication.Options {
var sparkApplicationMetrics *metrics.SparkApplicationMetrics
var sparkExecutorMetrics *metrics.SparkExecutorMetrics
if enableMetrics {
sparkApplicationMetrics = metrics.NewSparkApplicationMetrics(metricsPrefix, metricsLabels, metricsJobStartLatencyBuckets)
sparkApplicationMetrics.Register()
sparkExecutorMetrics = metrics.NewSparkExecutorMetrics(metricsPrefix, metricsLabels)
sparkExecutorMetrics.Register()
}
options := sparkapplication.Options{
Namespaces: namespaces,
EnableUIService: enableUIService,
IngressClassName: ingressClassName,
IngressURLFormat: ingressURLFormat,
DefaultBatchScheduler: defaultBatchScheduler,
SparkApplicationMetrics: sparkApplicationMetrics,
SparkExecutorMetrics: sparkExecutorMetrics,
}
if enableBatchScheduler {
options.KubeSchedulerNames = kubeSchedulerNames
}
return options
}
func newScheduledSparkApplicationReconcilerOptions() scheduledsparkapplication.Options {
options := scheduledsparkapplication.Options{
Namespaces: namespaces,
}
return options
}

39
cmd/operator/root.go Normal file
View File

@ -0,0 +1,39 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package operator
import (
"github.com/spf13/cobra"
"github.com/kubeflow/spark-operator/cmd/operator/controller"
"github.com/kubeflow/spark-operator/cmd/operator/version"
"github.com/kubeflow/spark-operator/cmd/operator/webhook"
)
func NewCommand() *cobra.Command {
command := &cobra.Command{
Use: "spark-operator",
Short: "Spark operator",
RunE: func(cmd *cobra.Command, _ []string) error {
return cmd.Help()
},
}
command.AddCommand(controller.NewCommand())
command.AddCommand(webhook.NewCommand())
command.AddCommand(version.NewCommand())
return command
}

View File

@ -0,0 +1,40 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package version
import (
"github.com/spf13/cobra"
sparkoperator "github.com/kubeflow/spark-operator"
)
var (
short bool
)
func NewCommand() *cobra.Command {
command := &cobra.Command{
Use: "version",
Short: "Print version information",
RunE: func(cmd *cobra.Command, args []string) error {
sparkoperator.PrintVersion(short)
return nil
},
}
command.Flags().BoolVar(&short, "short", false, "Print just the version string.")
return command
}

View File

@ -0,0 +1,33 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package webhook
import (
"github.com/spf13/cobra"
)
func NewCommand() *cobra.Command {
command := &cobra.Command{
Use: "webhook",
Short: "Spark operator webhook",
RunE: func(cmd *cobra.Command, _ []string) error {
return cmd.Help()
},
}
command.AddCommand(NewStartCommand())
return command
}

View File

@ -0,0 +1,409 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package webhook
import (
"context"
"crypto/tls"
"flag"
"os"
"time"
// Import all Kubernetes client auth plugins (e.g. Azure, GCP, OIDC, etc.)
// to ensure that exec-entrypoint and run can make use of them.
_ "k8s.io/client-go/plugin/pkg/client/auth"
"github.com/spf13/cobra"
"github.com/spf13/viper"
"go.uber.org/zap"
"go.uber.org/zap/zapcore"
admissionregistrationv1 "k8s.io/api/admissionregistration/v1"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/fields"
"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/runtime"
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
"k8s.io/apimachinery/pkg/util/wait"
clientgoscheme "k8s.io/client-go/kubernetes/scheme"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/cache"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller"
logzap "sigs.k8s.io/controller-runtime/pkg/log/zap"
metricsserver "sigs.k8s.io/controller-runtime/pkg/metrics/server"
ctrlwebhook "sigs.k8s.io/controller-runtime/pkg/webhook"
sparkoperator "github.com/kubeflow/spark-operator"
"github.com/kubeflow/spark-operator/api/v1beta1"
"github.com/kubeflow/spark-operator/api/v1beta2"
"github.com/kubeflow/spark-operator/internal/controller/mutatingwebhookconfiguration"
"github.com/kubeflow/spark-operator/internal/controller/validatingwebhookconfiguration"
"github.com/kubeflow/spark-operator/internal/webhook"
"github.com/kubeflow/spark-operator/pkg/certificate"
"github.com/kubeflow/spark-operator/pkg/common"
"github.com/kubeflow/spark-operator/pkg/util"
// +kubebuilder:scaffold:imports
)
var (
scheme = runtime.NewScheme()
logger = ctrl.Log.WithName("")
)
var (
namespaces []string
labelSelectorFilter string
// Controller
controllerThreads int
cacheSyncTimeout time.Duration
// Webhook
enableResourceQuotaEnforcement bool
webhookCertDir string
webhookCertName string
webhookKeyName string
mutatingWebhookName string
validatingWebhookName string
webhookPort int
webhookSecretName string
webhookSecretNamespace string
webhookServiceName string
webhookServiceNamespace string
// Leader election
enableLeaderElection bool
leaderElectionLockName string
leaderElectionLockNamespace string
leaderElectionLeaseDuration time.Duration
leaderElectionRenewDeadline time.Duration
leaderElectionRetryPeriod time.Duration
// Metrics
enableMetrics bool
metricsBindAddress string
metricsEndpoint string
metricsPrefix string
metricsLabels []string
healthProbeBindAddress string
secureMetrics bool
enableHTTP2 bool
development bool
zapOptions = logzap.Options{}
)
func init() {
utilruntime.Must(clientgoscheme.AddToScheme(scheme))
utilruntime.Must(v1beta1.AddToScheme(scheme))
utilruntime.Must(v1beta2.AddToScheme(scheme))
// +kubebuilder:scaffold:scheme
}
func NewStartCommand() *cobra.Command {
var command = &cobra.Command{
Use: "start",
Short: "Start controller and webhook",
PreRun: func(_ *cobra.Command, args []string) {
development = viper.GetBool("development")
},
Run: func(cmd *cobra.Command, args []string) {
sparkoperator.PrintVersion(false)
start()
},
}
command.Flags().IntVar(&controllerThreads, "controller-threads", 10, "Number of worker threads used by the SparkApplication controller.")
command.Flags().StringSliceVar(&namespaces, "namespaces", []string{}, "The Kubernetes namespace to manage. Will manage custom resource objects of the managed CRD types for the whole cluster if unset or contains empty string.")
command.Flags().StringVar(&labelSelectorFilter, "label-selector-filter", "", "A comma-separated list of key=value, or key labels to filter resources during watch and list based on the specified labels.")
command.Flags().DurationVar(&cacheSyncTimeout, "cache-sync-timeout", 30*time.Second, "Informer cache sync timeout.")
command.Flags().StringVar(&webhookCertDir, "webhook-cert-dir", "/etc/k8s-webhook-server/serving-certs", "The directory that contains the webhook server key and certificate. "+
"When running as nonRoot, you must create and own this directory before running this command.")
command.Flags().StringVar(&webhookCertName, "webhook-cert-name", "tls.crt", "The file name of webhook server certificate.")
command.Flags().StringVar(&webhookKeyName, "webhook-key-name", "tls.key", "The file name of webhook server key.")
command.Flags().StringVar(&mutatingWebhookName, "mutating-webhook-name", "spark-operator-webhook", "The name of the mutating webhook.")
command.Flags().StringVar(&validatingWebhookName, "validating-webhook-name", "spark-operator-webhook", "The name of the validating webhook.")
command.Flags().IntVar(&webhookPort, "webhook-port", 9443, "Service port of the webhook server.")
command.Flags().StringVar(&webhookSecretName, "webhook-secret-name", "spark-operator-webhook-certs", "The name of the secret that contains the webhook server's TLS certificate and key.")
command.Flags().StringVar(&webhookSecretNamespace, "webhook-secret-namespace", "spark-operator", "The namespace of the secret that contains the webhook server's TLS certificate and key.")
command.Flags().StringVar(&webhookServiceName, "webhook-svc-name", "spark-webhook", "The name of the Service for the webhook server.")
command.Flags().StringVar(&webhookServiceNamespace, "webhook-svc-namespace", "spark-webhook", "The name of the Service for the webhook server.")
command.Flags().BoolVar(&enableResourceQuotaEnforcement, "enable-resource-quota-enforcement", false, "Whether to enable ResourceQuota enforcement for SparkApplication resources. Requires the webhook to be enabled.")
command.Flags().BoolVar(&enableLeaderElection, "leader-election", false, "Enable leader election for controller manager. "+
"Enabling this will ensure there is only one active controller manager.")
command.Flags().StringVar(&leaderElectionLockName, "leader-election-lock-name", "spark-operator-lock", "Name of the ConfigMap for leader election.")
command.Flags().StringVar(&leaderElectionLockNamespace, "leader-election-lock-namespace", "spark-operator", "Namespace in which to create the ConfigMap for leader election.")
command.Flags().DurationVar(&leaderElectionLeaseDuration, "leader-election-lease-duration", 15*time.Second, "Leader election lease duration.")
command.Flags().DurationVar(&leaderElectionRenewDeadline, "leader-election-renew-deadline", 14*time.Second, "Leader election renew deadline.")
command.Flags().DurationVar(&leaderElectionRetryPeriod, "leader-election-retry-period", 4*time.Second, "Leader election retry period.")
command.Flags().BoolVar(&enableMetrics, "enable-metrics", false, "Enable metrics.")
command.Flags().StringVar(&metricsBindAddress, "metrics-bind-address", "0", "The address the metric endpoint binds to. "+
"Use the port :8080. If not set, it will be 0 in order to disable the metrics server")
command.Flags().StringVar(&metricsEndpoint, "metrics-endpoint", "/metrics", "Metrics endpoint.")
command.Flags().StringVar(&metricsPrefix, "metrics-prefix", "", "Prefix for the metrics.")
command.Flags().StringSliceVar(&metricsLabels, "metrics-labels", []string{}, "Labels to be added to the metrics.")
command.Flags().StringVar(&healthProbeBindAddress, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.")
command.Flags().BoolVar(&secureMetrics, "secure-metrics", false, "If set the metrics endpoint is served securely")
command.Flags().BoolVar(&enableHTTP2, "enable-http2", false, "If set, HTTP/2 will be enabled for the metrics and webhook servers")
flagSet := flag.NewFlagSet("controller", flag.ExitOnError)
ctrl.RegisterFlags(flagSet)
zapOptions.BindFlags(flagSet)
command.Flags().AddGoFlagSet(flagSet)
return command
}
func start() {
setupLog()
// Create the client rest config. Use kubeConfig if given, otherwise assume in-cluster.
cfg, err := ctrl.GetConfig()
if err != nil {
logger.Error(err, "failed to get kube config")
os.Exit(1)
}
// Create the manager.
tlsOptions := newTLSOptions()
mgr, err := ctrl.NewManager(cfg, ctrl.Options{
Scheme: scheme,
Cache: newCacheOptions(),
Metrics: metricsserver.Options{
BindAddress: metricsBindAddress,
SecureServing: secureMetrics,
TLSOpts: tlsOptions,
},
WebhookServer: ctrlwebhook.NewServer(ctrlwebhook.Options{
Port: webhookPort,
CertDir: webhookCertDir,
CertName: webhookCertName,
TLSOpts: tlsOptions,
}),
HealthProbeBindAddress: healthProbeBindAddress,
LeaderElection: enableLeaderElection,
LeaderElectionID: leaderElectionLockName,
LeaderElectionNamespace: leaderElectionLockNamespace,
// LeaderElectionReleaseOnCancel defines if the leader should step down voluntarily
// when the Manager ends. This requires the binary to immediately end when the
// Manager is stopped, otherwise, this setting is unsafe. Setting this significantly
// speeds up voluntary leader transitions as the new leader don't have to wait
// LeaseDuration time first.
//
// In the default scaffold provided, the program ends immediately after
// the manager stops, so would be fine to enable this option. However,
// if you are doing or is intended to do any operation such as perform cleanups
// after the manager stops then its usage might be unsafe.
// LeaderElectionReleaseOnCancel: true,
})
if err != nil {
logger.Error(err, "Failed to create manager")
os.Exit(1)
}
client, err := client.New(cfg, client.Options{Scheme: mgr.GetScheme()})
if err != nil {
logger.Error(err, "Failed to create client")
os.Exit(1)
}
certProvider := certificate.NewProvider(
client,
webhookServiceName,
webhookServiceNamespace,
)
if err := wait.ExponentialBackoff(
wait.Backoff{
Steps: 5,
Duration: 1 * time.Second,
Factor: 2.0,
Jitter: 0.1,
},
func() (bool, error) {
logger.Info("Syncing webhook secret", "name", webhookSecretName, "namespace", webhookSecretNamespace)
if err := certProvider.SyncSecret(context.TODO(), webhookSecretName, webhookSecretNamespace); err != nil {
if errors.IsAlreadyExists(err) || errors.IsConflict(err) {
return false, nil
}
return false, err
}
return true, nil
},
); err != nil {
logger.Error(err, "Failed to sync webhook secret")
os.Exit(1)
}
logger.Info("Writing certificates", "path", webhookCertDir, "certificate name", webhookCertName, "key name", webhookKeyName)
if err := certProvider.WriteFile(webhookCertDir, webhookCertName, webhookKeyName); err != nil {
logger.Error(err, "Failed to save certificate")
os.Exit(1)
}
if err := mutatingwebhookconfiguration.NewReconciler(
mgr.GetClient(),
certProvider,
mutatingWebhookName,
).SetupWithManager(mgr, controller.Options{}); err != nil {
logger.Error(err, "Failed to create controller", "controller", "MutatingWebhookConfiguration")
os.Exit(1)
}
if err := validatingwebhookconfiguration.NewReconciler(
mgr.GetClient(),
certProvider,
validatingWebhookName,
).SetupWithManager(mgr, controller.Options{}); err != nil {
logger.Error(err, "Failed to create controller", "controller", "ValidatingWebhookConfiguration")
os.Exit(1)
}
if err := ctrl.NewWebhookManagedBy(mgr).
For(&v1beta2.SparkApplication{}).
WithDefaulter(webhook.NewSparkApplicationDefaulter()).
WithValidator(webhook.NewSparkApplicationValidator(mgr.GetClient(), enableResourceQuotaEnforcement)).
Complete(); err != nil {
logger.Error(err, "Failed to create mutating webhook for Spark application")
os.Exit(1)
}
if err := ctrl.NewWebhookManagedBy(mgr).
For(&v1beta2.ScheduledSparkApplication{}).
WithDefaulter(webhook.NewScheduledSparkApplicationDefaulter()).
WithValidator(webhook.NewScheduledSparkApplicationValidator()).
Complete(); err != nil {
logger.Error(err, "Failed to create mutating webhook for Scheduled Spark application")
os.Exit(1)
}
if err := ctrl.NewWebhookManagedBy(mgr).
For(&corev1.Pod{}).
WithDefaulter(webhook.NewSparkPodDefaulter(mgr.GetClient(), namespaces)).
Complete(); err != nil {
logger.Error(err, "Failed to create mutating webhook for Spark pod")
os.Exit(1)
}
// +kubebuilder:scaffold:builder
if err := mgr.AddHealthzCheck("healthz", mgr.GetWebhookServer().StartedChecker()); err != nil {
logger.Error(err, "Failed to set up health check")
os.Exit(1)
}
if err := mgr.AddReadyzCheck("readyz", mgr.GetWebhookServer().StartedChecker()); err != nil {
logger.Error(err, "Failed to set up ready check")
os.Exit(1)
}
logger.Info("Starting manager")
if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
logger.Error(err, "Failed to start manager")
os.Exit(1)
}
}
// setupLog Configures the logging system
func setupLog() {
ctrl.SetLogger(logzap.New(
logzap.UseFlagOptions(&zapOptions),
func(o *logzap.Options) {
o.Development = development
}, func(o *logzap.Options) {
o.ZapOpts = append(o.ZapOpts, zap.AddCaller())
}, func(o *logzap.Options) {
var config zapcore.EncoderConfig
if !development {
config = zap.NewProductionEncoderConfig()
} else {
config = zap.NewDevelopmentEncoderConfig()
}
config.EncodeLevel = zapcore.CapitalColorLevelEncoder
config.EncodeTime = zapcore.ISO8601TimeEncoder
config.EncodeCaller = zapcore.ShortCallerEncoder
o.Encoder = zapcore.NewConsoleEncoder(config)
}),
)
}
func newTLSOptions() []func(c *tls.Config) {
// if the enable-http2 flag is false (the default), http/2 should be disabled
// due to its vulnerabilities. More specifically, disabling http/2 will
// prevent from being vulnerable to the HTTP/2 Stream Cancellation and
// Rapid Reset CVEs. For more information see:
// - https://github.com/advisories/GHSA-qppj-fm5r-hxr3
// - https://github.com/advisories/GHSA-4374-p667-p6c8
disableHTTP2 := func(c *tls.Config) {
logger.Info("disabling http/2")
c.NextProtos = []string{"http/1.1"}
}
tlsOpts := []func(*tls.Config){}
if !enableHTTP2 {
tlsOpts = append(tlsOpts, disableHTTP2)
}
return tlsOpts
}
// newCacheOptions creates and returns a cache.Options instance configured with default namespaces and object caching settings.
func newCacheOptions() cache.Options {
defaultNamespaces := make(map[string]cache.Config)
if !util.ContainsString(namespaces, cache.AllNamespaces) {
for _, ns := range namespaces {
defaultNamespaces[ns] = cache.Config{}
}
}
byObject := map[client.Object]cache.ByObject{
&corev1.Pod{}: {
Label: labels.SelectorFromSet(labels.Set{
common.LabelLaunchedBySparkOperator: "true",
}),
},
&v1beta2.SparkApplication{}: {},
&v1beta2.ScheduledSparkApplication{}: {},
&admissionregistrationv1.MutatingWebhookConfiguration{}: {
Field: fields.SelectorFromSet(fields.Set{
"metadata.name": mutatingWebhookName,
}),
},
&admissionregistrationv1.ValidatingWebhookConfiguration{}: {
Field: fields.SelectorFromSet(fields.Set{
"metadata.name": validatingWebhookName,
}),
},
}
if enableResourceQuotaEnforcement {
byObject[&corev1.ResourceQuota{}] = cache.ByObject{}
}
options := cache.Options{
Scheme: scheme,
DefaultNamespaces: defaultNamespaces,
ByObject: byObject,
}
return options
}

10
codecov.yaml Normal file
View File

@ -0,0 +1,10 @@
coverage:
status:
project:
default:
threshold: 0.1%
patch:
default:
target: 60%
ignore:
- "**/*_generated.*"

View File

@ -0,0 +1,35 @@
# The following manifests contain a self-signed issuer CR and a certificate CR.
# More document can be found at https://docs.cert-manager.io
# WARNING: Targets CertManager v1.0. Check https://cert-manager.io/docs/installation/upgrading/ for breaking changes.
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
labels:
app.kubernetes.io/name: spark-operator
app.kubernetes.io/managed-by: kustomize
name: selfsigned-issuer
namespace: system
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
labels:
app.kubernetes.io/name: certificate
app.kubernetes.io/instance: serving-cert
app.kubernetes.io/component: certificate
app.kubernetes.io/created-by: spark-operator
app.kubernetes.io/part-of: spark-operator
app.kubernetes.io/managed-by: kustomize
name: serving-cert # this name should match the one appeared in kustomizeconfig.yaml
namespace: system
spec:
# SERVICE_NAME and SERVICE_NAMESPACE will be substituted by kustomize
dnsNames:
- SERVICE_NAME.SERVICE_NAMESPACE.svc
- SERVICE_NAME.SERVICE_NAMESPACE.svc.cluster.local
issuerRef:
kind: Issuer
name: selfsigned-issuer
secretName: webhook-server-cert # this secret will not be prefixed, since it's not managed by kustomize

View File

@ -0,0 +1,5 @@
resources:
- certificate.yaml
configurations:
- kustomizeconfig.yaml

View File

@ -0,0 +1,8 @@
# This configuration is for teaching kustomize how to update name ref substitution
nameReference:
- kind: Issuer
group: cert-manager.io
fieldSpecs:
- kind: Certificate
group: cert-manager.io
path: spec/issuerRef/name

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More