Compare commits

...

206 Commits

Author SHA1 Message Date
dependabot[bot] 920772e065
Bump github.com/go-viper/mapstructure/v2 from 2.2.1 to 2.3.0 (#2572)
Bumps [github.com/go-viper/mapstructure/v2](https://github.com/go-viper/mapstructure) from 2.2.1 to 2.3.0.
- [Release notes](https://github.com/go-viper/mapstructure/releases)
- [Changelog](https://github.com/go-viper/mapstructure/blob/main/CHANGELOG.md)
- [Commits](https://github.com/go-viper/mapstructure/compare/v2.2.1...v2.3.0)

---
updated-dependencies:
- dependency-name: github.com/go-viper/mapstructure/v2
  dependency-version: 2.3.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-10 09:17:53 +00:00
Mat Schaffer 799d915a44
Include pod.Status.Message in recordExecutorEvent (#2589)
This is especially useful for ephemeral-storage exhaustion.

Signed-off-by: Mat Schaffer <Mat.Schaffer@roblox.com>
2025-07-10 02:57:52 +00:00
Yi Chen 45cb1a1277
fix: should add executor env when driver env is empty (#2586)
* should check the length of executor env when adding env to executor pods

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add unit tests for addEnvVars

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-07-10 01:47:52 +00:00
dependabot[bot] 8635b2e84b
Bump aquasecurity/trivy-action from 0.31.0 to 0.32.0 (#2585)
Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.31.0 to 0.32.0.
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](https://github.com/aquasecurity/trivy-action/compare/0.31.0...0.32.0)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-version: 0.32.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-08 01:57:50 +00:00
Yi Chen 107b457296
Make logging encoder configurable (#2580)
Signed-off-by: Yi Chen <github@chenyicn.net>
2025-07-08 01:28:50 +00:00
Yi Chen fd52169b25
Add SparkConnect e2e test (#2578)
Signed-off-by: Yi Chen <github@chenyicn.net>
2025-07-08 01:05:50 +00:00
Mat Schaffer dee20c86c4
Splat recordExecutorEvent args for cleaner event messages (#2582)
Signed-off-by: Mat Schaffer <Mat.Schaffer@roblox.com>
2025-07-07 02:39:21 +00:00
Manabu McCloskey 191ac52820
upgrade to Spark 4.0.0 (#2564)
* upgrade to Spark 4.0.0

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

* correct docker address

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

* fix dockr fqdn

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

---------

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-07-04 02:11:18 +00:00
Yi Chen 9773369b6d
Add support for Spark Connect (#2569)
* Add v1alpha1 version of CRD SparkConnect

Signed-off-by: Yi Chen <github@chenyicn.net>

* Generate SparkConnect CRD manifests

Signed-off-by: Yi Chen <github@chenyicn.net>

* Implement SparkConnect controller

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update Helm chart CRDs and RBAC resources

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add SparkConnect example

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add pre stop lifecycle handler

Signed-off-by: Yi Chen <github@chenyicn.net>

* Rename LabelSparkConnName to LabelSparkConnectName

Signed-off-by: Yi Chen <github@chenyicn.net>

* Rename SparkConnectStateUnready to SparkConnectStateNotReady

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add constant SparkConfigMapVolumeMountName

Signed-off-by: Yi Chen <github@chenyicn.net>

* Use server pod status message as SparkConnect condition message

Signed-off-by: Yi Chen <github@chenyicn.net>

* Define SparkConnect condition reasons

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add prefix to Hadoop properties if it does not start with 'spark.hadoop.'

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-07-02 03:21:18 +00:00
Francisco Arceo 2a51f5a426
chore: Adding OpenSSF Badge (#2571)
Signed-off-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-07-01 02:40:16 +00:00
Yi Chen 9d10f2f67a
Add changelog for v2.2.1 (#2570)
* Release v2.2.1 (#2568)

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add changelog for v2.2.1

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-06-27 01:06:17 +00:00
dependabot[bot] 5148c4e81a
Bump github.com/go-logr/logr from 1.4.2 to 1.4.3 (#2567)
Bumps [github.com/go-logr/logr](https://github.com/go-logr/logr) from 1.4.2 to 1.4.3.
- [Release notes](https://github.com/go-logr/logr/releases)
- [Changelog](https://github.com/go-logr/logr/blob/master/CHANGELOG.md)
- [Commits](https://github.com/go-logr/logr/compare/v1.4.2...v1.4.3)

---
updated-dependencies:
- dependency-name: github.com/go-logr/logr
  dependency-version: 1.4.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-25 06:13:15 +00:00
dependabot[bot] 1b8df900a8
Bump golang.org/x/mod from 0.24.0 to 0.25.0 (#2566)
Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.24.0 to 0.25.0.
- [Commits](https://github.com/golang/mod/compare/v0.24.0...v0.25.0)

---
updated-dependencies:
- dependency-name: golang.org/x/mod
  dependency-version: 0.25.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-25 06:12:17 +00:00
dependabot[bot] 78bb172fa1
Bump sigs.k8s.io/scheduler-plugins from 0.30.6 to 0.31.8 (#2549)
Bumps [sigs.k8s.io/scheduler-plugins](https://github.com/kubernetes-sigs/scheduler-plugins) from 0.30.6 to 0.31.8.
- [Release notes](https://github.com/kubernetes-sigs/scheduler-plugins/releases)
- [Changelog](https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/RELEASE.md)
- [Commits](https://github.com/kubernetes-sigs/scheduler-plugins/compare/v0.30.6...v0.31.8)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/scheduler-plugins
  dependency-version: 0.31.8
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-19 13:50:10 +00:00
dependabot[bot] 6a32e32432
Bump github.com/prometheus/client_golang from 1.21.1 to 1.22.0 (#2548)
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.21.1 to 1.22.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.21.1...v1.22.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-version: 1.22.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-19 13:49:11 +00:00
jbhalodia-slack ca11a8f55d
Use code-generator for clientset, informers, listers (#2563)
* Use code-generator to for clientset, informer, lister

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Add README in hack/ for code-generator

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Run verify-codegen.sh in tests

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* make generate

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* make build-api-docs

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* update makefile

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* update makefile

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* make go-fmt

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* make generate

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* run tests

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Remove deepcopy-gen since its conflicting with controller-gen

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Revert some changes

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Generate packages in pkg/client/

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update year to 2025 in boilerplate.go.txt

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update year to 2025 in types.go

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

---------

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
2025-06-19 13:37:10 +00:00
Joshua Cuellar 818e34a36a
Update golangci lint (#2560)
* Update Golangci-lint version to 2.1.6

Signed-off-by: Joshua Cuellar <joshuac.cuellar@outlook.com>

* Fix Golangci-lint code issues

Signed-off-by: Joshua Cuellar <joshuac.cuellar@outlook.com>

* Simplify struct paths

Signed-off-by: Joshua Cuellar <joshuac.cuellar@outlook.com>

* Catch returned errors in test

Signed-off-by: Joshua Cuellar <joshuac.cuellar@outlook.com>

---------

Signed-off-by: Joshua Cuellar <joshuac.cuellar@outlook.com>
2025-06-17 02:40:08 +00:00
Yi Chen d14c901d12
Get logger from context (#2551)
Signed-off-by: Yi Chen <github@chenyicn.net>
2025-06-12 02:04:50 +00:00
Thomas Newton 8dd8db45a3
Make default ingress tls and annotations congurable in the helm config (#2513)
* Initial attempt at passing the info through the CLI

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Working getting ingressTLS config from helm yaml into a `[]networkingv1.IngressTLS`

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Vaguely correct adding TLS options to UI ingress

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* First test passing

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Pass through argument where I had forgotten

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Implement IngressAnnotations

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add "default" to some variable names

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update helm and CLI parsing, including adding annotations

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Sufficient unit tests

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tests and documentation for helm

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Minor adjustments to test strings

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* PR comments

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix rebase

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Avoid manually constructing the expected ingress name in tests

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Quote and rename on the helm side

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Avoid using pointers

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* More renaming to remove "default"

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Revert helm quote on json strings

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy imports

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix tests after #2554 moved the ingress creation to be on submitted spark applications

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Re-generate helm docs

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

---------

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
2025-06-10 16:02:49 +00:00
Yi Chen 718e3a004e
Customize ingress URL with Spark application ID (#2554)
* Customize ingress URL with Spark application ID

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update policy rules for ingress

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-06-10 09:33:48 +00:00
dependabot[bot] 84a7749680
Bump aquasecurity/trivy-action from 0.30.0 to 0.31.0 (#2557)
Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.30.0 to 0.31.0.
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](https://github.com/aquasecurity/trivy-action/compare/0.30.0...0.31.0)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-version: 0.31.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-10 09:29:50 +00:00
Manabu McCloskey 53299362e5
add driver ingress unit tests (#2552)
* add driver ingress unit tests

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

* fix lint

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

* fix lint

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

---------

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-06-06 09:07:15 +00:00
Yi Chen 5a1932490a
Add changelog for v2.2.0 (#2547)
* Release v2.2.0 (#2546)

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add changelog for v2.2.0

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update README.md

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-06-03 12:12:13 +00:00
Yi Chen 08dfbc5fc9
Pass the correct LDFLAGS when building the operator image (#2541)
* Update module path in Makefile

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update .dockerignore and .gitignore

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-05-29 05:14:18 +00:00
Hossein Torabi 7b819728ff
#2525 spark metrics in depends on prometheus (#2529)
Signed-off-by: Hossein Torabi <blcksrx@pm.me>
2025-05-29 04:33:18 +00:00
Yi Chen ec83779094
Bump k8s.io dependencies to v0.32.5 (#2540)
Signed-off-by: Yi Chen <github@chenyicn.net>
2025-05-29 04:10:19 +00:00
Yi Chen 45efb0cf68
Add v2 to module path (#2515)
* Add v2 to moudle path

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update imports

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update API docs

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-05-28 15:46:18 +00:00
Yi Chen 61510642af
Add support for using cert manager to generate webhook certificates (#2373)
* Add support for using cert manager to generate webhook certificates

Signed-off-by: Yi Chen <github@chenyicn.net>

* update certificate provider unit tests

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add a newline at the end of file

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add examples for configuring duration and renewBefore

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-05-28 05:33:18 +00:00
Yi Chen 8c7a4949d0
Define SparkApplicationSubmitter interface to allow customizing submitting mechanism (#2500)
* Define SparkApplicationSubmitter interface to allow customizing submitting mechanism

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update comments

Co-authored-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Co-authored-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-05-28 04:46:18 +00:00
dependabot[bot] 0de3502c9e
Bump manusa/actions-setup-minikube from 2.13.1 to 2.14.0 (#2523)
Bumps [manusa/actions-setup-minikube](https://github.com/manusa/actions-setup-minikube) from 2.13.1 to 2.14.0.
- [Release notes](https://github.com/manusa/actions-setup-minikube/releases)
- [Commits](https://github.com/manusa/actions-setup-minikube/compare/v2.13.1...v2.14.0)

---
updated-dependencies:
- dependency-name: manusa/actions-setup-minikube
  dependency-version: 2.14.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-28 02:24:18 +00:00
Vara Bonthu 8998abd881
Adding Manabu to the reviewers (#2522)
* Adding Manabu to the approvers

Signed-off-by: Vara Bonthu <vara.bonthu@gmail.com>

* Adding Manabu to the reviewers

Signed-off-by: Vara Bonthu <vara.bonthu@gmail.com>

---------

Signed-off-by: Vara Bonthu <vara.bonthu@gmail.com>
2025-05-28 02:23:18 +00:00
dependabot[bot] 81cac03725
Bump golang.org/x/mod from 0.23.0 to 0.24.0 (#2495)
Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.23.0 to 0.24.0.
- [Commits](https://github.com/golang/mod/compare/v0.23.0...v0.24.0)

---
updated-dependencies:
- dependency-name: golang.org/x/mod
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-27 09:21:27 +00:00
dependabot[bot] 486f5a7eee
Bump github.com/spf13/cobra from 1.8.1 to 1.9.1 (#2497)
Bumps [github.com/spf13/cobra](https://github.com/spf13/cobra) from 1.8.1 to 1.9.1.
- [Release notes](https://github.com/spf13/cobra/releases)
- [Commits](https://github.com/spf13/cobra/compare/v1.8.1...v1.9.1)

---
updated-dependencies:
- dependency-name: github.com/spf13/cobra
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-27 09:19:03 +00:00
Manabu McCloskey 7536c0739f
fix volcano tests (#2533)
Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-05-27 08:26:24 +00:00
Manabu McCloskey 851668f7ca
fix and add back unit tests (#2532)
* fix and add back unit tests

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

* add more tests

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

---------

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-05-17 11:05:24 +00:00
jbhalodia-slack ca37f6b7b3
Add ShuffleTrackingEnabled to DynamicAllocation struct to allow disabling shuffle tracking (#2511)
* Add ShuffleTrackingEnabled *bool to DynamicAllocation struct to allow disabling shuffle tracking

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Run make generate

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* make manifests

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* make update-crd && make build-api-docs

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update internal/controller/sparkapplication/submission.go

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Go fmt

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Refactor defaultExecutorSpec func

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Refactor dynamicAllocationOption func

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Add IsDynamicAllocationEnabled func

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

---------

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
2025-05-14 05:49:22 +00:00
Tarek Abouzeid 0cb98f79cf
Adding securityContext to spark examples (#2530)
* Adding securityContext to spark examples

Signed-off-by: Tarek Abouzeid <tarek.abouzeid91@gmail.com>

* Fix identation and new lines

Signed-off-by: Tarek Abouzeid <tarek.abouzeid91@gmail.com>

---------

Signed-off-by: Tarek Abouzeid <tarek.abouzeid91@gmail.com>
2025-05-13 02:40:20 +00:00
Manabu McCloskey e071e5f4a6
add unit tests for driver and executor configs (#2521)
* add unit tests for driver and executor configs

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

* fix import

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

---------

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-05-06 21:30:38 +00:00
Yi Chen 85b209ec20
Remove v1beta1 API (#2516)
Signed-off-by: Yi Chen <github@chenyicn.net>
2025-05-03 00:00:04 +00:00
Yi Chen 5002c08dce
Remove clientset, informer and listers generated by code-generator (#2506)
* Remove clientset, informer and listers generated by code-generator

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update imports

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-04-27 03:20:59 +00:00
dependabot[bot] 634b0c1a9f
Bump golang.org/x/net from 0.37.0 to 0.38.0 (#2505)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.37.0 to 0.38.0.
- [Commits](https://github.com/golang/net/compare/v0.37.0...v0.38.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-version: 0.38.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-21 08:32:26 +00:00
dependabot[bot] 6e0623cc7e
Bump github.com/spf13/viper from 1.19.0 to 1.20.1 (#2496)
Bumps [github.com/spf13/viper](https://github.com/spf13/viper) from 1.19.0 to 1.20.1.
- [Release notes](https://github.com/spf13/viper/releases)
- [Commits](https://github.com/spf13/viper/compare/v1.19.0...v1.20.1)

---
updated-dependencies:
- dependency-name: github.com/spf13/viper
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-21 07:25:26 +00:00
Daniel Freitas 5a97ca4daa
Enable the override of MemoryLimit through webhook (#2478)
* Documentation and interface definition

Signed-off-by: danielrs <danielrs@ibm.com>

* addMemoryLimit and convertion methods

Signed-off-by: danielrs <danielrs@ibm.com>

* Unit tests

Signed-off-by: danielrs <danielrs@ibm.com>

* Deepcopy

Signed-off-by: danielrs <danielrs@ibm.com>

* Adjustments after make command

Signed-off-by: danielrs <danielrs@ibm.com>

* Address comments

Signed-off-by: danielrs <danielrs@ibm.com>

---------

Signed-off-by: danielrs <danielrs@ibm.com>
Co-authored-by: danielrs <danielrs@ibm.com>
2025-04-21 07:12:26 +00:00
Yi Chen 836a9186e5
Remove sparkctl (#2466)
Signed-off-by: Yi Chen <github@chenyicn.net>
2025-04-21 06:46:26 +00:00
Yi Chen 32017d2f41
Add changelog for v2.1.1 (#2504)
Signed-off-by: Yi Chen <github@chenyicn.net>
2025-04-15 10:51:22 +00:00
dependabot[bot] f38ae18abe
Bump helm.sh/helm/v3 from 3.16.2 to 3.17.3 (#2503)
Bumps [helm.sh/helm/v3](https://github.com/helm/helm) from 3.16.2 to 3.17.3.
- [Release notes](https://github.com/helm/helm/releases)
- [Commits](https://github.com/helm/helm/compare/v3.16.2...v3.17.3)

---
updated-dependencies:
- dependency-name: helm.sh/helm/v3
  dependency-version: 3.17.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-14 08:01:09 +00:00
Jacob Salway 3c4ebc7235
Upgrade Golang to 1.24.1 and golangci-lint to 1.64.8 (#2494)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2025-03-31 04:01:30 +00:00
Jacob Salway 50ae7a0062
Add timeZone to ScheduledSparkApplication (#2471)
* Add timeZone to ScheduledSparkApplication

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Update api/v1beta2/scheduledsparkapplication_types.go

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

---------

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
2025-03-31 02:12:30 +00:00
Vikas Saxena 7668a1c551
Changing image repo from docker.io to ghcr.io (#2483)
* modified image.registry value

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* modified registry value to ghcr.io

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* modified image registry value to ghcr.io

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* modified image registry_value to ghcr.io

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* replaced docker.io with ghcr.io

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* updated container registry credentials

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* corrected registry username and password

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* changed image_repo value to inlude controller

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* removed unwanted space

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* Update charts/spark-operator-chart/values.yaml

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Vikas Saxena <90456542+vikas-saxena02@users.noreply.github.com>

* Update Makefile

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Vikas Saxena <90456542+vikas-saxena02@users.noreply.github.com>

* updated charts/spark-operator-chart/README.md by running make hem-docs

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

---------

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>
Signed-off-by: Vikas Saxena <90456542+vikas-saxena02@users.noreply.github.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
2025-03-30 09:13:30 +00:00
TJ Miller d089a43836
fix: add webhook cert validity checking (#2489)
Signed-off-by: TJ Miller <millert@us.ibm.com>
2025-03-27 05:08:22 +00:00
Jacob Salway 2c0cac198c
Upgrade to Spark 3.5.5 (#2490)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2025-03-26 11:22:03 +00:00
Vikas Saxena 5a1fc7ba16
Deprecating sparkctl (#2484)
* Added deprecation warning in cmd/sparkctl/main.go

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* added deprecation warning in Makefile

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* added recommendation to use kubectl in Makefile

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* updated readme for sparkctl to show its being deprecated

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

* adding deprecated comment as well to main.go

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>

---------

Signed-off-by: Vikas Saxena <Vikas.Saxena.2006@gmail.com>
2025-03-25 12:52:51 +00:00
dependabot[bot] 1768eb1ede
Bump sigs.k8s.io/controller-runtime from 0.20.1 to 0.20.4 (#2486)
Bumps [sigs.k8s.io/controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) from 0.20.1 to 0.20.4.
- [Release notes](https://github.com/kubernetes-sigs/controller-runtime/releases)
- [Changelog](https://github.com/kubernetes-sigs/controller-runtime/blob/main/RELEASE.md)
- [Commits](https://github.com/kubernetes-sigs/controller-runtime/compare/v0.20.1...v0.20.4)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/controller-runtime
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 09:59:50 +00:00
dependabot[bot] 791d4ab9c8
Bump github.com/prometheus/client_golang from 1.20.5 to 1.21.1 (#2487)
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.20.5 to 1.21.1.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.20.5...v1.21.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 09:42:50 +00:00
dependabot[bot] c0403dd777
Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 (#2488)
Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify) from 1.9.0 to 1.10.0.
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.9.0...v1.10.0)

---
updated-dependencies:
- dependency-name: github.com/stretchr/testify
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-25 09:30:51 +00:00
Shu 6f4102d3c2
Add APRA AMCOS to adopters (#2485)
Signed-off-by: Shu <57744345+shuch3ng@users.noreply.github.com>
2025-03-23 15:51:05 +00:00
dependabot[bot] 9f69b3a922
Bump sigs.k8s.io/scheduler-plugins from 0.29.8 to 0.30.6 (#2444)
Bumps [sigs.k8s.io/scheduler-plugins](https://github.com/kubernetes-sigs/scheduler-plugins) from 0.29.8 to 0.30.6.
- [Release notes](https://github.com/kubernetes-sigs/scheduler-plugins/releases)
- [Changelog](https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/RELEASE.md)
- [Commits](https://github.com/kubernetes-sigs/scheduler-plugins/compare/v0.29.8...v0.30.6)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/scheduler-plugins
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 06:38:26 +00:00
dependabot[bot] 544a342702
Bump github.com/aws/aws-sdk-go-v2/config from 1.28.0 to 1.29.9 (#2463)
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.28.0 to 1.29.9.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/changelog-template.json)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.28.0...config/v1.29.9)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 05:36:26 +00:00
dependabot[bot] 1676651ca0
Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.66.0 to 1.78.2 (#2473)
Bumps [github.com/aws/aws-sdk-go-v2/service/s3](https://github.com/aws/aws-sdk-go-v2) from 1.66.0 to 1.78.2.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/changelog-template.json)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/service/s3/v1.66.0...service/s3/v1.78.2)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/service/s3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 05:12:27 +00:00
dependabot[bot] 9a76f6ad80
Bump k8s.io/apimachinery from 0.32.0 to 0.32.3 (#2474)
Bumps [k8s.io/apimachinery](https://github.com/kubernetes/apimachinery) from 0.32.0 to 0.32.3.
- [Commits](https://github.com/kubernetes/apimachinery/compare/v0.32.0...v0.32.3)

---
updated-dependencies:
- dependency-name: k8s.io/apimachinery
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 04:48:27 +00:00
dependabot[bot] c90a93aadf
Bump github.com/containerd/containerd from 1.7.19 to 1.7.27 (#2476)
Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.7.19 to 1.7.27.
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](https://github.com/containerd/containerd/compare/v1.7.19...v1.7.27)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 03:51:27 +00:00
dependabot[bot] e70b01a087
Bump golang.org/x/net from 0.35.0 to 0.37.0 (#2472)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.35.0 to 0.37.0.
- [Commits](https://github.com/golang/net/compare/v0.35.0...v0.37.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 03:02:26 +00:00
dependabot[bot] 520f7a6fc8
Bump aquasecurity/trivy-action from 0.29.0 to 0.30.0 (#2475)
Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.29.0 to 0.30.0.
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](https://github.com/aquasecurity/trivy-action/compare/0.29.0...0.30.0)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 03:01:27 +00:00
Tan Qi 68e6c1d586
change env in executorSecretOption (#2467)
* change env in executorSecretOption

Signed-off-by: Qi Tan <16416018+TQJADE@users.noreply.github.com>

* Use spark.executorEnv instead

Signed-off-by: Qi Tan <16416018+TQJADE@users.noreply.github.com>

* Remove V2 and update SparkExecutorEnvTemplate

Signed-off-by: Qi Tan <16416018+TQJADE@users.noreply.github.com>

---------

Signed-off-by: Qi Tan <16416018+TQJADE@users.noreply.github.com>
2025-03-20 02:01:20 +00:00
dependabot[bot] 092e41ad58
Bump golang.org/x/net from 0.35.0 to 0.36.0 (#2470)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.35.0 to 0.36.0.
- [Commits](https://github.com/golang/net/compare/v0.35.0...v0.36.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-19 06:25:45 +00:00
pvbouwel 3128c7f157
fix: webhook fail to add lifecycle to Spark3 executor pods (#2458)
* bugfix: A lifecycle on a spark3 executor should not fail

Before this fix if you have  a Spark 3.x spec where the executor has a lifecycle then the webhook will fail to identify the correct container. As described in issue [2457](https://github.com/kubeflow/spark-operator/issues/2457)

Signed-off-by: pvbouwel <463976+pvbouwel@users.noreply.github.com>

* tests: Add coverage for spark3 executor with a lifecycle

Signed-off-by: pvbouwel <463976+pvbouwel@users.noreply.github.com>

* make go-fmt

Signed-off-by: pvbouwel <463976+pvbouwel@users.noreply.github.com>

---------

Signed-off-by: pvbouwel <463976+pvbouwel@users.noreply.github.com>
2025-03-06 14:15:15 +00:00
Manabu McCloskey 939218c85f
add support for metrics-job-start-latency-buckets flag in helm (#2450)
Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-03-04 08:21:33 +00:00
Manabu McCloskey fc7c697c61
specify branch name in chart testing (#2451)
Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-03-01 01:04:29 +00:00
jbhalodia-slack 79264a4ac5
Support non-standard Spark container names (#2441)
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
2025-02-20 18:36:43 +00:00
jbhalodia-slack d10b8f5f3a
Make image optional (#2439)
* Make app.Spec.Driver.Image and app.Spec.Image optional
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Make app.Spec.Executor.Image optional
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
2025-02-20 04:03:42 +00:00
Manabu McCloskey bd197c6f8c
use cmd context in sparkctl (#2447)
Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-02-20 03:52:42 +00:00
Anish Asthana 405ae51de4
docs: Add information about KEP process (#2440)
Signed-off-by: Anish Asthana <anishasthana1@gmail.com>
2025-02-15 00:03:37 +00:00
Jacob Salway 25ca90cb07
Support Kubernetes 1.32 (#2416)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
Signed-off-by: Jacob Salway <jacob.salway@rokt.com>
2025-02-12 12:02:29 +00:00
dependabot[bot] 8892dd4b32
Bump golang.org/x/net from 0.32.0 to 0.35.0 (#2428)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.32.0 to 0.35.0.
- [Commits](https://github.com/golang/net/compare/v0.32.0...v0.35.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 10:26:29 +00:00
dependabot[bot] 30c15f2db5
Bump github.com/golang/glog from 1.2.2 to 1.2.4 (#2411)
Bumps [github.com/golang/glog](https://github.com/golang/glog) from 1.2.2 to 1.2.4.
- [Release notes](https://github.com/golang/glog/releases)
- [Commits](https://github.com/golang/glog/compare/v1.2.2...v1.2.4)

---
updated-dependencies:
- dependency-name: github.com/golang/glog
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 10:12:29 +00:00
dependabot[bot] 53b2292025
Bump golang.org/x/mod from 0.21.0 to 0.23.0 (#2427)
Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.21.0 to 0.23.0.
- [Commits](https://github.com/golang/mod/compare/v0.21.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/mod
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 10:11:29 +00:00
dependabot[bot] d73e209ea2
Bump helm/chart-testing-action from 2.6.1 to 2.7.0 (#2391)
Bumps [helm/chart-testing-action](https://github.com/helm/chart-testing-action) from 2.6.1 to 2.7.0.
- [Release notes](https://github.com/helm/chart-testing-action/releases)
- [Commits](https://github.com/helm/chart-testing-action/compare/v2.6.1...v2.7.0)

---
updated-dependencies:
- dependency-name: helm/chart-testing-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 10:01:29 +00:00
dependabot[bot] 46ec8e64f4
Bump manusa/actions-setup-minikube from 2.13.0 to 2.13.1 (#2390)
Bumps [manusa/actions-setup-minikube](https://github.com/manusa/actions-setup-minikube) from 2.13.0 to 2.13.1.
- [Release notes](https://github.com/manusa/actions-setup-minikube/releases)
- [Commits](https://github.com/manusa/actions-setup-minikube/compare/v2.13.0...v2.13.1)

---
updated-dependencies:
- dependency-name: manusa/actions-setup-minikube
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-12 10:00:29 +00:00
Yi Chen 54fb0b0305
Controller should only be granted event permissions in spark job namespaces (#2426)
Signed-off-by: Yi Chen <github@chenyicn.net>
2025-02-12 09:57:29 +00:00
Manabu McCloskey 2995a0a963
ensure passed context is used (#2432)
Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-02-12 08:44:29 +00:00
Yi Chen 1f2cfbcae7
Add option for disabling leader election (#2423)
* Add option for disabling leader election

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove related RBAC rules when disabling leader election

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-02-10 23:15:28 +00:00
Yi Chen ae85466a52
Add helm unittest step to integration test workflow (#2424)
* chore: add helm unittest step to integration test workflow

Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: contrainer security context unit test

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2025-02-09 03:27:04 +00:00
Manabu McCloskey a348b9218f
fix make deploy and install (#2412)
* fix make deploy and install

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

* fix install-crd

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

* build local image

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

---------

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-02-05 11:30:37 +00:00
Jacob Salway ad30d15bbb
Remove dependency on `k8s.io/kubernetes` (#2398)
* Remove dependency on `k8s.io/kubernetes`

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Make code compliant with Apache 2.0 license

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

---------

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2025-01-28 15:36:51 +00:00
Manabu McCloskey 6e15770f83
add an example of using prometheus servlet (#2403)
* add an example of using prometheus servlet

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

* add k8s annotations

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>

---------

Signed-off-by: Manabu McCloskey <manabu.mccloskey@gmail.com>
2025-01-27 13:50:50 +00:00
Tarek Abouzeid b2411033f0
Adding seccompProfile RuntimeDefault (#2397)
* Adding seccompProfile RuntimeDefault

Signed-off-by: Tarek Abouzeid <tarek.abouzeid@teliacompany.com>

* updating helm docs

Signed-off-by: Tarek Abouzeid <tarek.abouzeid@teliacompany.com>

---------

Signed-off-by: Tarek Abouzeid <tarek.abouzeid@teliacompany.com>
2025-01-21 12:58:35 +00:00
hongshaoyang e6c2337e02
Add Ninja Van to adopters (#2377)
* Add Ninja Van to adopters

Signed-off-by: hongshaoyang <hongsy2006@gmail.com>

* Fix typo

Signed-off-by: hongshaoyang <hongsy2006@gmail.com>

---------

Signed-off-by: hongshaoyang <hongsy2006@gmail.com>
2025-01-09 03:23:21 +00:00
dependabot[bot] 92deff0be9
Bump golang.org/x/crypto from 0.30.0 to 0.31.0 (#2365)
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.30.0 to 0.31.0.
- [Commits](https://github.com/golang/crypto/compare/v0.30.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-18 16:55:10 +00:00
dependabot[bot] 7f5b5edea5
Bump golang.org/x/net from 0.30.0 to 0.32.0 (#2350)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.30.0 to 0.32.0.
- [Commits](https://github.com/golang/net/compare/v0.30.0...v0.32.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-18 16:28:11 +00:00
Chaoran Yu 413d05eb22
Promoted jacobsalway from reviewer to approver (#2361)
Discussed offline on Slack and during community Zoom call today with fellow maintainers

Signed-off-by: Chaoran Yu <yuchaoran2011@gmail.com>
2024-12-15 14:55:08 +00:00
Yi Chen 71821733a6
Add changelog for v2.1.0 (#2355)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-12-11 09:58:04 +00:00
Yi Chen 2375a306f9
Move sparkctl to cmd directory (#2347)
* Move spark-operator

Signed-off-by: Yi Chen <github@chenyicn.net>

* Move sparkctl to cmd directory

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove unnecessary app package/directory

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-12-06 05:08:00 +00:00
Aakcht 5dd91c4bf2
Use NSS_WRAPPER_PASSWD instead of /etc/passwd as in spark-operator image entrypoint.sh (#2312)
Signed-off-by: Aakcht <aakcht@gmail.com>
2024-12-04 13:07:00 +00:00
Thomas Newton d815e78c21
Robustness to driver pod taking time to create (#2315)
* Retry after driver pod now found if recent submission

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add a test

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Make grace period configurable

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update test

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add an extra test with the driver pod

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Separate context to create and delete the driver pod

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Autoformat

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update error message

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add helm paramater

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update internal/controller/sparkapplication/controller.go

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Newlines between helm tests

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

---------

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
2024-12-04 12:58:59 +00:00
C. H. Afzal a261523144
The webhook-key-name command-line param isn't taking effect (#2344)
Signed-off-by: C. H. Afzal <c-h-afzal@outlook.com>
2024-12-04 09:18:01 +00:00
dependabot[bot] 40423d5501
Bump github.com/onsi/ginkgo/v2 from 2.20.2 to 2.22.0 (#2335)
Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.20.2 to 2.22.0.
- [Release notes](https://github.com/onsi/ginkgo/releases)
- [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/ginkgo/compare/v2.20.2...v2.22.0)

---
updated-dependencies:
- dependency-name: github.com/onsi/ginkgo/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-04 09:12:59 +00:00
dependabot[bot] 270b09e4c7
Bump aquasecurity/trivy-action from 0.28.0 to 0.29.0 (#2332)
Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.28.0 to 0.29.0.
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](https://github.com/aquasecurity/trivy-action/compare/0.28.0...0.29.0)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-04 09:11:00 +00:00
Jacob Salway 43c1888c9d
Truncate UI service name if over 63 characters (#2311)
* Truncate UI service name if over 63 characters

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Also truncate ingress name

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

---------

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-11-18 14:17:23 +00:00
Jacob Salway 22e4fb8e48
Bump `volcano.sh/apis` to 1.10.0 (#2320)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-11-11 11:35:16 +00:00
Cian (Keen) Gallagher 2999546dc6
Fix: should not add emptyDir sizeLimit conf on executor pods if it is nil (#2316)
Signed-off-by: Cian Gallagher <cian@ciangallagher.net>
2024-11-11 02:13:15 +00:00
Nicholas Gretzon 72107fd7b8
Allow the Controller and Webhook Containers to run with the securityContext: readOnlyRootfilesystem: true (#2282)
* create a tmp dir for the controller to write Spark artifacts to and set the controller to readOnlyRootFilesystem

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* mount a dir for the webhook container to generate its certificates in and set readOnlyRootFilesystem: true for the webhook pod

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* update the securityContext in the controller deployment test

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* update securityContext of the webhook container in the deployment_test

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* update README

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* remove -- so comments are not rendered in the README.md

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* recreate README.md after removal of comments for volumes and volumeMounts

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* make indentation for volumes and volumeMounts consistent with rest of values.yaml

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* Revert "make indentation for volumes and volumeMounts consistent with rest of values.yaml"

This reverts commit dba97fc3d9458e5addfff79d021d23b30938cbb9.

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* fix indentation in webhook and controller deployment templates for volumes and volumeMounts

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

* Update charts/spark-operator-chart/values.yaml

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Nicholas Gretzon <50811947+npgretz@users.noreply.github.com>

* Update charts/spark-operator-chart/values.yaml

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Nicholas Gretzon <50811947+npgretz@users.noreply.github.com>

* Update charts/spark-operator-chart/values.yaml

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Nicholas Gretzon <50811947+npgretz@users.noreply.github.com>

* Update charts/spark-operator-chart/values.yaml

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Nicholas Gretzon <50811947+npgretz@users.noreply.github.com>

* Update charts/spark-operator-chart/templates/controller/deployment.yaml

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Nicholas Gretzon <50811947+npgretz@users.noreply.github.com>

* Update charts/spark-operator-chart/templates/controller/deployment.yaml

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Nicholas Gretzon <50811947+npgretz@users.noreply.github.com>

* Update charts/spark-operator-chart/templates/webhook/deployment.yaml

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Nicholas Gretzon <50811947+npgretz@users.noreply.github.com>

* Update charts/spark-operator-chart/templates/webhook/deployment.yaml

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Nicholas Gretzon <50811947+npgretz@users.noreply.github.com>

* add additional securityContext to the controller deployment_test.yaml

Signed-off-by: Nick Gretzon <npgretz@gmail.com>

---------

Signed-off-by: Nick Gretzon <npgretz@gmail.com>
Signed-off-by: Nicholas Gretzon <50811947+npgretz@users.noreply.github.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
2024-11-07 03:10:12 +00:00
Yi Chen 763682dfe6
Fix: should not add emptyDir sizeLimit conf if it is nil (#2305)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-11-04 11:17:15 +00:00
Yi Chen 171e429706
Fix: executor container security context does not work (#2306)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-11-04 11:03:15 +00:00
Aran Shavit 515d805b8a
Allow setting automountServiceAccountToken (#2298)
* Allow setting automountServiceAccountToken on workloads and serviceAccounts

Signed-off-by: Aran Shavit <Aranshavit@gmail.com>

* update helm docs

Signed-off-by: Aran Shavit <Aranshavit@gmail.com>

---------

Signed-off-by: Aran Shavit <Aranshavit@gmail.com>
2024-11-04 07:50:16 +00:00
Yi Chen 85f0ed039d
Update issue and pull request templates (#2287)
* Update issue templates

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update pull request template

Signed-off-by: Yi Chen <github@chenyicn.net>

* Referring to the operator as the Spark operator

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add link to the Spark operator slack channel

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-26 00:08:31 +00:00
Yi Chen 1e864c8b91
Add workflow for releasing sparkctl binary (#2264)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-24 08:27:30 +00:00
Yi Chen d0daf2fd17
Support pod template for Spark 3.x applications (#2141)
* Update API definition to support pod template

Signed-off-by: Yi Chen <github@chenyicn.net>

* Mark pod template field as schemaless

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add kubebuilder marker to preserve unknown fields

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add example for using pod template

Signed-off-by: Yi Chen <github@chenyicn.net>

* Support pod template

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-24 02:23:30 +00:00
Yi Chen fd2e1251d8
Update default container security context (#2265)
* Update default container security context

Signed-off-by: Yi Chen <github@chenyicn.net>

* Push user and group directives into Dockerfile

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add allowPrivilegeEscalation to container security context

Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: fsGroup should be moved to pod security context

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-24 01:52:30 +00:00
Thomas Newton 735c7fc9e5
Fix retries (#2241)
* Attempt to requeue after correct period

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Syntactically correct

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* I think correct requeueing

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Same treatment for the other retries

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Requeue after deleting resources

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Try to fix submission status updates

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Correct usage of submitSparkApplication

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix error logging

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Bring back ExecutionAttempts increment that I forgot about

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Log after reconcile complete

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix setting submission ID

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy logging

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update comment

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Start a new test

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Working Fails submission and retries until retries are exhausted test

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add Application fails and retries until retries are exhausted

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Comments

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Move fail configs out of the examples directory

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix lint

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Move TimeUntilNextRetryDue to `pkg/util/sparkapplication.go`

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update internal/controller/sparkapplication/controller.go

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Update test/e2e/sparkapplication_test.go

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* camelCase

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* make fo-fmt

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* PR comments

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

---------

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
2024-10-23 13:13:30 +00:00
dependabot[bot] d130b08fa5
Bump github.com/aws/aws-sdk-go-v2/config from 1.27.43 to 1.28.0 (#2272)
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.27.43 to 1.28.0.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.27.43...v1.28.0)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-23 12:23:30 +00:00
dependabot[bot] 117d5f05ef
Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.65.3 to 1.66.0 (#2271)
Bumps [github.com/aws/aws-sdk-go-v2/service/s3](https://github.com/aws/aws-sdk-go-v2) from 1.65.3 to 1.66.0.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/service/s3/v1.65.3...service/s3/v1.66.0)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/service/s3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-23 11:46:30 +00:00
Jacob Salway 345d611810
Allow --ingress-class-name to be specified in chart (#2278)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-10-23 09:41:29 +00:00
dependabot[bot] 641007cf08
Bump aquasecurity/trivy-action from 0.27.0 to 0.28.0 (#2270)
Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.27.0 to 0.28.0.
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](https://github.com/aquasecurity/trivy-action/compare/0.27.0...0.28.0)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-22 14:00:29 +00:00
Jacob Salway 9f83e2a87a
Run e2e tests with Kubernetes version matrix (#2266)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-10-22 06:19:29 +00:00
dependabot[bot] f56ba30d5c
Bump cloud.google.com/go/storage from 1.44.0 to 1.45.0 (#2273)
Bumps [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) from 1.44.0 to 1.45.0.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](https://github.com/googleapis/google-cloud-go/compare/pubsub/v1.44.0...spanner/v1.45.0)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/storage
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-22 03:03:29 +00:00
dependabot[bot] 2a9b278318
Bump github.com/prometheus/client_golang from 1.20.4 to 1.20.5 (#2274)
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.20.4 to 1.20.5.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.20.4...v1.20.5)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-22 02:43:28 +00:00
dependabot[bot] d0f62854b3
Bump helm.sh/helm/v3 from 3.16.1 to 3.16.2 (#2275)
Bumps [helm.sh/helm/v3](https://github.com/helm/helm) from 3.16.1 to 3.16.2.
- [Release notes](https://github.com/helm/helm/releases)
- [Commits](https://github.com/helm/helm/compare/v3.16.1...v3.16.2)

---
updated-dependencies:
- dependency-name: helm.sh/helm/v3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-22 02:04:28 +00:00
Jacob Salway 1509b341d4
Add release badge to README (#2263)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-10-18 11:33:20 +00:00
Thomas Newton 5ff8dcf350
`omitempty` corrections (#2255)
* Still working on tests

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Maybe progress

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* First working validation

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Lots of cleanup needed but it actually reproduced

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Working but ugly get schema from CRD

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Satisfactory test

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add missing omitempty for optional values

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Remove omitempty on required fields

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Run update-crd

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Remove temp schema config

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Tidy

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* go import

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Cover more test cases

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add tests that spec and metadata are required

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add tests against error content

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* `go mod tidy`

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Fix lint

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Remove test - hopefully we can add a better test as a follow up

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Make `mainApplicationFile` required

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Regenerated api-docs

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

* Add `MainApplicationFile` in tests

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>

---------

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
2024-10-18 11:06:20 +00:00
Roberto Devesa 1491550391
Make sure enable-ui-service flag is set to false when controller.uiService.enable is set to false (#2261)
Signed-off-by: Roberto Devesa <15369573+Roberdvs@users.noreply.github.com>
2024-10-18 00:54:20 +00:00
dependabot[bot] ff2b26b669
Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.63.3 to 1.65.3 (#2249)
Bumps [github.com/aws/aws-sdk-go-v2/service/s3](https://github.com/aws/aws-sdk-go-v2) from 1.63.3 to 1.65.3.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/service/s3/v1.63.3...service/s3/v1.65.3)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/service/s3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-17 16:54:21 +00:00
Jacob Salway 1da78d6d3d
Add Quick Start guide to README (#2259)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-10-17 16:25:20 +00:00
dependabot[bot] 0165798e04
Bump gocloud.dev from 0.39.0 to 0.40.0 (#2250)
Bumps [gocloud.dev](https://github.com/google/go-cloud) from 0.39.0 to 0.40.0.
- [Release notes](https://github.com/google/go-cloud/releases)
- [Commits](https://github.com/google/go-cloud/compare/v0.39.0...v0.40.0)

---
updated-dependencies:
- dependency-name: gocloud.dev
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-16 07:16:18 +00:00
dependabot[bot] 85d8901f39
Bump aquasecurity/trivy-action from 0.24.0 to 0.27.0 (#2248)
Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.24.0 to 0.27.0.
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](https://github.com/aquasecurity/trivy-action/compare/0.24.0...0.27.0)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-16 06:55:19 +00:00
bo a1de26dd31
feat: support archives param for spark-submit (#2256)
Signed-off-by: kaka-zb <sin19990111@gmail.com>
2024-10-16 06:20:18 +00:00
dependabot[bot] b1092e01f9
Bump golang.org/x/net from 0.29.0 to 0.30.0 (#2251)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.29.0 to 0.30.0.
- [Commits](https://github.com/golang/net/compare/v0.29.0...v0.30.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 16:38:18 +00:00
dependabot[bot] 73a095094a
Bump manusa/actions-setup-minikube from 2.12.0 to 2.13.0 (#2247)
Bumps [manusa/actions-setup-minikube](https://github.com/manusa/actions-setup-minikube) from 2.12.0 to 2.13.0.
- [Release notes](https://github.com/manusa/actions-setup-minikube/releases)
- [Commits](https://github.com/manusa/actions-setup-minikube/compare/v2.12.0...v2.13.0)

---
updated-dependencies:
- dependency-name: manusa/actions-setup-minikube
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 16:12:18 +00:00
dependabot[bot] 60abc2ae44
Bump github.com/aws/aws-sdk-go-v2/config from 1.27.42 to 1.27.43 (#2252)
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.27.42 to 1.27.43.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.27.42...config/v1.27.43)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 15:48:18 +00:00
tcassaert f75097449d
Add permissions to controller serviceaccount to list and watch ingresses (#2246)
Signed-off-by: Thomas Cassaert <tcassaert@inuits.eu>
2024-10-14 10:40:17 +00:00
Yi Chen 3acd0f1a90
remove redundant test.sh file (#2243)
* Fix go lint error

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove test.sh

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-14 01:23:16 +00:00
Guangyu Yang 8342f2cb3f
Added off heap memory to calculation for YuniKorn gang scheduling (#2209)
Signed-off-by: Guangyu Yang <guangyu.yang@rokt.com>
2024-10-14 00:26:15 +00:00
Thomas Newton 0607a6dc06
Minor fixes to e2e test `make` targets (#2242)
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
2024-10-13 16:29:16 +00:00
Jacob Salway 718e2444a4
Upgrade to Spark 3.5.3 (#2202)
* Upgrade to Spark 3.5.3

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Check result of err

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

---------

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-10-13 15:43:15 +00:00
Sébastien Maintrot a8b5d644b5
implement an upper bound limit to the number of tracked executor (#2181)
* implement an upper bound limit to the number of tracked executor

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

* add upper bound limit to the number of tracked executor to helm chart

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

---------

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
2024-10-11 05:54:10 +00:00
Yi Chen cc57f1cc41
Add changelog for v2.0.2 (#2239)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-11 02:50:10 +00:00
Yi Chen c75d99f65b
Add check for generating manifests and code (#2234)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-10 11:17:09 +00:00
Yi Chen 247e834456
fix: webhook panics due to logging (#2232)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-10 09:40:09 +00:00
Yi Chen dee62b8d68
Add @ImpSy as reviewer (#2235)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-10 01:47:09 +00:00
Jacob Salway ac761ef511
Remove `cap_net_bind_service` from image (#2216)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-10-08 11:24:07 +00:00
dependabot[bot] fe833fa127
Bump github.com/prometheus/client_golang from 1.19.1 to 1.20.4 (#2204)
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.19.1 to 1.20.4.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.19.1...v1.20.4)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 06:29:08 +00:00
dependabot[bot] 9be8dceb48
Bump github.com/aws/aws-sdk-go-v2/config from 1.27.33 to 1.27.42 (#2231)
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.27.33 to 1.27.42.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.27.33...config/v1.27.42)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 03:34:07 +00:00
Nick Tan d07821bcba
fix: spark-submission failed due to lack of permission by user `spark` (#2223)
error: Exception in thread "main" java.io.FileNotFoundException: /home/spark/.ivy2/cache/resolved-org.apache.spark-spark-submit-parent-511288aa-ce7c-4a38-9c8e-4869b71c68fa-1.0.xml (No such file or directory)

Signed-off-by: xuqingtan <missedone@gmail.com>
2024-10-08 03:19:07 +00:00
Nick Tan 7fb14e629e
fix: imagePullPolicy was ignored (#2222)
Signed-off-by: xuqingtan <missedone@gmail.com>
2024-10-08 02:41:07 +00:00
dependabot[bot] 29ba4e72b0
Bump golang.org/x/time from 0.6.0 to 0.7.0 (#2227)
Bumps [golang.org/x/time](https://github.com/golang/time) from 0.6.0 to 0.7.0.
- [Commits](https://github.com/golang/time/compare/v0.6.0...v0.7.0)

---
updated-dependencies:
- dependency-name: golang.org/x/time
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 02:29:07 +00:00
dependabot[bot] 4358fd49bb
Bump manusa/actions-setup-minikube from 2.11.0 to 2.12.0 (#2226)
Bumps [manusa/actions-setup-minikube](https://github.com/manusa/actions-setup-minikube) from 2.11.0 to 2.12.0.
- [Release notes](https://github.com/manusa/actions-setup-minikube/releases)
- [Commits](https://github.com/manusa/actions-setup-minikube/compare/v2.11.0...v2.12.0)

---
updated-dependencies:
- dependency-name: manusa/actions-setup-minikube
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 02:28:06 +00:00
dependabot[bot] 254200977b
Bump cloud.google.com/go/storage from 1.43.0 to 1.44.0 (#2228)
Bumps [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) from 1.43.0 to 1.44.0.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](https://github.com/googleapis/google-cloud-go/compare/pubsub/v1.43.0...spanner/v1.44.0)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/storage
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 02:26:07 +00:00
dependabot[bot] a4dcfcb328
Bump github.com/aws/aws-sdk-go-v2 from 1.31.0 to 1.32.0 (#2229)
Bumps [github.com/aws/aws-sdk-go-v2](https://github.com/aws/aws-sdk-go-v2) from 1.31.0 to 1.32.0.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.31.0...v1.32.0)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-08 02:15:07 +00:00
Yi Chen 143b16ee75
Update integration test workflow and add golangci lint check (#2197)
* Update integration test workflow

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update golangci lint config

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-10-08 02:06:07 +00:00
dependabot[bot] 1972fb75d2
Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.58.3 to 1.63.3 (#2206)
Bumps [github.com/aws/aws-sdk-go-v2/service/s3](https://github.com/aws/aws-sdk-go-v2) from 1.58.3 to 1.63.3.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/service/s3/v1.58.3...service/s3/v1.63.3)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/service/s3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-30 08:53:39 +00:00
dependabot[bot] 316536f7b5
Bump github.com/docker/docker from 27.0.3+incompatible to 27.1.1+incompatible (#2125)
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 27.0.3+incompatible to 27.1.1+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v27.0.3...v27.1.1)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-29 14:20:37 +00:00
dependabot[bot] 6106178c5f
Bump golang.org/x/net from 0.28.0 to 0.29.0 (#2205)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.28.0 to 0.29.0.
- [Commits](https://github.com/golang/net/compare/v0.28.0...v0.29.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-29 14:12:38 +00:00
dependabot[bot] faa0822ad0
Bump github.com/aws/aws-sdk-go-v2 from 1.30.5 to 1.31.0 (#2207)
Bumps [github.com/aws/aws-sdk-go-v2](https://github.com/aws/aws-sdk-go-v2) from 1.30.5 to 1.31.0.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.30.5...v1.31.0)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-29 13:49:37 +00:00
Jacob Salway b50703737b
Revert "Group monthly dependabot updates for gomod (#2190)" (#2203)
This reverts commit bae7c27e10.

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-09-29 13:29:38 +00:00
Jacob Salway 56b4974310
Fix ingress capability discovery (#2201)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-09-29 07:07:37 +00:00
Sébastien Maintrot d37a0e938a
FEATURE: add cli argument to modify controller workqueue ratelimiter (#2186)
* add cli argument to modify controller workqueue ratelimiter

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

* add cli argument to modify controller workqueue ratelimiter support to helm chart

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

---------

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
2024-09-29 01:47:37 +00:00
ha2hi 01278834a4
Add yaml file to create serviceaccount (#2191)
* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* create service account

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* create service account

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

---------

Signed-off-by: HyukSangCho <a01045542949@gmail.com>
2024-09-28 07:13:37 +00:00
Yi Chen 443843d683
Add changelog for v2.0.1 and update README (#2195)
* Add changelog for v2.0.0

Signed-off-by: Yi Chen <github@chenyicn.net>

* Bumpt VERSION to v2.0.1

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update version matrix for v2.0.x

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add integration test status badge

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-27 06:17:35 +00:00
Jacob Salway bae7c27e10
Group monthly dependabot updates for gomod (#2190)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-09-27 05:37:35 +00:00
Yi Chen 73caefd0d3
Update controller RBAC for ConfigMap and PersistentVolumeClaim (#2187)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-24 08:06:33 +00:00
dependabot[bot] 68e3d9cac5
Bump github.com/onsi/gomega from 1.33.1 to 1.34.2 (#2189)
Bumps [github.com/onsi/gomega](https://github.com/onsi/gomega) from 1.33.1 to 1.34.2.
- [Release notes](https://github.com/onsi/gomega/releases)
- [Changelog](https://github.com/onsi/gomega/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/gomega/compare/v1.33.1...v1.34.2)

---
updated-dependencies:
- dependency-name: github.com/onsi/gomega
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 03:16:32 +00:00
dependabot[bot] 4df6df21ae
Bump github.com/onsi/ginkgo/v2 from 2.19.0 to 2.20.2 (#2188)
Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.19.0 to 2.20.2.
- [Release notes](https://github.com/onsi/ginkgo/releases)
- [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/ginkgo/compare/v2.19.0...v2.20.2)

---
updated-dependencies:
- dependency-name: github.com/onsi/ginkgo/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-24 02:54:32 +00:00
Sébastien Maintrot e2cc295204
FEATURE: build operator image as non-root (#2171)
Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
2024-09-23 03:18:31 +00:00
Yi Chen c855ee4c8b
Fix: spark application does not respect time to live seconds (#2165)
* Add time to live seconds example spark application

Signed-off-by: Yi Chen <github@chenyicn.net>

* fix: spark application does not respect time to live seconds

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-19 12:40:29 +00:00
Jacob Salway a2f71c6137
Account for spark.executor.pyspark.memory in Yunikorn gang scheduling (#2178)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-09-19 08:59:28 +00:00
tcassaert ed3226ebe7
Add specific error in log line when failed to create web UI service (#2170)
* Add specific error in log line when failed to create web UI service

Signed-off-by: tcassaert <tcassaert@inuits.eu>

* Update log to reflect correct resource that could not be created

Co-authored-by: Yi Chen <github@chenyicn.net>
Signed-off-by: tcassaert <tcassaert@protonmail.com>

---------

Signed-off-by: tcassaert <tcassaert@inuits.eu>
Signed-off-by: tcassaert <tcassaert@protonmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
2024-09-19 08:11:28 +00:00
Sébastien Maintrot 59a8ca4493
implement workflow to scan latest released docker image (#2177)
Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
2024-09-18 15:15:28 +00:00
dependabot[bot] f3f80d49b1
Bump helm.sh/helm/v3 from 3.15.3 to 3.16.1 (#2173)
Bumps [helm.sh/helm/v3](https://github.com/helm/helm) from 3.15.3 to 3.16.1.
- [Release notes](https://github.com/helm/helm/releases)
- [Commits](https://github.com/helm/helm/compare/v3.15.3...v3.16.1)

---
updated-dependencies:
- dependency-name: helm.sh/helm/v3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 01:51:26 +00:00
dependabot[bot] b81833246f
Bump github.com/aws/aws-sdk-go-v2/config from 1.27.27 to 1.27.33 (#2174)
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.27.27 to 1.27.33.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.27.27...config/v1.27.33)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-18 01:47:26 +00:00
Sébastien Maintrot cbfefd57bb
fix the make kind-delete-custer to avoid accidental kubeconfig deletion (#2172)
Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
2024-09-18 01:44:26 +00:00
Sébastien Maintrot 75b926652b
Feature: Add pprof endpoint (#2164)
* add pprof support to the operator Controller Manager

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

* add pprof support to helm chart

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>

---------

Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
2024-09-16 10:53:25 +00:00
ha2hi 9f0c08a65e
Upgrade to Spark 3.5.2(#2012) (#2157)
* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

* Upgrade to Spark 3.5.2

Signed-off-by: HyukSangCho <a01045542949@gmail.com>

---------

Signed-off-by: HyukSangCho <a01045542949@gmail.com>
2024-09-13 15:14:23 +00:00
Sébastien Maintrot 6680f356b7
replace datamechanics -> spot.io as adopter (acquisition) (#2168)
Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
2024-09-13 15:07:23 +00:00
tcassaert eb48b349a1
fix: The logger had an odd number of arguments, making it panic (#2166)
Signed-off-by: tcassaert <tcassaert@inuits.eu>
2024-09-13 09:24:23 +00:00
Yi Chen 7785107ec5
fix: webhook not working when settings spark job namespaces to empty (#2163)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-13 05:06:22 +00:00
Yi Chen e6a7805079
Update e2e tests (#2161)
* Add sleep buffer to ensture the webhooks are ready before running the e2e tests

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove duplicate operator image build tasks

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update e2e tests

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update examples

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-13 03:24:22 +00:00
dependabot[bot] e58023b90d
Bump gocloud.dev from 0.37.0 to 0.39.0 (#2160)
Bumps [gocloud.dev](https://github.com/google/go-cloud) from 0.37.0 to 0.39.0.
- [Release notes](https://github.com/google/go-cloud/releases)
- [Commits](https://github.com/google/go-cloud/compare/v0.37.0...v0.39.0)

---
updated-dependencies:
- dependency-name: gocloud.dev
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 06:37:19 +00:00
Kevinz 6ae1b2f69c
feat: support driver and executor pod use different priority (#2146)
* feat: support driver and executor pod use different priority

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: if *app.Spec.Driver.PriorityClassName and *app.Spec.Executor.PriorityClassName specifically defined, then can precedence over spec.batchSchedulerOptions.priorityClassName

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: merge the logic of setPodPriorityClassName into addPriorityClassName

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: support driver and executor pod use different priority

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>
Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* feat: if *app.Spec.Driver.PriorityClassName and *app.Spec.Executor.PriorityClassName specifically defined, then can precedence over spec.batchSchedulerOptions.priorityClassName

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>
Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* feat: merge the logic of setPodPriorityClassName into addPriorityClassName

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>
Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* feat: add adjust pointer if is nil

Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* feat: remove spec.batchSchedulerOptions.priorityClassName define , split driver and executor pod priorityClass

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: remove spec.batchSchedulerOptions.priorityClassName define , split driver and executor pod priorityClass

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* feat: Optimize code to avoid null pointer exceptions

Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* fix: remove backup crd files

Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>

* fix: remove BatchSchedulerOptions.PriorityClassName test code

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

* fix: add driver and executor pod priorityClassName test code

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>

---------

Signed-off-by: Kevin Wu <kevin.wu@momenta.ai>
Signed-off-by: Kevin.Wu <kevin.wu@momenta.ai>
Co-authored-by: Kevin Wu <kevin.wu@momenta.ai>
2024-09-10 06:27:19 +00:00
dependabot[bot] 95d202e95c
Bump sigs.k8s.io/scheduler-plugins from 0.29.7 to 0.29.8 (#2159)
Bumps [sigs.k8s.io/scheduler-plugins](https://github.com/kubernetes-sigs/scheduler-plugins) from 0.29.7 to 0.29.8.
- [Release notes](https://github.com/kubernetes-sigs/scheduler-plugins/releases)
- [Changelog](https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/RELEASE.md)
- [Commits](https://github.com/kubernetes-sigs/scheduler-plugins/compare/v0.29.7...v0.29.8)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/scheduler-plugins
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-10 03:17:19 +00:00
Jacob Salway e1b7a27062
Upgrade to Spark 3.5.2 (#2154)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-09-09 05:06:19 +00:00
Jacob Salway 10fcb8e19a
Upgrade to Go 1.23.1 (#2155)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-09-09 04:57:19 +00:00
Yi Chen dee91ba66c
Fix: e2e test failes due to webhook not ready (#2149)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-09 03:05:18 +00:00
Yi Chen 592b649917
Create role and rolebinding for controller/webhook in every spark job namespace if not watching all namespaces (#2129)
watching all namespaces

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-08 05:22:17 +00:00
Jacob Salway 62b4ca636d
Set schedulerName to Yunikorn (#2153)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-09-08 05:15:17 +00:00
Jacob Salway c810ece25b
Run e2e tests on Kind (#2148)
Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-09-03 07:26:14 +00:00
Yi Chen e8d3de9e1a
Support extended kube-scheduler as batch scheduler (#2136)
* Support coscheduling with kube-scheduler plugins

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add example for using kube-schulder coscheduling

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-03 03:23:13 +00:00
Yi Chen c93b0ec0e7
Adding support for setting spark job namespaces to all namespaces (#2123)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-09-03 02:55:14 +00:00
Yi Chen 1afa72e7a0
fix: unable to set controller/webhook replicas to zero (#2147)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-30 14:26:05 +00:00
Yi Chen bca6aa85cc
Update release workflow and docs (#2121)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-29 13:07:05 +00:00
Jacob Salway 9cc1c02c64
Add default batch scheduler argument (#2143)
* Add default batch scheduler argument

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Add helm unit test

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

---------

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-08-28 02:53:03 +00:00
Yi Chen 9e88049af1
Reintroduce option webhook.enable (#2142)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-24 17:59:59 +00:00
Chaoran Yu 63bbe4f35f
Added jacobsalway as a reviewer (#2140)
Signed-off-by: Chaoran Yu <chaoran_yu@apple.com>
2024-08-22 21:18:58 +00:00
Yi Chen a1a38ea2f1
Fix: Spark role binding did not render properly when setting spark service account name (#2135)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-22 16:28:58 +00:00
Yi Chen ac14169a4f
Add changelog for v2.0.0-rc.0 (#2126)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-22 16:25:57 +00:00
Neo 52f818d535
fix: Add default values for namespaces to match usage descriptions (#2128)
* fix: Add default values for namespaces to match usage descriptions

Signed-off-by: pengfei4.li <pengfei4.li@ly.com>

* fix: remove incorrect cache settings

Signed-off-by: pengfei4.li <pengfei4.li@ly.com>

---------

Signed-off-by: pengfei4.li <pengfei4.li@ly.com>
Co-authored-by: pengfei4.li <pengfei4.li@ly.com>
2024-08-22 16:24:58 +00:00
Yi Chen 4bc6e89708
Update Makefile for building sparkctl (#2119)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-22 16:00:58 +00:00
Jacob Salway 8fcda12657
Support gang scheduling with Yunikorn (#2107)
* Add Yunikorn scheduler and example

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Add test cases

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Add code comments

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Add license comment

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Inline mergeNodeSelector

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

* Fix initial number implementation

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>

---------

Signed-off-by: Jacob Salway <jacob.salway@gmail.com>
2024-08-22 04:15:57 +00:00
Andrey Velichkevich 5972482ca4
Increase frequency for Stale bot (#2124)
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
2024-08-14 02:49:26 +00:00
Yi Chen e2693f19c5
Fix CI: environment variable BRANCH is missed (#2111)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-02 03:34:06 +00:00
Yi Chen 6ff204a6ab
Fix broken integration test CI (#2109)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-01 21:27:06 +00:00
Yi Chen 0dc641bd1d
Use controller-runtime to reconsturct spark operator (#2072)
* Use controller-runtime to reconstruct spark operator

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update helm charts

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update examples

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-01 12:29:06 +00:00
Yi Chen a3ec8f193f
Update workflow and docs for releasing Spark operator (#2089)
* Update .helmignore

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add release docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update release workflow

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update integration test workflow

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add workflow for pushing tag when VERSION file changes

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove the leading 'v' from chart version

Signed-off-by: Yi Chen <github@chenyicn.net>

* Update docker image tags

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
2024-08-01 12:26:06 +00:00
jbhalodia-slack 4108f54937
Add topologySpreadConstraints (#2091)
* Update README and documentation (#2047)

* Update docs

Signed-off-by: Yi Chen <github@chenyicn.net>

* Remove docs and update README

Signed-off-by: Yi Chen <github@chenyicn.net>

* Add link to monthly community meeting

Signed-off-by: Yi Chen <github@chenyicn.net>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Add PodDisruptionBudget to chart (#2078)

* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

* PR comments

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>

---------

Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Set topologySpreadConstraints

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README and increase patch version

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Revert replicaCount change

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README after master merger

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

* Update README

Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>

---------

Signed-off-by: Yi Chen <github@chenyicn.net>
Signed-off-by: jbhalodia-slack <jbhalodia@salesforce.com>
Signed-off-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
Signed-off-by: Carlos Sánchez Páez <sanchezpaezcarlos33@gmail.com>
Co-authored-by: Yi Chen <github@chenyicn.net>
Co-authored-by: Carlos Sánchez Páez <karlossanpa@gmail.com>
2024-07-26 08:39:55 +00:00
Yi Chen 51e4886953
Add Alibaba Cloud to adopters (#2097)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-07-24 15:40:53 +00:00
Andrey Velichkevich 461ddc906e
Update Stale bot settings (#2095)
Signed-off-by: Andrey Velichkevich <andrey.velichkevich@gmail.com>
2024-07-24 15:39:53 +00:00
Yi Chen b27717d95d
Add @ChenYi015 to approvers (#2096)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-07-24 15:38:53 +00:00
Yi Chen bc9dcc2dfb
Add CHANGELOG.md file and use python script to generate it automatically (#2087)
Signed-off-by: Yi Chen <github@chenyicn.net>
2024-07-22 05:18:01 +00:00
388 changed files with 35239 additions and 25856 deletions

View File

@ -1,31 +1,7 @@
.github/
.idea/
.vscode/
bin/
charts/
docs/
config/
examples/
hack/
manifest/
spark-docker/
sparkctl/
test/
vendor/
.dockerignore
.DS_Store
.gitignore
.gitlab-ci.yaml
.golangci.yaml
.pre-commit-config.yaml
ADOPTERS.md
CODE_OF_CONDUCT.md
codecov.ymal
CONTRIBUTING.md
codecov.yaml
cover.out
Dockerfile
LICENSE
OWNERS
PROJECT
README.md
test.sh
.DS_Store
*.iml

View File

@ -1,46 +0,0 @@
---
name: Bug report
about: Create a report to help us improve
title: '[BUG] Brief description of the issue'
labels: bug
---
## Description
Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration.
If your request is for a new feature, please use the `Feature request` template.
- [ ] ✋ I have searched the open/closed issues and my issue is not listed.
## Reproduction Code [Required]
<!-- REQUIRED -->
Steps to reproduce the behavior:
## Expected behavior
<!-- A clear and concise description of what you expected to happen -->
## Actual behavior
<!-- A clear and concise description of what actually happened -->
### Terminal Output Screenshot(s)
<!-- Optional but helpful -->
## Environment & Versions
- Spark Operator App version:
- Helm Chart Version:
- Kubernetes Version:
- Apache Spark version:
## Additional context
<!-- Add any other context about the problem here -->

54
.github/ISSUE_TEMPLATE/bug_report.yaml vendored Normal file
View File

@ -0,0 +1,54 @@
name: Bug Report
description: Tell us about a problem you are experiencing with the Spark operator.
labels:
- kind/bug
- lifecycle/needs-triage
body:
- type: markdown
attributes:
value: |
Thanks for taking the time to fill out this Spark operator bug report!
- type: textarea
id: problem
attributes:
label: What happened?
description: |
Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration.
If your request is for a new feature, please use the `Feature request` template.
value: |
- [ ] ✋ I have searched the open/closed issues and my issue is not listed.
validations:
required: true
- type: textarea
id: reproduce
attributes:
label: Reproduction Code
description: Steps to reproduce the behavior.
- type: textarea
id: expected
attributes:
label: Expected behavior
description: A clear and concise description of what you expected to happen.
- type: textarea
id: actual
attributes:
label: Actual behavior
description: A clear and concise description of what actually happened.
- type: textarea
id: environment
attributes:
label: Environment & Versions
value: |
- Kubernetes Version:
- Spark Operator Version:
- Apache Spark Version:
- type: textarea
id: context
attributes:
label: Additional context
description: Add any other context about the problem here.
- type: input
id: votes
attributes:
label: Impacted by this bug?
value: Give it a 👍 We prioritize the issues with most 👍

9
.github/ISSUE_TEMPLATE/config.yaml vendored Normal file
View File

@ -0,0 +1,9 @@
blank_issues_enabled: true
contact_links:
- name: Spark Operator Documentation
url: https://www.kubeflow.org/docs/components/spark-operator
about: Much help can be found in the docs
- name: Spark Operator Slack Channel
url: https://app.slack.com/client/T08PSQ7BQ/C074588U7EG
about: Ask questions about the Spark Operator

View File

@ -1,32 +0,0 @@
---
name: Feature request
about: Suggest an idea for this project
title: '[FEATURE] Brief description of the feature'
labels: enhancement
---
<!--- Please keep this note for the community --->
### Community Note
* Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the community and maintainers prioritize this request
* Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
* If you are interested in working on this issue or have submitted a pull request, please leave a comment
<!--- Thank you for keeping this note for the community --->
#### What is the outcome that you are trying to reach?
<!-- A clear and concise description of what the problem is. -->
#### Describe the solution you would like
<!-- A clear and concise description of what you want to happen. -->
#### Describe alternatives you have considered
<!-- A clear and concise description of any alternative solutions or features you've considered. -->
#### Additional context
<!-- Add any other context or screenshots about the feature request here. -->

View File

@ -0,0 +1,47 @@
name: Feature Request
description: Suggest an idea for the Spark operator.
labels:
- kind/feature
- lifecycle/needs-triage
body:
- type: markdown
attributes:
value: |
Thanks for taking the time to fill out this Spark operator feature request!
- type: markdown
attributes:
value: |
- Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the community and maintainers prioritize this request.
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.
- type: textarea
id: feature
attributes:
label: What feature you would like to be added?
description: |
A clear and concise description of what you want to add to the Spark operator.
Please consider to write a Spark operator enhancement proposal if it is a large feature request.
validations:
required: true
- type: textarea
id: rationale
attributes:
label: Why is this needed?
- type: textarea
id: solution
attributes:
label: Describe the solution you would like
- type: textarea
id: alternatives
attributes:
label: Describe alternatives you have considered
- type: textarea
id: context
attributes:
label: Additional context
description: Add any other context or screenshots about the feature request here.
- type: input
id: votes
attributes:
label: Love this feature?
value: Give it a 👍 We prioritize the features with most 👍

View File

@ -1,20 +0,0 @@
---
name: Question
about: I have a Question
title: '[QUESTION] Brief description of the Question'
labels: question
---
- [ ] ✋ I have searched the open/closed issues and my issue is not listed.
#### Please describe your question here
<!-- Provide as much information as possible to explain your question -->
#### Provide a link to the example/module related to the question
<!-- Please provide the link to the example related to this question from this repo -->
#### Additional context
<!-- Add any other context or screenshots about the question here -->

30
.github/ISSUE_TEMPLATE/question.yaml vendored Normal file
View File

@ -0,0 +1,30 @@
name: Question
description: Ask question about the Spark operator.
labels:
- kind/question
- lifecycle/needs-triage
body:
- type: markdown
attributes:
value: |
Thanks for taking the time to fill out this question!
- type: textarea
id: feature
attributes:
label: What question do you want to ask?
description: |
A clear and concise description of what you want to ask about the Spark operator.
value: |
- [ ] ✋ I have searched the open/closed issues and my issue is not listed.
validations:
required: true
- type: textarea
id: rationale
attributes:
label: Additional context
description: Add any other context or screenshots about the question here.
- type: input
id: votes
attributes:
label: Have the same question?
value: Give it a 👍 We prioritize the question with most 👍

View File

@ -1,18 +1,23 @@
### 🛑 Important:
Please open an issue to discuss significant work before you start. We appreciate your contributions and don't want your efforts to go to waste!
For guidelines on how to contribute, please review the [CONTRIBUTING.md](CONTRIBUTING.md) document.
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, check our contributor guidelines: https://www.kubeflow.org/docs/about/contributing
2. To know more about how to develop with the Spark operator, check the developer guide: https://www.kubeflow.org/docs/components/spark-operator/developer-guide/
3. If you want *faster* PR reviews, check how: https://git.k8s.io/community/contributors/guide/pull-requests.md#best-practices-for-faster-reviews
4. Please open an issue to discuss significant work before you start. We appreciate your contributions and don't want your efforts to go to waste!
-->
## Purpose of this PR
Provide a clear and concise description of the changes. Explain the motivation behind these changes and link to relevant issues or discussions.
<!-- Provide a clear and concise description of the changes. Explain the motivation behind these changes and link to relevant issues or discussions. -->
**Proposed changes:**
- <Change 1>
- <Change 2>
- <Change 3>
## Change Category
Indicate the type of change by marking the applicable boxes:
<!-- Indicate the type of change by marking the applicable boxes. -->
- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] Feature (non-breaking change which adds functionality)
@ -23,9 +28,9 @@ Indicate the type of change by marking the applicable boxes:
<!-- Provide reasoning for the changes if not already covered in the description above. -->
## Checklist
Before submitting your PR, please review the following:
<!-- Before submitting your PR, please review the following: -->
- [ ] I have conducted a self-review of my own code.
- [ ] I have updated documentation accordingly.
@ -35,4 +40,3 @@ Before submitting your PR, please review the following:
### Additional Notes
<!-- Include any additional notes or context that could be helpful for the reviewers here. -->

64
.github/workflows/check-release.yaml vendored Normal file
View File

@ -0,0 +1,64 @@
name: Check Release
on:
pull_request:
branches:
- release-*
paths:
- VERSION
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
SEMVER_PATTERN: '^v([0-9]+)\.([0-9]+)\.([0-9]+)(-rc\.([0-9]+))?$'
jobs:
check:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Check whether version matches semver pattern
run: |
VERSION=$(cat VERSION)
if [[ ${VERSION} =~ ${{ env.SEMVER_PATTERN }} ]]; then
echo "Version '${VERSION}' matches semver pattern."
else
echo "Version '${VERSION}' does not match semver pattern."
exit 1
fi
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Check whether chart version and appVersion matches version
run: |
VERSION=${VERSION#v}
CHART_VERSION=$(cat charts/spark-operator-chart/Chart.yaml | grep version | awk '{print $2}')
CHART_APP_VERSION=$(cat charts/spark-operator-chart/Chart.yaml | grep appVersion | awk '{print $2}')
if [[ ${CHART_VERSION} == ${VERSION} ]]; then
echo "Chart version '${CHART_VERSION}' matches version '${VERSION}'."
else
echo "Chart version '${CHART_VERSION}' does not match version '${VERSION}'."
exit 1
fi
if [[ ${CHART_APP_VERSION} == ${VERSION} ]]; then
echo "Chart appVersion '${CHART_APP_VERSION}' matches version '${VERSION}'."
else
echo "Chart appVersion '${CHART_APP_VERSION}' does not match version '${VERSION}'."
exit 1
fi
- name: Check if tag exists
run: |
git fetch --tags
if git tag -l | grep -q "^${VERSION}$"; then
echo "Tag '${VERSION}' already exists."
exit 1
else
echo "Tag '${VERSION}' does not exist."
fi

233
.github/workflows/integration.yaml vendored Normal file
View File

@ -0,0 +1,233 @@
name: Integration Test
on:
pull_request:
branches:
- master
- release-*
push:
branches:
- master
- release-*
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.actor }}
cancel-in-progress: true
jobs:
code-check:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Run go mod tidy
run: |
go mod tidy
if ! git diff --quiet; then
echo "Please run 'go mod tidy' and commit the changes."
git diff
false
fi
- name: Generate code
run: |
make generate
if ! git diff --quiet; then
echo "Need to re-run 'make generate' and commit the changes."
git diff
false
fi
- name: Verify Codegen
run: |
make verify-codegen
- name: Run go fmt check
run: |
make go-fmt
if ! git diff --quiet; then
echo "Need to re-run 'make go-fmt' and commit the changes."
git diff
false
fi
- name: Run go vet check
run: |
make go-vet
if ! git diff --quiet; then
echo "Need to re-run 'make go-vet' and commit the changes."
git diff
false
fi
- name: Run golangci-lint
run: |
make go-lint
build-api-docs:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Build API docs
run: |
make build-api-docs
if ! git diff --quiet; then
echo "Need to re-run 'make build-api-docs' and commit the changes."
git diff
false
fi
build-spark-operator:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Run go unit tests
run: make unit-test
- name: Build Spark operator
run: make build-operator
build-helm-chart:
runs-on: ubuntu-latest
steps:
- name: Determine branch name
id: get_branch
run: |
BRANCH=""
if [ "${{ github.event_name }}" == "push" ]; then
BRANCH=${{ github.ref_name }}
elif [ "${{ github.event_name }}" == "pull_request" ]; then
BRANCH=${{ github.base_ref }}
fi
echo "Branch name: $BRANCH"
echo "BRANCH=$BRANCH" >> "$GITHUB_OUTPUT"
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install Helm
uses: azure/setup-helm@v4
with:
version: v3.14.3
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.7.0
- name: Generate manifests
run: |
make manifests
if ! git diff --quiet; then
echo "Need to re-run 'make manifests' and commit the changes."
git diff
false
fi
- name: Detect CRDs drift between chart and manifest
run: make detect-crds-drift
- name: Run helm unittest
run: make helm-unittest
- name: Run chart-testing (list-changed)
id: list-changed
env:
BRANCH: ${{ steps.get_branch.outputs.BRANCH }}
run: |
changed=$(ct list-changed --target-branch $BRANCH)
if [[ -n "$changed" ]]; then
echo "changed=true" >> "$GITHUB_OUTPUT"
fi
- name: Run chart-testing (lint)
if: steps.list-changed.outputs.changed == 'true'
env:
BRANCH: ${{ steps.get_branch.outputs.BRANCH }}
run: ct lint --check-version-increment=false --target-branch $BRANCH
- name: Produce the helm documentation
if: steps.list-changed.outputs.changed == 'true'
run: |
make helm-docs
if ! git diff --quiet -- charts/spark-operator-chart/README.md; then
echo "Need to re-run 'make helm-docs' and commit the changes."
false
fi
- name: setup minikube
if: steps.list-changed.outputs.changed == 'true'
uses: manusa/actions-setup-minikube@v2.14.0
with:
minikube version: v1.33.0
kubernetes version: v1.30.0
start args: --memory 6g --cpus=2 --addons ingress
github token: ${{ inputs.github-token }}
- name: Run chart-testing (install)
if: steps.list-changed.outputs.changed == 'true'
run: |
docker build -t ghcr.io/kubeflow/spark-operator/controller:local .
minikube image load ghcr.io/kubeflow/spark-operator/controller:local
ct install --target-branch ${{ steps.get_branch.outputs.BRANCH }}
e2e-test:
runs-on: ubuntu-latest
strategy:
matrix:
k8s_version:
- v1.24.17
- v1.25.16
- v1.26.15
- v1.27.16
- v1.28.15
- v1.29.12
- v1.30.8
- v1.31.4
- v1.32.0
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: Create a Kind cluster
run: make kind-create-cluster KIND_K8S_VERSION=${{ matrix.k8s_version }}
- name: Build and load image to Kind cluster
run: make kind-load-image IMAGE_TAG=local
- name: Run e2e tests
run: make e2e-test

View File

@ -1,180 +0,0 @@
name: Pre-commit checks
on:
pull_request:
branches:
- master
push:
branches:
- master
jobs:
build-api-docs:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.mod
- name: The API documentation hasn't changed
run: |
make build-api-docs
if ! git diff --quiet -- docs/api-docs.md; then
echo "Need to re-run 'make build-api-docs' and commit the changes"
git diff -- docs/api-docs.md;
false
fi
build-sparkctl:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: "go.mod"
- name: build sparkctl
run: |
make build-sparkctl
build-spark-operator:
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: "go.mod"
- name: Run go fmt check
run: make go-fmt
- name: Run go vet
run: make go-vet
- name: Run unit tests
run: make unit-test
- name: Build Spark-Operator Docker Image
run: make docker-build IMAGE_TAG=latest
- name: Check changes in resources used in docker file
run: |
DOCKERFILE_RESOURCES=$(cat Dockerfile | grep -P -o "COPY [a-zA-Z0-9].*? " | cut -c6-)
for resource in $DOCKERFILE_RESOURCES; do
# If the resource is different
if ! git diff --quiet origin/master -- $resource; then
## And the appVersion hasn't been updated
if ! git diff origin/master -- charts/spark-operator-chart/Chart.yaml | grep +appVersion; then
echo "resource used in docker.io/kubeflow/spark-operator has changed in $resource, need to update the appVersion in charts/spark-operator-chart/Chart.yaml"
git diff origin/master -- $resource;
echo "failing the build... " && false
fi
fi
done
build-helm-chart:
runs-on: ubuntu-20.04
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: Install Helm
uses: azure/setup-helm@v4
with:
version: v3.14.3
- name: Produce the helm documentation
run: |
make helm-docs
if ! git diff --quiet -- charts/spark-operator-chart/README.md; then
echo "Need to re-run 'make helm-docs' and commit the changes"
false
fi
- name: Set up chart-testing
uses: helm/chart-testing-action@v2.6.1
- name: Print chart-testing version information
run: ct version
- name: Run chart-testing (lint)
run: ct lint
- name: Run chart-testing (list-changed)
id: list-changed
run: |
changed=$(ct list-changed)
if [[ -n "$changed" ]]; then
echo "::set-output name=changed::true"
fi
- name: Detect CRDs drift between chart and manifest
run: make detect-crds-drift
- name: setup minikube
uses: manusa/actions-setup-minikube@v2.11.0
with:
minikube version: v1.33.0
kubernetes version: v1.30.0
start args: --memory 6g --cpus=2 --addons ingress
github token: ${{ inputs.github-token }}
- name: Run chart-testing (install)
run: |
docker build -t docker.io/kubeflow/spark-operator:local .
minikube image load docker.io/kubeflow/spark-operator:local
ct install
integration-test:
runs-on: ubuntu-22.04
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: "0"
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: "go.mod"
- name: setup minikube
uses: manusa/actions-setup-minikube@v2.11.0
with:
minikube version: v1.33.0
kubernetes version: v1.30.0
start args: --memory 6g --cpus=2 --addons ingress
github token: ${{ inputs.github-token }}
- name: Build local spark-operator docker image for minikube testing
run: |
docker build -t docker.io/kubeflow/spark-operator:local .
minikube image load docker.io/kubeflow/spark-operator:local
# The integration tests are currently broken see: https://github.com/kubeflow/spark-operator/issues/1416
# - name: Run chart-testing (integration test)
# run: make integration-test
- name: Setup tmate session
if: failure()
uses: mxschmitt/action-tmate@v3
timeout-minutes: 15

View File

@ -0,0 +1,84 @@
name: Release Helm charts
on:
release:
types:
- published
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
HELM_REGISTRY: ghcr.io
HELM_REPOSITORY: ${{ github.repository_owner }}/helm-charts
jobs:
release_helm_charts:
permissions:
contents: write
packages: write
runs-on: ubuntu-latest
steps:
- name: Checkout source code
uses: actions/checkout@v4
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Set up Helm
uses: azure/setup-helm@v4.2.0
with:
version: v3.14.4
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ${{ env.HELM_REGISTRY }}
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Read version from VERSION file
run: |
VERSION=$(cat VERSION)
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Package Helm charts
run: |
for chart in $(ls charts); do
helm package charts/${chart}
done
- name: Upload charts to GHCR
run: |
for pkg in $(ls *.tgz); do
helm push ${pkg} oci://${{ env.HELM_REGISTRY }}/${{ env.HELM_REPOSITORY }}
done
- name: Save packaged charts to temp directory
run: |
mkdir -p /tmp/charts
cp *.tgz /tmp/charts
- name: Checkout to branch gh-pages
uses: actions/checkout@v4
with:
ref: gh-pages
fetch-depth: 0
- name: Copy packaged charts
run: |
cp /tmp/charts/*.tgz .
- name: Update Helm charts repo index
env:
CHART_URL: https://github.com/${{ github.repository }}/releases/download/${{ github.ref_name }}
run: |
helm repo index --merge index.yaml --url ${CHART_URL} .
git add index.yaml
git commit -s -m "Add index for Spark operator chart ${VERSION}" || exit 0
git push

View File

@ -1,106 +1,132 @@
name: Release Charts
name: Release
on:
push:
branches:
- master
- release-*
paths:
- VERSION
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
REGISTRY_IMAGE: docker.io/kubeflow/spark-operator
SEMVER_PATTERN: '^v([0-9]+)\.([0-9]+)\.([0-9]+)(-rc\.([0-9]+))?$'
IMAGE_REGISTRY: ghcr.io
IMAGE_REPOSITORY: kubeflow/spark-operator/controller
jobs:
build-skip-check:
check-release:
runs-on: ubuntu-latest
outputs:
image_changed: ${{ steps.skip-check.outputs.image_changed }}
chart_changed: ${{ steps.skip-check.outputs.chart_changed }}
app_version_tag: ${{ steps.skip-check.outputs.app_version_tag }}
chart_version_tag: ${{ steps.skip-check.outputs.chart_version_tag }}
steps:
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Check if build should be skipped
id: skip-check
- name: Check whether version matches semver pattern
run: |
app_version_tag=$(cat charts/spark-operator-chart/Chart.yaml | grep "appVersion: .*" | cut -c13-)
chart_version_tag=$(cat charts/spark-operator-chart/Chart.yaml | grep "version: .*" | cut -c10-)
VERSION=$(cat VERSION)
if [[ ${VERSION} =~ ${{ env.SEMVER_PATTERN }} ]]; then
echo "Version '${VERSION}' matches semver pattern."
else
echo "Version '${VERSION}' does not match semver pattern."
exit 1
fi
echo "VERSION=${VERSION}" >> $GITHUB_ENV
# Initialize flags
image_changed=false
chart_changed=false
if ! git rev-parse -q --verify "refs/tags/$app_version_tag"; then
image_changed=true
git tag $app_version_tag
git push origin $app_version_tag
echo "Spark-Operator Docker Image new tag: $app_version_tag released"
- name: Check whether chart version and appVersion matches version
run: |
VERSION=${VERSION#v}
CHART_VERSION=$(cat charts/spark-operator-chart/Chart.yaml | grep version | awk '{print $2}')
CHART_APP_VERSION=$(cat charts/spark-operator-chart/Chart.yaml | grep appVersion | awk '{print $2}')
if [[ ${CHART_VERSION} == ${VERSION} ]]; then
echo "Chart version '${CHART_VERSION}' matches version '${VERSION}'."
else
echo "Chart version '${CHART_VERSION}' does not match version '${VERSION}'."
exit 1
fi
if [[ ${CHART_APP_VERSION} == ${VERSION} ]]; then
echo "Chart appVersion '${CHART_APP_VERSION}' matches version '${VERSION}'."
else
echo "Chart appVersion '${CHART_APP_VERSION}' does not match version '${VERSION}'."
exit 1
fi
if ! git rev-parse -q --verify "refs/tags/spark-operator-chart-$chart_version_tag"; then
chart_changed=true
git tag spark-operator-chart-$chart_version_tag
git push origin spark-operator-chart-$chart_version_tag
echo "Spark-Operator Helm Chart new tag: spark-operator-chart-$chart_version_tag released"
- name: Check if tag exists
run: |
git fetch --tags
if git tag -l | grep -q "^${VERSION}$"; then
echo "Tag '${VERSION}' already exists."
exit 1
else
echo "Tag '${VERSION}' does not exist."
fi
echo "image_changed=${image_changed}" >> "$GITHUB_OUTPUT"
echo "chart_changed=${chart_changed}" >> "$GITHUB_OUTPUT"
echo "app_version_tag=${app_version_tag}" >> "$GITHUB_OUTPUT"
echo "chart_version_tag=${chart_version_tag}" >> "$GITHUB_OUTPUT"
release:
runs-on: ubuntu-latest
build_images:
needs:
- build-skip-check
if: needs.build-skip-check.outputs.image_changed == 'true'
- check-release
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
platform:
- linux/amd64
- linux/arm64
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 1
- name: Configure Git
- name: Prepare
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
platform=${{ matrix.platform }}
echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV
echo "SCOPE=${platform//\//-}" >> $GITHUB_ENV
- name: Set up QEMU
timeout-minutes: 1
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Install Helm
uses: azure/setup-helm@v4
- name: Checkout source code
uses: actions/checkout@v4
- name: Read version from VERSION file
run: |
VERSION=$(cat VERSION)
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
version: v3.14.3
- name: Login to Packages Container registry
images: ${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }}
tags: |
type=semver,pattern={{version}},value=${{ env.VERSION }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker buildx
uses: docker/setup-buildx-action@v3
- name: Login to container registry
uses: docker/login-action@v3
with:
registry: docker.io
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and Push Spark-Operator Docker Image to Docker Hub
registry: ${{ env.IMAGE_REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push by digest
id: build
uses: docker/build-push-action@v5
uses: docker/build-push-action@v6
with:
context: .
platforms: ${{ matrix.platform }}
cache-to: type=gha,mode=max,scope=${{ env.SCOPE }}
cache-from: type=gha,scope=${{ env.SCOPE }}
push: true
outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
labels: ${{ steps.meta.outputs.labels }}
outputs: type=image,name=${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }},push-by-digest=true,name-canonical=true,push=true
- name: Export digest
run: |
mkdir -p /tmp/digests
digest="${{ steps.build.outputs.digest }}"
touch "/tmp/digests/${digest#sha256:}"
- name: Upload digest
uses: actions/upload-artifact@v4
with:
@ -108,61 +134,127 @@ jobs:
path: /tmp/digests/*
if-no-files-found: error
retention-days: 1
publish-image:
runs-on: ubuntu-latest
release_images:
needs:
- release
- build-skip-check
if: needs.build-skip-check.outputs.image_changed == 'true'
- build_images
runs-on: ubuntu-latest
steps:
- name: Download digests
uses: actions/download-artifact@v4
with:
pattern: digests-*
path: /tmp/digests
merge-multiple: true
- name: Setup Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Checkout source code
uses: actions/checkout@v4
- name: Read version from VERSION file
run: |
VERSION=$(cat VERSION)
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY_IMAGE }}
tags: ${{ needs.build-skip-check.outputs.app_version_tag }}
- name: Login to Docker Hub
images: ${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }}
tags: |
type=semver,pattern={{version}},value=${{ env.VERSION }}
- name: Download digests
uses: actions/download-artifact@v4
with:
path: /tmp/digests
pattern: digests-*
merge-multiple: true
- name: Set up Docker buildx
uses: docker/setup-buildx-action@v3
- name: Login to container registry
uses: docker/login-action@v3
with:
registry: docker.io
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
registry: ${{ env.IMAGE_REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Create manifest list and push
working-directory: /tmp/digests
run: |
docker buildx imagetools create $(jq -cr '.tags | map("-t " + .) | join(" ")' <<< "$DOCKER_METADATA_OUTPUT_JSON") \
$(printf '${{ env.REGISTRY_IMAGE }}@sha256:%s ' *)
$(printf '${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }}@sha256:%s ' *)
- name: Inspect image
run: |
docker buildx imagetools inspect ${{ env.REGISTRY_IMAGE }}:${{ steps.meta.outputs.version }}
publish-chart:
runs-on: ubuntu-latest
if: needs.build-skip-check.outputs.chart_changed == 'true'
docker buildx imagetools inspect ${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REPOSITORY }}:${{ steps.meta.outputs.version }}
push_tag:
needs:
- build-skip-check
- release_images
runs-on: ubuntu-latest
steps:
- name: Checkout
- name: Checkout source code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Install Helm
uses: azure/setup-helm@v4
with:
version: v3.14.3
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Release Spark-Operator Helm Chart
uses: helm/chart-releaser-action@v1.6.0
env:
CR_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
CR_RELEASE_NAME_TEMPLATE: "spark-operator-chart-{{ .Version }}"
- name: Read version from VERSION file
run: |
VERSION=$(cat VERSION)
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Create and push tag
run: |
git tag -a "${VERSION}" -m "Spark Operator Official Release ${VERSION}"
git push origin "${VERSION}"
draft_release:
needs:
- push_tag
permissions:
contents: write
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Read version from VERSION file
run: |
VERSION=$(cat VERSION)
echo "VERSION=${VERSION}" >> $GITHUB_ENV
- name: Set up Helm
uses: azure/setup-helm@v4.2.0
with:
version: v3.14.4
- name: Package Helm charts
run: |
for chart in $(ls charts); do
helm package charts/${chart}
done
- name: Release
id: release
uses: softprops/action-gh-release@v2
with:
token: ${{ secrets.GITHUB_TOKEN }}
name: "Spark Operator ${{ env.VERSION }}"
tag_name: ${{ env.VERSION }}
prerelease: ${{ contains(env.VERSION, 'rc') }}
target_commitish: ${{ github.sha }}
draft: true
files: |
*.tgz

View File

@ -1,8 +1,8 @@
name: Close stale issues and PRs
name: Mark stale issues and pull requests
on:
schedule:
- cron: "0 1 * * *"
- cron: "0 */2 * * *"
jobs:
stale:
@ -15,21 +15,25 @@ jobs:
steps:
- uses: actions/stale@v9
with:
days-before-issue-stale: 60
days-before-issue-close: 30
days-before-pr-stale: 60
days-before-pr-close: 30
repo-token: ${{ secrets.GITHUB_TOKEN }}
days-before-stale: 90
days-before-close: 20
operations-per-run: 200
stale-issue-message: >
This issue has been automatically marked as stale because it has been open 60 days with no activity.
Remove stale label or comment or this will be closed in 30 days.
Thank you for your contributions.
This issue has been automatically marked as stale because it has not had
recent activity. It will be closed if no further activity occurs. Thank you
for your contributions.
close-issue-message: >
This issue has been automatically closed because it has been stalled for 30 days with no activity.
Please comment "/reopen" to reopen it.
This issue has been automatically closed because it has not had recent
activity. Please comment "/reopen" to reopen it.
stale-issue-label: lifecycle/stale
exempt-issue-labels: lifecycle/frozen
stale-pr-message: >
This pull request has been automatically marked as stale because it has been open 60 days with no activity.
Remove stale label or comment or this will be closed in 30 days.
Thank you for your contributions.
This pull request has been automatically marked as stale because it has not had
recent activity. It will be closed if no further activity occurs. Thank you
for your contributions.
close-pr-message: >
This pull request has been automatically closed because it has been stalled for 30 days with no activity.
Please comment "/reopen" to reopen it.
This pull request has been automatically closed because it has not had recent
activity. Please comment "/reopen" to reopen it.
stale-pr-label: lifecycle/stale
exempt-pr-labels: lifecycle/frozen

View File

@ -0,0 +1,32 @@
name: Trivy image scanning
on:
workflow_dispatch:
schedule:
- cron: '0 0 * * 1' # Every Monday at 00:00
jobs:
image-scanning:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Add image to environment
run: make print-IMAGE >> $GITHUB_ENV
- name: trivy scan for github security tab
uses: aquasecurity/trivy-action@0.32.0
with:
image-ref: '${{ env.IMAGE }}'
format: 'sarif'
ignore-unfixed: true
vuln-type: 'os,library'
severity: 'CRITICAL,HIGH'
output: 'trivy-results.sarif'
timeout: 30m0s
- name: Upload Trivy scan results to GitHub Security tab
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: 'trivy-results.sarif'

16
.gitignore vendored
View File

@ -1,11 +1,7 @@
bin/
vendor/
cover.out
sparkctl/sparkctl
sparkctl/sparkctl-linux-amd64
sparkctl/sparkctl-darwin-amd64
**/*.iml
# Various IDEs
.idea/
.vscode/
.vscode/
bin/
codecov.yaml
cover.out
.DS_Store
*.iml

65
.golangci.yaml Normal file
View File

@ -0,0 +1,65 @@
version: "2"
run:
# Timeout for total work, e.g. 30s, 5m, 5m30s.
# If the value is lower or equal to 0, the timeout is disabled.
# Default: 0 (disabled)
timeout: 2m
linters:
# Enable specific linters.
# https://golangci-lint.run/usage/linters/#enabled-by-default
enable:
# Detects places where loop variables are copied.
- copyloopvar
# Checks for duplicate words in the source code.
- dupword
# Tool for detection of FIXME, TODO and other comment keywords.
# - godox
# Enforces consistent import aliases.
- importas
# Find code that shadows one of Go's predeclared identifiers.
- predeclared
# Check that struct tags are well aligned.
- tagalign
# Remove unnecessary type conversions.
- unconvert
# Checks Go code for unused constants, variables, functions and types.
- unused
settings:
importas:
# List of aliases
alias:
- pkg: k8s.io/api/admissionregistration/v1
alias: admissionregistrationv1
- pkg: k8s.io/api/apps/v1
alias: appsv1
- pkg: k8s.io/api/batch/v1
alias: batchv1
- pkg: k8s.io/api/core/v1
alias: corev1
- pkg: k8s.io/api/extensions/v1beta1
alias: extensionsv1beta1
- pkg: k8s.io/api/networking/v1
alias: networkingv1
- pkg: k8s.io/apimachinery/pkg/apis/meta/v1
alias: metav1
- pkg: sigs.k8s.io/controller-runtime
alias: ctrl
issues:
# Maximum issues count per one linter.
# Set to 0 to disable.
# Default: 50
max-issues-per-linter: 50
# Maximum count of issues with the same text.
# Set to 0 to disable.
# Default: 3
max-same-issues: 3
formatters:
enable:
# Check import statements are formatted according to the 'goimport' command.
- goimports

View File

@ -7,3 +7,4 @@ repos:
# Make the tool search for charts only under the `charts` directory
- --chart-search-root=charts
- --template-files=README.md.gotmpl
- --sort-values-order=file

View File

@ -4,6 +4,8 @@ Below are the adopters of project Spark Operator. If you are using Spark Operato
| Organization | Contact (GitHub User Name) | Environment | Description of Use |
| ------------- | ------------- | ------------- | ------------- |
| [Alibaba Cloud](https://www.alibabacloud.com) | [@ChenYi015](https://github.com/ChenYi015) | Production | AI & Data Infrastructure |
| [APRA AMCOS](https://www.apraamcos.com.au/) | @shuch3ng | Production | Data Platform |
| [Beeline](https://beeline.ru) | @spestua | Evaluation | ML & Data Infrastructure |
| Bringg | @EladDolev | Production | ML & Analytics Data Platform |
| [Caicloud](https://intl.caicloud.io/) | @gaocegege | Production | Cloud-Native AI Platform |
@ -13,7 +15,7 @@ Below are the adopters of project Spark Operator. If you are using Spark Operato
| CloudZone | @iftachsc | Evaluation | Big Data Analytics Consultancy |
| Cyren | @avnerl | Evaluation | Data pipelines |
| [C2FO](https://www.c2fo.com/) | @vanhoale | Production | Data Platform / Data Infrastructure |
| [Data Mechanics](https://www.datamechanics.co) | @jrj-d | Production | Managed Spark Platform |
| [Spot by Netapp](https://spot.io/product/ocean-apache-spark/) | @ImpSy | Production | Managed Spark Platform |
| [DeepCure](https://www.deepcure.ai) | @mschroering | Production | Spark / ML |
| [DiDi](https://www.didiglobal.com) | @Run-Lin | Evaluation | Data Infrastructure |
| Exacaster | @minutis | Evaluation | Data pipelines |
@ -31,6 +33,7 @@ Below are the adopters of project Spark Operator. If you are using Spark Operato
| [Molex](https://www.molex.com/) | @AshishPushpSingh | Evaluation/Production | Data Platform |
| [MongoDB](https://www.mongodb.com) | @chickenpopcorn | Production | Data Infrastructure |
| Nielsen Identity Engine | @roitvt | Evaluation | Data pipelines |
| [Ninja Van](https://tech.ninjavan.co/) | @hongshaoyang | Production | Data Infrastructure |
| [PUBG](https://careers.pubg.com/#/en/) | @jacobhjkim | Production | ML & Data Infrastructure |
| [Qualytics](https://www.qualytics.co/) | @josecsotomorales | Production | Data Quality Platform |
| Riskified | @henbh | Evaluation | Analytics Data Platform |

723
CHANGELOG.md Normal file
View File

@ -0,0 +1,723 @@
# Changelog
## [v2.2.1](https://github.com/kubeflow/spark-operator/tree/v2.2.1) (2025-06-27)
### Features
- Customize ingress URL with Spark application ID ([#2554](https://github.com/kubeflow/spark-operator/pull/2554) by [@ChenYi015](https://github.com/ChenYi015))
- Make default ingress tls and annotations congurable in the helm config ([#2513](https://github.com/kubeflow/spark-operator/pull/2513) by [@Tom-Newton](https://github.com/Tom-Newton))
- Use code-generator for clientset, informers, listers ([#2563](https://github.com/kubeflow/spark-operator/pull/2563) by [@jbhalodia-slack](https://github.com/jbhalodia-slack))
### Misc
- add driver ingress unit tests ([#2552](https://github.com/kubeflow/spark-operator/pull/2552) by [@nabuskey](https://github.com/nabuskey))
- Get logger from context ([#2551](https://github.com/kubeflow/spark-operator/pull/2551) by [@ChenYi015](https://github.com/ChenYi015))
- Update golangci lint ([#2560](https://github.com/kubeflow/spark-operator/pull/2560) by [@joshuacuellar1](https://github.com/joshuacuellar1))
### Dependencies
- Bump aquasecurity/trivy-action from 0.30.0 to 0.31.0 ([#2557](https://github.com/kubeflow/spark-operator/pull/2557) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/prometheus/client_golang from 1.21.1 to 1.22.0 ([#2548](https://github.com/kubeflow/spark-operator/pull/2548) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump sigs.k8s.io/scheduler-plugins from 0.30.6 to 0.31.8 ([#2549](https://github.com/kubeflow/spark-operator/pull/2549) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/mod from 0.24.0 to 0.25.0 ([#2566](https://github.com/kubeflow/spark-operator/pull/2566) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/go-logr/logr from 1.4.2 to 1.4.3 ([#2567](https://github.com/kubeflow/spark-operator/pull/2567) by [@dependabot[bot]](https://github.com/apps/dependabot))
## [v2.2.0](https://github.com/kubeflow/spark-operator/tree/v2.2.0) (2025-05-29)
### Features
- Upgrade to Spark 3.5.5 ([#2490](https://github.com/kubeflow/spark-operator/pull/2490) by [@jacobsalway](https://github.com/jacobsalway))
- Add timeZone to ScheduledSparkApplication ([#2471](https://github.com/kubeflow/spark-operator/pull/2471) by [@jacobsalway](https://github.com/jacobsalway))
- Enable the override of MemoryLimit through webhook ([#2478](https://github.com/kubeflow/spark-operator/pull/2478) by [@danielrsfreitas](https://github.com/danielrsfreitas))
- Add ShuffleTrackingEnabled to DynamicAllocation struct to allow disabling shuffle tracking ([#2511](https://github.com/kubeflow/spark-operator/pull/2511) by [@jbhalodia-slack](https://github.com/jbhalodia-slack))
- Define SparkApplicationSubmitter interface to allow customizing submitting mechanism ([#2500](https://github.com/kubeflow/spark-operator/pull/2500) by [@ChenYi015](https://github.com/ChenYi015))
- Add support for using cert manager to generate webhook certificates ([#2373](https://github.com/kubeflow/spark-operator/pull/2373) by [@ChenYi015](https://github.com/ChenYi015))
### Bug Fixes
- fix: add webhook cert validity checking ([#2489](https://github.com/kubeflow/spark-operator/pull/2489) by [@teejaded](https://github.com/teejaded))
- fix and add back unit tests ([#2532](https://github.com/kubeflow/spark-operator/pull/2532) by [@nabuskey](https://github.com/nabuskey))
- fix volcano tests ([#2533](https://github.com/kubeflow/spark-operator/pull/2533) by [@nabuskey](https://github.com/nabuskey))
- Add v2 to module path ([#2515](https://github.com/kubeflow/spark-operator/pull/2515) by [@ChenYi015](https://github.com/ChenYi015))
- #2525 spark metrics in depends on prometheus ([#2529](https://github.com/kubeflow/spark-operator/pull/2529) by [@blcksrx](https://github.com/blcksrx))
### Misc
- Add APRA AMCOS to adopters ([#2485](https://github.com/kubeflow/spark-operator/pull/2485) by [@shuch3ng](https://github.com/shuch3ng))
- Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 ([#2488](https://github.com/kubeflow/spark-operator/pull/2488) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/prometheus/client_golang from 1.20.5 to 1.21.1 ([#2487](https://github.com/kubeflow/spark-operator/pull/2487) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump sigs.k8s.io/controller-runtime from 0.20.1 to 0.20.4 ([#2486](https://github.com/kubeflow/spark-operator/pull/2486) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Deprecating sparkctl ([#2484](https://github.com/kubeflow/spark-operator/pull/2484) by [@vikas-saxena02](https://github.com/vikas-saxena02))
- Changing image repo from docker.io to ghcr.io ([#2483](https://github.com/kubeflow/spark-operator/pull/2483) by [@vikas-saxena02](https://github.com/vikas-saxena02))
- Upgrade Golang to 1.24.1 and golangci-lint to 1.64.8 ([#2494](https://github.com/kubeflow/spark-operator/pull/2494) by [@jacobsalway](https://github.com/jacobsalway))
- Bump helm.sh/helm/v3 from 3.16.2 to 3.17.3 ([#2503](https://github.com/kubeflow/spark-operator/pull/2503) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Add changelog for v2.1.1 ([#2504](https://github.com/kubeflow/spark-operator/pull/2504) by [@ChenYi015](https://github.com/ChenYi015))
- Remove sparkctl ([#2466](https://github.com/kubeflow/spark-operator/pull/2466) by [@ChenYi015](https://github.com/ChenYi015))
- Bump github.com/spf13/viper from 1.19.0 to 1.20.1 ([#2496](https://github.com/kubeflow/spark-operator/pull/2496) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/net from 0.37.0 to 0.38.0 ([#2505](https://github.com/kubeflow/spark-operator/pull/2505) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Remove clientset, informer and listers generated by code-generator ([#2506](https://github.com/kubeflow/spark-operator/pull/2506) by [@ChenYi015](https://github.com/ChenYi015))
- Remove v1beta1 API ([#2516](https://github.com/kubeflow/spark-operator/pull/2516) by [@ChenYi015](https://github.com/ChenYi015))
- add unit tests for driver and executor configs ([#2521](https://github.com/kubeflow/spark-operator/pull/2521) by [@nabuskey](https://github.com/nabuskey))
- Adding securityContext to spark examples ([#2530](https://github.com/kubeflow/spark-operator/pull/2530) by [@tarekabouzeid](https://github.com/tarekabouzeid))
- Bump github.com/spf13/cobra from 1.8.1 to 1.9.1 ([#2497](https://github.com/kubeflow/spark-operator/pull/2497) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/mod from 0.23.0 to 0.24.0 ([#2495](https://github.com/kubeflow/spark-operator/pull/2495) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Adding Manabu to the reviewers ([#2522](https://github.com/kubeflow/spark-operator/pull/2522) by [@vara-bonthu](https://github.com/vara-bonthu))
- Bump manusa/actions-setup-minikube from 2.13.1 to 2.14.0 ([#2523](https://github.com/kubeflow/spark-operator/pull/2523) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump k8s.io dependencies to v0.32.5 ([#2540](https://github.com/kubeflow/spark-operator/pull/2540) by [@ChenYi015](https://github.com/ChenYi015))
- Pass the correct LDFLAGS when building the operator image ([#2541](https://github.com/kubeflow/spark-operator/pull/2541) by [@ChenYi015](https://github.com/ChenYi015))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/v2.1.1...v2.2.0)
## [v2.1.1](https://github.com/kubeflow/spark-operator/tree/v2.1.1) (2025-03-21)
### Features
- Adding seccompProfile RuntimeDefault ([#2397](https://github.com/kubeflow/spark-operator/pull/2397) by [@tarekabouzeid](https://github.com/tarekabouzeid))
- Add option for disabling leader election ([#2423](https://github.com/kubeflow/spark-operator/pull/2423) by [@ChenYi015](https://github.com/ChenYi015))
- Controller should only be granted event permissions in spark job namespaces ([#2426](https://github.com/kubeflow/spark-operator/pull/2426) by [@ChenYi015](https://github.com/ChenYi015))
- Make image optional ([#2439](https://github.com/kubeflow/spark-operator/pull/2439) by [@jbhalodia-slack](https://github.com/jbhalodia-slack))
- Support non-standard Spark container names ([#2441](https://github.com/kubeflow/spark-operator/pull/2441) by [@jbhalodia-slack](https://github.com/jbhalodia-slack))
- add support for metrics-job-start-latency-buckets flag in helm ([#2450](https://github.com/kubeflow/spark-operator/pull/2450) by [@nabuskey](https://github.com/nabuskey))
### Bug Fixes
- fix: webhook fail to add lifecycle to Spark3 executor pods ([#2458](https://github.com/kubeflow/spark-operator/pull/2458) by [@pvbouwel](https://github.com/pvbouwel))
- change env in executorSecretOption ([#2467](https://github.com/kubeflow/spark-operator/pull/2467) by [@TQJADE](https://github.com/TQJADE))
### Misc
- Move sparkctl to cmd directory ([#2347](https://github.com/kubeflow/spark-operator/pull/2347) by [@ChenYi015](https://github.com/ChenYi015))
- Bump golang.org/x/net from 0.30.0 to 0.32.0 ([#2350](https://github.com/kubeflow/spark-operator/pull/2350) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/crypto from 0.30.0 to 0.31.0 ([#2365](https://github.com/kubeflow/spark-operator/pull/2365) by [@dependabot[bot]](https://github.com/apps/dependabot))
- add an example of using prometheus servlet ([#2403](https://github.com/kubeflow/spark-operator/pull/2403) by [@nabuskey](https://github.com/nabuskey))
- Remove dependency on `k8s.io/kubernetes` ([#2398](https://github.com/kubeflow/spark-operator/pull/2398) by [@jacobsalway](https://github.com/jacobsalway))
- fix make deploy and install ([#2412](https://github.com/kubeflow/spark-operator/pull/2412) by [@nabuskey](https://github.com/nabuskey))
- Add helm unittest step to integration test workflow ([#2424](https://github.com/kubeflow/spark-operator/pull/2424) by [@ChenYi015](https://github.com/ChenYi015))
- ensure passed context is used ([#2432](https://github.com/kubeflow/spark-operator/pull/2432) by [@nabuskey](https://github.com/nabuskey))
- Bump manusa/actions-setup-minikube from 2.13.0 to 2.13.1 ([#2390](https://github.com/kubeflow/spark-operator/pull/2390) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump helm/chart-testing-action from 2.6.1 to 2.7.0 ([#2391](https://github.com/kubeflow/spark-operator/pull/2391) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/mod from 0.21.0 to 0.23.0 ([#2427](https://github.com/kubeflow/spark-operator/pull/2427) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/golang/glog from 1.2.2 to 1.2.4 ([#2411](https://github.com/kubeflow/spark-operator/pull/2411) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/net from 0.32.0 to 0.35.0 ([#2428](https://github.com/kubeflow/spark-operator/pull/2428) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Support Kubernetes 1.32 ([#2416](https://github.com/kubeflow/spark-operator/pull/2416) by [@jacobsalway](https://github.com/jacobsalway))
- use cmd context in sparkctl ([#2447](https://github.com/kubeflow/spark-operator/pull/2447) by [@nabuskey](https://github.com/nabuskey))
- Bump golang.org/x/net from 0.35.0 to 0.36.0 ([#2470](https://github.com/kubeflow/spark-operator/pull/2470) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump aquasecurity/trivy-action from 0.29.0 to 0.30.0 ([#2475](https://github.com/kubeflow/spark-operator/pull/2475) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/net from 0.35.0 to 0.37.0 ([#2472](https://github.com/kubeflow/spark-operator/pull/2472) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/containerd/containerd from 1.7.19 to 1.7.27 ([#2476](https://github.com/kubeflow/spark-operator/pull/2476) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump k8s.io/apimachinery from 0.32.0 to 0.32.3 ([#2474](https://github.com/kubeflow/spark-operator/pull/2474) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.66.0 to 1.78.2 ([#2473](https://github.com/kubeflow/spark-operator/pull/2473) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/aws/aws-sdk-go-v2/config from 1.28.0 to 1.29.9 ([#2463](https://github.com/kubeflow/spark-operator/pull/2463) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump sigs.k8s.io/scheduler-plugins from 0.29.8 to 0.30.6 ([#2444](https://github.com/kubeflow/spark-operator/pull/2444) by [@dependabot[bot]](https://github.com/apps/dependabot))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/v2.1.0...v2.1.1)
## [v2.1.0](https://github.com/kubeflow/spark-operator/tree/v2.1.0) (2024-12-06)
### New Features
- Upgrade to Spark 3.5.3 ([#2202](https://github.com/kubeflow/spark-operator/pull/2202) by [@jacobsalway](https://github.com/jacobsalway))
- feat: support archives param for spark-submit ([#2256](https://github.com/kubeflow/spark-operator/pull/2256) by [@kaka-zb](https://github.com/kaka-zb))
- Allow --ingress-class-name to be specified in chart ([#2278](https://github.com/kubeflow/spark-operator/pull/2278) by [@jacobsalway](https://github.com/jacobsalway))
- Update default container security context ([#2265](https://github.com/kubeflow/spark-operator/pull/2265) by [@ChenYi015](https://github.com/ChenYi015))
- Support pod template for Spark 3.x applications ([#2141](https://github.com/kubeflow/spark-operator/pull/2141) by [@ChenYi015](https://github.com/ChenYi015))
- Allow setting automountServiceAccountToken ([#2298](https://github.com/kubeflow/spark-operator/pull/2298) by [@Aranch](https://github.com/Aransh))
- Allow the Controller and Webhook Containers to run with the securityContext: readOnlyRootfilesystem: true ([#2282](https://github.com/kubeflow/spark-operator/pull/2282) by [@npgretz](https://github.com/npgretz))
- Use NSS_WRAPPER_PASSWD instead of /etc/passwd as in spark-operator image entrypoint.sh ([#2312](https://github.com/kubeflow/spark-operator/pull/2312) by [@Aakcht](https://github.com/Aakcht))
### Bug Fixes
- Minor fixes to e2e test `make` targets ([#2242](https://github.com/kubeflow/spark-operator/pull/2242) by [@Tom-Newton](https://github.com/Tom-Newton))
- Added off heap memory to calculation for YuniKorn gang scheduling ([#2209](https://github.com/kubeflow/spark-operator/pull/2209) by [@guangyu-yang-rokt](https://github.com/guangyu-yang-rokt))
- Add permissions to controller serviceaccount to list and watch ingresses ([#2246](https://github.com/kubeflow/spark-operator/pull/2246) by [@tcassaert](https://github.com/tcassaert))
- Make sure enable-ui-service flag is set to false when controller.uiService.enable is set to false ([#2261](https://github.com/kubeflow/spark-operator/pull/2261) by [@Roberdvs](https://github.com/Roberdvs))
- `omitempty` corrections ([#2255](https://github.com/kubeflow/spark-operator/pull/2255) by [@Tom-Newton](https://github.com/Tom-Newton))
- Fix retries ([#2241](https://github.com/kubeflow/spark-operator/pull/2241) by [@Tom-Newton](https://github.com/Tom-Newton))
- Fix: executor container security context does not work ([#2306](https://github.com/kubeflow/spark-operator/pull/2306) by [@ChenYi015](https://github.com/ChenYi015))
- Fix: should not add emptyDir sizeLimit conf if it is nil ([#2305](https://github.com/kubeflow/spark-operator/pull/2305) by [@ChenYi015](https://github.com/ChenYi015))
- Fix: should not add emptyDir sizeLimit conf on executor pods if it is nil ([#2316](https://github.com/kubeflow/spark-operator/pull/2316) by [@Cian911](https://github.com/Cian911))
- Truncate UI service name if over 63 characters ([#2311](https://github.com/kubeflow/spark-operator/pull/2311) by [@jacobsalway](https://github.com/jacobsalway))
- The webhook-key-name command-line param isn't taking effect ([#2344](https://github.com/kubeflow/spark-operator/pull/2344) by [@c-h-afzal](https://github.com/c-h-afzal))
- Robustness to driver pod taking time to create ([#2315](https://github.com/kubeflow/spark-operator/pull/2315) by [@Tom-Newton](https://github.com/Tom-Newton))
### Misc
- remove redundant test.sh file ([#2243](https://github.com/kubeflow/spark-operator/pull/2243) by [@ChenYi015](https://github.com/ChenYi015))
- Bump github.com/aws/aws-sdk-go-v2/config from 1.27.42 to 1.27.43 ([#2252](https://github.com/kubeflow/spark-operator/pull/2252) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump manusa/actions-setup-minikube from 2.12.0 to 2.13.0 ([#2247](https://github.com/kubeflow/spark-operator/pull/2247) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/net from 0.29.0 to 0.30.0 ([#2251](https://github.com/kubeflow/spark-operator/pull/2251) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump aquasecurity/trivy-action from 0.24.0 to 0.27.0 ([#2248](https://github.com/kubeflow/spark-operator/pull/2248) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump gocloud.dev from 0.39.0 to 0.40.0 ([#2250](https://github.com/kubeflow/spark-operator/pull/2250) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Add Quick Start guide to README ([#2259](https://github.com/kubeflow/spark-operator/pull/2259) by [@jacobsalway](https://github.com/jacobsalway))
- Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.63.3 to 1.65.3 ([#2249](https://github.com/kubeflow/spark-operator/pull/2249) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Add release badge to README ([#2263](https://github.com/kubeflow/spark-operator/pull/2263) by [@jacobsalway](https://github.com/jacobsalway))
- Bump helm.sh/helm/v3 from 3.16.1 to 3.16.2 ([#2275](https://github.com/kubeflow/spark-operator/pull/2275) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/prometheus/client_golang from 1.20.4 to 1.20.5 ([#2274](https://github.com/kubeflow/spark-operator/pull/2274) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump cloud.google.com/go/storage from 1.44.0 to 1.45.0 ([#2273](https://github.com/kubeflow/spark-operator/pull/2273) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Run e2e tests with Kubernetes version matrix ([#2266](https://github.com/kubeflow/spark-operator/pull/2266) by [@jacobsalway](https://github.com/jacobsalway))
- Bump aquasecurity/trivy-action from 0.27.0 to 0.28.0 ([#2270](https://github.com/kubeflow/spark-operator/pull/2270) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.65.3 to 1.66.0 ([#2271](https://github.com/kubeflow/spark-operator/pull/2271) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/aws/aws-sdk-go-v2/config from 1.27.43 to 1.28.0 ([#2272](https://github.com/kubeflow/spark-operator/pull/2272) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Add workflow for releasing sparkctl binary ([#2264](https://github.com/kubeflow/spark-operator/pull/2264) by [@ChenYi015](https://github.com/ChenYi015))
- Bump `volcano.sh/apis` to 1.10.0 ([#2320](https://github.com/kubeflow/spark-operator/pull/2320) by [@jacobsalway](https://github.com/jacobsalway))
- Bump aquasecurity/trivy-action from 0.28.0 to 0.29.0 ([#2332](https://github.com/kubeflow/spark-operator/pull/2332) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/onsi/ginkgo/v2 from 2.20.2 to 2.22.0 ([#2335](https://github.com/kubeflow/spark-operator/pull/2335) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Move sparkctl to cmd directory ([#2347](https://github.com/kubeflow/spark-operator/pull/2347) by [@ChenYi015](https://github.com/ChenYi015))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/a8b5d6...v2.1.0 )
## [v2.0.2](https://github.com/kubeflow/spark-operator/tree/v2.0.2) (2024-10-10)
### Bug Fixes
- Fix ingress capability discovery ([#2201](https://github.com/kubeflow/spark-operator/pull/2201) by [@jacobsalway](https://github.com/jacobsalway))
- fix: imagePullPolicy was ignored ([#2222](https://github.com/kubeflow/spark-operator/pull/2222) by [@missedone](https://github.com/missedone))
- fix: spark-submission failed due to lack of permission by user `spark` ([#2223](https://github.com/kubeflow/spark-operator/pull/2223) by [@missedone](https://github.com/missedone))
- Remove `cap_net_bind_service` from image ([#2216](https://github.com/kubeflow/spark-operator/pull/2216) by [@jacobsalway](https://github.com/jacobsalway))
- fix: webhook panics due to logging ([#2232](https://github.com/kubeflow/spark-operator/pull/2232) by [@ChenYi015](https://github.com/ChenYi015))
### Misc
- Bump github.com/aws/aws-sdk-go-v2 from 1.30.5 to 1.31.0 ([#2207](https://github.com/kubeflow/spark-operator/pull/2207) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/net from 0.28.0 to 0.29.0 ([#2205](https://github.com/kubeflow/spark-operator/pull/2205) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/docker/docker from 27.0.3+incompatible to 27.1.1+incompatible ([#2125](https://github.com/kubeflow/spark-operator/pull/2125) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.58.3 to 1.63.3 ([#2206](https://github.com/kubeflow/spark-operator/pull/2206) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Update integration test workflow and add golangci lint check ([#2197](https://github.com/kubeflow/spark-operator/pull/2197) by [@ChenYi015](https://github.com/ChenYi015))
- Bump github.com/aws/aws-sdk-go-v2 from 1.31.0 to 1.32.0 ([#2229](https://github.com/kubeflow/spark-operator/pull/2229) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump cloud.google.com/go/storage from 1.43.0 to 1.44.0 ([#2228](https://github.com/kubeflow/spark-operator/pull/2228) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump manusa/actions-setup-minikube from 2.11.0 to 2.12.0 ([#2226](https://github.com/kubeflow/spark-operator/pull/2226) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump golang.org/x/time from 0.6.0 to 0.7.0 ([#2227](https://github.com/kubeflow/spark-operator/pull/2227) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/aws/aws-sdk-go-v2/config from 1.27.33 to 1.27.42 ([#2231](https://github.com/kubeflow/spark-operator/pull/2231) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/prometheus/client_golang from 1.19.1 to 1.20.4 ([#2204](https://github.com/kubeflow/spark-operator/pull/2204) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Add check for generating manifests and code ([#2234](https://github.com/kubeflow/spark-operator/pull/2234) by [@ChenYi015](https://github.com/ChenYi015))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/v2.0.1...v2.0.2)
## [v2.0.1](https://github.com/kubeflow/spark-operator/tree/v2.0.1) (2024-09-26)
### New Features
- FEATURE: build operator image as non-root ([#2171](https://github.com/kubeflow/spark-operator/pull/2171) by [@ImpSy](https://github.com/ImpSy))
### Bug Fixes
- Update controller RBAC for ConfigMap and PersistentVolumeClaim ([#2187](https://github.com/kubeflow/spark-operator/pull/2187) by [@ChenYi015](https://github.com/ChenYi015))
### Misc
- Bump github.com/onsi/ginkgo/v2 from 2.19.0 to 2.20.2 ([#2188](https://github.com/kubeflow/spark-operator/pull/2188) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump github.com/onsi/gomega from 1.33.1 to 1.34.2 ([#2189](https://github.com/kubeflow/spark-operator/pull/2189) by [@dependabot[bot]](https://github.com/apps/dependabot))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/v2.0.0...v2.0.1)
## [v2.0.0](https://github.com/kubeflow/spark-operator/tree/v2.0.0) (2024-09-23)
### Breaking Changes
- Use controller-runtime to reconsturct spark operator ([#2072](https://github.com/kubeflow/spark-operator/pull/2072) by [@ChenYi015](https://github.com/ChenYi015))
- feat: support driver and executor pod use different priority ([#2146](https://github.com/kubeflow/spark-operator/pull/2146) by [@Kevinz857](https://github.com/Kevinz857))
### New Features
- Support gang scheduling with Yunikorn ([#2107](https://github.com/kubeflow/spark-operator/pull/2107)) by [@jacobsalway](https://github.com/jacobsalway)
- Reintroduce option webhook.enable ([#2142](https://github.com/kubeflow/spark-operator/pull/2142) by [@ChenYi015](https://github.com/ChenYi015))
- Add default batch scheduler argument ([#2143](https://github.com/kubeflow/spark-operator/pull/2143) by [@jacobsalway](https://github.com/jacobsalway))
- Support extended kube-scheduler as batch scheduler ([#2136](https://github.com/kubeflow/spark-operator/pull/2136) by [@ChenYi015](https://github.com/ChenYi015))
- Set schedulerName to Yunikorn ([#2153](https://github.com/kubeflow/spark-operator/pull/2153) by [@jacobsalway](https://github.com/jacobsalway))
- Feature: Add pprof endpoint ([#2164](https://github.com/kubeflow/spark-operator/pull/2164) by [@ImpSy](https://github.com/ImpSy))
### Bug Fixes
- fix: Add default values for namespaces to match usage descriptions ([#2128](https://github.com/kubeflow/spark-operator/pull/2128) by [@snappyyouth](https://github.com/snappyyouth))
- Fix: Spark role binding did not render properly when setting spark service account name ([#2135](https://github.com/kubeflow/spark-operator/pull/2135) by [@ChenYi015](https://github.com/ChenYi015))
- fix: unable to set controller/webhook replicas to zero ([#2147](https://github.com/kubeflow/spark-operator/pull/2147) by [@ChenYi015](https://github.com/ChenYi015))
- Adding support for setting spark job namespaces to all namespaces ([#2123](https://github.com/kubeflow/spark-operator/pull/2123) by [@ChenYi015](https://github.com/ChenYi015))
- Fix: e2e test failes due to webhook not ready ([#2149](https://github.com/kubeflow/spark-operator/pull/2149) by [@ChenYi015](https://github.com/ChenYi015))
- fix: webhook not working when settings spark job namespaces to empty ([#2163](https://github.com/kubeflow/spark-operator/pull/2163) by [@ChenYi015](https://github.com/ChenYi015))
- fix: The logger had an odd number of arguments, making it panic ([#2166](https://github.com/kubeflow/spark-operator/pull/2166) by [@tcassaert](https://github.com/tcassaert))
- fix the make kind-delete-custer to avoid accidental kubeconfig deletion ([#2172](https://github.com/kubeflow/spark-operator/pull/2172) by [@ImpSy](https://github.com/ImpSy))
- Add specific error in log line when failed to create web UI service ([#2170](https://github.com/kubeflow/spark-operator/pull/2170) by [@tcassaert](https://github.com/tcassaert))
- Account for spark.executor.pyspark.memory in Yunikorn gang scheduling ([#2178](https://github.com/kubeflow/spark-operator/pull/2178) by [@jacobsalway](https://github.com/jacobsalway))
- Fix: spark application does not respect time to live seconds ([#2165](https://github.com/kubeflow/spark-operator/pull/2165) by [@ChenYi015](https://github.com/ChenYi015))
### Misc
- Update workflow and docs for releasing Spark operator ([#2089](https://github.com/kubeflow/spark-operator/pull/2089) by [@ChenYi015](https://github.com/ChenYi015))
- Fix broken integration test CI ([#2109](https://github.com/kubeflow/spark-operator/pull/2109) by [@ChenYi015](https://github.com/ChenYi015))
- Fix CI: environment variable BRANCH is missed ([#2111](https://github.com/kubeflow/spark-operator/pull/2111) by [@ChenYi015](https://github.com/ChenYi015))
- Update Makefile for building sparkctl ([#2119](https://github.com/kubeflow/spark-operator/pull/2119) by [@ChenYi015](https://github.com/ChenYi015))
- Update release workflow and docs ([#2121](https://github.com/kubeflow/spark-operator/pull/2121) by [@ChenYi015](https://github.com/ChenYi015))
- Run e2e tests on Kind ([#2148](https://github.com/kubeflow/spark-operator/pull/2148) by [@jacobsalway](https://github.com/jacobsalway))
- Upgrade to Go 1.23.1 ([#2155](https://github.com/kubeflow/spark-operator/pull/2155) by [@jacobsalway](https://github.com/jacobsalway))
- Upgrade to Spark 3.5.2 ([#2154](https://github.com/kubeflow/spark-operator/pull/2154) by [@jacobsalway](https://github.com/jacobsalway))
- Bump sigs.k8s.io/scheduler-plugins from 0.29.7 to 0.29.8 ([#2159](https://github.com/kubeflow/spark-operator/pull/2159) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump gocloud.dev from 0.37.0 to 0.39.0 ([#2160](https://github.com/kubeflow/spark-operator/pull/2160) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Update e2e tests ([#2161](https://github.com/kubeflow/spark-operator/pull/2161) by [@ChenYi015](https://github.com/ChenYi015))
- Upgrade to Spark 3.5.2(#2012) ([#2157](https://github.com/kubeflow/spark-operator/pull/2157) by [@ha2hi](https://github.com/ha2hi))
- Bump github.com/aws/aws-sdk-go-v2/config from 1.27.27 to 1.27.33 ([#2174](https://github.com/kubeflow/spark-operator/pull/2174) by [@dependabot[bot]](https://github.com/apps/dependabot))
- Bump helm.sh/helm/v3 from 3.15.3 to 3.16.1 ([#2173](https://github.com/kubeflow/spark-operator/pull/2173) by [@dependabot[bot]](https://github.com/apps/dependabot))
- implement workflow to scan latest released docker image ([#2177](https://github.com/kubeflow/spark-operator/pull/2177) by [@ImpSy](https://github.com/ImpSy))
## What's Changed
- Cherry pick #2081 #2046 #2091 #2072 by @ChenYi015 in <https://github.com/kubeflow/spark-operator/pull/2108>
- Cherry pick #2089 #2109 #2111 by @ChenYi015 in <https://github.com/kubeflow/spark-operator/pull/2110>
- Release v2.0.0-rc.0 by @ChenYi015 in <https://github.com/kubeflow/spark-operator/pull/2115>
- Cherry pick commits for releasing v2.0.0 by @ChenYi015 in <https://github.com/kubeflow/spark-operator/pull/2156>
- Release v2.0.0 by @ChenYi015 in <https://github.com/kubeflow/spark-operator/pull/2182>
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/v1beta2-1.6.2-3.5.0...v2.0.0)
## [v2.0.0-rc.0](https://github.com/kubeflow/spark-operator/tree/v2.0.0-rc.0) (2024-08-09)
### Breaking Changes
- Use controller-runtime to reconsturct spark operator ([#2072](https://github.com/kubeflow/spark-operator/pull/2072) by [@ChenYi015](https://github.com/ChenYi015))
### Misc
- Fix CI: environment variable BRANCH is missed ([#2111](https://github.com/kubeflow/spark-operator/pull/2111) by [@ChenYi015](https://github.com/ChenYi015))
- Fix broken integration test CI ([#2109](https://github.com/kubeflow/spark-operator/pull/2109) by [@ChenYi015](https://github.com/ChenYi015))
- Update workflow and docs for releasing Spark operator ([#2089](https://github.com/kubeflow/spark-operator/pull/2089) by [@ChenYi015](https://github.com/ChenYi015))
### What's Changed
- Release v2.0.0-rc.0 ([#2115](https://github.com/kubeflow/spark-operator/pull/2115) by [@ChenYi015](https://github.com/ChenYi015))
- Cherry pick #2089 #2109 #2111 ([#2110](https://github.com/kubeflow/spark-operator/pull/2110) by [@ChenYi015](https://github.com/ChenYi015))
- Cherry pick #2081 #2046 #2091 #2072 ([#2108](https://github.com/kubeflow/spark-operator/pull/2108) by [@ChenYi015](https://github.com/ChenYi015))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.4.3...v2.0.0-rc.0)
## [spark-operator-chart-1.4.6](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.4.6) (2024-07-26)
- Add topologySpreadConstraints ([#2091](https://github.com/kubeflow/spark-operator/pull/2091) by [@jbhalodia-slack](https://github.com/jbhalodia-slack))
- Add Alibaba Cloud to adopters ([#2097](https://github.com/kubeflow/spark-operator/pull/2097) by [@ChenYi015](https://github.com/ChenYi015))
- Update Stale bot settings ([#2095](https://github.com/kubeflow/spark-operator/pull/2095) by [@andreyvelich](https://github.com/andreyvelich))
- Add @ChenYi015 to approvers ([#2096](https://github.com/kubeflow/spark-operator/pull/2096) by [@ChenYi015](https://github.com/ChenYi015))
- Add CHANGELOG.md file and use python script to generate it automatically ([#2087](https://github.com/kubeflow/spark-operator/pull/2087) by [@ChenYi015](https://github.com/ChenYi015))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.4.5...spark-operator-chart-1.4.6)
## [spark-operator-chart-1.4.5](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.4.5) (2024-07-22)
- Update the process to build api-docs, generate CRD manifests and code ([#2046](https://github.com/kubeflow/spark-operator/pull/2046) by [@ChenYi015](https://github.com/ChenYi015))
- Add workflow for closing stale issues and PRs ([#2073](https://github.com/kubeflow/spark-operator/pull/2073) by [@ChenYi015](https://github.com/ChenYi015))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.4.4...spark-operator-chart-1.4.5)
## [spark-operator-chart-1.4.4](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.4.4) (2024-07-22)
- Update helm docs ([#2081](https://github.com/kubeflow/spark-operator/pull/2081) by [@csp33](https://github.com/csp33))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.4.3...spark-operator-chart-1.4.4)
## [spark-operator-chart-1.4.3](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.4.3) (2024-07-03)
- Add PodDisruptionBudget to chart ([#2078](https://github.com/kubeflow/spark-operator/pull/2078) by [@csp33](https://github.com/csp33))
- Update README and documentation ([#2047](https://github.com/kubeflow/spark-operator/pull/2047) by [@ChenYi015](https://github.com/ChenYi015))
- Add code of conduct and update contributor guide ([#2074](https://github.com/kubeflow/spark-operator/pull/2074) by [@ChenYi015](https://github.com/ChenYi015))
- Remove .gitlab-ci.yml ([#2069](https://github.com/kubeflow/spark-operator/pull/2069) by [@jacobsalway](https://github.com/jacobsalway))
- Modified README.MD as per changes discussed on <https://github.com/kubeflow/spark-operator/pull/2062> ([#2066](https://github.com/kubeflow/spark-operator/pull/2066) by [@vikas-saxena02](https://github.com/vikas-saxena02))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.4.2...spark-operator-chart-1.4.3)
## [spark-operator-chart-1.4.2](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.4.2) (2024-06-17)
- Support objectSelector on mutating webhook ([#2058](https://github.com/kubeflow/spark-operator/pull/2058) by [@Cian911](https://github.com/Cian911))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.4.1...spark-operator-chart-1.4.2)
## [spark-operator-chart-1.4.1](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.4.1) (2024-06-15)
- Adding an option to set the priority class for spark-operator pod ([#2043](https://github.com/kubeflow/spark-operator/pull/2043) by [@pkgajulapalli](https://github.com/pkgajulapalli))
- Update minikube version in CI ([#2059](https://github.com/kubeflow/spark-operator/pull/2059) by [@Cian911](https://github.com/Cian911))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.4.0...spark-operator-chart-1.4.1)
## [spark-operator-chart-1.4.0](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.4.0) (2024-06-05)
- Certifictes are generated by operator rather than gencerts.sh ([#2016](https://github.com/kubeflow/spark-operator/pull/2016) by [@ChenYi015](https://github.com/ChenYi015))
- Add ChenYi015 as spark-operator reviewer ([#2045](https://github.com/kubeflow/spark-operator/pull/2045) by [@ChenYi015](https://github.com/ChenYi015))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.3.2...spark-operator-chart-1.4.0)
## [spark-operator-chart-1.3.2](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.3.2) (2024-06-05)
- Bump appVersion to v1beta2-1.5.0-3.5.0 ([#2044](https://github.com/kubeflow/spark-operator/pull/2044) by [@ChenYi015](https://github.com/ChenYi015))
- Add restartPolicy field to SparkApplication Driver/Executor initContainers CRDs ([#2022](https://github.com/kubeflow/spark-operator/pull/2022) by [@mschroering](https://github.com/mschroering))
- :memo: Add Inter&Co to who-is-using.md ([#2040](https://github.com/kubeflow/spark-operator/pull/2040) by [@ignitz](https://github.com/ignitz))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.3.1...spark-operator-chart-1.3.2)
## [spark-operator-chart-1.3.1](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.3.1) (2024-05-31)
- Chart: add POD_NAME env for leader election ([#2039](https://github.com/kubeflow/spark-operator/pull/2039) by [@Aakcht](https://github.com/Aakcht))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.3.0...spark-operator-chart-1.3.1)
## [spark-operator-chart-1.3.0](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.3.0) (2024-05-20)
- Support exposing extra TCP ports in Spark Driver via K8s Ingress ([#1998](https://github.com/kubeflow/spark-operator/pull/1998) by [@hiboyang](https://github.com/hiboyang))
- Fixes a bug with dynamic allocation forcing the executor count to be 1 even when minExecutors is set to 0 ([#1979](https://github.com/kubeflow/spark-operator/pull/1979) by [@peter-mcclonski](https://github.com/peter-mcclonski))
- Remove outdated PySpark experimental warning in example ([#2014](https://github.com/kubeflow/spark-operator/pull/2014) by [@andrejpk](https://github.com/andrejpk))
- Update Spark Job Namespace docs ([#2000](https://github.com/kubeflow/spark-operator/pull/2000) by [@matthewrossi](https://github.com/matthewrossi))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.2.15...spark-operator-chart-1.3.0)
## [spark-operator-chart-1.2.15](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.2.15) (2024-05-07)
- Fix examples ([#2010](https://github.com/kubeflow/spark-operator/pull/2010) by [@peter-mcclonski](https://github.com/peter-mcclonski))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.2.14...spark-operator-chart-1.2.15)
## [spark-operator-chart-1.2.14](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.2.14) (2024-04-26)
- feat: add support for service labels on driver-svc ([#1985](https://github.com/kubeflow/spark-operator/pull/1985) by [@Cian911](https://github.com/Cian911))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.2.13...spark-operator-chart-1.2.14)
## [spark-operator-chart-1.2.13](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.2.13) (2024-04-24)
- fix(chart): remove operator namespace default for job namespaces value ([#1989](https://github.com/kubeflow/spark-operator/pull/1989) by [@t3mi](https://github.com/t3mi))
- Fix Docker Hub Credentials in CI ([#2003](https://github.com/kubeflow/spark-operator/pull/2003) by [@andreyvelich](https://github.com/andreyvelich))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.2.12...spark-operator-chart-1.2.13)
## [spark-operator-chart-1.2.12](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.2.12) (2024-04-19)
- Add emptyDir sizeLimit support for local dirs ([#1993](https://github.com/kubeflow/spark-operator/pull/1993) by [@jacobsalway](https://github.com/jacobsalway))
- fix: Removed `publish-image` dependency on publishing the helm chart ([#1995](https://github.com/kubeflow/spark-operator/pull/1995) by [@vara-bonthu](https://github.com/vara-bonthu))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.2.11...spark-operator-chart-1.2.12)
## [spark-operator-chart-1.2.11](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.2.11) (2024-04-19)
- fix: Update Github workflow to publish Helm charts on chart changes, irrespective of image change ([#1992](https://github.com/kubeflow/spark-operator/pull/1992) by [@vara-bonthu](https://github.com/vara-bonthu))
- chore: Add Timo to user list ([#1615](https://github.com/kubeflow/spark-operator/pull/1615) by [@vanducng](https://github.com/vanducng))
- Update spark operator permissions for CRD ([#1973](https://github.com/kubeflow/spark-operator/pull/1973) by [@ChenYi015](https://github.com/ChenYi015))
- fix spark-rbac ([#1986](https://github.com/kubeflow/spark-operator/pull/1986) by [@Aransh](https://github.com/Aransh))
- Use Kubeflow Docker Hub for Spark Operator Image ([#1974](https://github.com/kubeflow/spark-operator/pull/1974) by [@andreyvelich](https://github.com/andreyvelich))
- fix: fixed serviceaccount annotations ([#1972](https://github.com/kubeflow/spark-operator/pull/1972) by [@AndrewChubatiuk](https://github.com/AndrewChubatiuk))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.2.7...spark-operator-chart-1.2.11)
## [spark-operator-chart-1.2.7](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.2.7) (2024-04-16)
- fix: upgraded k8s deps ([#1983](https://github.com/kubeflow/spark-operator/pull/1983) by [@AndrewChubatiuk](https://github.com/AndrewChubatiuk))
- chore: remove k8s.io/kubernetes replaces and adapt to v1.29.3 apis ([#1968](https://github.com/kubeflow/spark-operator/pull/1968) by [@ajayk](https://github.com/ajayk))
- Add some helm chart unit tests and fix spark service account render failure when extra annotations are specified ([#1967](https://github.com/kubeflow/spark-operator/pull/1967) by [@ChenYi015](https://github.com/ChenYi015))
- feat: Doc updates, Issue and PR templates are added ([#1970](https://github.com/kubeflow/spark-operator/pull/1970) by [@vara-bonthu](https://github.com/vara-bonthu))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.2.5...spark-operator-chart-1.2.7)
## [spark-operator-chart-1.2.5](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.2.5) (2024-04-14)
- fixed docker image tag and updated chart docs ([#1969](https://github.com/kubeflow/spark-operator/pull/1969) by [@AndrewChubatiuk](https://github.com/AndrewChubatiuk))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.2.4...spark-operator-chart-1.2.5)
## [spark-operator-chart-1.2.4](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.2.4) (2024-04-13)
- publish chart independently, incremented both chart and image versions to trigger build of both ([#1964](https://github.com/kubeflow/spark-operator/pull/1964) by [@AndrewChubatiuk](https://github.com/AndrewChubatiuk))
- Update helm chart README ([#1958](https://github.com/kubeflow/spark-operator/pull/1958) by [@ChenYi015](https://github.com/ChenYi015))
- fix: add containerPort declaration for webhook in helm chart ([#1961](https://github.com/kubeflow/spark-operator/pull/1961) by [@zevisert](https://github.com/zevisert))
- added id for a build job to fix digests artifact creation ([#1963](https://github.com/kubeflow/spark-operator/pull/1963) by [@AndrewChubatiuk](https://github.com/AndrewChubatiuk))
- support multiple namespaces ([#1955](https://github.com/kubeflow/spark-operator/pull/1955) by [@AndrewChubatiuk](https://github.com/AndrewChubatiuk))
- chore: replace GoogleCloudPlatform/spark-on-k8s-operator with kubeflow/spark-operator ([#1937](https://github.com/kubeflow/spark-operator/pull/1937) by [@zevisert](https://github.com/zevisert))
- Chart: add patch permissions for spark operator SA to support spark 3.5.0 ([#1884](https://github.com/kubeflow/spark-operator/pull/1884) by [@Aakcht](https://github.com/Aakcht))
- Cleanup after golang upgrade ([#1956](https://github.com/kubeflow/spark-operator/pull/1956) by [@AndrewChubatiuk](https://github.com/AndrewChubatiuk))
- feat: add support for custom service labels ([#1952](https://github.com/kubeflow/spark-operator/pull/1952) by [@Cian911](https://github.com/Cian911))
- upgraded golang and dependencies ([#1954](https://github.com/kubeflow/spark-operator/pull/1954) by [@AndrewChubatiuk](https://github.com/AndrewChubatiuk))
- README for installing operator using kustomize with custom namespace and service name ([#1778](https://github.com/kubeflow/spark-operator/pull/1778) by [@shahsiddharth08](https://github.com/shahsiddharth08))
- BUGFIX: Added cancel method to fix context leak ([#1917](https://github.com/kubeflow/spark-operator/pull/1917) by [@fazledyn-or](https://github.com/fazledyn-or))
- remove unmatched quotes from user-guide.md ([#1584](https://github.com/kubeflow/spark-operator/pull/1584) by [@taeyeopkim1](https://github.com/taeyeopkim1))
- Add PVC permission to Operator role ([#1889](https://github.com/kubeflow/spark-operator/pull/1889) by [@wyangsun](https://github.com/wyangsun))
- Allow to set webhook job resource limits (#1429,#1300) ([#1946](https://github.com/kubeflow/spark-operator/pull/1946) by [@karbyshevds](https://github.com/karbyshevds))
- Create OWNERS ([#1927](https://github.com/kubeflow/spark-operator/pull/1927) by [@zijianjoy](https://github.com/zijianjoy))
- fix: fix issue #1723 about spark-operator not working with volcano on OCP ([#1724](https://github.com/kubeflow/spark-operator/pull/1724) by [@disaster37](https://github.com/disaster37))
- Add Rokt to who-is-using.md ([#1867](https://github.com/kubeflow/spark-operator/pull/1867) by [@jacobsalway](https://github.com/jacobsalway))
- Handle invalid API resources in discovery ([#1758](https://github.com/kubeflow/spark-operator/pull/1758) by [@wiltonsr](https://github.com/wiltonsr))
- Fix docs for Volcano integration ([#1719](https://github.com/kubeflow/spark-operator/pull/1719) by [@VVKot](https://github.com/VVKot))
- Added qualytics to who is using ([#1736](https://github.com/kubeflow/spark-operator/pull/1736) by [@josecsotomorales](https://github.com/josecsotomorales))
- Allowing optional annotation on rbac ([#1770](https://github.com/kubeflow/spark-operator/pull/1770) by [@cxfcxf](https://github.com/cxfcxf))
- Support `seccompProfile` in Spark application CRD and fix pre-commit jobs ([#1768](https://github.com/kubeflow/spark-operator/pull/1768) by [@ordukhanian](https://github.com/ordukhanian))
- Updating webhook docs to also mention eks ([#1763](https://github.com/kubeflow/spark-operator/pull/1763) by [@JunaidChaudry](https://github.com/JunaidChaudry))
- Link to helm docs fixed ([#1783](https://github.com/kubeflow/spark-operator/pull/1783) by [@haron](https://github.com/haron))
- Improve getMasterURL() to add [] to IPv6 if needed ([#1825](https://github.com/kubeflow/spark-operator/pull/1825) by [@LittleWat](https://github.com/LittleWat))
- Add envFrom to operator deployment ([#1785](https://github.com/kubeflow/spark-operator/pull/1785) by [@matschaffer-roblox](https://github.com/matschaffer-roblox))
- Expand ingress docs a bit ([#1806](https://github.com/kubeflow/spark-operator/pull/1806) by [@matschaffer-roblox](https://github.com/matschaffer-roblox))
- Optional sidecars for operator pod ([#1754](https://github.com/kubeflow/spark-operator/pull/1754) by [@qq157755587](https://github.com/qq157755587))
- Add Roblox to who-is ([#1784](https://github.com/kubeflow/spark-operator/pull/1784) by [@matschaffer-roblox](https://github.com/matschaffer-roblox))
- Molex started using spark K8 operator. ([#1714](https://github.com/kubeflow/spark-operator/pull/1714) by [@AshishPushpSingh](https://github.com/AshishPushpSingh))
- Extra helm chart labels ([#1669](https://github.com/kubeflow/spark-operator/pull/1669) by [@kvanzuijlen](https://github.com/kvanzuijlen))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.27...spark-operator-chart-1.2.4)
## [spark-operator-chart-1.1.27](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.27) (2023-03-17)
- Added permissions for leader election #1635 ([#1647](https://github.com/kubeflow/spark-operator/pull/1647) by [@ordukhanian](https://github.com/ordukhanian))
- Fix #1393 : fix tolerations block in wrong segment for webhook jobs ([#1633](https://github.com/kubeflow/spark-operator/pull/1633) by [@zhiminglim](https://github.com/zhiminglim))
- add dependabot ([#1629](https://github.com/kubeflow/spark-operator/pull/1629) by [@monotek](https://github.com/monotek))
- Add support for `ephemeral.volumeClaimTemplate` in helm chart CRDs ([#1661](https://github.com/kubeflow/spark-operator/pull/1661) by [@ArshiAAkhavan](https://github.com/ArshiAAkhavan))
- Add Kognita to "Who is using" ([#1637](https://github.com/kubeflow/spark-operator/pull/1637) by [@claudino-kognita](https://github.com/claudino-kognita))
- add lifecycle to executor ([#1674](https://github.com/kubeflow/spark-operator/pull/1674) by [@tiechengsu](https://github.com/tiechengsu))
- Fix signal handling for non-leader processes ([#1680](https://github.com/kubeflow/spark-operator/pull/1680) by [@antonipp](https://github.com/antonipp))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.26...spark-operator-chart-1.1.27)
## [spark-operator-chart-1.1.26](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.26) (2022-10-25)
- update go to 1.19 + k8s.io libs to v0.25.3 ([#1630](https://github.com/kubeflow/spark-operator/pull/1630) by [@ImpSy](https://github.com/ImpSy))
- Update README - secrets and sidecars need mutating webhooks ([#1550](https://github.com/kubeflow/spark-operator/pull/1550) by [@djdillon](https://github.com/djdillon))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.25...spark-operator-chart-1.1.26)
## [spark-operator-chart-1.1.25](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.25) (2022-06-08)
- Webhook init and cleanup should respect nodeSelector ([#1545](https://github.com/kubeflow/spark-operator/pull/1545) by [@erikcw](https://github.com/erikcw))
- rename unit tests to integration tests in Makefile#integration-test ([#1539](https://github.com/kubeflow/spark-operator/pull/1539) by [@dcoliversun](https://github.com/dcoliversun))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.24...spark-operator-chart-1.1.25)
## [spark-operator-chart-1.1.24](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.24) (2022-06-01)
- Fix: use V1 api for CRDs for volcano integration ([#1540](https://github.com/kubeflow/spark-operator/pull/1540) by [@Aakcht](https://github.com/Aakcht))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.23...spark-operator-chart-1.1.24)
## [spark-operator-chart-1.1.23](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.23) (2022-05-18)
- fix: add pre-upgrade hook to rbac resources ([#1511](https://github.com/kubeflow/spark-operator/pull/1511) by [@cwyl02](https://github.com/cwyl02))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.22...spark-operator-chart-1.1.23)
## [spark-operator-chart-1.1.22](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.22) (2022-05-16)
- Fixes issue #1467 (issue when deleting SparkApplication without metrics server) ([#1530](https://github.com/kubeflow/spark-operator/pull/1530) by [@aneagoe](https://github.com/aneagoe))
- Implement --logs and --delete flags on 'sparkctl create' and a timeout on 'sparkctl log' to wait a pod startup ([#1506](https://github.com/kubeflow/spark-operator/pull/1506) by [@alaurentinoofficial](https://github.com/alaurentinoofficial))
- Fix Spark UI URL in app status ([#1518](https://github.com/kubeflow/spark-operator/pull/1518) by [@gtopper](https://github.com/gtopper))
- remove quotes from yaml file ([#1524](https://github.com/kubeflow/spark-operator/pull/1524) by [@zencircle](https://github.com/zencircle))
- Added missing manifest yaml, point the manifest to the right direction ([#1504](https://github.com/kubeflow/spark-operator/pull/1504) by [@RonZhang724](https://github.com/RonZhang724))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.21...spark-operator-chart-1.1.22)
## [spark-operator-chart-1.1.21](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.21) (2022-05-12)
- Ensure that driver is deleted prior to sparkapplication resubmission ([#1521](https://github.com/kubeflow/spark-operator/pull/1521) by [@khorshuheng](https://github.com/khorshuheng))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.20...spark-operator-chart-1.1.21)
## [spark-operator-chart-1.1.20](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.20) (2022-04-11)
- Add ingress-class-name controller flag ([#1482](https://github.com/kubeflow/spark-operator/pull/1482) by [@voyvodov](https://github.com/voyvodov))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.19...spark-operator-chart-1.1.20)
## [spark-operator-chart-1.1.19](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.19) (2022-02-14)
- Add Operator volumes and volumeMounts in chart ([#1475](https://github.com/kubeflow/spark-operator/pull/1475) by [@ocworld](https://github.com/ocworld))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.18...spark-operator-chart-1.1.19)
## [spark-operator-chart-1.1.18](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.18) (2022-02-13)
- Updated default registry to ghcr.io ([#1454](https://github.com/kubeflow/spark-operator/pull/1454) by [@aneagoe](https://github.com/aneagoe))
- Github actions workflow fix for Helm chart deployment ([#1456](https://github.com/kubeflow/spark-operator/pull/1456) by [@vara-bonthu](https://github.com/vara-bonthu))
- Kubernetes v1.22 extensions/v1beta1 API removal ([#1427](https://github.com/kubeflow/spark-operator/pull/1427) by [@aneagoe](https://github.com/aneagoe))
- Fixes an issue with github action in job build-spark-operator ([#1452](https://github.com/kubeflow/spark-operator/pull/1452) by [@aneagoe](https://github.com/aneagoe))
- use github container registry instead of gcr.io for releases ([#1422](https://github.com/kubeflow/spark-operator/pull/1422) by [@TomHellier](https://github.com/TomHellier))
- Fixes an error that was preventing the pods from being mutated ([#1421](https://github.com/kubeflow/spark-operator/pull/1421) by [@ssullivan](https://github.com/ssullivan))
- Make github actions more feature complete ([#1418](https://github.com/kubeflow/spark-operator/pull/1418) by [@TomHellier](https://github.com/TomHellier))
- Resolves an error when deploying the webhook where the k8s api indica… ([#1413](https://github.com/kubeflow/spark-operator/pull/1413) by [@ssullivan](https://github.com/ssullivan))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.15...spark-operator-chart-1.1.18)
## [spark-operator-chart-1.1.15](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.15) (2021-12-02)
- Add docker build to github action ([#1415](https://github.com/kubeflow/spark-operator/pull/1415) by [@TomHellier](https://github.com/TomHellier))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.14...spark-operator-chart-1.1.15)
## [spark-operator-chart-1.1.14](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.14) (2021-11-30)
- Updating API version of admissionregistration.k8s.io ([#1401](https://github.com/kubeflow/spark-operator/pull/1401) by [@sairamankumar2](https://github.com/sairamankumar2))
- Add C2FO to who is using ([#1391](https://github.com/kubeflow/spark-operator/pull/1391) by [@vanhoale](https://github.com/vanhoale))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.13...spark-operator-chart-1.1.14)
## [spark-operator-chart-1.1.13](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.13) (2021-11-18)
- delete-service-accounts-and-roles-before-creation ([#1384](https://github.com/kubeflow/spark-operator/pull/1384) by [@TiansuYu](https://github.com/TiansuYu))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.12...spark-operator-chart-1.1.13)
## [spark-operator-chart-1.1.12](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.12) (2021-11-14)
- webhook timeout variable ([#1387](https://github.com/kubeflow/spark-operator/pull/1387) by [@sairamankumar2](https://github.com/sairamankumar2))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.11...spark-operator-chart-1.1.12)
## [spark-operator-chart-1.1.11](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.11) (2021-11-12)
- [FIX] add service account access to persistentvolumeclaims ([#1390](https://github.com/kubeflow/spark-operator/pull/1390) by [@mschroering](https://github.com/mschroering))
- Add DeepCure to who is using ([#1389](https://github.com/kubeflow/spark-operator/pull/1389) by [@mschroering](https://github.com/mschroering))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.10...spark-operator-chart-1.1.11)
## [spark-operator-chart-1.1.10](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.10) (2021-11-09)
- Add custom toleration support for webhook jobs ([#1383](https://github.com/kubeflow/spark-operator/pull/1383) by [@korjek](https://github.com/korjek))
- fix container name in addsecuritycontext patch ([#1377](https://github.com/kubeflow/spark-operator/pull/1377) by [@lybavsky](https://github.com/lybavsky))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.9...spark-operator-chart-1.1.10)
## [spark-operator-chart-1.1.9](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.9) (2021-11-01)
- `Role` and `RoleBinding` not installed for `webhook-init` in Helm `pre-hook` ([#1379](https://github.com/kubeflow/spark-operator/pull/1379) by [@zzvara](https://github.com/zzvara))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.8...spark-operator-chart-1.1.9)
## [spark-operator-chart-1.1.8](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.8) (2021-10-26)
- Regenerate deleted cert after upgrade ([#1373](https://github.com/kubeflow/spark-operator/pull/1373) by [@simplylizz](https://github.com/simplylizz))
- Make manifests usable by Kustomize ([#1367](https://github.com/kubeflow/spark-operator/pull/1367) by [@karpoftea](https://github.com/karpoftea))
- #1329 update the operator to allow subpaths to be used with the spark ui ingress. ([#1330](https://github.com/kubeflow/spark-operator/pull/1330) by [@TomHellier](https://github.com/TomHellier))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.7...spark-operator-chart-1.1.8)
## [spark-operator-chart-1.1.7](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.7) (2021-10-21)
- serviceAccount annotations ([#1350](https://github.com/kubeflow/spark-operator/pull/1350) by [@moskitone](https://github.com/moskitone))
- Update Dockerfile ([#1369](https://github.com/kubeflow/spark-operator/pull/1369) by [@Sadagopan88](https://github.com/Sadagopan88))
- [FIX] tolerations are not directly present in Driver(/Executor)Spec ([#1365](https://github.com/kubeflow/spark-operator/pull/1365) by [@s-pedamallu](https://github.com/s-pedamallu))
- fix running metrics for application deletion ([#1358](https://github.com/kubeflow/spark-operator/pull/1358) by [@Aakcht](https://github.com/Aakcht))
- Update who-is-using.md ([#1338](https://github.com/kubeflow/spark-operator/pull/1338) by [@Juandavi1](https://github.com/Juandavi1))
- Update who-is-using.md ([#1082](https://github.com/kubeflow/spark-operator/pull/1082) by [@Juandavi1](https://github.com/Juandavi1))
- Add support for executor service account ([#1322](https://github.com/kubeflow/spark-operator/pull/1322) by [@bbenzikry](https://github.com/bbenzikry))
- fix NPE introduce on #1280 ([#1325](https://github.com/kubeflow/spark-operator/pull/1325) by [@ImpSy](https://github.com/ImpSy))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.6...spark-operator-chart-1.1.7)
## [spark-operator-chart-1.1.6](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.6) (2021-08-04)
- Add hook deletion policy for spark-operator service account ([#1313](https://github.com/kubeflow/spark-operator/pull/1313) by [@pdrastil](https://github.com/pdrastil))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.5...spark-operator-chart-1.1.6)
## [spark-operator-chart-1.1.5](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.5) (2021-07-28)
- Add user defined pod labels ([#1288](https://github.com/kubeflow/spark-operator/pull/1288) by [@pdrastil](https://github.com/pdrastil))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.4...spark-operator-chart-1.1.5)
## [spark-operator-chart-1.1.4](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.4) (2021-07-25)
- Migrate CRDs from v1beta1 to v1. Add additionalPrinterColumns ([#1298](https://github.com/kubeflow/spark-operator/pull/1298) by [@drazul](https://github.com/drazul))
- Explain "signal: kill" errors during submission ([#1292](https://github.com/kubeflow/spark-operator/pull/1292) by [@zzvara](https://github.com/zzvara))
- fix the invalid repo address ([#1291](https://github.com/kubeflow/spark-operator/pull/1291) by [@william-wang](https://github.com/william-wang))
- add failure context to recordExecutorEvent ([#1280](https://github.com/kubeflow/spark-operator/pull/1280) by [@ImpSy](https://github.com/ImpSy))
- Update pythonVersion to fix example ([#1284](https://github.com/kubeflow/spark-operator/pull/1284) by [@stratus](https://github.com/stratus))
- add crds drift check between chart/ and manifest/ ([#1272](https://github.com/kubeflow/spark-operator/pull/1272) by [@ImpSy](https://github.com/ImpSy))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.3...spark-operator-chart-1.1.4)
## [spark-operator-chart-1.1.3](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.3) (2021-05-25)
- Allow user to specify service annotation on Spark UI service ([#1264](https://github.com/kubeflow/spark-operator/pull/1264) by [@khorshuheng](https://github.com/khorshuheng))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.2...spark-operator-chart-1.1.3)
## [spark-operator-chart-1.1.2](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.2) (2021-05-25)
- implement shareProcessNamespace in SparkPodSpec ([#1262](https://github.com/kubeflow/spark-operator/pull/1262) by [@ImpSy](https://github.com/ImpSy))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.1...spark-operator-chart-1.1.2)
## [spark-operator-chart-1.1.1](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.1) (2021-05-19)
- Enable UI service flag for disabling UI service ([#1261](https://github.com/kubeflow/spark-operator/pull/1261) by [@sairamankumar2](https://github.com/sairamankumar2))
- Add DiDi to who-is-using.md ([#1255](https://github.com/kubeflow/spark-operator/pull/1255) by [@Run-Lin](https://github.com/Run-Lin))
- doc: update who is using page ([#1251](https://github.com/kubeflow/spark-operator/pull/1251) by [@luizm](https://github.com/luizm))
- Add Tongdun under who-is-using ([#1249](https://github.com/kubeflow/spark-operator/pull/1249) by [@lomoJG](https://github.com/lomoJG))
- [#1239] Custom service port name for spark application UI ([#1240](https://github.com/kubeflow/spark-operator/pull/1240) by [@marcozov](https://github.com/marcozov))
- fix: do not remove preemptionPolicy in patcher when not present ([#1246](https://github.com/kubeflow/spark-operator/pull/1246) by [@HHK1](https://github.com/HHK1))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.1.0...spark-operator-chart-1.1.1)
## [spark-operator-chart-1.1.0](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.1.0) (2021-04-28)
- Updating Spark version from 3.0 to 3.1.1 ([#1153](https://github.com/kubeflow/spark-operator/pull/1153) by [@chethanuk](https://github.com/chethanuk))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.0.10...spark-operator-chart-1.1.0)
## [spark-operator-chart-1.0.10](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.0.10) (2021-04-28)
- Add support for blue/green deployments ([#1230](https://github.com/kubeflow/spark-operator/pull/1230) by [@flupke](https://github.com/flupke))
- Update who-is-using.md: Fossil is using Spark Operator for Production ([#1244](https://github.com/kubeflow/spark-operator/pull/1244) by [@duyet](https://github.com/duyet))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.0.9...spark-operator-chart-1.0.10)
## [spark-operator-chart-1.0.9](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.0.9) (2021-04-23)
- Link to Kubernetes Slack ([#1234](https://github.com/kubeflow/spark-operator/pull/1234) by [@jsoref](https://github.com/jsoref))
- fix: remove preemptionPolicy when priority class name is used ([#1236](https://github.com/kubeflow/spark-operator/pull/1236) by [@HHK1](https://github.com/HHK1))
- Spelling ([#1231](https://github.com/kubeflow/spark-operator/pull/1231) by [@jsoref](https://github.com/jsoref))
- Add support to expose custom ports ([#1205](https://github.com/kubeflow/spark-operator/pull/1205) by [@luizm](https://github.com/luizm))
- Fix the error of hostAliases when there are more than 2 hostnames ([#1209](https://github.com/kubeflow/spark-operator/pull/1209) by [@cdmikechen](https://github.com/cdmikechen))
- remove multiple prefixes for 'p' ([#1210](https://github.com/kubeflow/spark-operator/pull/1210) by [@chaudhryfaisal](https://github.com/chaudhryfaisal))
- added --s3-force-path-style to force path style URLs for S3 objects ([#1206](https://github.com/kubeflow/spark-operator/pull/1206) by [@chaudhryfaisal](https://github.com/chaudhryfaisal))
- Allow custom bucket path ([#1207](https://github.com/kubeflow/spark-operator/pull/1207) by [@bribroder](https://github.com/bribroder))
- fix: Remove priority from the spec when using priority class ([#1203](https://github.com/kubeflow/spark-operator/pull/1203) by [@HHK1](https://github.com/HHK1))
- Fix go get issue with "unknown revision v0.0.0" ([#1198](https://github.com/kubeflow/spark-operator/pull/1198) by [@hongshaoyang](https://github.com/hongshaoyang))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.0.8...spark-operator-chart-1.0.9)
## [spark-operator-chart-1.0.8](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.0.8) (2021-03-07)
- Helm: Put service account into pre-install hook. ([#1155](https://github.com/kubeflow/spark-operator/pull/1155) by [@tandrup](https://github.com/tandrup))
- correct hook annotation for webhook job ([#1193](https://github.com/kubeflow/spark-operator/pull/1193) by [@chaudhryfaisal](https://github.com/chaudhryfaisal))
- Update who-is-using.md ([#1174](https://github.com/kubeflow/spark-operator/pull/1174) by [@tarek-izemrane](https://github.com/tarek-izemrane))
- add Carrefour as adopter and contributor ([#1156](https://github.com/kubeflow/spark-operator/pull/1156) by [@AliGouta](https://github.com/AliGouta))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.0.7...spark-operator-chart-1.0.8)
## [spark-operator-chart-1.0.7](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.0.7) (2021-02-05)
- fix issue #1131 ([#1142](https://github.com/kubeflow/spark-operator/pull/1142) by [@kz33](https://github.com/kz33))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.0.6...spark-operator-chart-1.0.7)
## [spark-operator-chart-1.0.6](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.0.6) (2021-02-04)
- Add Fossil to who-is-using.md ([#1152](https://github.com/kubeflow/spark-operator/pull/1152) by [@duyet](https://github.com/duyet))
- #1143 Helm issues while deploying using argocd ([#1145](https://github.com/kubeflow/spark-operator/pull/1145) by [@TomHellier](https://github.com/TomHellier))
- Include Gojek in who-is-using.md ([#1146](https://github.com/kubeflow/spark-operator/pull/1146) by [@pradithya](https://github.com/pradithya))
- add hostAliases for SparkPodSpec ([#1133](https://github.com/kubeflow/spark-operator/pull/1133) by [@ImpSy](https://github.com/ImpSy))
- Adding MavenCode ([#1128](https://github.com/kubeflow/spark-operator/pull/1128) by [@charlesa101](https://github.com/charlesa101))
- Add MongoDB to who-is-using.md ([#1123](https://github.com/kubeflow/spark-operator/pull/1123) by [@chickenPopcorn](https://github.com/chickenPopcorn))
- update go version to 1.15 and k8s deps to v0.19.6 ([#1119](https://github.com/kubeflow/spark-operator/pull/1119) by [@stpabhi](https://github.com/stpabhi))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.0.5...spark-operator-chart-1.0.6)
## [spark-operator-chart-1.0.5](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.0.5) (2020-12-15)
- Add prometheus containr port name ([#1099](https://github.com/kubeflow/spark-operator/pull/1099) by [@nicholas-fwang](https://github.com/nicholas-fwang))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.0.4...spark-operator-chart-1.0.5)
## [spark-operator-chart-1.0.4](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.0.4) (2020-12-12)
- Upgrade the Chart version to 1.0.4 ([#1113](https://github.com/kubeflow/spark-operator/pull/1113) by [@ordukhanian](https://github.com/ordukhanian))
- Support Prometheus PodMonitor Deployment (#1106) ([#1112](https://github.com/kubeflow/spark-operator/pull/1112) by [@ordukhanian](https://github.com/ordukhanian))
- update executor status if pod is lost while app is still running ([#1111](https://github.com/kubeflow/spark-operator/pull/1111) by [@ImpSy](https://github.com/ImpSy))
- Add scheduler func for clearing batch scheduling on completed ([#1079](https://github.com/kubeflow/spark-operator/pull/1079) by [@nicholas-fwang](https://github.com/nicholas-fwang))
- Add configuration for SparkUI service type ([#1100](https://github.com/kubeflow/spark-operator/pull/1100) by [@jutley](https://github.com/jutley))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.0.3...spark-operator-chart-1.0.4)
## [spark-operator-chart-1.0.3](https://github.com/kubeflow/spark-operator/tree/spark-operator-chart-1.0.3) (2020-12-07)
- Update docs with new helm instructions ([#1105](https://github.com/kubeflow/spark-operator/pull/1105) by [@hagaibarel](https://github.com/hagaibarel))
[Full Changelog](https://github.com/kubeflow/spark-operator/compare/spark-operator-chart-1.0.2...spark-operator-chart-1.0.3)

View File

@ -14,35 +14,47 @@
# limitations under the License.
#
ARG SPARK_IMAGE=spark:3.5.0
ARG SPARK_IMAGE=docker.io/library/spark:4.0.0
FROM golang:1.22-alpine as builder
FROM golang:1.24.1 AS builder
WORKDIR /workspace
# Copy the Go Modules manifests
COPY go.mod go.mod
COPY go.sum go.sum
# Cache deps before building and copying source so that we don't need to re-download as much
# and so that source changes don't invalidate our downloaded layer
RUN go mod download
RUN --mount=type=cache,target=/go/pkg/mod/ \
--mount=type=bind,source=go.mod,target=go.mod \
--mount=type=bind,source=go.sum,target=go.sum \
go mod download
# Copy the go source code
COPY main.go main.go
COPY pkg/ pkg/
COPY . .
ENV GOCACHE=/root/.cache/go-build
# Build
ARG TARGETARCH
RUN CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} GO111MODULE=on go build -a -o /usr/bin/spark-operator main.go
RUN --mount=type=cache,target=/go/pkg/mod/ \
--mount=type=cache,target="/root/.cache/go-build" \
CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} GO111MODULE=on make build-operator
FROM ${SPARK_IMAGE}
ARG SPARK_UID=185
ARG SPARK_GID=185
USER root
COPY --from=builder /usr/bin/spark-operator /usr/bin/
RUN apt-get update --allow-releaseinfo-change \
&& apt-get update \
RUN apt-get update \
&& apt-get install -y tini \
&& rm -rf /var/lib/apt/lists/*
RUN mkdir -p /etc/k8s-webhook-server/serving-certs /home/spark && \
chmod -R g+rw /etc/k8s-webhook-server/serving-certs && \
chown -R spark /etc/k8s-webhook-server/serving-certs /home/spark
USER ${SPARK_UID}:${SPARK_GID}
COPY --from=builder /workspace/bin/spark-operator /usr/bin/spark-operator
COPY entrypoint.sh /usr/bin/
ENTRYPOINT ["/usr/bin/entrypoint.sh"]

262
Makefile
View File

@ -12,13 +12,21 @@ endif
SHELL = /usr/bin/env bash -o pipefail
.SHELLFLAGS = -ec
REPO=github.com/kubeflow/spark-operator
SPARK_OPERATOR_GOPATH=/go/src/github.com/kubeflow/spark-operator
SPARK_OPERATOR_CHART_PATH=charts/spark-operator-chart
OPERATOR_VERSION ?= $$(grep appVersion $(SPARK_OPERATOR_CHART_PATH)/Chart.yaml | awk '{print $$2}')
DEP_VERSION:=`grep DEP_VERSION= Dockerfile | awk -F\" '{print $$2}'`
BUILDER=`grep "FROM golang:" Dockerfile | awk '{print $$2}'`
UNAME:=`uname | tr '[:upper:]' '[:lower:]'`
# Version information.
VERSION ?= $(shell cat VERSION | sed "s/^v//")
BUILD_DATE := $(shell date -u +"%Y-%m-%dT%H:%M:%S%:z")
GIT_COMMIT := $(shell git rev-parse HEAD)
GIT_TAG := $(shell if [ -z "`git status --porcelain`" ]; then git describe --exact-match --tags HEAD 2>/dev/null; fi)
GIT_TREE_STATE := $(shell if [ -z "`git status --porcelain`" ]; then echo "clean" ; else echo "dirty"; fi)
GIT_SHA := $(shell git rev-parse --short HEAD || echo "HEAD")
GIT_VERSION := ${VERSION}+${GIT_SHA}
MODULE_PATH := $(shell awk '/^module/{print $$2; exit}' go.mod)
SPARK_OPERATOR_GOPATH := /go/src/github.com/kubeflow/spark-operator
SPARK_OPERATOR_CHART_PATH := charts/spark-operator-chart
DEP_VERSION := `grep DEP_VERSION= Dockerfile | awk -F\" '{print $$2}'`
BUILDER := `grep "FROM golang:" Dockerfile | awk '{print $$2}'`
UNAME := `uname | tr '[:upper:]' '[:lower:]'`
# CONTAINER_TOOL defines the container tool to be used for building images.
# Be aware that the target commands are only tested with Docker which is
@ -27,9 +35,45 @@ UNAME:=`uname | tr '[:upper:]' '[:lower:]'`
CONTAINER_TOOL ?= docker
# Image URL to use all building/pushing image targets
IMAGE_REPOSITORY ?= docker.io/kubeflow/spark-operator
IMAGE_TAG ?= $(OPERATOR_VERSION)
OPERATOR_IMAGE ?= $(IMAGE_REPOSITORY):$(IMAGE_TAG)
IMAGE_REGISTRY ?= ghcr.io
IMAGE_REPOSITORY ?= kubeflow/spark-operator/controller
IMAGE_TAG ?= $(VERSION)
IMAGE ?= $(IMAGE_REGISTRY)/$(IMAGE_REPOSITORY):$(IMAGE_TAG)
# Kind cluster
KIND_CLUSTER_NAME ?= spark-operator
KIND_CONFIG_FILE ?= charts/spark-operator-chart/ci/kind-config.yaml
KIND_KUBE_CONFIG ?= $(HOME)/.kube/config
## Location to install binaries
LOCALBIN ?= $(shell pwd)/bin
## Versions
KUSTOMIZE_VERSION ?= v5.4.1
CONTROLLER_TOOLS_VERSION ?= v0.17.1
KIND_VERSION ?= v0.23.0
KIND_K8S_VERSION ?= v1.32.0
ENVTEST_VERSION ?= release-0.20
# ENVTEST_K8S_VERSION refers to the version of kubebuilder assets to be downloaded by envtest binary.
ENVTEST_K8S_VERSION ?= 1.32.0
GOLANGCI_LINT_VERSION ?= v2.1.6
GEN_CRD_API_REFERENCE_DOCS_VERSION ?= v0.3.0
HELM_VERSION ?= v3.15.3
HELM_UNITTEST_VERSION ?= 0.5.1
HELM_DOCS_VERSION ?= v1.14.2
CODE_GENERATOR_VERSION ?= v0.33.1
## Binaries
SPARK_OPERATOR ?= $(LOCALBIN)/spark-operator
KUBECTL ?= kubectl
KUSTOMIZE ?= $(LOCALBIN)/kustomize-$(KUSTOMIZE_VERSION)
CONTROLLER_GEN ?= $(LOCALBIN)/controller-gen-$(CONTROLLER_TOOLS_VERSION)
KIND ?= $(LOCALBIN)/kind-$(KIND_VERSION)
ENVTEST ?= $(LOCALBIN)/setup-envtest-$(ENVTEST_VERSION)
GOLANGCI_LINT ?= $(LOCALBIN)/golangci-lint-$(GOLANGCI_LINT_VERSION)
GEN_CRD_API_REFERENCE_DOCS ?= $(LOCALBIN)/gen-crd-api-reference-docs-$(GEN_CRD_API_REFERENCE_DOCS_VERSION)
HELM ?= $(LOCALBIN)/helm-$(HELM_VERSION)
HELM_DOCS ?= $(LOCALBIN)/helm-docs-$(HELM_DOCS_VERSION)
##@ General
@ -46,13 +90,26 @@ OPERATOR_IMAGE ?= $(IMAGE_REPOSITORY):$(IMAGE_TAG)
.PHONY: help
help: ## Display this help.
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf " \033[36m%-15s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf " \033[36m%-30s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)
.PHONY: version
version: ## Print version information.
@echo "Version: ${VERSION}"
@echo "Build Date: ${BUILD_DATE}"
@echo "Git Commit: ${GIT_COMMIT}"
@echo "Git Tag: ${GIT_TAG}"
@echo "Git Tree State: ${GIT_TREE_STATE}"
@echo "Git SHA: ${GIT_SHA}"
@echo "Git Version: ${GIT_VERSION}"
.PHONY: print-%
print-%: ; @echo $*=$($*)
##@ Development
.PHONY: manifests
manifests: controller-gen ## Generate CustomResourceDefinition, RBAC and WebhookConfiguration manifests.
$(CONTROLLER_GEN) crd rbac:roleName=spark-operator-controller webhook paths="./..." output:crd:artifacts:config=config/crd/bases
$(CONTROLLER_GEN) crd:generateEmbeddedObjectMeta=true rbac:roleName=spark-operator-controller webhook paths="./..." output:crd:artifacts:config=config/crd/bases
.PHONY: generate
generate: controller-gen ## Generate code containing DeepCopy, DeepCopyInto, and DeepCopyObject method implementations.
@ -62,8 +119,16 @@ generate: controller-gen ## Generate code containing DeepCopy, DeepCopyInto, and
update-crd: manifests ## Update CRD files in the Helm chart.
cp config/crd/bases/* charts/spark-operator-chart/crds/
.PHONY: clean
clean: ## Clean up caches and output.
.PHONY: verify-codegen
verify-codegen: $(LOCALBIN) ## Install code-generator commands and verify changes
$(call go-install-tool,$(LOCALBIN)/register-gen-$(CODE_GENERATOR_VERSION),k8s.io/code-generator/cmd/register-gen,$(CODE_GENERATOR_VERSION))
$(call go-install-tool,$(LOCALBIN)/client-gen-$(CODE_GENERATOR_VERSION),k8s.io/code-generator/cmd/client-gen,$(CODE_GENERATOR_VERSION))
$(call go-install-tool,$(LOCALBIN)/lister-gen-$(CODE_GENERATOR_VERSION),k8s.io/code-generator/cmd/lister-gen,$(CODE_GENERATOR_VERSION))
$(call go-install-tool,$(LOCALBIN)/informer-gen-$(CODE_GENERATOR_VERSION),k8s.io/code-generator/cmd/informer-gen,$(CODE_GENERATOR_VERSION))
./hack/verify-codegen.sh
.PHONY: go-clean
go-clean: ## Clean up caches and output.
@echo "cleaning up caches and output"
go clean -cache -testcache -r -x 2>&1 >/dev/null
-rm -rf _output
@ -83,65 +148,50 @@ go-vet: ## Run go vet against code.
@echo "Running go vet..."
go vet ./...
.PHONY: lint
lint: golangci-lint ## Run golangci-lint linter.
.PHONY: go-lint
go-lint: golangci-lint ## Run golangci-lint linter.
@echo "Running golangci-lint run..."
$(GOLANGCI_LINT) run
.PHONY: lint-fix
lint-fix: golangci-lint ## Run golangci-lint linter and perform fixes.
.PHONY: go-lint-fix
go-lint-fix: golangci-lint ## Run golangci-lint linter and perform fixes.
@echo "Running golangci-lint run --fix..."
$(GOLANGCI_LINT) run --fix
.PHONY: unit-test
unit-test: clean ## Run go unit tests.
@echo "running unit tests"
unit-test: envtest ## Run unit tests.
@echo "Running unit tests..."
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)"
go test $(shell go list ./... | grep -v /e2e) -coverprofile cover.out
.PHONY: e2e-test
e2e-test: clean ## Run go integration tests.
@echo "running integration tests"
go test -v ./test/e2e/ --kubeconfig "$(HOME)/.kube/config" --operator-image=docker.io/spark-operator/spark-operator:local
e2e-test: envtest ## Run the e2e tests against a Kind k8s instance that is spun up.
@echo "Running e2e tests..."
go test ./test/e2e/ -v -ginkgo.v -timeout 30m
##@ Build
override LDFLAGS += \
-X ${MODULE_PATH}.version=${GIT_VERSION} \
-X ${MODULE_PATH}.buildDate=${BUILD_DATE} \
-X ${MODULE_PATH}.gitCommit=${GIT_COMMIT} \
-X ${MODULE_PATH}.gitTreeState=${GIT_TREE_STATE} \
-extldflags "-static"
.PHONY: build-operator
build-operator: ## Build spark-operator binary.
go build -o bin/spark-operator main.go
build-operator: ## Build Spark operator.
echo "Building spark-operator binary..."
CGO_ENABLED=0 go build -o $(SPARK_OPERATOR) -ldflags '${LDFLAGS}' cmd/operator/main.go
.PHONY: build-sparkctl
build-sparkctl: ## Build sparkctl binary.
[ ! -f "sparkctl/sparkctl-darwin-amd64" ] || [ ! -f "sparkctl/sparkctl-linux-amd64" ] && \
echo building using $(BUILDER) && \
docker run -w $(SPARK_OPERATOR_GOPATH) \
-v $$(pwd):$(SPARK_OPERATOR_GOPATH) $(BUILDER) sh -c \
"apk add --no-cache bash git && \
cd sparkctl && \
./build.sh" || true
.PHONY: install-sparkctl
install-sparkctl: | sparkctl/sparkctl-darwin-amd64 sparkctl/sparkctl-linux-amd64 ## Install sparkctl binary.
@if [ "$(UNAME)" = "linux" ]; then \
echo "installing linux binary to /usr/local/bin/sparkctl"; \
sudo cp sparkctl/sparkctl-linux-amd64 /usr/local/bin/sparkctl; \
sudo chmod +x /usr/local/bin/sparkctl; \
elif [ "$(UNAME)" = "darwin" ]; then \
echo "installing macOS binary to /usr/local/bin/sparkctl"; \
cp sparkctl/sparkctl-darwin-amd64 /usr/local/bin/sparkctl; \
chmod +x /usr/local/bin/sparkctl; \
else \
echo "$(UNAME) not supported"; \
fi
.PHONY: clean-sparkctl
clean-sparkctl: ## Clean sparkctl binary.
rm -f sparkctl/sparkctl-darwin-amd64 sparkctl/sparkctl-linux-amd64
.PHONY: clean
clean: ## Clean binaries.
rm -f $(SPARK_OPERATOR)
.PHONY: build-api-docs
build-api-docs: gen-crd-api-reference-docs ## Build api documentaion.
build-api-docs: gen-crd-api-reference-docs ## Build api documentation.
$(GEN_CRD_API_REFERENCE_DOCS) \
-config hack/api-docs/config.json \
-api-dir github.com/kubeflow/spark-operator/pkg/apis/sparkoperator.k8s.io/v1beta2 \
-api-dir github.com/kubeflow/spark-operator/v2/api/v1beta2 \
-template-dir hack/api-docs/template \
-out-file docs/api-docs.md
@ -150,11 +200,11 @@ build-api-docs: gen-crd-api-reference-docs ## Build api documentaion.
# More info: https://docs.docker.com/develop/develop-images/build_enhancements/
.PHONY: docker-build
docker-build: ## Build docker image with the operator.
$(CONTAINER_TOOL) build -t ${IMAGE_REPOSITORY}:${IMAGE_TAG} .
$(CONTAINER_TOOL) build -t ${IMAGE} .
.PHONY: docker-push
docker-push: ## Push docker image with the operator.
$(CONTAINER_TOOL) push ${IMAGE_REPOSITORY}:${IMAGE_TAG}
$(CONTAINER_TOOL) push ${IMAGE}
# PLATFORMS defines the target platforms for the operator image be built to provide support to multiple
# architectures. (i.e. make docker-buildx IMG=myregistry/mypoperator:0.0.1). To use this option you need to:
@ -164,32 +214,29 @@ docker-push: ## Push docker image with the operator.
# To adequately provide solutions that are compatible with multiple platforms, you should consider using this option.
PLATFORMS ?= linux/amd64,linux/arm64
.PHONY: docker-buildx
docker-buildx: ## Build and push docker image for the operator for cross-platform support.
# copy existing Dockerfile and insert --platform=${BUILDPLATFORM} into Dockerfile.cross, and preserve the original Dockerfile
sed -e '1 s/\(^FROM\)/FROM --platform=\$$\{BUILDPLATFORM\}/; t' -e ' 1,// s//FROM --platform=\$$\{BUILDPLATFORM\}/' Dockerfile > Dockerfile.cross
docker-buildx: ## Build and push docker image for the operator for cross-platform support
- $(CONTAINER_TOOL) buildx create --name spark-operator-builder
$(CONTAINER_TOOL) buildx use spark-operator-builder
- $(CONTAINER_TOOL) buildx build --push --platform=$(PLATFORMS) --tag ${IMAGE_REPOSITORY}:${IMAGE_TAG} -f Dockerfile.cross .
- $(CONTAINER_TOOL) buildx build --push --platform=$(PLATFORMS) --tag ${IMAGE} -f Dockerfile .
- $(CONTAINER_TOOL) buildx rm spark-operator-builder
rm Dockerfile.cross
##@ Helm
.PHONY: detect-crds-drift
detect-crds-drift:
diff -q charts/spark-operator-chart/crds config/crd/bases
detect-crds-drift: manifests ## Detect CRD drift.
diff -q $(SPARK_OPERATOR_CHART_PATH)/crds config/crd/bases
.PHONY: helm-unittest
helm-unittest: helm-unittest-plugin ## Run Helm chart unittests.
helm unittest charts/spark-operator-chart --strict --file "tests/**/*_test.yaml"
$(HELM) unittest $(SPARK_OPERATOR_CHART_PATH) --strict --file "tests/**/*_test.yaml"
.PHONY: helm-lint
helm-lint: ## Run Helm chart lint test.
docker run --rm --workdir /workspace --volume "$$(pwd):/workspace" quay.io/helmpack/chart-testing:latest ct lint --target-branch master
docker run --rm --workdir /workspace --volume "$$(pwd):/workspace" quay.io/helmpack/chart-testing:latest ct lint --target-branch master --validate-maintainers=false
.PHONY: helm-docs
helm-docs: ## Generates markdown documentation for helm charts from requirements and values files.
docker run --rm --volume "$$(pwd):/helm-docs" -u "$(id -u)" jnorwood/helm-docs:latest
helm-docs: helm-docs-plugin ## Generates markdown documentation for helm charts from requirements and values files.
$(HELM_DOCS) --sort-values-order=file
##@ Deployment
@ -197,50 +244,47 @@ ifndef ignore-not-found
ignore-not-found = false
endif
.PHONY: install-crds
install-crds: manifests kustomize ## Install CRDs into the K8s cluster specified in ~/.kube/config.
$(KUSTOMIZE) build config/crd | $(KUBECTL) create -f -
.PHONY: kind-create-cluster
kind-create-cluster: kind ## Create a kind cluster for integration tests.
if ! $(KIND) get clusters 2>/dev/null | grep -q "^$(KIND_CLUSTER_NAME)$$"; then \
$(KIND) create cluster \
--name $(KIND_CLUSTER_NAME) \
--config $(KIND_CONFIG_FILE) \
--image kindest/node:$(KIND_K8S_VERSION) \
--kubeconfig $(KIND_KUBE_CONFIG) \
--wait=1m; \
fi
.PHONY: uninstall-crds
uninstall-crds: manifests kustomize ## Uninstall CRDs from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
.PHONY: kind-load-image
kind-load-image: kind-create-cluster docker-build ## Load the image into the kind cluster.
$(KIND) load docker-image --name $(KIND_CLUSTER_NAME) $(IMAGE)
.PHONY: kind-delete-cluster
kind-delete-cluster: kind ## Delete the created kind cluster.
$(KIND) delete cluster --name $(KIND_CLUSTER_NAME) --kubeconfig $(KIND_KUBE_CONFIG)
.PHONY: install
install-crd: manifests kustomize ## Install CRDs into the K8s cluster specified in ~/.kube/config.
$(KUSTOMIZE) build config/crd | $(KUBECTL) create -f - 2>/dev/null || $(KUSTOMIZE) build config/crd | $(KUBECTL) replace -f -
.PHONY: uninstall
uninstall-crd: manifests kustomize ## Uninstall CRDs from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
$(KUSTOMIZE) build config/crd | $(KUBECTL) delete --ignore-not-found=$(ignore-not-found) -f -
.PHONY: deploy
deploy: manifests kustomize ## Deploy controller to the K8s cluster specified in ~/.kube/config.
cd config/manager && $(KUSTOMIZE) edit set image controller=${IMG}
$(KUSTOMIZE) build config/default | $(KUBECTL) apply -f -
deploy: IMAGE_TAG=local
deploy: helm manifests update-crd kind-load-image ## Deploy controller to the K8s cluster specified in ~/.kube/config.
$(HELM) upgrade --install -f charts/spark-operator-chart/ci/ci-values.yaml spark-operator ./charts/spark-operator-chart/
.PHONY: undeploy
undeploy: kustomize ## Undeploy controller from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
$(KUSTOMIZE) build config/default | $(KUBECTL) delete --ignore-not-found=$(ignore-not-found) -f -
undeploy: helm ## Uninstall spark-operator
$(HELM) uninstall spark-operator
##@ Dependencies
## Location to install dependencies to
LOCALBIN ?= $(shell pwd)/bin
$(LOCALBIN):
mkdir -p $(LOCALBIN)
## Tool Binaries
KUBECTL ?= kubectl
KUSTOMIZE ?= $(LOCALBIN)/kustomize-$(KUSTOMIZE_VERSION)
CONTROLLER_GEN ?= $(LOCALBIN)/controller-gen-$(CONTROLLER_TOOLS_VERSION)
KIND ?= $(LOCALBIN)/kind-$(KIND_VERSION)
ENVTEST ?= $(LOCALBIN)/setup-envtest-$(ENVTEST_VERSION)
GOLANGCI_LINT = $(LOCALBIN)/golangci-lint-$(GOLANGCI_LINT_VERSION)
GEN_CRD_API_REFERENCE_DOCS ?= $(LOCALBIN)/gen-crd-api-reference-docs-$(GEN_CRD_API_REFERENCE_DOCS_VERSION)
HELM ?= helm
HELM_UNITTEST ?= unittest
## Tool Versions
KUSTOMIZE_VERSION ?= v5.4.1
CONTROLLER_TOOLS_VERSION ?= v0.15.0
KIND_VERSION ?= v0.23.0
ENVTEST_VERSION ?= release-0.18
GOLANGCI_LINT_VERSION ?= v1.57.2
GEN_CRD_API_REFERENCE_DOCS_VERSION ?= v0.3.0
HELM_UNITTEST_VERSION ?= 0.5.1
.PHONY: kustomize
kustomize: $(KUSTOMIZE) ## Download kustomize locally if necessary.
$(KUSTOMIZE): $(LOCALBIN)
@ -264,20 +308,30 @@ $(ENVTEST): $(LOCALBIN)
.PHONY: golangci-lint
golangci-lint: $(GOLANGCI_LINT) ## Download golangci-lint locally if necessary.
$(GOLANGCI_LINT): $(LOCALBIN)
$(call go-install-tool,$(GOLANGCI_LINT),github.com/golangci/golangci-lint/cmd/golangci-lint,${GOLANGCI_LINT_VERSION})
$(call go-install-tool,$(GOLANGCI_LINT),github.com/golangci/golangci-lint/v2/cmd/golangci-lint,${GOLANGCI_LINT_VERSION})
.PHONY: gen-crd-api-reference-docs
gen-crd-api-reference-docs: $(GEN_CRD_API_REFERENCE_DOCS) ## Download gen-crd-api-reference-docs locally if necessary.
$(GEN_CRD_API_REFERENCE_DOCS): $(LOCALBIN)
$(call go-install-tool,$(GEN_CRD_API_REFERENCE_DOCS),github.com/ahmetb/gen-crd-api-reference-docs,$(GEN_CRD_API_REFERENCE_DOCS_VERSION))
.PHONY: helm
helm: $(HELM) ## Download helm locally if necessary.
$(HELM): $(LOCALBIN)
$(call go-install-tool,$(HELM),helm.sh/helm/v3/cmd/helm,$(HELM_VERSION))
.PHONY: helm-unittest-plugin
helm-unittest-plugin: ## Download helm unittest plugin locally if necessary.
if [ -z "$(shell helm plugin list | grep unittest)" ]; then \
echo "Installing helm unittest plugin..."; \
helm plugin install https://github.com/helm-unittest/helm-unittest.git --version $(HELM_UNITTEST_VERSION); \
helm-unittest-plugin: helm ## Download helm unittest plugin locally if necessary.
if [ -z "$(shell $(HELM) plugin list | grep unittest)" ]; then \
echo "Installing helm unittest plugin"; \
$(HELM) plugin install https://github.com/helm-unittest/helm-unittest.git --version $(HELM_UNITTEST_VERSION); \
fi
.PHONY: helm-docs-plugin
helm-docs-plugin: $(HELM_DOCS) ## Download helm-docs plugin locally if necessary.
$(HELM_DOCS): $(LOCALBIN)
$(call go-install-tool,$(HELM_DOCS),github.com/norwoodj/helm-docs/cmd/helm-docs,$(HELM_DOCS_VERSION))
# go-install-tool will 'go install' any package with custom target and name of binary, if it doesn't exist
# $1 - target path with name of binary (ideally with version)
# $2 - package url which can be installed

11
OWNERS
View File

@ -1,7 +1,10 @@
approvers:
- andreyvelich
- mwielgus
- yuchaoran2011
- vara-bonthu
reviewers:
- ChenYi015
- jacobsalway
- mwielgus
- vara-bonthu
- yuchaoran2011
reviewers:
- ImpSy
- nabuskey

39
PROJECT Normal file
View File

@ -0,0 +1,39 @@
# Code generated by tool. DO NOT EDIT.
# This file is used to track the info used to scaffold your project
# and allow the plugins properly work.
# More info: https://book.kubebuilder.io/reference/project-config.html
domain: sparkoperator.k8s.io
layout:
- go.kubebuilder.io/v4
projectName: spark-operator
repo: github.com/kubeflow/spark-operator
resources:
- api:
crdVersion: v1
namespaced: true
controller: true
domain: sparkoperator.k8s.io
kind: SparkConnect
path: github.com/kubeflow/spark-operator/api/v1alpha1
version: v1alpha1
- api:
crdVersion: v1
namespaced: true
controller: true
domain: sparkoperator.k8s.io
kind: SparkApplication
path: github.com/kubeflow/spark-operator/api/v1beta2
version: v1beta2
webhooks:
defaulting: true
validation: true
webhookVersion: v1
- api:
crdVersion: v1
namespaced: true
controller: true
domain: sparkoperator.k8s.io
kind: ScheduledSparkApplication
path: github.com/kubeflow/spark-operator/api/v1beta2
version: v1beta2
version: "3"

View File

@ -1,12 +1,39 @@
# Kubeflow Spark Operator
[![Integration Test](https://github.com/kubeflow/spark-operator/actions/workflows/integration.yaml/badge.svg)](https://github.com/kubeflow/spark-operator/actions/workflows/integration.yaml)
[![Go Report Card](https://goreportcard.com/badge/github.com/kubeflow/spark-operator)](https://goreportcard.com/report/github.com/kubeflow/spark-operator)
[![GitHub release](https://img.shields.io/github/v/release/kubeflow/spark-operator)](https://github.com/kubeflow/spark-operator/releases)
[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/10524/badge)](https://www.bestpractices.dev/projects/10524)
## What is Spark Operator?
The Kubernetes Operator for Apache Spark aims to make specifying and running [Spark](https://github.com/apache/spark) applications as easy and idiomatic as running other workloads on Kubernetes. It uses
[Kubernetes custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) for specifying, running, and surfacing status of Spark applications.
## Quick Start
For a more detailed guide, please refer to the [Getting Started guide](https://www.kubeflow.org/docs/components/spark-operator/getting-started/).
```bash
# Add the Helm repository
helm repo add --force-update spark-operator https://kubeflow.github.io/spark-operator
# Install the operator into the spark-operator namespace and wait for deployments to be ready
helm install spark-operator spark-operator/spark-operator \
--namespace spark-operator \
--create-namespace \
--wait
# Create an example application in the default namespace
kubectl apply -f https://raw.githubusercontent.com/kubeflow/spark-operator/refs/heads/master/examples/spark-pi.yaml
# Get the status of the application
kubectl get sparkapp spark-pi
# Delete the application
kubectl delete sparkapp spark-pi
```
## Overview
For a complete reference of the custom resource definitions, please refer to the [API Definition](docs/api-docs.md). For details on its design, please refer to the [Architecture](https://www.kubeflow.org/docs/components/spark-operator/overview/#architecture). It requires Spark 2.3 and above that supports Kubernetes as a native scheduler backend.
@ -21,8 +48,6 @@ The Kubernetes Operator for Apache Spark currently supports the following list o
* Supports automatic application re-submission for updated `SparkApplication` objects with updated specification.
* Supports automatic application restart with a configurable restart policy.
* Supports automatic retries of failed submissions with optional linear back-off.
* Supports mounting local Hadoop configuration as a Kubernetes ConfigMap automatically via `sparkctl`.
* Supports automatically staging local application dependencies to Google Cloud Storage (GCS) via `sparkctl`.
* Supports collecting and exporting application-level metrics and driver/executor metrics to Prometheus.
## Project Status
@ -53,18 +78,21 @@ If you are running Spark operator on Google Kubernetes Engine (GKE) and want to
The following table lists the most recent few versions of the operator.
| Operator Version | API Version | Kubernetes Version | Base Spark Version |
| ------------- | ------------- | ------------- | ------------- |
| `v1beta2-1.6.x-3.5.0` | `v1beta2` | 1.16+ | `3.5.0` |
| `v1beta2-1.5.x-3.5.0` | `v1beta2` | 1.16+ | `3.5.0` |
| `v1beta2-1.4.x-3.5.0` | `v1beta2` | 1.16+ | `3.5.0` |
| `v1beta2-1.3.x-3.1.1` | `v1beta2` | 1.16+ | `3.1.1` |
| `v1beta2-1.2.3-3.1.1` | `v1beta2` | 1.13+ | `3.1.1` |
| `v1beta2-1.2.2-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` |
| `v1beta2-1.2.1-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` |
| `v1beta2-1.2.0-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` |
| `v1beta2-1.1.x-2.4.5` | `v1beta2` | 1.13+ | `2.4.5` |
| `v1beta2-1.0.x-2.4.4` | `v1beta2` | 1.13+ | `2.4.4` |
| Operator Version | API Version | Kubernetes Version | Base Spark Version |
|-----------------------|-------------|--------------------|--------------------|
| `v2.2.x` | `v1beta2` | 1.16+ | `3.5.5` |
| `v2.1.x` | `v1beta2` | 1.16+ | `3.5.3` |
| `v2.0.x` | `v1beta2` | 1.16+ | `3.5.2` |
| `v1beta2-1.6.x-3.5.0` | `v1beta2` | 1.16+ | `3.5.0` |
| `v1beta2-1.5.x-3.5.0` | `v1beta2` | 1.16+ | `3.5.0` |
| `v1beta2-1.4.x-3.5.0` | `v1beta2` | 1.16+ | `3.5.0` |
| `v1beta2-1.3.x-3.1.1` | `v1beta2` | 1.16+ | `3.1.1` |
| `v1beta2-1.2.3-3.1.1` | `v1beta2` | 1.13+ | `3.1.1` |
| `v1beta2-1.2.2-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` |
| `v1beta2-1.2.1-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` |
| `v1beta2-1.2.0-3.0.0` | `v1beta2` | 1.13+ | `3.0.0` |
| `v1beta2-1.1.x-2.4.5` | `v1beta2` | 1.13+ | `2.4.5` |
| `v1beta2-1.0.x-2.4.4` | `v1beta2` | 1.13+ | `2.4.4` |
## Developer Guide

1
VERSION Normal file
View File

@ -0,0 +1 @@
v2.2.1

View File

@ -0,0 +1,82 @@
/*
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1alpha1
// DeployMode describes the type of deployment of a Spark application.
type DeployMode string
// Different types of deployments.
const (
DeployModeCluster DeployMode = "cluster"
DeployModeClient DeployMode = "client"
)
// DriverState tells the current state of a spark driver.
type DriverState string
// Different states a spark driver may have.
const (
DriverStatePending DriverState = "PENDING"
DriverStateRunning DriverState = "RUNNING"
DriverStateCompleted DriverState = "COMPLETED"
DriverStateFailed DriverState = "FAILED"
DriverStateUnknown DriverState = "UNKNOWN"
)
// ExecutorState tells the current state of an executor.
type ExecutorState string
// Different states an executor may have.
const (
ExecutorStatePending ExecutorState = "PENDING"
ExecutorStateRunning ExecutorState = "RUNNING"
ExecutorStateCompleted ExecutorState = "COMPLETED"
ExecutorStateFailed ExecutorState = "FAILED"
ExecutorStateUnknown ExecutorState = "UNKNOWN"
)
// DynamicAllocation contains configuration options for dynamic allocation.
type DynamicAllocation struct {
// Enabled controls whether dynamic allocation is enabled or not.
// +optional
Enabled bool `json:"enabled,omitempty"`
// InitialExecutors is the initial number of executors to request. If .spec.executor.instances
// is also set, the initial number of executors is set to the bigger of that and this option.
// +optional
InitialExecutors *int32 `json:"initialExecutors,omitempty"`
// MinExecutors is the lower bound for the number of executors if dynamic allocation is enabled.
// +optional
MinExecutors *int32 `json:"minExecutors,omitempty"`
// MaxExecutors is the upper bound for the number of executors if dynamic allocation is enabled.
// +optional
MaxExecutors *int32 `json:"maxExecutors,omitempty"`
// ShuffleTrackingEnabled enables shuffle file tracking for executors, which allows dynamic allocation without
// the need for an external shuffle service. This option will try to keep alive executors that are storing
// shuffle data for active jobs. If external shuffle service is enabled, set ShuffleTrackingEnabled to false.
// ShuffleTrackingEnabled is true by default if dynamicAllocation.enabled is true.
// +optional
ShuffleTrackingEnabled *bool `json:"shuffleTrackingEnabled,omitempty"`
// ShuffleTrackingTimeout controls the timeout in milliseconds for executors that are holding
// shuffle data if shuffle tracking is enabled (true by default if dynamic allocation is enabled).
// +optional
ShuffleTrackingTimeout *int64 `json:"shuffleTrackingTimeout,omitempty"`
}

21
api/v1alpha1/defaults.go Normal file
View File

@ -0,0 +1,21 @@
/*
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1alpha1
// SetSparkConnectDefaults sets default values for certain fields of a SparkConnect.
func SetSparkConnectDefaults(conn *SparkConnect) {
}

View File

@ -1,5 +1,5 @@
/*
Copyright 2017 Google LLC
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@ -16,6 +16,7 @@ limitations under the License.
// +k8s:deepcopy-gen=package,register
// Package v1beta1 is the v1beta1 version of the API.
// Package v1alpha1 is the v1alpha1 version of the API.
// +groupName=sparkoperator.k8s.io
package v1beta1
// +versionName=v1alpha1
package v1alpha1

View File

@ -0,0 +1,36 @@
/*
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Package v1alpha1 contains API Schema definitions for the v1alpha1 API group
// +kubebuilder:object:generate=true
// +groupName=sparkoperator.k8s.io
package v1alpha1
import (
"k8s.io/apimachinery/pkg/runtime/schema"
"sigs.k8s.io/controller-runtime/pkg/scheme"
)
var (
// GroupVersion is group version used to register these objects.
GroupVersion = schema.GroupVersion{Group: "sparkoperator.k8s.io", Version: "v1alpha1"}
// SchemeBuilder is used to add go types to the GroupVersionKind scheme.
SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}
// AddToScheme adds the types in this group-version to the given scheme.
AddToScheme = SchemeBuilder.AddToScheme
)

View File

@ -1,5 +1,5 @@
/*
Copyright 2017 Google LLC
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@ -14,39 +14,21 @@ See the License for the specific language governing permissions and
limitations under the License.
*/
package v1beta1
package v1alpha1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
"github.com/kubeflow/spark-operator/pkg/apis/sparkoperator.k8s.io"
)
const Version = "v1beta1"
var (
SchemeBuilder = runtime.NewSchemeBuilder(addKnownTypes)
AddToScheme = SchemeBuilder.AddToScheme
const (
Group = "sparkoperator.k8s.io"
Version = "v1alpha1"
)
// SchemeGroupVersion is the group version used to register these objects.
var SchemeGroupVersion = schema.GroupVersion{Group: sparkoperator.GroupName, Version: Version}
var SchemeGroupVersion = schema.GroupVersion{Group: Group, Version: Version}
// Resource takes an unqualified resource and returns a Group-qualified GroupResource.
func Resource(resource string) schema.GroupResource {
return SchemeGroupVersion.WithResource(resource).GroupResource()
}
// addKnownTypes adds the set of types defined in this package to the supplied scheme.
func addKnownTypes(scheme *runtime.Scheme) error {
scheme.AddKnownTypes(SchemeGroupVersion,
&SparkApplication{},
&SparkApplicationList{},
&ScheduledSparkApplication{},
&ScheduledSparkApplicationList{},
)
metav1.AddToGroupVersion(scheme, SchemeGroupVersion)
return nil
}

View File

@ -0,0 +1,185 @@
/*
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1alpha1
import (
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
func init() {
SchemeBuilder.Register(&SparkConnect{}, &SparkConnectList{})
}
// +kubebuilder:object:root=true
// +kubebuilder:metadata:annotations="api-approved.kubernetes.io=https://github.com/kubeflow/spark-operator/pull/1298"
// +kubebuilder:resource:scope=Namespaced,shortName=sparkconn,singular=sparkconnect
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=.metadata.creationTimestamp,name=Age,type=date
// SparkConnect is the Schema for the sparkconnections API.
type SparkConnect struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec SparkConnectSpec `json:"spec"`
Status SparkConnectStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// SparkConnectList contains a list of SparkConnect.
type SparkConnectList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []SparkConnect `json:"items"`
}
// SparkConnectSpec defines the desired state of SparkConnect.
type SparkConnectSpec struct {
// SparkVersion is the version of Spark the spark connect use.
SparkVersion string `json:"sparkVersion"`
// Image is the container image for the driver, executor, and init-container. Any custom container images for the
// driver, executor, or init-container takes precedence over this.
// +optional
Image *string `json:"image,omitempty"`
// HadoopConf carries user-specified Hadoop configuration properties as they would use the "--conf" option
// in spark-submit. The SparkApplication controller automatically adds prefix "spark.hadoop." to Hadoop
// configuration properties.
// +optional
HadoopConf map[string]string `json:"hadoopConf,omitempty"`
// SparkConf carries user-specified Spark configuration properties as they would use the "--conf" option in
// spark-submit.
// +optional
SparkConf map[string]string `json:"sparkConf,omitempty"`
// Server is the Spark connect server specification.
Server ServerSpec `json:"server"`
// Executor is the Spark executor specification.
Executor ExecutorSpec `json:"executor"`
// DynamicAllocation configures dynamic allocation that becomes available for the Kubernetes
// scheduler backend since Spark 3.0.
// +optional
DynamicAllocation *DynamicAllocation `json:"dynamicAllocation,omitempty"`
}
// ServerSpec is specification of the Spark connect server.
type ServerSpec struct {
SparkPodSpec `json:",inline"`
}
// ExecutorSpec is specification of the executor.
type ExecutorSpec struct {
SparkPodSpec `json:",inline"`
// Instances is the number of executor instances.
// +optional
// +kubebuilder:validation:Minimum=0
Instances *int32 `json:"instances,omitempty"`
}
// SparkPodSpec defines common things that can be customized for a Spark driver or executor pod.
type SparkPodSpec struct {
// Cores maps to `spark.driver.cores` or `spark.executor.cores` for the driver and executors, respectively.
// +optional
// +kubebuilder:validation:Minimum=1
Cores *int32 `json:"cores,omitempty"`
// Memory is the amount of memory to request for the pod.
// +optional
Memory *string `json:"memory,omitempty"`
// Template is a pod template that can be used to define the driver or executor pod configurations that Spark configurations do not support.
// Spark version >= 3.0.0 is required.
// Ref: https://spark.apache.org/docs/latest/running-on-kubernetes.html#pod-template.
// +optional
// +kubebuilder:validation:Schemaless
// +kubebuilder:validation:Type:=object
// +kubebuilder:pruning:PreserveUnknownFields
Template *corev1.PodTemplateSpec `json:"template,omitempty"`
}
// SparkConnectStatus defines the observed state of SparkConnect.
type SparkConnectStatus struct {
// Represents the latest available observations of a SparkConnect's current state.
// +patchMergeKey=type
// +patchStrategy=merge
// +listType=map
// +listMapKey=type
// +optional
Conditions []metav1.Condition `json:"conditions,omitempty" patchMergeKey:"type" patchStrategy:"merge"`
// State represents the current state of the SparkConnect.
State SparkConnectState `json:"state,omitempty"`
// Server represents the current state of the SparkConnect server.
Server SparkConnectServerStatus `json:"server,omitempty"`
// Executors represents the current state of the SparkConnect executors.
Executors map[string]int `json:"executors,omitempty"`
// StartTime is the time at which the SparkConnect controller started processing the SparkConnect.
StartTime metav1.Time `json:"startTime,omitempty"`
// LastUpdateTime is the time at which the SparkConnect controller last updated the SparkConnect.
LastUpdateTime metav1.Time `json:"lastUpdateTime,omitempty"`
}
// SparkConnectConditionType represents the condition types of the SparkConnect.
type SparkConnectConditionType string
// All possible condition types of the SparkConnect.
const (
SparkConnectConditionServerPodReady SparkConnectConditionType = "ServerPodReady"
)
// SparkConnectConditionReason represents the reason of SparkConnect conditions.
type SparkConnectConditionReason string
// All possible reasons of SparkConnect conditions.
const (
SparkConnectConditionReasonServerPodReady SparkConnectConditionReason = "ServerPodReady"
SparkConnectConditionReasonServerPodNotReady SparkConnectConditionReason = "ServerPodNotReady"
)
// SparkConnectState represents the current state of the SparkConnect.
type SparkConnectState string
// All possible states of the SparkConnect.
const (
SparkConnectStateNew SparkConnectState = ""
SparkConnectStateProvisioning SparkConnectState = "Provisioning"
SparkConnectStateReady SparkConnectState = "Ready"
SparkConnectStateNotReady SparkConnectState = "NotReady"
SparkConnectStateFailed SparkConnectState = "Failed"
)
type SparkConnectServerStatus struct {
// PodName is the name of the pod that is running the Spark Connect server.
PodName string `json:"podName,omitempty"`
// PodIP is the IP address of the pod that is running the Spark Connect server.
PodIP string `json:"podIp,omitempty"`
// ServiceName is the name of the service that is exposing the Spark Connect server.
ServiceName string `json:"serviceName,omitempty"`
}

View File

@ -0,0 +1,281 @@
//go:build !ignore_autogenerated
/*
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Code generated by controller-gen. DO NOT EDIT.
package v1alpha1
import (
"k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
runtime "k8s.io/apimachinery/pkg/runtime"
)
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *DynamicAllocation) DeepCopyInto(out *DynamicAllocation) {
*out = *in
if in.InitialExecutors != nil {
in, out := &in.InitialExecutors, &out.InitialExecutors
*out = new(int32)
**out = **in
}
if in.MinExecutors != nil {
in, out := &in.MinExecutors, &out.MinExecutors
*out = new(int32)
**out = **in
}
if in.MaxExecutors != nil {
in, out := &in.MaxExecutors, &out.MaxExecutors
*out = new(int32)
**out = **in
}
if in.ShuffleTrackingEnabled != nil {
in, out := &in.ShuffleTrackingEnabled, &out.ShuffleTrackingEnabled
*out = new(bool)
**out = **in
}
if in.ShuffleTrackingTimeout != nil {
in, out := &in.ShuffleTrackingTimeout, &out.ShuffleTrackingTimeout
*out = new(int64)
**out = **in
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DynamicAllocation.
func (in *DynamicAllocation) DeepCopy() *DynamicAllocation {
if in == nil {
return nil
}
out := new(DynamicAllocation)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *ExecutorSpec) DeepCopyInto(out *ExecutorSpec) {
*out = *in
in.SparkPodSpec.DeepCopyInto(&out.SparkPodSpec)
if in.Instances != nil {
in, out := &in.Instances, &out.Instances
*out = new(int32)
**out = **in
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ExecutorSpec.
func (in *ExecutorSpec) DeepCopy() *ExecutorSpec {
if in == nil {
return nil
}
out := new(ExecutorSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *ServerSpec) DeepCopyInto(out *ServerSpec) {
*out = *in
in.SparkPodSpec.DeepCopyInto(&out.SparkPodSpec)
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ServerSpec.
func (in *ServerSpec) DeepCopy() *ServerSpec {
if in == nil {
return nil
}
out := new(ServerSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *SparkConnect) DeepCopyInto(out *SparkConnect) {
*out = *in
out.TypeMeta = in.TypeMeta
in.ObjectMeta.DeepCopyInto(&out.ObjectMeta)
in.Spec.DeepCopyInto(&out.Spec)
in.Status.DeepCopyInto(&out.Status)
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkConnect.
func (in *SparkConnect) DeepCopy() *SparkConnect {
if in == nil {
return nil
}
out := new(SparkConnect)
in.DeepCopyInto(out)
return out
}
// DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
func (in *SparkConnect) DeepCopyObject() runtime.Object {
if c := in.DeepCopy(); c != nil {
return c
}
return nil
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *SparkConnectList) DeepCopyInto(out *SparkConnectList) {
*out = *in
out.TypeMeta = in.TypeMeta
in.ListMeta.DeepCopyInto(&out.ListMeta)
if in.Items != nil {
in, out := &in.Items, &out.Items
*out = make([]SparkConnect, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkConnectList.
func (in *SparkConnectList) DeepCopy() *SparkConnectList {
if in == nil {
return nil
}
out := new(SparkConnectList)
in.DeepCopyInto(out)
return out
}
// DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
func (in *SparkConnectList) DeepCopyObject() runtime.Object {
if c := in.DeepCopy(); c != nil {
return c
}
return nil
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *SparkConnectServerStatus) DeepCopyInto(out *SparkConnectServerStatus) {
*out = *in
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkConnectServerStatus.
func (in *SparkConnectServerStatus) DeepCopy() *SparkConnectServerStatus {
if in == nil {
return nil
}
out := new(SparkConnectServerStatus)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *SparkConnectSpec) DeepCopyInto(out *SparkConnectSpec) {
*out = *in
if in.Image != nil {
in, out := &in.Image, &out.Image
*out = new(string)
**out = **in
}
if in.HadoopConf != nil {
in, out := &in.HadoopConf, &out.HadoopConf
*out = make(map[string]string, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.SparkConf != nil {
in, out := &in.SparkConf, &out.SparkConf
*out = make(map[string]string, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
in.Server.DeepCopyInto(&out.Server)
in.Executor.DeepCopyInto(&out.Executor)
if in.DynamicAllocation != nil {
in, out := &in.DynamicAllocation, &out.DynamicAllocation
*out = new(DynamicAllocation)
(*in).DeepCopyInto(*out)
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkConnectSpec.
func (in *SparkConnectSpec) DeepCopy() *SparkConnectSpec {
if in == nil {
return nil
}
out := new(SparkConnectSpec)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *SparkConnectStatus) DeepCopyInto(out *SparkConnectStatus) {
*out = *in
if in.Conditions != nil {
in, out := &in.Conditions, &out.Conditions
*out = make([]metav1.Condition, len(*in))
for i := range *in {
(*in)[i].DeepCopyInto(&(*out)[i])
}
}
out.Server = in.Server
if in.Executors != nil {
in, out := &in.Executors, &out.Executors
*out = make(map[string]int, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
in.StartTime.DeepCopyInto(&out.StartTime)
in.LastUpdateTime.DeepCopyInto(&out.LastUpdateTime)
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkConnectStatus.
func (in *SparkConnectStatus) DeepCopy() *SparkConnectStatus {
if in == nil {
return nil
}
out := new(SparkConnectStatus)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *SparkPodSpec) DeepCopyInto(out *SparkPodSpec) {
*out = *in
if in.Cores != nil {
in, out := &in.Cores, &out.Cores
*out = new(int32)
**out = **in
}
if in.Memory != nil {
in, out := &in.Memory, &out.Memory
*out = new(string)
**out = **in
}
if in.Template != nil {
in, out := &in.Template, &out.Template
*out = new(v1.PodTemplateSpec)
(*in).DeepCopyInto(*out)
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new SparkPodSpec.
func (in *SparkPodSpec) DeepCopy() *SparkPodSpec {
if in == nil {
return nil
}
out := new(SparkPodSpec)
in.DeepCopyInto(out)
return out
}

View File

@ -24,15 +24,19 @@ func SetSparkApplicationDefaults(app *SparkApplication) {
return
}
if app.Spec.Type == "" {
app.Spec.Type = SparkApplicationTypeScala
}
if app.Spec.Mode == "" {
app.Spec.Mode = ClusterMode
app.Spec.Mode = DeployModeCluster
}
if app.Spec.RestartPolicy.Type == "" {
app.Spec.RestartPolicy.Type = Never
app.Spec.RestartPolicy.Type = RestartPolicyNever
}
if app.Spec.RestartPolicy.Type != Never {
if app.Spec.RestartPolicy.Type != RestartPolicyNever {
// Default to 5 sec if the RestartPolicy is OnFailure or Always and these values aren't specified.
if app.Spec.RestartPolicy.OnFailureRetryInterval == nil {
app.Spec.RestartPolicy.OnFailureRetryInterval = new(int64)
@ -50,7 +54,6 @@ func SetSparkApplicationDefaults(app *SparkApplication) {
}
func setDriverSpecDefaults(spec *DriverSpec, sparkConf map[string]string) {
if _, exists := sparkConf["spark.driver.cores"]; !exists && spec.Cores == nil {
spec.Cores = new(int32)
*spec.Cores = 1

View File

@ -36,11 +36,11 @@ func TestSetSparkApplicationDefaultsEmptyModeShouldDefaultToClusterMode(t *testi
SetSparkApplicationDefaults(app)
assert.Equal(t, ClusterMode, app.Spec.Mode)
assert.Equal(t, DeployModeCluster, app.Spec.Mode)
}
func TestSetSparkApplicationDefaultsModeShouldNotChangeIfSet(t *testing.T) {
expectedMode := ClientMode
expectedMode := DeployModeClient
app := &SparkApplication{
Spec: SparkApplicationSpec{
Mode: expectedMode,
@ -59,21 +59,21 @@ func TestSetSparkApplicationDefaultsEmptyRestartPolicyShouldDefaultToNever(t *te
SetSparkApplicationDefaults(app)
assert.Equal(t, Never, app.Spec.RestartPolicy.Type)
assert.Equal(t, RestartPolicyNever, app.Spec.RestartPolicy.Type)
}
func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValues(t *testing.T) {
app := &SparkApplication{
Spec: SparkApplicationSpec{
RestartPolicy: RestartPolicy{
Type: OnFailure,
Type: RestartPolicyOnFailure,
},
},
}
SetSparkApplicationDefaults(app)
assert.Equal(t, OnFailure, app.Spec.RestartPolicy.Type)
assert.Equal(t, RestartPolicyOnFailure, app.Spec.RestartPolicy.Type)
assert.NotNil(t, app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.Equal(t, int64(5), *app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.NotNil(t, app.Spec.RestartPolicy.OnSubmissionFailureRetryInterval)
@ -85,7 +85,7 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
app := &SparkApplication{
Spec: SparkApplicationSpec{
RestartPolicy: RestartPolicy{
Type: OnFailure,
Type: RestartPolicyOnFailure,
OnSubmissionFailureRetryInterval: &expectedOnSubmissionFailureRetryInterval,
},
},
@ -93,7 +93,7 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
SetSparkApplicationDefaults(app)
assert.Equal(t, OnFailure, app.Spec.RestartPolicy.Type)
assert.Equal(t, RestartPolicyOnFailure, app.Spec.RestartPolicy.Type)
assert.NotNil(t, app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.Equal(t, int64(5), *app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.NotNil(t, app.Spec.RestartPolicy.OnSubmissionFailureRetryInterval)
@ -105,7 +105,7 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
app := &SparkApplication{
Spec: SparkApplicationSpec{
RestartPolicy: RestartPolicy{
Type: OnFailure,
Type: RestartPolicyOnFailure,
OnFailureRetryInterval: &expectedOnFailureRetryInterval,
},
},
@ -113,7 +113,7 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
SetSparkApplicationDefaults(app)
assert.Equal(t, OnFailure, app.Spec.RestartPolicy.Type)
assert.Equal(t, RestartPolicyOnFailure, app.Spec.RestartPolicy.Type)
assert.NotNil(t, app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.Equal(t, expectedOnFailureRetryInterval, *app.Spec.RestartPolicy.OnFailureRetryInterval)
assert.NotNil(t, app.Spec.RestartPolicy.OnSubmissionFailureRetryInterval)
@ -121,7 +121,6 @@ func TestSetSparkApplicationDefaultsOnFailureRestartPolicyShouldSetDefaultValueF
}
func TestSetSparkApplicationDefaultsDriverSpecDefaults(t *testing.T) {
//Case1: Driver config not set.
app := &SparkApplication{
Spec: SparkApplicationSpec{},

View File

@ -15,7 +15,6 @@ limitations under the License.
*/
// +k8s:deepcopy-gen=package,register
// go:generate controller-gen crd:trivialVersions=true paths=. output:dir=.
// Package v1beta2 is the v1beta2 version of the API.
// +groupName=sparkoperator.k8s.io

View File

@ -0,0 +1,36 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Package v1beta2 contains API Schema definitions for the v1beta2 API group
// +kubebuilder:object:generate=true
// +groupName=sparkoperator.k8s.io
package v1beta2
import (
"k8s.io/apimachinery/pkg/runtime/schema"
"sigs.k8s.io/controller-runtime/pkg/scheme"
)
var (
// GroupVersion is group version used to register these objects.
GroupVersion = schema.GroupVersion{Group: "sparkoperator.k8s.io", Version: "v1beta2"}
// SchemeBuilder is used to add go types to the GroupVersionKind scheme.
SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion}
// AddToScheme adds the types in this group-version to the given scheme.
AddToScheme = SchemeBuilder.AddToScheme
)

View File

@ -1,5 +1,5 @@
/*
Copyright 2017 Google LLC
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@ -14,8 +14,4 @@ See the License for the specific language governing permissions and
limitations under the License.
*/
package sparkoperator
const (
GroupName = "sparkoperator.k8s.io"
)
package v1beta2

View File

@ -1,5 +1,5 @@
/*
Copyright 2017 Google LLC
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@ -17,36 +17,18 @@ limitations under the License.
package v1beta2
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
"github.com/kubeflow/spark-operator/pkg/apis/sparkoperator.k8s.io"
)
const Version = "v1beta2"
var (
SchemeBuilder = runtime.NewSchemeBuilder(addKnownTypes)
AddToScheme = SchemeBuilder.AddToScheme
const (
Group = "sparkoperator.k8s.io"
Version = "v1beta2"
)
// SchemeGroupVersion is the group version used to register these objects.
var SchemeGroupVersion = schema.GroupVersion{Group: sparkoperator.GroupName, Version: Version}
var SchemeGroupVersion = schema.GroupVersion{Group: Group, Version: Version}
// Resource takes an unqualified resource and returns a Group-qualified GroupResource.
func Resource(resource string) schema.GroupResource {
return SchemeGroupVersion.WithResource(resource).GroupResource()
}
// addKnownTypes adds the set of types defined in this package to the supplied scheme.
func addKnownTypes(scheme *runtime.Scheme) error {
scheme.AddKnownTypes(SchemeGroupVersion,
&SparkApplication{},
&SparkApplicationList{},
&ScheduledSparkApplication{},
&ScheduledSparkApplicationList{},
)
metav1.AddToGroupVersion(scheme, SchemeGroupVersion)
return nil
}

View File

@ -0,0 +1,135 @@
/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1beta2
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
func init() {
SchemeBuilder.Register(&ScheduledSparkApplication{}, &ScheduledSparkApplicationList{})
}
// ScheduledSparkApplicationSpec defines the desired state of ScheduledSparkApplication.
type ScheduledSparkApplicationSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// Schedule is a cron schedule on which the application should run.
Schedule string `json:"schedule"`
// TimeZone is the time zone in which the cron schedule will be interpreted in.
// This value is passed to time.LoadLocation, so it must be either "Local", "UTC",
// or a valid IANA location name e.g. "America/New_York".
// +optional
// Defaults to "Local".
TimeZone string `json:"timeZone,omitempty"`
// Template is a template from which SparkApplication instances can be created.
Template SparkApplicationSpec `json:"template"`
// Suspend is a flag telling the controller to suspend subsequent runs of the application if set to true.
// +optional
// Defaults to false.
Suspend *bool `json:"suspend,omitempty"`
// ConcurrencyPolicy is the policy governing concurrent SparkApplication runs.
ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`
// SuccessfulRunHistoryLimit is the number of past successful runs of the application to keep.
// +optional
// Defaults to 1.
SuccessfulRunHistoryLimit *int32 `json:"successfulRunHistoryLimit,omitempty"`
// FailedRunHistoryLimit is the number of past failed runs of the application to keep.
// +optional
// Defaults to 1.
FailedRunHistoryLimit *int32 `json:"failedRunHistoryLimit,omitempty"`
}
// ScheduledSparkApplicationStatus defines the observed state of ScheduledSparkApplication.
type ScheduledSparkApplicationStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// LastRun is the time when the last run of the application started.
// +nullable
LastRun metav1.Time `json:"lastRun,omitempty"`
// NextRun is the time when the next run of the application will start.
// +nullable
NextRun metav1.Time `json:"nextRun,omitempty"`
// LastRunName is the name of the SparkApplication for the most recent run of the application.
LastRunName string `json:"lastRunName,omitempty"`
// PastSuccessfulRunNames keeps the names of SparkApplications for past successful runs.
PastSuccessfulRunNames []string `json:"pastSuccessfulRunNames,omitempty"`
// PastFailedRunNames keeps the names of SparkApplications for past failed runs.
PastFailedRunNames []string `json:"pastFailedRunNames,omitempty"`
// ScheduleState is the current scheduling state of the application.
ScheduleState ScheduleState `json:"scheduleState,omitempty"`
// Reason tells why the ScheduledSparkApplication is in the particular ScheduleState.
Reason string `json:"reason,omitempty"`
}
// +kubebuilder:object:root=true
// +kubebuilder:metadata:annotations="api-approved.kubernetes.io=https://github.com/kubeflow/spark-operator/pull/1298"
// +kubebuilder:resource:scope=Namespaced,shortName=scheduledsparkapp,singular=scheduledsparkapplication
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=.spec.schedule,name=Schedule,type=string
// +kubebuilder:printcolumn:JSONPath=.spec.timeZone,name=TimeZone,type=string
// +kubebuilder:printcolumn:JSONPath=.spec.suspend,name=Suspend,type=string
// +kubebuilder:printcolumn:JSONPath=.status.lastRun,name=Last Run,type=date
// +kubebuilder:printcolumn:JSONPath=.status.lastRunName,name=Last Run Name,type=string
// +kubebuilder:printcolumn:JSONPath=.metadata.creationTimestamp,name=Age,type=date
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +genclient
// ScheduledSparkApplication is the Schema for the scheduledsparkapplications API.
type ScheduledSparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec ScheduledSparkApplicationSpec `json:"spec"`
Status ScheduledSparkApplicationStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// ScheduledSparkApplicationList contains a list of ScheduledSparkApplication.
type ScheduledSparkApplicationList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []ScheduledSparkApplication `json:"items"`
}
type ConcurrencyPolicy string
const (
// ConcurrencyAllow allows SparkApplications to run concurrently.
ConcurrencyAllow ConcurrencyPolicy = "Allow"
// ConcurrencyForbid forbids concurrent runs of SparkApplications, skipping the next run if the previous
// one hasn't finished yet.
ConcurrencyForbid ConcurrencyPolicy = "Forbid"
// ConcurrencyReplace kills the currently running SparkApplication instance and replaces it with a new one.
ConcurrencyReplace ConcurrencyPolicy = "Replace"
)
type ScheduleState string
const (
ScheduleStateNew ScheduleState = ""
ScheduleStateValidating ScheduleState = "Validating"
ScheduleStateScheduled ScheduleState = "Scheduled"
ScheduleStateFailedValidation ScheduleState = "FailedValidation"
)

View File

@ -1,5 +1,5 @@
/*
Copyright 2017 Google LLC
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@ -17,180 +17,24 @@ limitations under the License.
package v1beta2
import (
apiv1 "k8s.io/api/core/v1"
corev1 "k8s.io/api/core/v1"
networkingv1 "k8s.io/api/networking/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// SparkApplicationType describes the type of a Spark application.
type SparkApplicationType string
// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.
// Different types of Spark applications.
const (
JavaApplicationType SparkApplicationType = "Java"
ScalaApplicationType SparkApplicationType = "Scala"
PythonApplicationType SparkApplicationType = "Python"
RApplicationType SparkApplicationType = "R"
)
// DeployMode describes the type of deployment of a Spark application.
type DeployMode string
// Different types of deployments.
const (
ClusterMode DeployMode = "cluster"
ClientMode DeployMode = "client"
InClusterClientMode DeployMode = "in-cluster-client"
)
// RestartPolicy is the policy of if and in which conditions the controller should restart a terminated application.
// This completely defines actions to be taken on any kind of Failures during an application run.
type RestartPolicy struct {
// Type specifies the RestartPolicyType.
// +kubebuilder:validation:Enum={Never,Always,OnFailure}
Type RestartPolicyType `json:"type,omitempty"`
// OnSubmissionFailureRetries is the number of times to retry submitting an application before giving up.
// This is best effort and actual retry attempts can be >= the value specified due to caching.
// These are required if RestartPolicy is OnFailure.
// +kubebuilder:validation:Minimum=0
// +optional
OnSubmissionFailureRetries *int32 `json:"onSubmissionFailureRetries,omitempty"`
// OnFailureRetries the number of times to retry running an application before giving up.
// +kubebuilder:validation:Minimum=0
// +optional
OnFailureRetries *int32 `json:"onFailureRetries,omitempty"`
// OnSubmissionFailureRetryInterval is the interval in seconds between retries on failed submissions.
// +kubebuilder:validation:Minimum=1
// +optional
OnSubmissionFailureRetryInterval *int64 `json:"onSubmissionFailureRetryInterval,omitempty"`
// OnFailureRetryInterval is the interval in seconds between retries on failed runs.
// +kubebuilder:validation:Minimum=1
// +optional
OnFailureRetryInterval *int64 `json:"onFailureRetryInterval,omitempty"`
func init() {
SchemeBuilder.Register(&SparkApplication{}, &SparkApplicationList{})
}
type RestartPolicyType string
const (
Never RestartPolicyType = "Never"
OnFailure RestartPolicyType = "OnFailure"
Always RestartPolicyType = "Always"
)
// +genclient
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +k8s:defaulter-gen=true
// +kubebuilder:metadata:annotations="api-approved.kubernetes.io=https://github.com/kubeflow/spark-operator/pull/1298"
// +kubebuilder:resource:scope=Namespaced,shortName=scheduledsparkapp,singular=scheduledsparkapplication
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=.spec.schedule,name=Schedule,type=string
// +kubebuilder:printcolumn:JSONPath=.spec.suspend,name=Suspend,type=string
// +kubebuilder:printcolumn:JSONPath=.status.lastRun,name=Last Run,type=date
// +kubebuilder:printcolumn:JSONPath=.status.lastRunName,name=Last Run Name,type=string
// +kubebuilder:printcolumn:JSONPath=.metadata.creationTimestamp,name=Age,type=date
type ScheduledSparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec ScheduledSparkApplicationSpec `json:"spec"`
Status ScheduledSparkApplicationStatus `json:"status,omitempty"`
}
type ConcurrencyPolicy string
const (
// ConcurrencyAllow allows SparkApplications to run concurrently.
ConcurrencyAllow ConcurrencyPolicy = "Allow"
// ConcurrencyForbid forbids concurrent runs of SparkApplications, skipping the next run if the previous
// one hasn't finished yet.
ConcurrencyForbid ConcurrencyPolicy = "Forbid"
// ConcurrencyReplace kills the currently running SparkApplication instance and replaces it with a new one.
ConcurrencyReplace ConcurrencyPolicy = "Replace"
)
type ScheduledSparkApplicationSpec struct {
// Schedule is a cron schedule on which the application should run.
Schedule string `json:"schedule"`
// Template is a template from which SparkApplication instances can be created.
Template SparkApplicationSpec `json:"template"`
// Suspend is a flag telling the controller to suspend subsequent runs of the application if set to true.
// +optional
// Defaults to false.
Suspend *bool `json:"suspend,omitempty"`
// ConcurrencyPolicy is the policy governing concurrent SparkApplication runs.
ConcurrencyPolicy ConcurrencyPolicy `json:"concurrencyPolicy,omitempty"`
// SuccessfulRunHistoryLimit is the number of past successful runs of the application to keep.
// +optional
// Defaults to 1.
SuccessfulRunHistoryLimit *int32 `json:"successfulRunHistoryLimit,omitempty"`
// FailedRunHistoryLimit is the number of past failed runs of the application to keep.
// +optional
// Defaults to 1.
FailedRunHistoryLimit *int32 `json:"failedRunHistoryLimit,omitempty"`
}
type ScheduleState string
const (
FailedValidationState ScheduleState = "FailedValidation"
ScheduledState ScheduleState = "Scheduled"
)
type ScheduledSparkApplicationStatus struct {
// LastRun is the time when the last run of the application started.
// +nullable
LastRun metav1.Time `json:"lastRun,omitempty"`
// NextRun is the time when the next run of the application will start.
// +nullable
NextRun metav1.Time `json:"nextRun,omitempty"`
// LastRunName is the name of the SparkApplication for the most recent run of the application.
LastRunName string `json:"lastRunName,omitempty"`
// PastSuccessfulRunNames keeps the names of SparkApplications for past successful runs.
PastSuccessfulRunNames []string `json:"pastSuccessfulRunNames,omitempty"`
// PastFailedRunNames keeps the names of SparkApplications for past failed runs.
PastFailedRunNames []string `json:"pastFailedRunNames,omitempty"`
// ScheduleState is the current scheduling state of the application.
ScheduleState ScheduleState `json:"scheduleState,omitempty"`
// Reason tells why the ScheduledSparkApplication is in the particular ScheduleState.
Reason string `json:"reason,omitempty"`
}
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// ScheduledSparkApplicationList carries a list of ScheduledSparkApplication objects.
type ScheduledSparkApplicationList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []ScheduledSparkApplication `json:"items,omitempty"`
}
// +genclient
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +k8s:defaulter-gen=true
// +kubebuilder:metadata:annotations="api-approved.kubernetes.io=https://github.com/kubeflow/spark-operator/pull/1298"
// +kubebuilder:resource:scope=Namespaced,shortName=sparkapp,singular=sparkapplication
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=.status.applicationState.state,name=Status,type=string
// +kubebuilder:printcolumn:JSONPath=.status.executionAttempts,name=Attempts,type=string
// +kubebuilder:printcolumn:JSONPath=.status.lastSubmissionAttemptTime,name=Start,type=string
// +kubebuilder:printcolumn:JSONPath=.status.terminationTime,name=Finish,type=string
// +kubebuilder:printcolumn:JSONPath=.metadata.creationTimestamp,name=Age,type=date
// SparkApplication represents a Spark application running on and using Kubernetes as a cluster manager.
type SparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec SparkApplicationSpec `json:"spec"`
Status SparkApplicationStatus `json:"status,omitempty"`
}
// SparkApplicationSpec describes the specification of a Spark application using Kubernetes as a cluster manager.
// SparkApplicationSpec defines the desired state of SparkApplication
// It carries every pieces of information a spark-submit command takes and recognizes.
type SparkApplicationSpec struct {
// INSERT ADDITIONAL SPEC FIELDS - desired state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// Type tells the type of the Spark application.
// +kubebuilder:validation:Enum={Java,Python,Scala,R}
Type SparkApplicationType `json:"type"`
@ -218,7 +62,6 @@ type SparkApplicationSpec struct {
// +optional
MainClass *string `json:"mainClass,omitempty"`
// MainFile is the path to a bundled JAR, Python, or R file of the application.
// +optional
MainApplicationFile *string `json:"mainApplicationFile"`
// Arguments is a list of arguments to be passed to the application.
// +optional
@ -227,8 +70,8 @@ type SparkApplicationSpec struct {
// spark-submit.
// +optional
SparkConf map[string]string `json:"sparkConf,omitempty"`
// HadoopConf carries user-specified Hadoop configuration properties as they would use the the "--conf" option
// in spark-submit. The SparkApplication controller automatically adds prefix "spark.hadoop." to Hadoop
// HadoopConf carries user-specified Hadoop configuration properties as they would use the "--conf" option
// in spark-submit. The SparkApplication controller automatically adds prefix "spark.hadoop." to Hadoop
// configuration properties.
// +optional
HadoopConf map[string]string `json:"hadoopConf,omitempty"`
@ -242,7 +85,7 @@ type SparkApplicationSpec struct {
HadoopConfigMap *string `json:"hadoopConfigMap,omitempty"`
// Volumes is the list of Kubernetes volumes that can be mounted by the driver and/or executors.
// +optional
Volumes []apiv1.Volume `json:"volumes,omitempty"`
Volumes []corev1.Volume `json:"volumes,omitempty"`
// Driver is the driver specification.
Driver DriverSpec `json:"driver"`
// Executor is the executor specification.
@ -301,124 +144,11 @@ type SparkApplicationSpec struct {
DynamicAllocation *DynamicAllocation `json:"dynamicAllocation,omitempty"`
}
// BatchSchedulerConfiguration used to configure how to batch scheduling Spark Application
type BatchSchedulerConfiguration struct {
// Queue stands for the resource queue which the application belongs to, it's being used in Volcano batch scheduler.
// +optional
Queue *string `json:"queue,omitempty"`
// PriorityClassName stands for the name of k8s PriorityClass resource, it's being used in Volcano batch scheduler.
// +optional
PriorityClassName *string `json:"priorityClassName,omitempty"`
// Resources stands for the resource list custom request for. Usually it is used to define the lower-bound limit.
// If specified, volcano scheduler will consider it as the resources requested.
// +optional
Resources apiv1.ResourceList `json:"resources,omitempty"`
}
// SparkUIConfiguration is for driver UI specific configuration parameters.
type SparkUIConfiguration struct {
// ServicePort allows configuring the port at service level that might be different from the targetPort.
// TargetPort should be the same as the one defined in spark.ui.port
// +optional
ServicePort *int32 `json:"servicePort"`
// ServicePortName allows configuring the name of the service port.
// This may be useful for sidecar proxies like Envoy injected by Istio which require specific ports names to treat traffic as proper HTTP.
// Defaults to spark-driver-ui-port.
// +optional
ServicePortName *string `json:"servicePortName"`
// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *apiv1.ServiceType `json:"serviceType"`
// ServiceAnnotations is a map of key,value pairs of annotations that might be added to the service object.
// +optional
ServiceAnnotations map[string]string `json:"serviceAnnotations,omitempty"`
// ServiceLables is a map of key,value pairs of labels that might be added to the service object.
// +optional
ServiceLabels map[string]string `json:"serviceLabels,omitempty"`
// IngressAnnotations is a map of key,value pairs of annotations that might be added to the ingress object. i.e. specify nginx as ingress.class
// +optional
IngressAnnotations map[string]string `json:"ingressAnnotations,omitempty"`
// TlsHosts is useful If we need to declare SSL certificates to the ingress object
// +optional
IngressTLS []networkingv1.IngressTLS `json:"ingressTLS,omitempty"`
}
// DriverIngressConfiguration is for driver ingress specific configuration parameters.
type DriverIngressConfiguration struct {
// ServicePort allows configuring the port at service level that might be different from the targetPort.
ServicePort *int32 `json:"servicePort"`
// ServicePortName allows configuring the name of the service port.
// This may be useful for sidecar proxies like Envoy injected by Istio which require specific ports names to treat traffic as proper HTTP.
ServicePortName *string `json:"servicePortName"`
// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *apiv1.ServiceType `json:"serviceType"`
// ServiceAnnotations is a map of key,value pairs of annotations that might be added to the service object.
// +optional
ServiceAnnotations map[string]string `json:"serviceAnnotations,omitempty"`
// ServiceLables is a map of key,value pairs of labels that might be added to the service object.
// +optional
ServiceLabels map[string]string `json:"serviceLabels,omitempty"`
// IngressURLFormat is the URL for the ingress.
IngressURLFormat string `json:"ingressURLFormat,omitempty"`
// IngressAnnotations is a map of key,value pairs of annotations that might be added to the ingress object. i.e. specify nginx as ingress.class
// +optional
IngressAnnotations map[string]string `json:"ingressAnnotations,omitempty"`
// TlsHosts is useful If we need to declare SSL certificates to the ingress object
// +optional
IngressTLS []networkingv1.IngressTLS `json:"ingressTLS,omitempty"`
}
// ApplicationStateType represents the type of the current state of an application.
type ApplicationStateType string
// Different states an application may have.
const (
NewState ApplicationStateType = ""
SubmittedState ApplicationStateType = "SUBMITTED"
RunningState ApplicationStateType = "RUNNING"
CompletedState ApplicationStateType = "COMPLETED"
FailedState ApplicationStateType = "FAILED"
FailedSubmissionState ApplicationStateType = "SUBMISSION_FAILED"
PendingRerunState ApplicationStateType = "PENDING_RERUN"
InvalidatingState ApplicationStateType = "INVALIDATING"
SucceedingState ApplicationStateType = "SUCCEEDING"
FailingState ApplicationStateType = "FAILING"
UnknownState ApplicationStateType = "UNKNOWN"
)
// ApplicationState tells the current state of the application and an error message in case of failures.
type ApplicationState struct {
State ApplicationStateType `json:"state"`
ErrorMessage string `json:"errorMessage,omitempty"`
}
// DriverState tells the current state of a spark driver.
type DriverState string
// Different states a spark driver may have.
const (
DriverPendingState DriverState = "PENDING"
DriverRunningState DriverState = "RUNNING"
DriverCompletedState DriverState = "COMPLETED"
DriverFailedState DriverState = "FAILED"
DriverUnknownState DriverState = "UNKNOWN"
)
// ExecutorState tells the current state of an executor.
type ExecutorState string
// Different states an executor may have.
const (
ExecutorPendingState ExecutorState = "PENDING"
ExecutorRunningState ExecutorState = "RUNNING"
ExecutorCompletedState ExecutorState = "COMPLETED"
ExecutorFailedState ExecutorState = "FAILED"
ExecutorUnknownState ExecutorState = "UNKNOWN"
)
// SparkApplicationStatus describes the current status of a Spark application.
// SparkApplicationStatus defines the observed state of SparkApplication
type SparkApplicationStatus struct {
// INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
// Important: Run "make generate" to regenerate code after modifying this file
// SparkApplicationID is set by the spark-distribution(via spark.app.id config) on the driver and executor pods
SparkApplicationID string `json:"sparkApplicationId,omitempty"`
// SubmissionID is a unique ID of the current submission of the application.
@ -443,15 +173,212 @@ type SparkApplicationStatus struct {
SubmissionAttempts int32 `json:"submissionAttempts,omitempty"`
}
// +kubebuilder:object:root=true
// +kubebuilder:metadata:annotations="api-approved.kubernetes.io=https://github.com/kubeflow/spark-operator/pull/1298"
// +kubebuilder:resource:scope=Namespaced,shortName=sparkapp,singular=sparkapplication
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:JSONPath=.status.applicationState.state,name=Status,type=string
// +kubebuilder:printcolumn:JSONPath=.status.executionAttempts,name=Attempts,type=string
// +kubebuilder:printcolumn:JSONPath=.status.lastSubmissionAttemptTime,name=Start,type=string
// +kubebuilder:printcolumn:JSONPath=.status.terminationTime,name=Finish,type=string
// +kubebuilder:printcolumn:JSONPath=.metadata.creationTimestamp,name=Age,type=date
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// +genclient
// SparkApplication is the Schema for the sparkapplications API
type SparkApplication struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec SparkApplicationSpec `json:"spec"`
Status SparkApplicationStatus `json:"status,omitempty"`
}
// +kubebuilder:object:root=true
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
// SparkApplicationList carries a list of SparkApplication objects.
// SparkApplicationList contains a list of SparkApplication
type SparkApplicationList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []SparkApplication `json:"items,omitempty"`
Items []SparkApplication `json:"items"`
}
// SparkApplicationType describes the type of a Spark application.
type SparkApplicationType string
// Different types of Spark applications.
const (
SparkApplicationTypeJava SparkApplicationType = "Java"
SparkApplicationTypeScala SparkApplicationType = "Scala"
SparkApplicationTypePython SparkApplicationType = "Python"
SparkApplicationTypeR SparkApplicationType = "R"
)
// DeployMode describes the type of deployment of a Spark application.
type DeployMode string
// Different types of deployments.
const (
DeployModeCluster DeployMode = "cluster"
DeployModeClient DeployMode = "client"
DeployModeInClusterClient DeployMode = "in-cluster-client"
)
// RestartPolicy is the policy of if and in which conditions the controller should restart a terminated application.
// This completely defines actions to be taken on any kind of Failures during an application run.
type RestartPolicy struct {
// Type specifies the RestartPolicyType.
// +kubebuilder:validation:Enum={Never,Always,OnFailure}
Type RestartPolicyType `json:"type,omitempty"`
// OnSubmissionFailureRetries is the number of times to retry submitting an application before giving up.
// This is best effort and actual retry attempts can be >= the value specified due to caching.
// These are required if RestartPolicy is OnFailure.
// +kubebuilder:validation:Minimum=0
// +optional
OnSubmissionFailureRetries *int32 `json:"onSubmissionFailureRetries,omitempty"`
// OnFailureRetries the number of times to retry running an application before giving up.
// +kubebuilder:validation:Minimum=0
// +optional
OnFailureRetries *int32 `json:"onFailureRetries,omitempty"`
// OnSubmissionFailureRetryInterval is the interval in seconds between retries on failed submissions.
// +kubebuilder:validation:Minimum=1
// +optional
OnSubmissionFailureRetryInterval *int64 `json:"onSubmissionFailureRetryInterval,omitempty"`
// OnFailureRetryInterval is the interval in seconds between retries on failed runs.
// +kubebuilder:validation:Minimum=1
// +optional
OnFailureRetryInterval *int64 `json:"onFailureRetryInterval,omitempty"`
}
type RestartPolicyType string
const (
RestartPolicyNever RestartPolicyType = "Never"
RestartPolicyOnFailure RestartPolicyType = "OnFailure"
RestartPolicyAlways RestartPolicyType = "Always"
)
// BatchSchedulerConfiguration used to configure how to batch scheduling Spark Application
type BatchSchedulerConfiguration struct {
// Queue stands for the resource queue which the application belongs to, it's being used in Volcano batch scheduler.
// +optional
Queue *string `json:"queue,omitempty"`
// PriorityClassName stands for the name of k8s PriorityClass resource, it's being used in Volcano batch scheduler.
// +optional
PriorityClassName *string `json:"priorityClassName,omitempty"`
// Resources stands for the resource list custom request for. Usually it is used to define the lower-bound limit.
// If specified, volcano scheduler will consider it as the resources requested.
// +optional
Resources corev1.ResourceList `json:"resources,omitempty"`
}
// SparkUIConfiguration is for driver UI specific configuration parameters.
type SparkUIConfiguration struct {
// ServicePort allows configuring the port at service level that might be different from the targetPort.
// TargetPort should be the same as the one defined in spark.ui.port
// +optional
ServicePort *int32 `json:"servicePort,omitempty"`
// ServicePortName allows configuring the name of the service port.
// This may be useful for sidecar proxies like Envoy injected by Istio which require specific ports names to treat traffic as proper HTTP.
// Defaults to spark-driver-ui-port.
// +optional
ServicePortName *string `json:"servicePortName,omitempty"`
// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *corev1.ServiceType `json:"serviceType,omitempty"`
// ServiceAnnotations is a map of key,value pairs of annotations that might be added to the service object.
// +optional
ServiceAnnotations map[string]string `json:"serviceAnnotations,omitempty"`
// ServiceLabels is a map of key,value pairs of labels that might be added to the service object.
// +optional
ServiceLabels map[string]string `json:"serviceLabels,omitempty"`
// IngressAnnotations is a map of key,value pairs of annotations that might be added to the ingress object. i.e. specify nginx as ingress.class
// +optional
IngressAnnotations map[string]string `json:"ingressAnnotations,omitempty"`
// TlsHosts is useful If we need to declare SSL certificates to the ingress object
// +optional
IngressTLS []networkingv1.IngressTLS `json:"ingressTLS,omitempty"`
}
// DriverIngressConfiguration is for driver ingress specific configuration parameters.
type DriverIngressConfiguration struct {
// ServicePort allows configuring the port at service level that might be different from the targetPort.
ServicePort *int32 `json:"servicePort"`
// ServicePortName allows configuring the name of the service port.
// This may be useful for sidecar proxies like Envoy injected by Istio which require specific ports names to treat traffic as proper HTTP.
ServicePortName *string `json:"servicePortName"`
// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *corev1.ServiceType `json:"serviceType,omitempty"`
// ServiceAnnotations is a map of key,value pairs of annotations that might be added to the service object.
// +optional
ServiceAnnotations map[string]string `json:"serviceAnnotations,omitempty"`
// ServiceLabels is a map of key,value pairs of labels that might be added to the service object.
// +optional
ServiceLabels map[string]string `json:"serviceLabels,omitempty"`
// IngressURLFormat is the URL for the ingress.
IngressURLFormat string `json:"ingressURLFormat,omitempty"`
// IngressAnnotations is a map of key,value pairs of annotations that might be added to the ingress object. i.e. specify nginx as ingress.class
// +optional
IngressAnnotations map[string]string `json:"ingressAnnotations,omitempty"`
// TlsHosts is useful If we need to declare SSL certificates to the ingress object
// +optional
IngressTLS []networkingv1.IngressTLS `json:"ingressTLS,omitempty"`
}
// ApplicationStateType represents the type of the current state of an application.
type ApplicationStateType string
// Different states an application may have.
const (
ApplicationStateNew ApplicationStateType = ""
ApplicationStateSubmitted ApplicationStateType = "SUBMITTED"
ApplicationStateRunning ApplicationStateType = "RUNNING"
ApplicationStateCompleted ApplicationStateType = "COMPLETED"
ApplicationStateFailed ApplicationStateType = "FAILED"
ApplicationStateFailedSubmission ApplicationStateType = "SUBMISSION_FAILED"
ApplicationStatePendingRerun ApplicationStateType = "PENDING_RERUN"
ApplicationStateInvalidating ApplicationStateType = "INVALIDATING"
ApplicationStateSucceeding ApplicationStateType = "SUCCEEDING"
ApplicationStateFailing ApplicationStateType = "FAILING"
ApplicationStateUnknown ApplicationStateType = "UNKNOWN"
)
// ApplicationState tells the current state of the application and an error message in case of failures.
type ApplicationState struct {
State ApplicationStateType `json:"state"`
ErrorMessage string `json:"errorMessage,omitempty"`
}
// DriverState tells the current state of a spark driver.
type DriverState string
// Different states a spark driver may have.
const (
DriverStatePending DriverState = "PENDING"
DriverStateRunning DriverState = "RUNNING"
DriverStateCompleted DriverState = "COMPLETED"
DriverStateFailed DriverState = "FAILED"
DriverStateUnknown DriverState = "UNKNOWN"
)
// ExecutorState tells the current state of an executor.
type ExecutorState string
// Different states an executor may have.
const (
ExecutorStatePending ExecutorState = "PENDING"
ExecutorStateRunning ExecutorState = "RUNNING"
ExecutorStateCompleted ExecutorState = "COMPLETED"
ExecutorStateFailed ExecutorState = "FAILED"
ExecutorStateUnknown ExecutorState = "UNKNOWN"
)
// Dependencies specifies all possible types of dependencies of a Spark application.
type Dependencies struct {
// Jars is a list of JAR files the Spark application depends on.
@ -477,11 +404,22 @@ type Dependencies struct {
// given with the "packages" option.
// +optional
Repositories []string `json:"repositories,omitempty"`
// Archives is a list of archives to be extracted into the working directory of each executor.
// +optional
Archives []string `json:"archives,omitempty"`
}
// SparkPodSpec defines common things that can be customized for a Spark driver or executor pod.
// TODO: investigate if we should use v1.PodSpec and limit what can be set instead.
type SparkPodSpec struct {
// Template is a pod template that can be used to define the driver or executor pod configurations that Spark configurations do not support.
// Spark version >= 3.0.0 is required.
// Ref: https://spark.apache.org/docs/latest/running-on-kubernetes.html#pod-template.
// +optional
// +kubebuilder:validation:Schemaless
// +kubebuilder:validation:Type:=object
// +kubebuilder:pruning:PreserveUnknownFields
Template *corev1.PodTemplateSpec `json:"template,omitempty"`
// Cores maps to `spark.driver.cores` or `spark.executor.cores` for the driver and executors, respectively.
// +optional
// +kubebuilder:validation:Minimum=1
@ -492,6 +430,9 @@ type SparkPodSpec struct {
// Memory is the amount of memory to request for the pod.
// +optional
Memory *string `json:"memory,omitempty"`
// MemoryLimit overrides the memory limit of the pod.
// +optional
MemoryLimit *string `json:"memoryLimit,omitempty"`
// MemoryOverhead is the amount of off-heap memory to allocate in cluster mode, in MiB unless otherwise specified.
// +optional
MemoryOverhead *string `json:"memoryOverhead,omitempty"`
@ -509,14 +450,14 @@ type SparkPodSpec struct {
Secrets []SecretInfo `json:"secrets,omitempty"`
// Env carries the environment variables to add to the pod.
// +optional
Env []apiv1.EnvVar `json:"env,omitempty"`
Env []corev1.EnvVar `json:"env,omitempty"`
// EnvVars carries the environment variables to add to the pod.
// Deprecated. Consider using `env` instead.
// +optional
EnvVars map[string]string `json:"envVars,omitempty"`
// EnvFrom is a list of sources to populate environment variables in the container.
// +optional
EnvFrom []apiv1.EnvFromSource `json:"envFrom,omitempty"`
EnvFrom []corev1.EnvFromSource `json:"envFrom,omitempty"`
// EnvSecretKeyRefs holds a mapping from environment variable names to SecretKeyRefs.
// Deprecated. Consider using `env` instead.
// +optional
@ -529,28 +470,28 @@ type SparkPodSpec struct {
Annotations map[string]string `json:"annotations,omitempty"`
// VolumeMounts specifies the volumes listed in ".spec.volumes" to mount into the main container's filesystem.
// +optional
VolumeMounts []apiv1.VolumeMount `json:"volumeMounts,omitempty"`
VolumeMounts []corev1.VolumeMount `json:"volumeMounts,omitempty"`
// Affinity specifies the affinity/anti-affinity settings for the pod.
// +optional
Affinity *apiv1.Affinity `json:"affinity,omitempty"`
Affinity *corev1.Affinity `json:"affinity,omitempty"`
// Tolerations specifies the tolerations listed in ".spec.tolerations" to be applied to the pod.
// +optional
Tolerations []apiv1.Toleration `json:"tolerations,omitempty"`
Tolerations []corev1.Toleration `json:"tolerations,omitempty"`
// PodSecurityContext specifies the PodSecurityContext to apply.
// +optional
PodSecurityContext *apiv1.PodSecurityContext `json:"podSecurityContext,omitempty"`
PodSecurityContext *corev1.PodSecurityContext `json:"podSecurityContext,omitempty"`
// SecurityContext specifies the container's SecurityContext to apply.
// +optional
SecurityContext *apiv1.SecurityContext `json:"securityContext,omitempty"`
SecurityContext *corev1.SecurityContext `json:"securityContext,omitempty"`
// SchedulerName specifies the scheduler that will be used for scheduling
// +optional
SchedulerName *string `json:"schedulerName,omitempty"`
// Sidecars is a list of sidecar containers that run along side the main Spark container.
// +optional
Sidecars []apiv1.Container `json:"sidecars,omitempty"`
Sidecars []corev1.Container `json:"sidecars,omitempty"`
// InitContainers is a list of init-containers that run to completion before the main Spark container.
// +optional
InitContainers []apiv1.Container `json:"initContainers,omitempty"`
InitContainers []corev1.Container `json:"initContainers,omitempty"`
// HostNetwork indicates whether to request host networking for the pod or not.
// +optional
HostNetwork *bool `json:"hostNetwork,omitempty"`
@ -560,7 +501,7 @@ type SparkPodSpec struct {
NodeSelector map[string]string `json:"nodeSelector,omitempty"`
// DnsConfig dns settings for the pod, following the Kubernetes specifications.
// +optional
DNSConfig *apiv1.PodDNSConfig `json:"dnsConfig,omitempty"`
DNSConfig *corev1.PodDNSConfig `json:"dnsConfig,omitempty"`
// Termination grace period seconds for the pod
// +optional
TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty"`
@ -569,7 +510,7 @@ type SparkPodSpec struct {
ServiceAccount *string `json:"serviceAccount,omitempty"`
// HostAliases settings for the pod, following the Kubernetes specifications.
// +optional
HostAliases []apiv1.HostAlias `json:"hostAliases,omitempty"`
HostAliases []corev1.HostAlias `json:"hostAliases,omitempty"`
// ShareProcessNamespace settings for the pod, following the Kubernetes specifications.
// +optional
ShareProcessNamespace *bool `json:"shareProcessNamespace,omitempty"`
@ -595,7 +536,7 @@ type DriverSpec struct {
JavaOptions *string `json:"javaOptions,omitempty"`
// Lifecycle for running preStop or postStart commands
// +optional
Lifecycle *apiv1.Lifecycle `json:"lifecycle,omitempty"`
Lifecycle *corev1.Lifecycle `json:"lifecycle,omitempty"`
// KubernetesMaster is the URL of the Kubernetes master used by the driver to manage executor pods and
// other Kubernetes resources. Default to https://kubernetes.default.svc.
// +optional
@ -611,6 +552,9 @@ type DriverSpec struct {
// Ports settings for the pods, following the Kubernetes specifications.
// +optional
Ports []Port `json:"ports,omitempty"`
// PriorityClassName is the name of the PriorityClass for the driver pod.
// +optional
PriorityClassName *string `json:"priorityClassName,omitempty"`
}
// ExecutorSpec is specification of the executor.
@ -630,7 +574,7 @@ type ExecutorSpec struct {
JavaOptions *string `json:"javaOptions,omitempty"`
// Lifecycle for running preStop or postStart commands
// +optional
Lifecycle *apiv1.Lifecycle `json:"lifecycle,omitempty"`
Lifecycle *corev1.Lifecycle `json:"lifecycle,omitempty"`
// DeleteOnTermination specify whether executor pods should be deleted in case of failure or normal termination.
// Maps to `spark.kubernetes.executor.deleteOnTermination` that is available since Spark 3.0.
// +optional
@ -638,6 +582,9 @@ type ExecutorSpec struct {
// Ports settings for the pods, following the Kubernetes specifications.
// +optional
Ports []Port `json:"ports,omitempty"`
// PriorityClassName is the name of the PriorityClass for the executor pod.
// +optional
PriorityClassName *string `json:"priorityClassName,omitempty"`
}
// NamePath is a pair of a name and a path to which the named objects should be mounted to.
@ -651,22 +598,22 @@ type SecretType string
// An enumeration of secret types supported.
const (
// GCPServiceAccountSecret is for secrets from a GCP service account Json key file that needs
// SecretTypeGCPServiceAccount is for secrets from a GCP service account Json key file that needs
// the environment variable GOOGLE_APPLICATION_CREDENTIALS.
GCPServiceAccountSecret SecretType = "GCPServiceAccount"
// HadoopDelegationTokenSecret is for secrets from an Hadoop delegation token that needs the
SecretTypeGCPServiceAccount SecretType = "GCPServiceAccount"
// SecretTypeHadoopDelegationToken is for secrets from an Hadoop delegation token that needs the
// environment variable HADOOP_TOKEN_FILE_LOCATION.
HadoopDelegationTokenSecret SecretType = "HadoopDelegationToken"
// GenericType is for secrets that needs no special handling.
GenericType SecretType = "Generic"
SecretTypeHadoopDelegationToken SecretType = "HadoopDelegationToken"
// SecretTypeGeneric is for secrets that needs no special handling.
SecretTypeGeneric SecretType = "Generic"
)
// DriverInfo captures information about the driver.
type DriverInfo struct {
WebUIServiceName string `json:"webUIServiceName,omitempty"`
// UI Details for the UI created via ClusterIP service accessible from within the cluster.
WebUIPort int32 `json:"webUIPort,omitempty"`
WebUIAddress string `json:"webUIAddress,omitempty"`
WebUIPort int32 `json:"webUIPort,omitempty"`
// Ingress Details if an ingress for the UI was created.
WebUIIngressName string `json:"webUIIngressName,omitempty"`
WebUIIngressAddress string `json:"webUIIngressAddress,omitempty"`
@ -759,44 +706,14 @@ type DynamicAllocation struct {
// MaxExecutors is the upper bound for the number of executors if dynamic allocation is enabled.
// +optional
MaxExecutors *int32 `json:"maxExecutors,omitempty"`
// ShuffleTrackingEnabled enables shuffle file tracking for executors, which allows dynamic allocation without
// the need for an external shuffle service. This option will try to keep alive executors that are storing
// shuffle data for active jobs. If external shuffle service is enabled, set ShuffleTrackingEnabled to false.
// ShuffleTrackingEnabled is true by default if dynamicAllocation.enabled is true.
// +optional
ShuffleTrackingEnabled *bool `json:"shuffleTrackingEnabled,omitempty"`
// ShuffleTrackingTimeout controls the timeout in milliseconds for executors that are holding
// shuffle data if shuffle tracking is enabled (true by default if dynamic allocation is enabled).
// +optional
ShuffleTrackingTimeout *int64 `json:"shuffleTrackingTimeout,omitempty"`
}
// PrometheusMonitoringEnabled returns if Prometheus monitoring is enabled or not.
func (s *SparkApplication) PrometheusMonitoringEnabled() bool {
return s.Spec.Monitoring != nil && s.Spec.Monitoring.Prometheus != nil
}
// HasPrometheusConfigFile returns if Prometheus monitoring uses a configuration file in the container.
func (s *SparkApplication) HasPrometheusConfigFile() bool {
return s.PrometheusMonitoringEnabled() &&
s.Spec.Monitoring.Prometheus.ConfigFile != nil &&
*s.Spec.Monitoring.Prometheus.ConfigFile != ""
}
// HasPrometheusConfig returns if Prometheus monitoring defines metricsProperties in the spec.
func (s *SparkApplication) HasMetricsProperties() bool {
return s.PrometheusMonitoringEnabled() &&
s.Spec.Monitoring.MetricsProperties != nil &&
*s.Spec.Monitoring.MetricsProperties != ""
}
// HasPrometheusConfigFile returns if Monitoring defines metricsPropertiesFile in the spec.
func (s *SparkApplication) HasMetricsPropertiesFile() bool {
return s.PrometheusMonitoringEnabled() &&
s.Spec.Monitoring.MetricsPropertiesFile != nil &&
*s.Spec.Monitoring.MetricsPropertiesFile != ""
}
// ExposeDriverMetrics returns if driver metrics should be exposed.
func (s *SparkApplication) ExposeDriverMetrics() bool {
return s.Spec.Monitoring != nil && s.Spec.Monitoring.ExposeDriverMetrics
}
// ExposeExecutorMetrics returns if executor metrics should be exposed.
func (s *SparkApplication) ExposeExecutorMetrics() bool {
return s.Spec.Monitoring != nil && s.Spec.Monitoring.ExposeExecutorMetrics
}

23
api/v1beta2/types.go Normal file
View File

@ -0,0 +1,23 @@
/*
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
/*
This file is needed for kubernetes/code-generator/kube_codegen.sh script used in hack/update-codegen.sh.
*/
package v1beta2
//+genclient

View File

@ -1,7 +1,7 @@
//go:build !ignore_autogenerated
/*
Copyright 2024 The Kubeflow authors.
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@ -23,7 +23,7 @@ package v1beta2
import (
"k8s.io/api/core/v1"
networkingv1 "k8s.io/api/networking/v1"
"k8s.io/apimachinery/pkg/runtime"
runtime "k8s.io/apimachinery/pkg/runtime"
)
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
@ -106,6 +106,11 @@ func (in *Dependencies) DeepCopyInto(out *Dependencies) {
*out = make([]string, len(*in))
copy(*out, *in)
}
if in.Archives != nil {
in, out := &in.Archives, &out.Archives
*out = make([]string, len(*in))
copy(*out, *in)
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Dependencies.
@ -239,6 +244,11 @@ func (in *DriverSpec) DeepCopyInto(out *DriverSpec) {
*out = make([]Port, len(*in))
copy(*out, *in)
}
if in.PriorityClassName != nil {
in, out := &in.PriorityClassName, &out.PriorityClassName
*out = new(string)
**out = **in
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new DriverSpec.
@ -269,6 +279,11 @@ func (in *DynamicAllocation) DeepCopyInto(out *DynamicAllocation) {
*out = new(int32)
**out = **in
}
if in.ShuffleTrackingEnabled != nil {
in, out := &in.ShuffleTrackingEnabled, &out.ShuffleTrackingEnabled
*out = new(bool)
**out = **in
}
if in.ShuffleTrackingTimeout != nil {
in, out := &in.ShuffleTrackingTimeout, &out.ShuffleTrackingTimeout
*out = new(int64)
@ -320,6 +335,11 @@ func (in *ExecutorSpec) DeepCopyInto(out *ExecutorSpec) {
*out = make([]Port, len(*in))
copy(*out, *in)
}
if in.PriorityClassName != nil {
in, out := &in.PriorityClassName, &out.PriorityClassName
*out = new(string)
**out = **in
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ExecutorSpec.
@ -861,6 +881,11 @@ func (in *SparkApplicationStatus) DeepCopy() *SparkApplicationStatus {
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *SparkPodSpec) DeepCopyInto(out *SparkPodSpec) {
*out = *in
if in.Template != nil {
in, out := &in.Template, &out.Template
*out = new(v1.PodTemplateSpec)
(*in).DeepCopyInto(*out)
}
if in.Cores != nil {
in, out := &in.Cores, &out.Cores
*out = new(int32)
@ -876,6 +901,11 @@ func (in *SparkPodSpec) DeepCopyInto(out *SparkPodSpec) {
*out = new(string)
**out = **in
}
if in.MemoryLimit != nil {
in, out := &in.MemoryLimit, &out.MemoryLimit
*out = new(string)
**out = **in
}
if in.MemoryOverhead != nil {
in, out := &in.MemoryOverhead, &out.MemoryOverhead
*out = new(string)

View File

@ -3,6 +3,7 @@
# negation (prefixed with !). Only one pattern per line.
ci/
.helmignore
# Common VCS dirs
.git/
@ -21,16 +22,16 @@ ci/
*~
# Various IDEs
*.tmproj
.project
.idea/
*.tmproj
.vscode/
# MacOS
.DS_Store
# helm-unittest
./tests
tests
.debug
__snapshot__

View File

@ -1,11 +1,39 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
apiVersion: v2
name: spark-operator
description: A Helm chart for Spark on Kubernetes operator
version: 1.4.5
appVersion: v1beta2-1.6.2-3.5.0
description: A Helm chart for Spark on Kubernetes operator.
version: 2.2.1
appVersion: 2.2.1
keywords:
- spark
- apache spark
- big data
home: https://github.com/kubeflow/spark-operator
maintainers:
- name: yuchaoran2011
email: yuchaoran2011@gmail.com
- name: yuchaoran2011
email: yuchaoran2011@gmail.com
url: https://github.com/yuchaoran2011
- name: ChenYi015
email: github@chenyicn.net
url: https://github.com/ChenYi015

View File

@ -1,8 +1,8 @@
# spark-operator
![Version: 1.4.5](https://img.shields.io/badge/Version-1.4.5-informational?style=flat-square) ![AppVersion: v1beta2-1.6.2-3.5.0](https://img.shields.io/badge/AppVersion-v1beta2--1.6.2--3.5.0-informational?style=flat-square)
![Version: 2.2.1](https://img.shields.io/badge/Version-2.2.1-informational?style=flat-square) ![AppVersion: 2.2.1](https://img.shields.io/badge/AppVersion-2.2.1-informational?style=flat-square)
A Helm chart for Spark on Kubernetes operator
A Helm chart for Spark on Kubernetes operator.
**Homepage:** <https://github.com/kubeflow/spark-operator>
@ -41,13 +41,7 @@ See [helm repo](https://helm.sh/docs/helm/helm_repo) for command documentation.
helm install [RELEASE_NAME] spark-operator/spark-operator
```
For example, if you want to create a release with name `spark-operator` in the `default` namespace:
```shell
helm install spark-operator spark-operator/spark-operator
```
Note that `helm` will fail to install if the namespace doesn't exist. Either create the namespace beforehand or pass the `--create-namespace` flag to the `helm install` command.
For example, if you want to create a release with name `spark-operator` in the `spark-operator` namespace:
```shell
helm install spark-operator spark-operator/spark-operator \
@ -55,6 +49,8 @@ helm install spark-operator spark-operator/spark-operator \
--create-namespace
```
Note that by passing the `--create-namespace` flag to the `helm install` command, `helm` will create the release namespace if it does not exist.
See [helm install](https://helm.sh/docs/helm/helm_install) for command documentation.
### Upgrade the chart
@ -79,71 +75,118 @@ See [helm uninstall](https://helm.sh/docs/helm/helm_uninstall) for command docum
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| affinity | object | `{}` | Affinity for pod assignment |
| batchScheduler.enable | bool | `false` | Enable batch scheduler for spark jobs scheduling. If enabled, users can specify batch scheduler name in spark application |
| commonLabels | object | `{}` | Common labels to add to the resources |
| controllerThreads | int | `10` | Operator concurrency, higher values might increase memory usage |
| envFrom | list | `[]` | Pod environment variable sources |
| fullnameOverride | string | `""` | String to override release name |
| image.pullPolicy | string | `"IfNotPresent"` | Image pull policy |
| image.repository | string | `"docker.io/kubeflow/spark-operator"` | Image repository |
| image.tag | string | `""` | if set, override the image tag whose default is the chart appVersion. |
| imagePullSecrets | list | `[]` | Image pull secrets |
| ingressUrlFormat | string | `""` | Ingress URL format. Requires the UI service to be enabled by setting `uiService.enable` to true. |
| istio.enabled | bool | `false` | When using `istio`, spark jobs need to run without a sidecar to properly terminate |
| labelSelectorFilter | string | `""` | A comma-separated list of key=value, or key labels to filter resources during watch and list based on the specified labels. |
| leaderElection.lockName | string | `"spark-operator-lock"` | Leader election lock name. Ref: https://github.com/kubeflow/spark-operator/blob/master/docs/user-guide.md#enabling-leader-election-for-high-availability. |
| leaderElection.lockNamespace | string | `""` | Optionally store the lock in another namespace. Defaults to operator's namespace |
| logLevel | int | `2` | Set higher levels for more verbose logging |
| metrics.enable | bool | `true` | Enable prometheus metric scraping |
| metrics.endpoint | string | `"/metrics"` | Metrics serving endpoint |
| metrics.port | int | `10254` | Metrics port |
| metrics.portName | string | `"metrics"` | Metrics port name |
| metrics.prefix | string | `""` | Metric prefix, will be added to all exported metrics |
| nameOverride | string | `""` | String to partially override `spark-operator.fullname` template (will maintain the release name) |
| nodeSelector | object | `{}` | Node labels for pod assignment |
| podAnnotations | object | `{}` | Additional annotations to add to the pod |
| podDisruptionBudget | object | `{"enable":false,"minAvailable":1}` | podDisruptionBudget to avoid service degradation |
| podDisruptionBudget.enable | bool | `false` | Specifies whether to enable pod disruption budget. Ref: [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) |
| podDisruptionBudget.minAvailable | int | `1` | The number of pods that must be available. Require `replicaCount` to be greater than 1 |
| podLabels | object | `{}` | Additional labels to add to the pod |
| podMonitor | object | `{"enable":false,"jobLabel":"spark-operator-podmonitor","labels":{},"podMetricsEndpoint":{"interval":"5s","scheme":"http"}}` | Prometheus pod monitor for operator's pod. |
| podMonitor.enable | bool | `false` | If enabled, a pod monitor for operator's pod will be submitted. Note that prometheus metrics should be enabled as well. |
| podMonitor.jobLabel | string | `"spark-operator-podmonitor"` | The label to use to retrieve the job name from |
| podMonitor.labels | object | `{}` | Pod monitor labels |
| podMonitor.podMetricsEndpoint | object | `{"interval":"5s","scheme":"http"}` | Prometheus metrics endpoint properties. `metrics.portName` will be used as a port |
| podSecurityContext | object | `{}` | Pod security context |
| priorityClassName | string | `""` | A priority class to be used for running spark-operator pod. |
| rbac.annotations | object | `{}` | Optional annotations for rbac |
| rbac.create | bool | `false` | **DEPRECATED** use `createRole` and `createClusterRole` |
| rbac.createClusterRole | bool | `true` | Create and use RBAC `ClusterRole` resources |
| rbac.createRole | bool | `true` | Create and use RBAC `Role` resources |
| replicaCount | int | `1` | Desired number of pods, leaderElection will be enabled if this is greater than 1 |
| resourceQuotaEnforcement.enable | bool | `false` | Whether to enable the ResourceQuota enforcement for SparkApplication resources. Requires the webhook to be enabled by setting `webhook.enable` to true. Ref: https://github.com/kubeflow/spark-operator/blob/master/docs/user-guide.md#enabling-resource-quota-enforcement. |
| resources | object | `{}` | Pod resource requests and limits Note, that each job submission will spawn a JVM within the Spark Operator Pod using "/usr/local/openjdk-11/bin/java -Xmx128m". Kubernetes may kill these Java processes at will to enforce resource limits. When that happens, you will see the following error: 'failed to run spark-submit for SparkApplication [...]: signal: killed' - when this happens, you may want to increase memory limits. |
| resyncInterval | int | `30` | Operator resync interval. Note that the operator will respond to events (e.g. create, update) unrelated to this setting |
| securityContext | object | `{}` | Operator container security context |
| serviceAccounts.spark.annotations | object | `{}` | Optional annotations for the spark service account |
| serviceAccounts.spark.create | bool | `true` | Create a service account for spark apps |
| serviceAccounts.spark.name | string | `""` | Optional name for the spark service account |
| serviceAccounts.sparkoperator.annotations | object | `{}` | Optional annotations for the operator service account |
| serviceAccounts.sparkoperator.create | bool | `true` | Create a service account for the operator |
| serviceAccounts.sparkoperator.name | string | `""` | Optional name for the operator service account |
| sidecars | list | `[]` | Sidecar containers |
| sparkJobNamespaces | list | `[""]` | List of namespaces where to run spark jobs |
| tolerations | list | `[]` | List of node taints to tolerate |
| uiService.enable | bool | `true` | Enable UI service creation for Spark application |
| volumeMounts | list | `[]` | |
| volumes | list | `[]` | |
| webhook.enable | bool | `false` | Enable webhook server |
| webhook.namespaceSelector | string | `""` | The webhook server will only operate on namespaces with this label, specified in the form key1=value1,key2=value2. Empty string (default) will operate on all namespaces |
| webhook.objectSelector | string | `""` | The webhook will only operate on resources with this label/s, specified in the form key1=value1,key2=value2, OR key in (value1,value2). Empty string (default) will operate on all objects |
| webhook.port | int | `8080` | Webhook service port |
| webhook.portName | string | `"webhook"` | Webhook container port name and service target port name |
| webhook.timeout | int | `30` | The annotations applied to init job, required to restore certs deleted by the cleanup job during upgrade |
| nameOverride | string | `""` | String to partially override release name. |
| fullnameOverride | string | `""` | String to fully override release name. |
| commonLabels | object | `{}` | Common labels to add to the resources. |
| image.registry | string | `"ghcr.io"` | Image registry. |
| image.repository | string | `"kubeflow/spark-operator/controller"` | Image repository. |
| image.tag | string | If not set, the chart appVersion will be used. | Image tag. |
| image.pullPolicy | string | `"IfNotPresent"` | Image pull policy. |
| image.pullSecrets | list | `[]` | Image pull secrets for private image registry. |
| controller.replicas | int | `1` | Number of replicas of controller. |
| controller.leaderElection.enable | bool | `true` | Specifies whether to enable leader election for controller. |
| controller.workers | int | `10` | Reconcile concurrency, higher values might increase memory usage. |
| controller.logLevel | string | `"info"` | Configure the verbosity of logging, can be one of `debug`, `info`, `error`. |
| controller.logEncoder | string | `"console"` | Configure the encoder of logging, can be one of `console` or `json`. |
| controller.driverPodCreationGracePeriod | string | `"10s"` | Grace period after a successful spark-submit when driver pod not found errors will be retried. Useful if the driver pod can take some time to be created. |
| controller.maxTrackedExecutorPerApp | int | `1000` | Specifies the maximum number of Executor pods that can be tracked by the controller per SparkApplication. |
| controller.uiService.enable | bool | `true` | Specifies whether to create service for Spark web UI. |
| controller.uiIngress.enable | bool | `false` | Specifies whether to create ingress for Spark web UI. `controller.uiService.enable` must be `true` to enable ingress. |
| controller.uiIngress.urlFormat | string | `""` | Ingress URL format. Required if `controller.uiIngress.enable` is true. |
| controller.uiIngress.ingressClassName | string | `""` | Optionally set the ingressClassName. |
| controller.uiIngress.tls | list | `[]` | Optionally set default TLS configuration for the Spark UI's ingress. `ingressTLS` in the SparkApplication spec overrides this. |
| controller.uiIngress.annotations | object | `{}` | Optionally set default ingress annotations for the Spark UI's ingress. `ingressAnnotations` in the SparkApplication spec overrides this. |
| controller.batchScheduler.enable | bool | `false` | Specifies whether to enable batch scheduler for spark jobs scheduling. If enabled, users can specify batch scheduler name in spark application. |
| controller.batchScheduler.kubeSchedulerNames | list | `[]` | Specifies a list of kube-scheduler names for scheduling Spark pods. |
| controller.batchScheduler.default | string | `""` | Default batch scheduler to be used if not specified by the user. If specified, this value must be either "volcano" or "yunikorn". Specifying any other value will cause the controller to error on startup. |
| controller.serviceAccount.create | bool | `true` | Specifies whether to create a service account for the controller. |
| controller.serviceAccount.name | string | `""` | Optional name for the controller service account. |
| controller.serviceAccount.annotations | object | `{}` | Extra annotations for the controller service account. |
| controller.serviceAccount.automountServiceAccountToken | bool | `true` | Auto-mount service account token to the controller pods. |
| controller.rbac.create | bool | `true` | Specifies whether to create RBAC resources for the controller. |
| controller.rbac.annotations | object | `{}` | Extra annotations for the controller RBAC resources. |
| controller.labels | object | `{}` | Extra labels for controller pods. |
| controller.annotations | object | `{}` | Extra annotations for controller pods. |
| controller.volumes | list | `[{"emptyDir":{"sizeLimit":"1Gi"},"name":"tmp"}]` | Volumes for controller pods. |
| controller.nodeSelector | object | `{}` | Node selector for controller pods. |
| controller.affinity | object | `{}` | Affinity for controller pods. |
| controller.tolerations | list | `[]` | List of node taints to tolerate for controller pods. |
| controller.priorityClassName | string | `""` | Priority class for controller pods. |
| controller.podSecurityContext | object | `{"fsGroup":185}` | Security context for controller pods. |
| controller.topologySpreadConstraints | list | `[]` | Topology spread constraints rely on node labels to identify the topology domain(s) that each Node is in. Ref: [Pod Topology Spread Constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/). The labelSelector field in topology spread constraint will be set to the selector labels for controller pods if not specified. |
| controller.env | list | `[]` | Environment variables for controller containers. |
| controller.envFrom | list | `[]` | Environment variable sources for controller containers. |
| controller.volumeMounts | list | `[{"mountPath":"/tmp","name":"tmp","readOnly":false}]` | Volume mounts for controller containers. |
| controller.resources | object | `{}` | Pod resource requests and limits for controller containers. Note, that each job submission will spawn a JVM within the controller pods using "/usr/local/openjdk-11/bin/java -Xmx128m". Kubernetes may kill these Java processes at will to enforce resource limits. When that happens, you will see the following error: 'failed to run spark-submit for SparkApplication [...]: signal: killed' - when this happens, you may want to increase memory limits. |
| controller.securityContext | object | `{"allowPrivilegeEscalation":false,"capabilities":{"drop":["ALL"]},"privileged":false,"readOnlyRootFilesystem":true,"runAsNonRoot":true,"seccompProfile":{"type":"RuntimeDefault"}}` | Security context for controller containers. |
| controller.sidecars | list | `[]` | Sidecar containers for controller pods. |
| controller.podDisruptionBudget.enable | bool | `false` | Specifies whether to create pod disruption budget for controller. Ref: [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) |
| controller.podDisruptionBudget.minAvailable | int | `1` | The number of pods that must be available. Require `controller.replicas` to be greater than 1 |
| controller.pprof.enable | bool | `false` | Specifies whether to enable pprof. |
| controller.pprof.port | int | `6060` | Specifies pprof port. |
| controller.pprof.portName | string | `"pprof"` | Specifies pprof service port name. |
| controller.workqueueRateLimiter.bucketQPS | int | `50` | Specifies the average rate of items process by the workqueue rate limiter. |
| controller.workqueueRateLimiter.bucketSize | int | `500` | Specifies the maximum number of items that can be in the workqueue at any given time. |
| controller.workqueueRateLimiter.maxDelay.enable | bool | `true` | Specifies whether to enable max delay for the workqueue rate limiter. This is useful to avoid losing events when the workqueue is full. |
| controller.workqueueRateLimiter.maxDelay.duration | string | `"6h"` | Specifies the maximum delay duration for the workqueue rate limiter. |
| webhook.enable | bool | `true` | Specifies whether to enable webhook. |
| webhook.replicas | int | `1` | Number of replicas of webhook server. |
| webhook.leaderElection.enable | bool | `true` | Specifies whether to enable leader election for webhook. |
| webhook.logLevel | string | `"info"` | Configure the verbosity of logging, can be one of `debug`, `info`, `error`. |
| webhook.logEncoder | string | `"console"` | Configure the encoder of logging, can be one of `console` or `json`. |
| webhook.port | int | `9443` | Specifies webhook port. |
| webhook.portName | string | `"webhook"` | Specifies webhook service port name. |
| webhook.failurePolicy | string | `"Fail"` | Specifies how unrecognized errors are handled. Available options are `Ignore` or `Fail`. |
| webhook.timeoutSeconds | int | `10` | Specifies the timeout seconds of the webhook, the value must be between 1 and 30. |
| webhook.resourceQuotaEnforcement.enable | bool | `false` | Specifies whether to enable the ResourceQuota enforcement for SparkApplication resources. |
| webhook.serviceAccount.create | bool | `true` | Specifies whether to create a service account for the webhook. |
| webhook.serviceAccount.name | string | `""` | Optional name for the webhook service account. |
| webhook.serviceAccount.annotations | object | `{}` | Extra annotations for the webhook service account. |
| webhook.serviceAccount.automountServiceAccountToken | bool | `true` | Auto-mount service account token to the webhook pods. |
| webhook.rbac.create | bool | `true` | Specifies whether to create RBAC resources for the webhook. |
| webhook.rbac.annotations | object | `{}` | Extra annotations for the webhook RBAC resources. |
| webhook.labels | object | `{}` | Extra labels for webhook pods. |
| webhook.annotations | object | `{}` | Extra annotations for webhook pods. |
| webhook.sidecars | list | `[]` | Sidecar containers for webhook pods. |
| webhook.volumes | list | `[{"emptyDir":{"sizeLimit":"500Mi"},"name":"serving-certs"}]` | Volumes for webhook pods. |
| webhook.nodeSelector | object | `{}` | Node selector for webhook pods. |
| webhook.affinity | object | `{}` | Affinity for webhook pods. |
| webhook.tolerations | list | `[]` | List of node taints to tolerate for webhook pods. |
| webhook.priorityClassName | string | `""` | Priority class for webhook pods. |
| webhook.podSecurityContext | object | `{"fsGroup":185}` | Security context for webhook pods. |
| webhook.topologySpreadConstraints | list | `[]` | Topology spread constraints rely on node labels to identify the topology domain(s) that each Node is in. Ref: [Pod Topology Spread Constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/). The labelSelector field in topology spread constraint will be set to the selector labels for webhook pods if not specified. |
| webhook.env | list | `[]` | Environment variables for webhook containers. |
| webhook.envFrom | list | `[]` | Environment variable sources for webhook containers. |
| webhook.volumeMounts | list | `[{"mountPath":"/etc/k8s-webhook-server/serving-certs","name":"serving-certs","readOnly":false,"subPath":"serving-certs"}]` | Volume mounts for webhook containers. |
| webhook.resources | object | `{}` | Pod resource requests and limits for webhook pods. |
| webhook.securityContext | object | `{"allowPrivilegeEscalation":false,"capabilities":{"drop":["ALL"]},"privileged":false,"readOnlyRootFilesystem":true,"runAsNonRoot":true,"seccompProfile":{"type":"RuntimeDefault"}}` | Security context for webhook containers. |
| webhook.podDisruptionBudget.enable | bool | `false` | Specifies whether to create pod disruption budget for webhook. Ref: [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) |
| webhook.podDisruptionBudget.minAvailable | int | `1` | The number of pods that must be available. Require `webhook.replicas` to be greater than 1 |
| spark.jobNamespaces | list | `["default"]` | List of namespaces where to run spark jobs. If empty string is included, all namespaces will be allowed. Make sure the namespaces have already existed. |
| spark.serviceAccount.create | bool | `true` | Specifies whether to create a service account for spark applications. |
| spark.serviceAccount.name | string | `""` | Optional name for the spark service account. |
| spark.serviceAccount.annotations | object | `{}` | Optional annotations for the spark service account. |
| spark.serviceAccount.automountServiceAccountToken | bool | `true` | Auto-mount service account token to the spark applications pods. |
| spark.rbac.create | bool | `true` | Specifies whether to create RBAC resources for spark applications. |
| spark.rbac.annotations | object | `{}` | Optional annotations for the spark application RBAC resources. |
| prometheus.metrics.enable | bool | `true` | Specifies whether to enable prometheus metrics scraping. |
| prometheus.metrics.port | int | `8080` | Metrics port. |
| prometheus.metrics.portName | string | `"metrics"` | Metrics port name. |
| prometheus.metrics.endpoint | string | `"/metrics"` | Metrics serving endpoint. |
| prometheus.metrics.prefix | string | `""` | Metrics prefix, will be added to all exported metrics. |
| prometheus.metrics.jobStartLatencyBuckets | string | `"30,60,90,120,150,180,210,240,270,300"` | Job Start Latency histogram buckets. Specified in seconds. |
| prometheus.podMonitor.create | bool | `false` | Specifies whether to create pod monitor. Note that prometheus metrics should be enabled as well. |
| prometheus.podMonitor.labels | object | `{}` | Pod monitor labels |
| prometheus.podMonitor.jobLabel | string | `"spark-operator-podmonitor"` | The label to use to retrieve the job name from |
| prometheus.podMonitor.podMetricsEndpoint | object | `{"interval":"5s","scheme":"http"}` | Prometheus metrics endpoint properties. `metrics.portName` will be used as a port |
| certManager.enable | bool | `false` | Specifies whether to use [cert-manager](https://cert-manager.io) to generate certificate for webhook. `webhook.enable` must be set to `true` to enable cert-manager. |
| certManager.issuerRef | object | A self-signed issuer will be created and used if not specified. | The reference to the issuer. |
| certManager.duration | string | `2160h` (90 days) will be used if not specified. | The duration of the certificate validity (e.g. `2160h`). See [cert-manager.io/v1.Certificate](https://cert-manager.io/docs/reference/api-docs/#cert-manager.io/v1.Certificate). |
| certManager.renewBefore | string | 1/3 of issued certificates lifetime. | The duration before the certificate expiration to renew the certificate (e.g. `720h`). See [cert-manager.io/v1.Certificate](https://cert-manager.io/docs/reference/api-docs/#cert-manager.io/v1.Certificate). |
## Maintainers
| Name | Email | Url |
| ---- | ------ | --- |
| yuchaoran2011 | <yuchaoran2011@gmail.com> | |
| yuchaoran2011 | <yuchaoran2011@gmail.com> | <https://github.com/yuchaoran2011> |
| ChenYi015 | <github@chenyicn.net> | <https://github.com/ChenYi015> |

View File

@ -43,13 +43,7 @@ See [helm repo](https://helm.sh/docs/helm/helm_repo) for command documentation.
helm install [RELEASE_NAME] spark-operator/spark-operator
```
For example, if you want to create a release with name `spark-operator` in the `default` namespace:
```shell
helm install spark-operator spark-operator/spark-operator
```
Note that `helm` will fail to install if the namespace doesn't exist. Either create the namespace beforehand or pass the `--create-namespace` flag to the `helm install` command.
For example, if you want to create a release with name `spark-operator` in the `spark-operator` namespace:
```shell
helm install spark-operator spark-operator/spark-operator \
@ -57,6 +51,8 @@ helm install spark-operator spark-operator/spark-operator \
--create-namespace
```
Note that by passing the `--create-namespace` flag to the `helm install` command, `helm` will create the release namespace if it does not exist.
See [helm install](https://helm.sh/docs/helm/helm_install) for command documentation.
### Upgrade the chart

View File

@ -1,2 +1,2 @@
image:
tag: "local"
tag: local

View File

@ -0,0 +1,5 @@
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker

View File

@ -0,0 +1,272 @@
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
api-approved.kubernetes.io: https://github.com/kubeflow/spark-operator/pull/1298
controller-gen.kubebuilder.io/version: v0.17.1
name: sparkconnects.sparkoperator.k8s.io
spec:
group: sparkoperator.k8s.io
names:
kind: SparkConnect
listKind: SparkConnectList
plural: sparkconnects
shortNames:
- sparkconn
singular: sparkconnect
scope: Namespaced
versions:
- additionalPrinterColumns:
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
name: v1alpha1
schema:
openAPIV3Schema:
description: SparkConnect is the Schema for the sparkconnections API.
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: SparkConnectSpec defines the desired state of SparkConnect.
properties:
dynamicAllocation:
description: |-
DynamicAllocation configures dynamic allocation that becomes available for the Kubernetes
scheduler backend since Spark 3.0.
properties:
enabled:
description: Enabled controls whether dynamic allocation is enabled
or not.
type: boolean
initialExecutors:
description: |-
InitialExecutors is the initial number of executors to request. If .spec.executor.instances
is also set, the initial number of executors is set to the bigger of that and this option.
format: int32
type: integer
maxExecutors:
description: MaxExecutors is the upper bound for the number of
executors if dynamic allocation is enabled.
format: int32
type: integer
minExecutors:
description: MinExecutors is the lower bound for the number of
executors if dynamic allocation is enabled.
format: int32
type: integer
shuffleTrackingEnabled:
description: |-
ShuffleTrackingEnabled enables shuffle file tracking for executors, which allows dynamic allocation without
the need for an external shuffle service. This option will try to keep alive executors that are storing
shuffle data for active jobs. If external shuffle service is enabled, set ShuffleTrackingEnabled to false.
ShuffleTrackingEnabled is true by default if dynamicAllocation.enabled is true.
type: boolean
shuffleTrackingTimeout:
description: |-
ShuffleTrackingTimeout controls the timeout in milliseconds for executors that are holding
shuffle data if shuffle tracking is enabled (true by default if dynamic allocation is enabled).
format: int64
type: integer
type: object
executor:
description: Executor is the Spark executor specification.
properties:
cores:
description: Cores maps to `spark.driver.cores` or `spark.executor.cores`
for the driver and executors, respectively.
format: int32
minimum: 1
type: integer
instances:
description: Instances is the number of executor instances.
format: int32
minimum: 0
type: integer
memory:
description: Memory is the amount of memory to request for the
pod.
type: string
template:
description: |-
Template is a pod template that can be used to define the driver or executor pod configurations that Spark configurations do not support.
Spark version >= 3.0.0 is required.
Ref: https://spark.apache.org/docs/latest/running-on-kubernetes.html#pod-template.
type: object
x-kubernetes-preserve-unknown-fields: true
type: object
hadoopConf:
additionalProperties:
type: string
description: |-
HadoopConf carries user-specified Hadoop configuration properties as they would use the "--conf" option
in spark-submit. The SparkApplication controller automatically adds prefix "spark.hadoop." to Hadoop
configuration properties.
type: object
image:
description: |-
Image is the container image for the driver, executor, and init-container. Any custom container images for the
driver, executor, or init-container takes precedence over this.
type: string
server:
description: Server is the Spark connect server specification.
properties:
cores:
description: Cores maps to `spark.driver.cores` or `spark.executor.cores`
for the driver and executors, respectively.
format: int32
minimum: 1
type: integer
memory:
description: Memory is the amount of memory to request for the
pod.
type: string
template:
description: |-
Template is a pod template that can be used to define the driver or executor pod configurations that Spark configurations do not support.
Spark version >= 3.0.0 is required.
Ref: https://spark.apache.org/docs/latest/running-on-kubernetes.html#pod-template.
type: object
x-kubernetes-preserve-unknown-fields: true
type: object
sparkConf:
additionalProperties:
type: string
description: |-
SparkConf carries user-specified Spark configuration properties as they would use the "--conf" option in
spark-submit.
type: object
sparkVersion:
description: SparkVersion is the version of Spark the spark connect
use.
type: string
required:
- executor
- server
- sparkVersion
type: object
status:
description: SparkConnectStatus defines the observed state of SparkConnect.
properties:
conditions:
description: Represents the latest available observations of a SparkConnect's
current state.
items:
description: Condition contains details for one aspect of the current
state of this API Resource.
properties:
lastTransitionTime:
description: |-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
type: string
message:
description: |-
message is a human readable message indicating details about the transition.
This may be an empty string.
maxLength: 32768
type: string
observedGeneration:
description: |-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
format: int64
minimum: 0
type: integer
reason:
description: |-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
type: string
status:
description: status of the condition, one of True, False, Unknown.
enum:
- "True"
- "False"
- Unknown
type: string
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
required:
- lastTransitionTime
- message
- reason
- status
- type
type: object
type: array
x-kubernetes-list-map-keys:
- type
x-kubernetes-list-type: map
executors:
additionalProperties:
type: integer
description: Executors represents the current state of the SparkConnect
executors.
type: object
lastUpdateTime:
description: LastUpdateTime is the time at which the SparkConnect
controller last updated the SparkConnect.
format: date-time
type: string
server:
description: Server represents the current state of the SparkConnect
server.
properties:
podIp:
description: PodIP is the IP address of the pod that is running
the Spark Connect server.
type: string
podName:
description: PodName is the name of the pod that is running the
Spark Connect server.
type: string
serviceName:
description: ServiceName is the name of the service that is exposing
the Spark Connect server.
type: string
type: object
startTime:
description: StartTime is the time at which the SparkConnect controller
started processing the SparkConnect.
format: date-time
type: string
state:
description: State represents the current state of the SparkConnect.
type: string
type: object
required:
- metadata
- spec
type: object
served: true
storage: true
subresources:
status: {}

View File

@ -1,3 +1,19 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{/* vim: set filetype=mustache: */}}
{{/*
Expand the name of the chart.
@ -37,13 +53,13 @@ Common labels
{{- define "spark-operator.labels" -}}
helm.sh/chart: {{ include "spark-operator.chart" . }}
{{ include "spark-operator.selectorLabels" . }}
{{- if .Values.commonLabels }}
{{ toYaml .Values.commonLabels }}
{{- end }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- with .Values.commonLabels }}
{{ toYaml . }}
{{- end }}
{{- end }}
{{/*
@ -55,25 +71,8 @@ app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Create the name of the service account to be used by the operator
Spark Operator image
*/}}
{{- define "spark-operator.serviceAccountName" -}}
{{- if .Values.serviceAccounts.sparkoperator.create -}}
{{ default (include "spark-operator.fullname" .) .Values.serviceAccounts.sparkoperator.name }}
{{- else -}}
{{ default "default" .Values.serviceAccounts.sparkoperator.name }}
{{- define "spark-operator.image" -}}
{{ printf "%s/%s:%s" .Values.image.registry .Values.image.repository (.Values.image.tag | default .Chart.AppVersion | toString) }}
{{- end -}}
{{- end -}}
{{/*
Create the name of the service account to be used by spark apps
*/}}
{{- define "spark.serviceAccountName" -}}
{{- if .Values.serviceAccounts.spark.create -}}
{{- $sparkServiceaccount := printf "%s-%s" .Release.Name "spark" -}}
{{ default $sparkServiceaccount .Values.serviceAccounts.spark.name }}
{{- else -}}
{{ default "default" .Values.serviceAccounts.spark.name }}
{{- end -}}
{{- end -}}

View File

@ -0,0 +1,29 @@
{{- /*
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/ -}}
{{/*
Create the name of the webhook certificate issuer.
*/}}
{{- define "spark-operator.certManager.issuer.name" -}}
{{ include "spark-operator.name" . }}-self-signed-issuer
{{- end -}}
{{/*
Create the name of the certificate to be used by webhook.
*/}}
{{- define "spark-operator.certManager.certificate.name" -}}
{{ include "spark-operator.name" . }}-certificate
{{- end -}}

View File

@ -0,0 +1,56 @@
{{- /*
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/ -}}
{{- if .Values.webhook.enable }}
{{- if .Values.certManager.enable }}
{{- if not (.Capabilities.APIVersions.Has "cert-manager.io/v1/Certificate") }}
{{- fail "The cluster does not support the required API version `cert-manager.io/v1` for `Certificate`." }}
{{- end }}
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: {{ include "spark-operator.certManager.certificate.name" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
spec:
secretName: {{ include "spark-operator.webhook.secretName" . }}
issuerRef:
{{- if not .Values.certManager.issuerRef }}
group: cert-manager.io
kind: Issuer
name: {{ include "spark-operator.certManager.issuer.name" . }}
{{- else }}
{{- toYaml .Values.certManager.issuerRef | nindent 4 }}
{{- end }}
commonName: {{ include "spark-operator.webhook.serviceName" . }}.{{ .Release.Namespace }}.svc
dnsNames:
- {{ include "spark-operator.webhook.serviceName" . }}.{{ .Release.Namespace }}.svc
- {{ include "spark-operator.webhook.serviceName" . }}.{{ .Release.Namespace }}.svc.cluster.local
subject:
organizationalUnits:
- spark-operator
usages:
- server auth
- client auth
{{- with .Values.certManager.duration }}
duration: {{ . }}
{{- end }}
{{- with .Values.certManager.renewBefore }}
renewBefore: {{ . }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,34 @@
{{- /*
Copyright 2025 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/ -}}
{{- if .Values.webhook.enable }}
{{- if .Values.certManager.enable }}
{{- if not .Values.certManager.issuerRef }}
{{- if not (.Capabilities.APIVersions.Has "cert-manager.io/v1/Issuer") }}
{{- fail "The cluster does not support the required API version `cert-manager.io/v1` for `Issuer`." }}
{{- end }}
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: {{ include "spark-operator.certManager.issuer.name" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
spec:
selfSigned: {}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,217 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{/*
Create the name of controller component
*/}}
{{- define "spark-operator.controller.name" -}}
{{- include "spark-operator.fullname" . }}-controller
{{- end -}}
{{/*
Common labels for the controller
*/}}
{{- define "spark-operator.controller.labels" -}}
{{ include "spark-operator.labels" . }}
app.kubernetes.io/component: controller
{{- end -}}
{{/*
Selector labels for the controller
*/}}
{{- define "spark-operator.controller.selectorLabels" -}}
{{ include "spark-operator.selectorLabels" . }}
app.kubernetes.io/component: controller
{{- end -}}
{{/*
Create the name of the service account to be used by the controller
*/}}
{{- define "spark-operator.controller.serviceAccountName" -}}
{{- if .Values.controller.serviceAccount.create -}}
{{ .Values.controller.serviceAccount.name | default (include "spark-operator.controller.name" .) }}
{{- else -}}
{{ .Values.controller.serviceAccount.name | default "default" }}
{{- end -}}
{{- end -}}
{{/*
Create the name of the cluster role to be used by the controller
*/}}
{{- define "spark-operator.controller.clusterRoleName" -}}
{{ include "spark-operator.controller.name" . }}
{{- end }}
{{/*
Create the name of the cluster role binding to be used by the controller
*/}}
{{- define "spark-operator.controller.clusterRoleBindingName" -}}
{{ include "spark-operator.controller.clusterRoleName" . }}
{{- end }}
{{/*
Create the name of the role to be used by the controller
*/}}
{{- define "spark-operator.controller.roleName" -}}
{{ include "spark-operator.controller.name" . }}
{{- end }}
{{/*
Create the name of the role binding to be used by the controller
*/}}
{{- define "spark-operator.controller.roleBindingName" -}}
{{ include "spark-operator.controller.roleName" . }}
{{- end }}
{{/*
Create the name of the deployment to be used by controller
*/}}
{{- define "spark-operator.controller.deploymentName" -}}
{{ include "spark-operator.controller.name" . }}
{{- end -}}
{{/*
Create the name of the lease resource to be used by leader election
*/}}
{{- define "spark-operator.controller.leaderElectionName" -}}
{{ include "spark-operator.controller.name" . }}-lock
{{- end -}}
{{/*
Create the name of the pod disruption budget to be used by controller
*/}}
{{- define "spark-operator.controller.podDisruptionBudgetName" -}}
{{ include "spark-operator.controller.name" . }}-pdb
{{- end -}}
{{/*
Create the name of the service used by controller
*/}}
{{- define "spark-operator.controller.serviceName" -}}
{{ include "spark-operator.controller.name" . }}-svc
{{- end -}}
{{/*
Create the role policy rules for the controller in every Spark job namespace
*/}}
{{- define "spark-operator.controller.policyRules" -}}
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- deletecollection
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- ""
resources:
- persistentvolumeclaims
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- ""
resources:
- events
verbs:
- create
- update
- patch
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- list
- watch
- create
- update
- delete
- apiGroups:
- sparkoperator.k8s.io
resources:
- sparkapplications
- scheduledsparkapplications
- sparkconnects
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- sparkoperator.k8s.io
resources:
- sparkapplications/status
- sparkapplications/finalizers
- scheduledsparkapplications/status
- scheduledsparkapplications/finalizers
- sparkconnects/status
verbs:
- get
- update
- patch
{{- if .Values.controller.batchScheduler.enable }}
{{/* required for the `volcano` batch scheduler */}}
- apiGroups:
- scheduling.incubator.k8s.io
- scheduling.sigs.dev
- scheduling.volcano.sh
resources:
- podgroups
verbs:
- "*"
{{- end }}
{{- end -}}

View File

@ -0,0 +1,206 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "spark-operator.controller.deploymentName" . }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.controller.replicas }}
selector:
matchLabels:
{{- include "spark-operator.controller.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "spark-operator.controller.selectorLabels" . | nindent 8 }}
{{- with .Values.controller.labels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if or .Values.controller.annotations .Values.prometheus.metrics.enable }}
annotations:
{{- if .Values.prometheus.metrics.enable }}
prometheus.io/scrape: "true"
prometheus.io/port: {{ .Values.prometheus.metrics.port | quote }}
prometheus.io/path: {{ .Values.prometheus.metrics.endpoint }}
{{- end }}
{{- with .Values.controller.annotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
spec:
containers:
- name: spark-operator-controller
image: {{ include "spark-operator.image" . }}
{{- with .Values.image.pullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
args:
- controller
- start
{{- with .Values.controller.logLevel }}
- --zap-log-level={{ . }}
{{- end }}
{{- with .Values.controller.logEncoder }}
- --zap-encoder={{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if has "" . }}
- --namespaces=""
{{- else }}
- --namespaces={{ . | join "," }}
{{- end }}
{{- end }}
- --controller-threads={{ .Values.controller.workers }}
- --enable-ui-service={{ .Values.controller.uiService.enable }}
{{- if .Values.controller.uiIngress.enable }}
{{- with .Values.controller.uiIngress.urlFormat }}
- --ingress-url-format={{ . }}
{{- end }}
{{- with .Values.controller.uiIngress.ingressClassName }}
- --ingress-class-name={{ . }}
{{- end }}
{{- with .Values.controller.uiIngress.tls }}
- --ingress-tls={{ . | toJson }}
{{- end }}
{{- with .Values.controller.uiIngress.annotations }}
- --ingress-annotations={{ . | toJson }}
{{- end }}
{{- end }}
{{- if .Values.controller.batchScheduler.enable }}
- --enable-batch-scheduler=true
{{- with .Values.controller.batchScheduler.kubeSchedulerNames }}
- --kube-scheduler-names={{ . | join "," }}
{{- end }}
{{- with .Values.controller.batchScheduler.default }}
- --default-batch-scheduler={{ . }}
{{- end }}
{{- end }}
{{- if .Values.prometheus.metrics.enable }}
- --enable-metrics=true
- --metrics-bind-address=:{{ .Values.prometheus.metrics.port }}
- --metrics-endpoint={{ .Values.prometheus.metrics.endpoint }}
- --metrics-prefix={{ .Values.prometheus.metrics.prefix }}
- --metrics-labels=app_type
- --metrics-job-start-latency-buckets={{ .Values.prometheus.metrics.jobStartLatencyBuckets }}
{{- end }}
{{ if .Values.controller.leaderElection.enable }}
- --leader-election=true
- --leader-election-lock-name={{ include "spark-operator.controller.leaderElectionName" . }}
- --leader-election-lock-namespace={{ .Release.Namespace }}
{{- else -}}
- --leader-election=false
{{- end }}
{{- if .Values.controller.pprof.enable }}
- --pprof-bind-address=:{{ .Values.controller.pprof.port }}
{{- end }}
- --workqueue-ratelimiter-bucket-qps={{ .Values.controller.workqueueRateLimiter.bucketQPS }}
- --workqueue-ratelimiter-bucket-size={{ .Values.controller.workqueueRateLimiter.bucketSize }}
{{- if .Values.controller.workqueueRateLimiter.maxDelay.enable }}
- --workqueue-ratelimiter-max-delay={{ .Values.controller.workqueueRateLimiter.maxDelay.duration }}
{{- end }}
{{- if .Values.controller.driverPodCreationGracePeriod }}
- --driver-pod-creation-grace-period={{ .Values.controller.driverPodCreationGracePeriod }}
{{- end }}
{{- if .Values.controller.maxTrackedExecutorPerApp }}
- --max-tracked-executor-per-app={{ .Values.controller.maxTrackedExecutorPerApp }}
{{- end }}
{{- if or .Values.prometheus.metrics.enable .Values.controller.pprof.enable }}
ports:
{{- if .Values.controller.pprof.enable }}
- name: {{ .Values.controller.pprof.portName | quote }}
containerPort: {{ .Values.controller.pprof.port }}
{{- end }}
{{- if .Values.prometheus.metrics.enable }}
- name: {{ .Values.prometheus.metrics.portName | quote }}
containerPort: {{ .Values.prometheus.metrics.port }}
{{- end }}
{{- end }}
{{- with .Values.controller.env }}
env:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.envFrom }}
envFrom:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.volumeMounts }}
volumeMounts:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.resources }}
resources:
{{- toYaml . | nindent 10 }}
{{- end }}
livenessProbe:
httpGet:
port: 8081
scheme: HTTP
path: /healthz
readinessProbe:
httpGet:
port: 8081
scheme: HTTP
path: /readyz
{{- with .Values.controller.securityContext }}
securityContext:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- with .Values.controller.sidecars }}
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.image.pullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.controller.volumes }}
volumes:
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.controller.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.controller.tolerations }}
tolerations:
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.controller.priorityClassName }}
priorityClassName: {{ . }}
{{- end }}
serviceAccountName: {{ include "spark-operator.controller.serviceAccountName" . }}
automountServiceAccountToken: {{ .Values.controller.serviceAccount.automountServiceAccountToken }}
{{- with .Values.controller.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.controller.topologySpreadConstraints }}
{{- if le (int .Values.controller.replicas) 1 }}
{{- fail "controller.replicas must be greater than 1 to enable topology spread constraints for controller pods"}}
{{- end }}
{{- $selectorLabels := include "spark-operator.controller.selectorLabels" . | fromYaml }}
{{- $labelSelectorDict := dict "labelSelector" ( dict "matchLabels" $selectorLabels ) }}
topologySpreadConstraints:
{{- range .Values.controller.topologySpreadConstraints }}
- {{ mergeOverwrite . $labelSelectorDict | toYaml | nindent 8 | trim }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,34 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.controller.podDisruptionBudget.enable }}
{{- if le (int .Values.controller.replicas) 1 }}
{{- fail "controller.replicas must be greater than 1 to enable pod disruption budget for controller" }}
{{- end -}}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "spark-operator.controller.podDisruptionBudgetName" . }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "spark-operator.controller.selectorLabels" . | nindent 6 }}
{{- with .Values.controller.podDisruptionBudget.minAvailable }}
minAvailable: {{ . }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,164 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.controller.rbac.create -}}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "spark-operator.controller.clusterRoleName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
{{- if not .Values.spark.jobNamespaces | or (has "" .Values.spark.jobNamespaces) }}
{{ include "spark-operator.controller.policyRules" . }}
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "spark-operator.controller.clusterRoleBindingName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "spark-operator.controller.clusterRoleName" . }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.controller.roleName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
{{- if .Values.controller.leaderElection.enable }}
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- create
- apiGroups:
- coordination.k8s.io
resources:
- leases
resourceNames:
- {{ include "spark-operator.controller.leaderElectionName" . }}
verbs:
- get
- update
{{- end }}
{{- if has .Release.Namespace .Values.spark.jobNamespaces }}
{{ include "spark-operator.controller.policyRules" . }}
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.controller.roleBindingName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.controller.roleName" . }}
{{- if and .Values.spark.jobNamespaces (not (has "" .Values.spark.jobNamespaces)) }}
{{- range $jobNamespace := .Values.spark.jobNamespaces }}
{{- if ne $jobNamespace $.Release.Namespace }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.controller.roleName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.controller.labels" $ | nindent 4 }}
{{- with $.Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
{{ include "spark-operator.controller.policyRules" $ }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.controller.roleBindingName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.controller.labels" $ | nindent 4 }}
{{- with $.Values.controller.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.controller.serviceAccountName" $ }}
namespace: {{ $.Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.controller.roleName" $ }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,31 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.controller.pprof.enable }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "spark-operator.controller.serviceName" . }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
spec:
selector:
{{- include "spark-operator.controller.selectorLabels" . | nindent 4 }}
ports:
- port: {{ .Values.controller.pprof.port }}
targetPort: {{ .Values.controller.pprof.portName | quote }}
name: {{ .Values.controller.pprof.portName }}
{{- end }}

View File

@ -0,0 +1,30 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.controller.serviceAccount.create }}
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: {{ .Values.controller.serviceAccount.automountServiceAccountToken }}
metadata:
name: {{ include "spark-operator.controller.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.controller.labels" . | nindent 4 }}
{{- with .Values.controller.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -1,140 +0,0 @@
# If the admission webhook is enabled, then a post-install step is required
# to generate and install the secret in the operator namespace.
# In the post-install hook, the token corresponding to the operator service account
# is used to authenticate with the Kubernetes API server to install the secret bundle.
{{- $jobNamespaces := .Values.sparkJobNamespaces | default list }}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "spark-operator.fullname" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
strategy:
type: Recreate
template:
metadata:
{{- if or .Values.podAnnotations .Values.metrics.enable }}
annotations:
{{- if .Values.metrics.enable }}
prometheus.io/scrape: "true"
prometheus.io/port: "{{ .Values.metrics.port }}"
prometheus.io/path: {{ .Values.metrics.endpoint }}
{{- end }}
{{- if .Values.podAnnotations }}
{{- toYaml .Values.podAnnotations | trim | nindent 8 }}
{{- end }}
{{- end }}
labels:
{{- include "spark-operator.selectorLabels" . | nindent 8 }}
{{- with .Values.podLabels }}
{{- toYaml . | trim | nindent 8 }}
{{- end }}
spec:
serviceAccountName: {{ include "spark-operator.serviceAccountName" . }}
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
image: {{ .Values.image.repository }}:{{ default .Chart.AppVersion .Values.image.tag }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
{{- if gt (int .Values.replicaCount) 1 }}
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
{{- end }}
envFrom:
{{- toYaml .Values.envFrom | nindent 10 }}
securityContext:
{{- toYaml .Values.securityContext | nindent 10 }}
{{- if or .Values.metrics.enable .Values.webhook.enable }}
ports:
{{ if .Values.metrics.enable -}}
- name: {{ .Values.metrics.portName | quote }}
containerPort: {{ .Values.metrics.port }}
{{- end }}
{{ if .Values.webhook.enable -}}
- name: {{ .Values.webhook.portName | quote }}
containerPort: {{ .Values.webhook.port }}
{{- end }}
{{ end -}}
args:
- -v={{ .Values.logLevel }}
- -logtostderr
{{- if eq (len $jobNamespaces) 1 }}
- -namespace={{ index $jobNamespaces 0 }}
{{- end }}
- -enable-ui-service={{ .Values.uiService.enable}}
- -ingress-url-format={{ .Values.ingressUrlFormat }}
- -controller-threads={{ .Values.controllerThreads }}
- -resync-interval={{ .Values.resyncInterval }}
- -enable-batch-scheduler={{ .Values.batchScheduler.enable }}
- -label-selector-filter={{ .Values.labelSelectorFilter }}
{{- if .Values.metrics.enable }}
- -enable-metrics=true
- -metrics-labels=app_type
- -metrics-port={{ .Values.metrics.port }}
- -metrics-endpoint={{ .Values.metrics.endpoint }}
- -metrics-prefix={{ .Values.metrics.prefix }}
{{- end }}
{{- if .Values.webhook.enable }}
- -enable-webhook=true
- -webhook-secret-name={{ include "spark-operator.webhookSecretName" . }}
- -webhook-secret-namespace={{ .Release.Namespace }}
- -webhook-svc-name={{ include "spark-operator.webhookServiceName" . }}
- -webhook-svc-namespace={{ .Release.Namespace }}
- -webhook-config-name={{ include "spark-operator.fullname" . }}-webhook-config
- -webhook-port={{ .Values.webhook.port }}
- -webhook-timeout={{ .Values.webhook.timeout }}
- -webhook-namespace-selector={{ .Values.webhook.namespaceSelector }}
- -webhook-object-selector={{ .Values.webhook.objectSelector }}
{{- end }}
- -enable-resource-quota-enforcement={{ .Values.resourceQuotaEnforcement.enable }}
{{- if gt (int .Values.replicaCount) 1 }}
- -leader-election=true
- -leader-election-lock-namespace={{ default .Release.Namespace .Values.leaderElection.lockNamespace }}
- -leader-election-lock-name={{ .Values.leaderElection.lockName }}
{{- end }}
{{- with .Values.resources }}
resources:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- with .Values.volumeMounts }}
volumeMounts:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- with .Values.sidecars }}
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.volumes }}
volumes:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.priorityClassName }}
priorityClassName: {{ .Values.priorityClassName }}
{{- end }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@ -1,17 +0,0 @@
{{- if $.Values.podDisruptionBudget.enable }}
{{- if (gt (int $.Values.replicaCount) 1) }}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "spark-operator.fullname" . }}-pdb
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
minAvailable: {{ $.Values.podDisruptionBudget.minAvailable }}
{{- else }}
{{- fail "replicaCount must be greater than 1 to enable PodDisruptionBudget" }}
{{- end }}
{{- end }}

View File

@ -1,19 +0,0 @@
{{ if and .Values.metrics.enable .Values.podMonitor.enable }}
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: {{ include "spark-operator.name" . -}}-podmonitor
labels: {{ toYaml .Values.podMonitor.labels | nindent 4 }}
spec:
podMetricsEndpoints:
- interval: {{ .Values.podMonitor.podMetricsEndpoint.interval }}
port: {{ .Values.metrics.portName | quote }}
scheme: {{ .Values.podMonitor.podMetricsEndpoint.scheme }}
jobLabel: {{ .Values.podMonitor.jobLabel }}
namespaceSelector:
matchNames:
- {{ .Release.Namespace }}
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
{{ end }}

View File

@ -1,7 +1,5 @@
// Code generated by k8s code-generator DO NOT EDIT.
/*
Copyright 2018 Google LLC
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@ -14,9 +12,11 @@ distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
*/}}
// Code generated by client-gen. DO NOT EDIT.
// This package has the automatically generated clientset.
package versioned
{{/*
Create the name of pod monitor
*/}}
{{- define "spark-operator.prometheus.podMonitorName" -}}
{{- include "spark-operator.fullname" . }}-podmonitor
{{- end -}}

View File

@ -0,0 +1,44 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.prometheus.podMonitor.create -}}
{{- if not .Values.prometheus.metrics.enable }}
{{- fail "`metrics.enable` must be set to true when `podMonitor.create` is true." }}
{{- end }}
{{- if not (.Capabilities.APIVersions.Has "monitoring.coreos.com/v1/PodMonitor") }}
{{- fail "The cluster does not support the required API version `monitoring.coreos.com/v1` for `PodMonitor`." }}
{{- end }}
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: {{ include "spark-operator.prometheus.podMonitorName" . }}
{{- with .Values.prometheus.podMonitor.labels }}
labels:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
podMetricsEndpoints:
- interval: {{ .Values.prometheus.podMonitor.podMetricsEndpoint.interval }}
port: {{ .Values.prometheus.metrics.portName | quote }}
scheme: {{ .Values.prometheus.podMonitor.podMetricsEndpoint.scheme }}
jobLabel: {{ .Values.prometheus.podMonitor.jobLabel }}
namespaceSelector:
matchNames:
- {{ .Release.Namespace }}
selector:
matchLabels:
{{- include "spark-operator.selectorLabels" . | nindent 6 }}
{{- end }}

View File

@ -1,148 +0,0 @@
{{- if or .Values.rbac.create .Values.rbac.createClusterRole -}}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "spark-operator.fullname" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
{{- with .Values.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- pods
- persistentvolumeclaims
verbs:
- "*"
- apiGroups:
- ""
resources:
- services
- configmaps
- secrets
verbs:
- create
- get
- delete
- update
- patch
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
verbs:
- create
- get
- delete
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- events
verbs:
- create
- update
- patch
- apiGroups:
- ""
resources:
- resourcequotas
verbs:
- get
- list
- watch
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
verbs:
- create
- get
- update
- delete
- apiGroups:
- sparkoperator.k8s.io
resources:
- sparkapplications
- sparkapplications/status
- sparkapplications/finalizers
- scheduledsparkapplications
- scheduledsparkapplications/status
- scheduledsparkapplications/finalizers
verbs:
- "*"
{{- if .Values.batchScheduler.enable }}
# required for the `volcano` batch scheduler
- apiGroups:
- scheduling.incubator.k8s.io
- scheduling.sigs.dev
- scheduling.volcano.sh
resources:
- podgroups
verbs:
- "*"
{{- end }}
{{ if .Values.webhook.enable }}
- apiGroups:
- batch
resources:
- jobs
verbs:
- delete
{{- end }}
{{- if gt (int .Values.replicaCount) 1 }}
- apiGroups:
- coordination.k8s.io
resources:
- leases
resourceNames:
- {{ .Values.leaderElection.lockName }}
verbs:
- get
- update
- patch
- delete
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- create
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "spark-operator.fullname" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
{{- with .Values.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: {{ include "spark-operator.fullname" . }}
apiGroup: rbac.authorization.k8s.io
{{- end }}

View File

@ -1,12 +0,0 @@
{{- if .Values.serviceAccounts.sparkoperator.create }}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "spark-operator.serviceAccountName" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
{{- with .Values.serviceAccounts.sparkoperator.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -1,39 +0,0 @@
{{- if or .Values.rbac.create .Values.rbac.createRole }}
{{- $jobNamespaces := .Values.sparkJobNamespaces | default list }}
{{- range $jobNamespace := $jobNamespaces }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: spark-role
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.labels" $ | nindent 4 }}
rules:
- apiGroups:
- ""
resources:
- pods
- services
- configmaps
- persistentvolumeclaims
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: spark
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.labels" $ | nindent 4 }}
subjects:
- kind: ServiceAccount
name: {{ include "spark.serviceAccountName" $ }}
namespace: {{ $jobNamespace }}
roleRef:
kind: Role
name: spark-role
apiGroup: rbac.authorization.k8s.io
{{- end }}
{{- end }}

View File

@ -1,14 +0,0 @@
{{- if .Values.serviceAccounts.spark.create }}
{{- range $sparkJobNamespace := .Values.sparkJobNamespaces | default (list .Release.Namespace) }}
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "spark.serviceAccountName" $ }}
namespace: {{ $sparkJobNamespace }}
{{- with $.Values.serviceAccounts.spark.annotations }}
annotations: {{ toYaml . | nindent 4 }}
{{- end }}
labels: {{ include "spark-operator.labels" $ | nindent 4 }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,47 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{/*
Create the name of spark component
*/}}
{{- define "spark-operator.spark.name" -}}
{{- include "spark-operator.fullname" . }}-spark
{{- end -}}
{{/*
Create the name of the service account to be used by spark applications
*/}}
{{- define "spark-operator.spark.serviceAccountName" -}}
{{- if .Values.spark.serviceAccount.create -}}
{{- .Values.spark.serviceAccount.name | default (include "spark-operator.spark.name" .) -}}
{{- else -}}
{{- .Values.spark.serviceAccount.name | default "default" -}}
{{- end -}}
{{- end -}}
{{/*
Create the name of the role to be used by spark service account
*/}}
{{- define "spark-operator.spark.roleName" -}}
{{- include "spark-operator.spark.serviceAccountName" . }}
{{- end -}}
{{/*
Create the name of the role binding to be used by spark service account
*/}}
{{- define "spark-operator.spark.roleBindingName" -}}
{{- include "spark-operator.spark.serviceAccountName" . }}
{{- end -}}

View File

@ -0,0 +1,73 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.spark.rbac.create -}}
{{- range $jobNamespace := .Values.spark.jobNamespaces | default list }}
{{- if ne $jobNamespace "" }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.spark.roleName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.labels" $ | nindent 4 }}
{{- with $.Values.spark.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- pods
- configmaps
- persistentvolumeclaims
- services
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- deletecollection
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.spark.roleBindingName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.labels" $ | nindent 4 }}
{{- with $.Values.spark.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.spark.serviceAccountName" $ }}
namespace: {{ $jobNamespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.spark.roleName" $ }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,34 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.spark.serviceAccount.create }}
{{- range $jobNamespace := .Values.spark.jobNamespaces | default list }}
{{- if ne $jobNamespace "" }}
---
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: {{ $.Values.spark.serviceAccount.automountServiceAccountToken }}
metadata:
name: {{ include "spark-operator.spark.serviceAccountName" $ }}
namespace: {{ $jobNamespace }}
labels: {{ include "spark-operator.labels" $ | nindent 4 }}
{{- with $.Values.spark.serviceAccount.annotations }}
annotations: {{ toYaml . | nindent 4 }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -1,14 +1,165 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{/*
Create the name of webhook component
*/}}
{{- define "spark-operator.webhook.name" -}}
{{- include "spark-operator.fullname" . }}-webhook
{{- end -}}
{{/*
Common labels for the webhook
*/}}
{{- define "spark-operator.webhook.labels" -}}
{{ include "spark-operator.labels" . }}
app.kubernetes.io/component: webhook
{{- end -}}
{{/*
Selector labels for the webhook
*/}}
{{- define "spark-operator.webhook.selectorLabels" -}}
{{ include "spark-operator.selectorLabels" . }}
app.kubernetes.io/component: webhook
{{- end -}}
{{/*
Create the name of service account to be used by webhook
*/}}
{{- define "spark-operator.webhook.serviceAccountName" -}}
{{- if .Values.webhook.serviceAccount.create -}}
{{ .Values.webhook.serviceAccount.name | default (include "spark-operator.webhook.name" .) }}
{{- else -}}
{{ .Values.webhook.serviceAccount.name | default "default" }}
{{- end -}}
{{- end -}}
{{/*
Create the name of the cluster role to be used by the webhook
*/}}
{{- define "spark-operator.webhook.clusterRoleName" -}}
{{ include "spark-operator.webhook.name" . }}
{{- end }}
{{/*
Create the name of the cluster role binding to be used by the webhook
*/}}
{{- define "spark-operator.webhook.clusterRoleBindingName" -}}
{{ include "spark-operator.webhook.clusterRoleName" . }}
{{- end }}
{{/*
Create the name of the role to be used by the webhook
*/}}
{{- define "spark-operator.webhook.roleName" -}}
{{ include "spark-operator.webhook.name" . }}
{{- end }}
{{/*
Create the name of the role binding to be used by the webhook
*/}}
{{- define "spark-operator.webhook.roleBindingName" -}}
{{ include "spark-operator.webhook.roleName" . }}
{{- end }}
{{/*
Create the name of the secret to be used by webhook
*/}}
{{- define "spark-operator.webhookSecretName" -}}
{{ include "spark-operator.fullname" . }}-webhook-certs
{{- define "spark-operator.webhook.secretName" -}}
{{ include "spark-operator.webhook.name" . }}-certs
{{- end -}}
{{/*
Create the name of the service to be used by webhook
*/}}
{{- define "spark-operator.webhookServiceName" -}}
{{ include "spark-operator.fullname" . }}-webhook-svc
{{- define "spark-operator.webhook.serviceName" -}}
{{ include "spark-operator.webhook.name" . }}-svc
{{- end -}}
{{/*
Create the name of mutating webhook configuration
*/}}
{{- define "spark-operator.mutatingWebhookConfigurationName" -}}
webhook.sparkoperator.k8s.io
{{- end -}}
{{/*
Create the name of mutating webhook configuration
*/}}
{{- define "spark-operator.validatingWebhookConfigurationName" -}}
quotaenforcer.sparkoperator.k8s.io
{{- end -}}
{{/*
Create the name of the deployment to be used by webhook
*/}}
{{- define "spark-operator.webhook.deploymentName" -}}
{{ include "spark-operator.webhook.name" . }}
{{- end -}}
{{/*
Create the name of the lease resource to be used by leader election
*/}}
{{- define "spark-operator.webhook.leaderElectionName" -}}
{{ include "spark-operator.webhook.name" . }}-lock
{{- end -}}
{{/*
Create the name of the pod disruption budget to be used by webhook
*/}}
{{- define "spark-operator.webhook.podDisruptionBudgetName" -}}
{{ include "spark-operator.webhook.name" . }}-pdb
{{- end -}}
{{/*
Create the role policy rules for the webhook in every Spark job namespace
*/}}
{{- define "spark-operator.webhook.policyRules" -}}
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- resourcequotas
verbs:
- get
- list
- watch
- apiGroups:
- sparkoperator.k8s.io
resources:
- sparkapplications
- sparkapplications/status
- sparkapplications/finalizers
- scheduledsparkapplications
- scheduledsparkapplications/status
- scheduledsparkapplications/finalizers
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
{{- end -}}

View File

@ -0,0 +1,170 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "spark-operator.webhook.deploymentName" . }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.webhook.replicas }}
selector:
matchLabels:
{{- include "spark-operator.webhook.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "spark-operator.webhook.selectorLabels" . | nindent 8 }}
{{- with .Values.webhook.labels }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.annotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
containers:
- name: spark-operator-webhook
image: {{ include "spark-operator.image" . }}
{{- with .Values.image.pullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
args:
- webhook
- start
{{- with .Values.webhook.logLevel }}
- --zap-log-level={{ . }}
{{- end }}
{{- with .Values.webhook.logEncoder }}
- --zap-encoder={{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if has "" . }}
- --namespaces=""
{{- else }}
- --namespaces={{ . | join "," }}
{{- end }}
{{- end }}
- --webhook-secret-name={{ include "spark-operator.webhook.secretName" . }}
- --webhook-secret-namespace={{ .Release.Namespace }}
- --webhook-svc-name={{ include "spark-operator.webhook.serviceName" . }}
- --webhook-svc-namespace={{ .Release.Namespace }}
- --webhook-port={{ .Values.webhook.port }}
- --mutating-webhook-name={{ include "spark-operator.webhook.name" . }}
- --validating-webhook-name={{ include "spark-operator.webhook.name" . }}
{{- with .Values.webhook.resourceQuotaEnforcement.enable }}
- --enable-resource-quota-enforcement=true
{{- end }}
{{- if .Values.certManager.enable }}
- --enable-cert-manager=true
{{- end }}
{{- if .Values.prometheus.metrics.enable }}
- --enable-metrics=true
- --metrics-bind-address=:{{ .Values.prometheus.metrics.port }}
- --metrics-endpoint={{ .Values.prometheus.metrics.endpoint }}
- --metrics-prefix={{ .Values.prometheus.metrics.prefix }}
- --metrics-labels=app_type
{{- end }}
{{ if .Values.webhook.leaderElection.enable }}
- --leader-election=true
- --leader-election-lock-name={{ include "spark-operator.webhook.leaderElectionName" . }}
- --leader-election-lock-namespace={{ .Release.Namespace }}
{{- else -}}
- --leader-election=false
{{- end }}
ports:
- name: {{ .Values.webhook.portName | quote }}
containerPort: {{ .Values.webhook.port }}
{{- if .Values.prometheus.metrics.enable }}
- name: {{ .Values.prometheus.metrics.portName | quote }}
containerPort: {{ .Values.prometheus.metrics.port }}
{{- end }}
{{- with .Values.webhook.env }}
env:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.envFrom }}
envFrom:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.volumeMounts }}
volumeMounts:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.resources }}
resources:
{{- toYaml . | nindent 10 }}
{{- end }}
livenessProbe:
httpGet:
port: 8081
scheme: HTTP
path: /healthz
readinessProbe:
httpGet:
port: 8081
scheme: HTTP
path: /readyz
{{- with .Values.webhook.securityContext }}
securityContext:
{{- toYaml . | nindent 10 }}
{{- end }}
{{- with .Values.webhook.sidecars }}
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.image.pullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.volumes }}
volumes:
{{- toYaml . | nindent 6 }}
{{- end }}
{{- with .Values.webhook.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.webhook.priorityClassName }}
priorityClassName: {{ . }}
{{- end }}
serviceAccountName: {{ include "spark-operator.webhook.serviceAccountName" . }}
automountServiceAccountToken: {{ .Values.webhook.serviceAccount.automountServiceAccountToken }}
{{- with .Values.webhook.podSecurityContext }}
securityContext:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.webhook.topologySpreadConstraints }}
{{- if le (int .Values.webhook.replicas) 1 }}
{{- fail "webhook.replicas must be greater than 1 to enable topology spread constraints for webhook pods"}}
{{- end }}
{{- $selectorLabels := include "spark-operator.webhook.selectorLabels" . | fromYaml }}
{{- $labelSelectorDict := dict "labelSelector" ( dict "matchLabels" $selectorLabels ) }}
topologySpreadConstraints:
{{- range .Values.webhook.topologySpreadConstraints }}
- {{ mergeOverwrite . $labelSelectorDict | toYaml | nindent 8 | trim }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,128 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
name: {{ include "spark-operator.webhook.name" . }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- if .Values.certManager.enable }}
annotations:
cert-manager.io/inject-ca-from: {{ .Release.Namespace }}/{{ include "spark-operator.certManager.certificate.name" . }}
{{- end }}
webhooks:
- name: mutate--v1-pod.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /mutate--v1-pod
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
objectSelector:
matchLabels:
sparkoperator.k8s.io/launched-by-spark-operator: "true"
rules:
- apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
operations: ["CREATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
- name: mutate-sparkoperator-k8s-io-v1beta2-sparkapplication.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /mutate-sparkoperator-k8s-io-v1beta2-sparkapplication
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
rules:
- apiGroups: ["sparkoperator.k8s.io"]
apiVersions: ["v1beta2"]
resources: ["sparkapplications"]
operations: ["CREATE", "UPDATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
- name: mutate-sparkoperator-k8s-io-v1beta2-scheduledsparkapplication.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /mutate-sparkoperator-k8s-io-v1beta2-scheduledsparkapplication
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
rules:
- apiGroups: ["sparkoperator.k8s.io"]
apiVersions: ["v1beta2"]
resources: ["scheduledsparkapplications"]
operations: ["CREATE", "UPDATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,36 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
{{- if .Values.webhook.podDisruptionBudget.enable }}
{{- if le (int .Values.webhook.replicas) 1 }}
{{- fail "webhook.replicas must be greater than 1 to enable pod disruption budget for webhook" }}
{{- end -}}
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: {{ include "spark-operator.webhook.podDisruptionBudgetName" . }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
spec:
selector:
matchLabels:
{{- include "spark-operator.webhook.selectorLabels" . | nindent 6 }}
{{- with .Values.webhook.podDisruptionBudget.minAvailable }}
minAvailable: {{ . }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,195 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
{{- if .Values.webhook.rbac.create }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "spark-operator.webhook.clusterRoleName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- events
verbs:
- create
- update
- patch
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
verbs:
- list
- watch
- apiGroups:
- admissionregistration.k8s.io
resources:
- mutatingwebhookconfigurations
- validatingwebhookconfigurations
resourceNames:
- {{ include "spark-operator.webhook.name" . }}
verbs:
- get
- update
{{- if not .Values.spark.jobNamespaces | or (has "" .Values.spark.jobNamespaces) }}
{{ include "spark-operator.webhook.policyRules" . }}
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ include "spark-operator.webhook.clusterRoleBindingName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.webhook.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ include "spark-operator.webhook.clusterRoleName" . }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.webhook.roleName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
- apiGroups:
- ""
resources:
- secrets
verbs:
- create
- apiGroups:
- ""
resources:
- secrets
resourceNames:
- {{ include "spark-operator.webhook.secretName" . }}
verbs:
- get
- update
{{- if .Values.webhook.leaderElection.enable }}
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- create
- apiGroups:
- coordination.k8s.io
resources:
- leases
resourceNames:
- {{ include "spark-operator.webhook.leaderElectionName" . }}
verbs:
- get
- update
{{- end }}
{{- if has .Release.Namespace .Values.spark.jobNamespaces }}
{{ include "spark-operator.webhook.policyRules" . }}
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.webhook.roleBindingName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.webhook.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.webhook.roleName" . }}
{{- if and .Values.spark.jobNamespaces (not (has "" .Values.spark.jobNamespaces)) }}
{{- range $jobNamespace := .Values.spark.jobNamespaces }}
{{- if ne $jobNamespace $.Release.Namespace }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ include "spark-operator.webhook.roleName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.webhook.labels" $ | nindent 4 }}
{{- with $.Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
rules:
{{ include "spark-operator.webhook.policyRules" $ }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ include "spark-operator.webhook.roleBindingName" $ }}
namespace: {{ $jobNamespace }}
labels:
{{- include "spark-operator.webhook.labels" $ | nindent 4 }}
{{- with $.Values.webhook.rbac.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
subjects:
- kind: ServiceAccount
name: {{ include "spark-operator.webhook.serviceAccountName" $ }}
namespace: {{ $.Release.Namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: {{ include "spark-operator.webhook.roleName" $ }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -1,13 +0,0 @@
{{- if .Values.webhook.enable -}}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "spark-operator.webhookSecretName" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
data:
ca-key.pem: ""
ca-cert.pem: ""
server-key.pem: ""
server-cert.pem: ""
{{- end }}

View File

@ -1,15 +1,31 @@
{{- if .Values.webhook.enable -}}
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "spark-operator.webhookServiceName" . }}
name: {{ include "spark-operator.webhook.serviceName" . }}
labels:
{{- include "spark-operator.labels" . | nindent 4 }}
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
spec:
selector:
{{- include "spark-operator.selectorLabels" . | nindent 4 }}
{{- include "spark-operator.webhook.selectorLabels" . | nindent 4 }}
ports:
- port: 443
- port: {{ .Values.webhook.port }}
targetPort: {{ .Values.webhook.portName | quote }}
name: {{ .Values.webhook.portName }}
{{- end }}

View File

@ -0,0 +1,32 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
{{- if .Values.webhook.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: {{ .Values.webhook.serviceAccount.automountServiceAccountToken }}
metadata:
name: {{ include "spark-operator.webhook.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- with .Values.webhook.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,93 @@
{{/*
Copyright 2024 The Kubeflow authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/}}
{{- if .Values.webhook.enable }}
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: {{ include "spark-operator.webhook.name" . }}
labels:
{{- include "spark-operator.webhook.labels" . | nindent 4 }}
{{- if .Values.certManager.enable }}
annotations:
cert-manager.io/inject-ca-from: {{ .Release.Namespace }}/{{ include "spark-operator.certManager.certificate.name" . }}
{{- end }}
webhooks:
- name: validate-sparkoperator-k8s-io-v1beta2-sparkapplication.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /validate-sparkoperator-k8s-io-v1beta2-sparkapplication
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
rules:
- apiGroups: ["sparkoperator.k8s.io"]
apiVersions: ["v1beta2"]
resources: ["sparkapplications"]
operations: ["CREATE", "UPDATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
- name: validate-sparkoperator-k8s-io-v1beta2-scheduledsparkapplication.sparkoperator.k8s.io
admissionReviewVersions: ["v1"]
clientConfig:
service:
name: {{ include "spark-operator.webhook.serviceName" . }}
namespace: {{ .Release.Namespace }}
port: {{ .Values.webhook.port }}
path: /validate-sparkoperator-k8s-io-v1beta2-scheduledsparkapplication
sideEffects: NoneOnDryRun
{{- with .Values.webhook.failurePolicy }}
failurePolicy: {{ . }}
{{- end }}
{{- with .Values.spark.jobNamespaces }}
{{- if not (has "" .) }}
namespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
{{- range $jobNamespace := . }}
- {{ $jobNamespace }}
{{- end }}
{{- end }}
{{- end }}
rules:
- apiGroups: ["sparkoperator.k8s.io"]
apiVersions: ["v1beta2"]
resources: ["scheduledsparkapplications"]
operations: ["CREATE", "UPDATE"]
{{- with .Values.webhook.timeoutSeconds }}
timeoutSeconds: {{ . }}
{{- end }}
{{- end }}

View File

@ -0,0 +1,134 @@
#
# Copyright 2025 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test CertManager Certificate
templates:
- certmanager/certificate.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create Certificate if `webhook.enable` is `false`
capabilities:
apiVersions:
- cert-manager.io/v1/Certificate
set:
webhook:
enable: false
certManager:
enable: true
asserts:
- hasDocuments:
count: 0
- it: Should not create Certificate if `certManager.enable` is `false`
capabilities:
apiVersions:
- cert-manager.io/v1/Certificate
set:
webhook:
enable: true
certManager:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should create Certificate if `webhook.enable` is `true` and `certManager.enable` is `true`
capabilities:
apiVersions:
- cert-manager.io/v1/Certificate
set:
webhook:
enable: true
certManager:
enable: true
asserts:
- containsDocument:
apiVersion: cert-manager.io/v1
kind: Certificate
name: spark-operator-certificate
namespace: spark-operator
- it: Should fail if the cluster does not support `cert-manager.io/v1/Certificate`
set:
webhook:
enable: true
certManager:
enable: true
asserts:
- failedTemplate:
errorMessage: "The cluster does not support the required API version `cert-manager.io/v1` for `Certificate`."
- it: Should use self signed issuer if `certManager.issuerRef` is not set
capabilities:
apiVersions:
- cert-manager.io/v1/Certificate
set:
webhook:
enable: true
certManager:
enable: true
issuerRef:
group: cert-manager.io
kind: Issuer
name: test-issuer
asserts:
- equal:
path: spec.issuerRef
value:
group: cert-manager.io
kind: Issuer
name: test-issuer
- it: Should use the specified issuer if `certManager.issuerRef` is set
capabilities:
apiVersions:
- cert-manager.io/v1/Certificate
set:
webhook:
enable: true
certManager:
enable: true
issuerRef:
group: cert-manager.io
kind: Issuer
name: test-issuer
asserts:
- equal:
path: spec.issuerRef
value:
group: cert-manager.io
kind: Issuer
name: test-issuer
- it: Should use the specified duration if `certManager.duration` is set
capabilities:
apiVersions:
- cert-manager.io/v1/Certificate
set:
webhook:
enable: true
certManager:
enable: true
duration: 8760h
asserts:
- equal:
path: spec.duration
value: 8760h

View File

@ -0,0 +1,95 @@
#
# Copyright 2025 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test CertManager Issuer
templates:
- certmanager/issuer.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create Issuer if `webhook.enable` is `false`
capabilities:
apiVersions:
- cert-manager.io/v1/Issuer
set:
webhook:
enable: false
certManager:
enable: true
asserts:
- hasDocuments:
count: 0
- it: Should not create Issuer if `certManager.enable` is `false`
capabilities:
apiVersions:
- cert-manager.io/v1/Issuer
set:
webhook:
enable: true
certManager:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should not create Issuer if `certManager.issuerRef` is set
capabilities:
apiVersions:
- cert-manager.io/v1/Issuer
set:
webhook:
enable: true
certManager:
enable: true
issuerRef:
group: cert-manager.io
kind: Issuer
name: test-issuer
asserts:
- hasDocuments:
count: 0
- it: Should fail if the cluster does not support `cert-manager.io/v1/Issuer`
set:
webhook:
enable: true
certManager:
enable: true
asserts:
- failedTemplate:
errorMessage: "The cluster does not support the required API version `cert-manager.io/v1` for `Issuer`."
- it: Should create Issuer if `webhook.enable` is `true` and `certManager.enable` is `true`
capabilities:
apiVersions:
- cert-manager.io/v1/Issuer
set:
webhook:
enable: true
certManager:
enable: true
issuerRef: null
asserts:
- containsDocument:
apiVersion: cert-manager.io/v1
kind: Issuer
name: spark-operator-self-signed-issuer
namespace: spark-operator

View File

@ -0,0 +1,729 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller deployment
templates:
- controller/deployment.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should use the specified image repository if `image.registry`, `image.repository` and `image.tag` are set
set:
image:
registry: test-registry
repository: test-repository
tag: test-tag
asserts:
- equal:
path: spec.template.spec.containers[0].image
value: test-registry/test-repository:test-tag
- it: Should use the specified image pull policy if `image.pullPolicy` is set
set:
image:
pullPolicy: Always
asserts:
- equal:
path: spec.template.spec.containers[*].imagePullPolicy
value: Always
- it: Should set replicas if `controller.replicas` is set
set:
controller:
replicas: 10
asserts:
- equal:
path: spec.replicas
value: 10
- it: Should set replicas if `controller.replicas` is set
set:
controller:
replicas: 0
asserts:
- equal:
path: spec.replicas
value: 0
- it: Should add pod labels if `controller.labels` is set
set:
controller:
labels:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.metadata.labels.key1
value: value1
- equal:
path: spec.template.metadata.labels.key2
value: value2
- it: Should add prometheus annotations if `metrics.enable` is true
set:
prometheus:
metrics:
enable: true
port: 10254
endpoint: /metrics
asserts:
- equal:
path: spec.template.metadata.annotations["prometheus.io/scrape"]
value: "true"
- equal:
path: spec.template.metadata.annotations["prometheus.io/port"]
value: "10254"
- equal:
path: spec.template.metadata.annotations["prometheus.io/path"]
value: /metrics
- it: Should add pod annotations if `controller.annotations` is set
set:
controller:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.metadata.annotations.key1
value: value1
- equal:
path: spec.template.metadata.annotations.key2
value: value2
- it: Should contain `--zap-log-level` arg if `controller.logLevel` is set
set:
controller:
logLevel: debug
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --zap-log-level=debug
- it: Should contain `--namespaces` arg if `spark.jobNamespaces` is set
set:
spark:
jobNamespaces:
- ns1
- ns2
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --namespaces=ns1,ns2
- it: Should set namespaces to all namespaces (`""`) if `spark.jobNamespaces` contains empty string
set:
spark:
jobNamespaces:
- ""
- default
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --namespaces=""
- it: Should contain `--controller-threads` arg if `controller.workers` is set
set:
controller:
workers: 30
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --controller-threads=30
- it: Should contain `--enable-ui-service` arg if `controller.uiService.enable` is set to `true`
set:
controller:
uiService:
enable: true
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --enable-ui-service=true
- it: Should contain `--ingress-url-format` arg if `controller.uiIngress.enable` is set to `true` and `controller.uiIngress.urlFormat` is set
set:
controller:
uiService:
enable: true
uiIngress:
enable: true
urlFormat: "{{$appName}}.example.com/{{$appNamespace}}/{{$appName}}"
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --ingress-url-format={{$appName}}.example.com/{{$appNamespace}}/{{$appName}}
- it: Should contain `--ingress-class-name` arg if `controller.uiIngress.enable` is set to `true` and `controller.uiIngress.ingressClassName` is set
set:
controller:
uiService:
enable: true
uiIngress:
enable: true
ingressClassName: nginx
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --ingress-class-name=nginx
- it: Should contain `--ingress-tls` arg if `controller.uiIngress.enable` is set to `true` and `controller.uiIngress.tls` is set
set:
controller:
uiService:
enable: true
uiIngress:
enable: true
tls:
- hosts:
- "*.test.com"
secretName: test-secret
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: '--ingress-tls=[{"hosts":["*.test.com"],"secretName":"test-secret"}]'
- it: Should contain `--ingress-annotations` arg if `controller.uiIngress.enable` is set to `true` and `controller.uiIngress.annotations` is set
set:
controller:
uiService:
enable: true
uiIngress:
enable: true
annotations:
cert-manager.io/cluster-issuer: "letsencrypt"
kubernetes.io/ingress.class: nginx
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: '--ingress-annotations={"cert-manager.io/cluster-issuer":"letsencrypt","kubernetes.io/ingress.class":"nginx"}'
- it: Should contain `--enable-batch-scheduler` arg if `controller.batchScheduler.enable` is `true`
set:
controller:
batchScheduler:
enable: true
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --enable-batch-scheduler=true
- it: Should contain `--default-batch-scheduler` arg if `controller.batchScheduler.default` is set
set:
controller:
batchScheduler:
enable: true
default: yunikorn
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --default-batch-scheduler=yunikorn
- it: Should contain `--enable-metrics` arg if `prometheus.metrics.enable` is set to `true`
set:
prometheus:
metrics:
enable: true
port: 12345
portName: test-port
endpoint: /test-endpoint
prefix: test-prefix
jobStartLatencyBuckets: "180,360,420,690"
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --enable-metrics=true
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --metrics-bind-address=:12345
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --metrics-endpoint=/test-endpoint
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --metrics-prefix=test-prefix
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --metrics-labels=app_type
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --metrics-job-start-latency-buckets=180,360,420,690
- it: Should enable leader election by default
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --leader-election=true
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --leader-election-lock-name=spark-operator-controller-lock
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --leader-election-lock-namespace=spark-operator
- it: Should disable leader election if `controller.leaderElection.enable` is set to `false`
set:
controller:
leaderElection:
enable: false
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --leader-election=false
- it: Should add metric ports if `prometheus.metrics.enable` is true
set:
prometheus:
metrics:
enable: true
port: 10254
portName: metrics
asserts:
- contains:
path: spec.template.spec.containers[0].ports
content:
name: metrics
containerPort: 10254
count: 1
- it: Should add environment variables if `controller.env` is set
set:
controller:
env:
- name: ENV_NAME_1
value: ENV_VALUE_1
- name: ENV_NAME_2
valueFrom:
configMapKeyRef:
name: test-configmap
key: test-key
optional: false
asserts:
- contains:
path: spec.template.spec.containers[0].env
content:
name: ENV_NAME_1
value: ENV_VALUE_1
- contains:
path: spec.template.spec.containers[0].env
content:
name: ENV_NAME_2
valueFrom:
configMapKeyRef:
name: test-configmap
key: test-key
optional: false
- it: Should add environment variable sources if `controller.envFrom` is set
set:
controller:
envFrom:
- configMapRef:
name: test-configmap
optional: false
- secretRef:
name: test-secret
optional: false
asserts:
- contains:
path: spec.template.spec.containers[0].envFrom
content:
configMapRef:
name: test-configmap
optional: false
- contains:
path: spec.template.spec.containers[0].envFrom
content:
secretRef:
name: test-secret
optional: false
- it: Should add volume mounts if `controller.volumeMounts` is set
set:
controller:
volumeMounts:
- name: volume1
mountPath: /volume1
- name: volume2
mountPath: /volume2
asserts:
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume1
mountPath: /volume1
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume2
mountPath: /volume2
- it: Should add resources if `controller.resources` is set
set:
controller:
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
asserts:
- equal:
path: spec.template.spec.containers[0].resources
value:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- it: Should add container securityContext if `controller.securityContext` is set
set:
controller:
securityContext:
readOnlyRootFilesystem: true
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
privileged: false
asserts:
- equal:
path: spec.template.spec.containers[0].securityContext.readOnlyRootFilesystem
value: true
- equal:
path: spec.template.spec.containers[0].securityContext.runAsUser
value: 1000
- equal:
path: spec.template.spec.containers[0].securityContext.runAsGroup
value: 2000
- equal:
path: spec.template.spec.containers[0].securityContext.fsGroup
value: 3000
- equal:
path: spec.template.spec.containers[0].securityContext.allowPrivilegeEscalation
value: false
- equal:
path: spec.template.spec.containers[0].securityContext.capabilities
value:
drop:
- ALL
- equal:
path: spec.template.spec.containers[0].securityContext.runAsNonRoot
value: true
- equal:
path: spec.template.spec.containers[0].securityContext.privileged
value: false
- it: Should add sidecars if `controller.sidecars` is set
set:
controller:
sidecars:
- name: sidecar1
image: sidecar-image1
- name: sidecar2
image: sidecar-image2
asserts:
- contains:
path: spec.template.spec.containers
content:
name: sidecar1
image: sidecar-image1
- contains:
path: spec.template.spec.containers
content:
name: sidecar2
image: sidecar-image2
- it: Should add secrets if `image.pullSecrets` is set
set:
image:
pullSecrets:
- name: test-secret1
- name: test-secret2
asserts:
- equal:
path: spec.template.spec.imagePullSecrets[0].name
value: test-secret1
- equal:
path: spec.template.spec.imagePullSecrets[1].name
value: test-secret2
- it: Should add volumes if `controller.volumes` is set
set:
controller:
volumes:
- name: volume1
emptyDir: {}
- name: volume2
emptyDir: {}
asserts:
- contains:
path: spec.template.spec.volumes
content:
name: volume1
emptyDir: {}
count: 1
- contains:
path: spec.template.spec.volumes
content:
name: volume2
emptyDir: {}
count: 1
- it: Should add nodeSelector if `controller.nodeSelector` is set
set:
controller:
nodeSelector:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.spec.nodeSelector.key1
value: value1
- equal:
path: spec.template.spec.nodeSelector.key2
value: value2
- it: Should add affinity if `controller.affinity` is set
set:
controller:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
asserts:
- equal:
path: spec.template.spec.affinity
value:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
- it: Should add tolerations if `controller.tolerations` is set
set:
controller:
tolerations:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule
asserts:
- equal:
path: spec.template.spec.tolerations
value:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule
- it: Should add priorityClassName if `controller.priorityClassName` is set
set:
controller:
priorityClassName: test-priority-class
asserts:
- equal:
path: spec.template.spec.priorityClassName
value: test-priority-class
- it: Should add pod securityContext if `controller.podSecurityContext` is set
set:
controller:
podSecurityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
asserts:
- equal:
path: spec.template.spec.securityContext
value:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
- it: Should not contain topologySpreadConstraints if `controller.topologySpreadConstraints` is not set
set:
controller:
topologySpreadConstraints: []
asserts:
- notExists:
path: spec.template.spec.topologySpreadConstraints
- it: Should add topologySpreadConstraints if `controller.topologySpreadConstraints` is set and `controller.replicas` is greater than 1
set:
controller:
replicas: 2
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
asserts:
- equal:
path: spec.template.spec.topologySpreadConstraints
value:
- labelSelector:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: spark-operator
app.kubernetes.io/name: spark-operator
maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- labelSelector:
matchLabels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: spark-operator
app.kubernetes.io/name: spark-operator
maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
- it: Should fail if `controller.topologySpreadConstraints` is set and `controller.replicas` is not greater than 1
set:
controller:
replicas: 1
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
asserts:
- failedTemplate:
errorMessage: "controller.replicas must be greater than 1 to enable topology spread constraints for controller pods"
- it: Should contain `--pprof-bind-address` arg if `controller.pprof.enable` is set to `true`
set:
controller:
pprof:
enable: true
port: 12345
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --pprof-bind-address=:12345
- it: Should add pprof ports if `controller.pprof.enable` is set to `true`
set:
controller:
pprof:
enable: true
port: 12345
portName: pprof-test
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].ports
content:
name: pprof-test
containerPort: 12345
count: 1
- it: Should contain `--workqueue-ratelimiter-max-delay` arg if `controller.workqueueRateLimiter.maxDelay.enable` is set to `true`
set:
controller:
workqueueRateLimiter:
bucketQPS: 1
bucketSize: 2
maxDelay:
enable: true
duration: 3h
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --workqueue-ratelimiter-bucket-qps=1
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --workqueue-ratelimiter-bucket-size=2
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --workqueue-ratelimiter-max-delay=3h
- it: Should contain `--workqueue-ratelimiter-max-delay` arg if `controller.workqueueRateLimiter.maxDelay.enable` is set to `true`
set:
controller:
maxDelay:
enable: false
duration: 1h
asserts:
- notContains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --workqueue-ratelimiter-max-delay=1h
- it: Should contain `driver-pod-creation-grace-period` arg if `controller.driverPodCreationGracePeriod` is set
set:
controller:
driverPodCreationGracePeriod: 30s
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --driver-pod-creation-grace-period=30s
- it: Should contain `--max-tracked-executor-per-app` arg if `controller.maxTrackedExecutorPerApp` is set
set:
controller:
maxTrackedExecutorPerApp: 123
asserts:
- contains:
path: spec.template.spec.containers[?(@.name=="spark-operator-controller")].args
content: --max-tracked-executor-per-app=123

View File

@ -0,0 +1,68 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller pod disruption budget
templates:
- controller/poddisruptionbudget.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not render podDisruptionBudget if `controller.podDisruptionBudget.enable` is false
set:
controller:
podDisruptionBudget:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should fail if `controller.replicas` is less than 2 when `controller.podDisruptionBudget.enable` is true
set:
controller:
replicas: 1
podDisruptionBudget:
enable: true
asserts:
- failedTemplate:
errorMessage: "controller.replicas must be greater than 1 to enable pod disruption budget for controller"
- it: Should render spark operator podDisruptionBudget if `controller.podDisruptionBudget.enable` is true
set:
controller:
replicas: 2
podDisruptionBudget:
enable: true
asserts:
- containsDocument:
apiVersion: policy/v1
kind: PodDisruptionBudget
name: spark-operator-controller-pdb
- it: Should set minAvailable if `controller.podDisruptionBudget.minAvailable` is specified
set:
controller:
replicas: 2
podDisruptionBudget:
enable: true
minAvailable: 3
asserts:
- equal:
path: spec.minAvailable
value: 3

View File

@ -0,0 +1,165 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller rbac
templates:
- controller/rbac.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create controller RBAC resources if `controller.rbac.create` is false
set:
controller:
rbac:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should create controller ClusterRole by default
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
name: spark-operator-controller
- it: Should create controller ClusterRoleBinding by default
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
name: spark-operator-controller
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-controller
namespace: spark-operator
count: 1
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: spark-operator-controller
- it: Should add extra annotations to controller ClusterRole if `controller.rbac.annotations` is set
set:
controller:
rbac:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2
- it: Should create role and rolebinding for controller in release namespace
documentIndex: 2
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-controller
namespace: spark-operator
- it: Should create role and rolebinding for controller in release namespace
documentIndex: 3
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-controller
namespace: spark-operator
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-controller
namespace: spark-operator
count: 1
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark-operator-controller
- it: Should create roles and rolebindings for controller in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 4
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-controller
namespace: default
- it: Should create roles and rolebindings for controller in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 5
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-controller
namespace: default
- it: Should create roles and rolebindings for controller in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 6
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-controller
namespace: spark
- it: Should create roles and rolebindings for controller in every spark job namespace if `spark.jobNamespaces` is set and does not contain empty string
set:
spark:
jobNamespaces:
- default
- spark
documentIndex: 7
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-controller
namespace: spark

View File

@ -0,0 +1,44 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller deployment
templates:
- controller/service.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should create the pprof service correctly
set:
controller:
pprof:
enable: true
port: 12345
portName: pprof-test
asserts:
- containsDocument:
apiVersion: v1
kind: Service
name: spark-operator-controller-svc
- equal:
path: spec.ports[0]
value:
port: 12345
targetPort: pprof-test
name: pprof-test

View File

@ -0,0 +1,67 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test controller service account
templates:
- controller/serviceaccount.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create controller service account if `controller.serviceAccount.create` is false
set:
controller:
serviceAccount:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should create controller service account by default
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-controller
- it: Should use the specified service account name if `controller.serviceAccount.name` is set
set:
controller:
serviceAccount:
name: custom-service-account
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: custom-service-account
- it: Should add extra annotations if `controller.serviceAccount.annotations` is set
set:
controller:
serviceAccount:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2

View File

@ -1,301 +0,0 @@
suite: Test spark operator deployment
templates:
- deployment.yaml
release:
name: spark-operator
tests:
- it: Should contain namespace arg when sparkJobNamespaces is equal to 1
set:
sparkJobNamespaces:
- ns1
asserts:
- contains:
path: spec.template.spec.containers[0].args
content: -namespace=ns1
- it: Should add pod annotations if podAnnotations is set
set:
podAnnotations:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.metadata.annotations.key1
value: value1
- equal:
path: spec.template.metadata.annotations.key2
value: value2
- it: Should add prometheus annotations if metrics.enable is true
set:
metrics:
enable: true
port: 10254
endpoint: /metrics
asserts:
- equal:
path: spec.template.metadata.annotations["prometheus.io/scrape"]
value: "true"
- equal:
path: spec.template.metadata.annotations["prometheus.io/port"]
value: "10254"
- equal:
path: spec.template.metadata.annotations["prometheus.io/path"]
value: /metrics
- it: Should add secrets if imagePullSecrets is set
set:
imagePullSecrets:
- name: test-secret1
- name: test-secret2
asserts:
- equal:
path: spec.template.spec.imagePullSecrets[0].name
value: test-secret1
- equal:
path: spec.template.spec.imagePullSecrets[1].name
value: test-secret2
- it: Should add pod securityContext if podSecurityContext is set
set:
podSecurityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
asserts:
- equal:
path: spec.template.spec.securityContext.runAsUser
value: 1000
- equal:
path: spec.template.spec.securityContext.runAsGroup
value: 2000
- equal:
path: spec.template.spec.securityContext.fsGroup
value: 3000
- it: Should use the specified image repository if image.repository and image.tag is set
set:
image:
repository: test-repository
tag: test-tag
asserts:
- equal:
path: spec.template.spec.containers[0].image
value: test-repository:test-tag
- it: Should use the specified image pull policy if image.pullPolicy is set
set:
image:
pullPolicy: Always
asserts:
- equal:
path: spec.template.spec.containers[0].imagePullPolicy
value: Always
- it: Should add container securityContext if securityContext is set
set:
securityContext:
runAsUser: 1000
runAsGroup: 2000
fsGroup: 3000
asserts:
- equal:
path: spec.template.spec.containers[0].securityContext.runAsUser
value: 1000
- equal:
path: spec.template.spec.containers[0].securityContext.runAsGroup
value: 2000
- equal:
path: spec.template.spec.containers[0].securityContext.fsGroup
value: 3000
- it: Should add metric ports if metrics.enable is true
set:
metrics:
enable: true
port: 10254
portName: metrics
asserts:
- contains:
path: spec.template.spec.containers[0].ports
content:
name: metrics
containerPort: 10254
count: 1
- it: Should add webhook ports if webhook.enable is true
set:
webhook:
enable: true
port: 8080
portName: webhook
asserts:
- contains:
path: spec.template.spec.containers[0].ports
content:
name: webhook
containerPort: 8080
count: 1
- it: Should add resources if resources is set
set:
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
asserts:
- equal:
path: spec.template.spec.containers[0].resources
value:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- it: Should add sidecars if sidecars is set
set:
sidecars:
- name: sidecar1
image: sidecar-image1
- name: sidecar2
image: sidecar-image2
asserts:
- contains:
path: spec.template.spec.containers
content:
name: sidecar1
image: sidecar-image1
count: 1
- contains:
path: spec.template.spec.containers
content:
name: sidecar2
image: sidecar-image2
count: 1
- it: Should add volumes if volumes is set
set:
volumes:
- name: volume1
emptyDir: {}
- name: volume2
emptyDir: {}
asserts:
- contains:
path: spec.template.spec.volumes
content:
name: volume1
emptyDir: {}
count: 1
- contains:
path: spec.template.spec.volumes
content:
name: volume2
emptyDir: {}
count: 1
- it: Should add volume mounts if volumeMounts is set
set:
volumeMounts:
- name: volume1
mountPath: /volume1
- name: volume2
mountPath: /volume2
asserts:
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume1
mountPath: /volume1
count: 1
- contains:
path: spec.template.spec.containers[0].volumeMounts
content:
name: volume2
mountPath: /volume2
count: 1
- it: Should add nodeSelector if nodeSelector is set
set:
nodeSelector:
key1: value1
key2: value2
asserts:
- equal:
path: spec.template.spec.nodeSelector.key1
value: value1
- equal:
path: spec.template.spec.nodeSelector.key2
value: value2
- it: Should add affinity if affinity is set
set:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
asserts:
- equal:
path: spec.template.spec.affinity
value:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
- antarctica-west1
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
- it: Should add tolerations if tolerations is set
set:
tolerations:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule
asserts:
- equal:
path: spec.template.spec.tolerations
value:
- key: key1
operator: Equal
value: value1
effect: NoSchedule
- key: key2
operator: Exists
effect: NoSchedule

View File

@ -1,37 +0,0 @@
suite: Test spark operator podDisruptionBudget
templates:
- poddisruptionbudget.yaml
release:
name: spark-operator
tests:
- it: Should not render spark operator podDisruptionBudget if podDisruptionBudget.enable is false
set:
podDisruptionBudget:
enable: false
asserts:
- hasDocuments:
count: 0
- it: Should render spark operator podDisruptionBudget if podDisruptionBudget.enable is true
set:
podDisruptionBudget:
enable: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: policy/v1
kind: PodDisruptionBudget
name: spark-operator-podDisruptionBudget
- it: Should set minAvailable from values
set:
podDisruptionBudget:
enable: true
minAvailable: 3
asserts:
- equal:
path: spec.template.minAvailable
value: 3

View File

@ -0,0 +1,102 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test prometheus pod monitor
templates:
- prometheus/podmonitor.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create pod monitor by default
asserts:
- hasDocuments:
count: 0
- it: Should fail if `prometheus.podMonitor.create` is true and `prometheus.metrics.enable` is false
set:
prometheus:
metrics:
enable: false
podMonitor:
create: true
asserts:
- failedTemplate:
errorMessage: "`metrics.enable` must be set to true when `podMonitor.create` is true."
- it: Should fail if the cluster does not support `monitoring.coreos.com/v1/PodMonitor` even if`prometheus.podMonitor.create` and `prometheus.metrics.enable` are both true
set:
prometheus:
metrics:
enable: true
podMonitor:
create: true
asserts:
- failedTemplate:
errorMessage: "The cluster does not support the required API version `monitoring.coreos.com/v1` for `PodMonitor`."
- it: Should create pod monitor if the cluster support `monitoring.coreos.com/v1/PodMonitor` and `prometheus.podMonitor.create` and `prometheus.metrics.enable` are both true
capabilities:
apiVersions:
- monitoring.coreos.com/v1/PodMonitor
set:
prometheus:
metrics:
enable: true
podMonitor:
create: true
asserts:
- containsDocument:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
name: spark-operator-podmonitor
- it: Should use the specified labels, jobLabel and podMetricsEndpoint
capabilities:
apiVersions:
- monitoring.coreos.com/v1/PodMonitor
set:
prometheus:
metrics:
enable: true
portName: custom-port
podMonitor:
create: true
labels:
key1: value1
key2: value2
jobLabel: custom-job-label
podMetricsEndpoint:
scheme: https
interval: 10s
asserts:
- equal:
path: metadata.labels
value:
key1: value1
key2: value2
- equal:
path: spec.podMetricsEndpoints[0]
value:
port: custom-port
scheme: https
interval: 10s
- equal:
path: spec.jobLabel
value: custom-job-label

View File

@ -1,90 +0,0 @@
suite: Test spark operator rbac
templates:
- rbac.yaml
release:
name: spark-operator
tests:
- it: Should not render spark operator rbac resources if rbac.create is false and rbac.createClusterRole is false
set:
rbac:
create: false
createClusterRole: false
asserts:
- hasDocuments:
count: 0
- it: Should render spark operator cluster role if rbac.create is true
set:
rbac:
create: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
name: spark-operator
- it: Should render spark operator cluster role if rbac.createClusterRole is true
set:
rbac:
createClusterRole: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
name: spark-operator
- it: Should render spark operator cluster role binding if rbac.create is true
set:
rbac:
create: true
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
name: spark-operator
- it: Should render spark operator cluster role binding correctly if rbac.createClusterRole is true
set:
rbac:
createClusterRole: true
release:
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
name: spark-operator
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator
namespace: NAMESPACE
count: 1
- equal:
path: roleRef
value:
kind: ClusterRole
name: spark-operator
apiGroup: rbac.authorization.k8s.io
- it: Should add extra annotations to spark operator cluster role if rbac.annotations is set
set:
rbac:
annotations:
key1: value1
key2: value2
documentIndex: 0
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2

View File

@ -1,54 +0,0 @@
suite: Test spark operator service account
templates:
- serviceaccount.yaml
release:
name: spark-operator
tests:
- it: Should not render service account if serviceAccounts.sparkoperator.create is false
set:
serviceAccounts:
sparkoperator:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should render service account if serviceAccounts.sparkoperator.create is true
set:
serviceAccounts:
sparkoperator:
create: true
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator
- it: Should use the specified service account name if serviceAccounts.sparkoperator.name is set
set:
serviceAccounts:
sparkoperator:
name: custom-service-account
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: custom-service-account
- it: Should add extra annotations if serviceAccounts.sparkoperator.annotations is set
set:
serviceAccounts:
sparkoperator:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2

View File

@ -1,133 +0,0 @@
suite: Test spark rbac
templates:
- spark-rbac.yaml
release:
name: spark-operator
tests:
- it: Should not render spark rbac resources if rbac.create is false and rbac.createRole is false
set:
rbac:
create: false
createRole: false
asserts:
- hasDocuments:
count: 0
- it: Should render spark role if rbac.create is true
set:
rbac:
create: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
- it: Should render spark role if rbac.createRole is true
set:
rbac:
createRole: true
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
- it: Should render spark role binding if rbac.create is true
set:
rbac:
create: true
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
- it: Should render spark role binding if rbac.createRole is true
set:
rbac:
createRole: true
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
- it: Should create a single spark role with namespace "" by default
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
namespace: ""
- it: Should create a single spark role binding with namespace "" by default
values:
- ../values.yaml
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
namespace: ""
- it: Should render multiple spark roles if sparkJobNamespaces is set with multiple values
set:
sparkJobNamespaces:
- ns1
- ns2
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
namespace: ns1
- it: Should render multiple spark role bindings if sparkJobNamespaces is set with multiple values
set:
sparkJobNamespaces:
- ns1
- ns2
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
namespace: ns1
- it: Should render multiple spark roles if sparkJobNamespaces is set with multiple values
set:
sparkJobNamespaces:
- ns1
- ns2
documentIndex: 2
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-role
namespace: ns2
- it: Should render multiple spark role bindings if sparkJobNamespaces is set with multiple values
set:
sparkJobNamespaces:
- ns1
- ns2
documentIndex: 3
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
namespace: ns2

View File

@ -1,112 +0,0 @@
suite: Test spark service account
templates:
- spark-serviceaccount.yaml
release:
name: spark-operator
tests:
- it: Should not render service account if serviceAccounts.spark.create is false
set:
serviceAccounts:
spark:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should render service account if serviceAccounts.spark.create is true
set:
serviceAccounts:
spark:
create: true
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-spark
- it: Should use the specified service account name if serviceAccounts.spark.name is set
set:
serviceAccounts:
spark:
name: spark
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
- it: Should add extra annotations if serviceAccounts.spark.annotations is set
set:
serviceAccounts:
spark:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2
- it: Should create multiple service accounts if sparkJobNamespaces is set
set:
serviceAccounts:
spark:
name: spark
sparkJobNamespaces:
- ns1
- ns2
- ns3
documentIndex: 0
asserts:
- hasDocuments:
count: 3
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
namespace: ns1
- it: Should create multiple service accounts if sparkJobNamespaces is set
set:
serviceAccounts:
spark:
name: spark
sparkJobNamespaces:
- ns1
- ns2
- ns3
documentIndex: 1
asserts:
- hasDocuments:
count: 3
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
namespace: ns2
- it: Should create multiple service accounts if sparkJobNamespaces is set
set:
serviceAccounts:
spark:
name: spark
sparkJobNamespaces:
- ns1
- ns2
- ns3
documentIndex: 2
asserts:
- hasDocuments:
count: 3
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
namespace: ns3

View File

@ -0,0 +1,182 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test Spark RBAC
templates:
- spark/rbac.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create RBAC resources for Spark if `spark.rbac.create` is false
set:
spark:
rbac:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should create RBAC resources for Spark in namespace `default` by default
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-spark
namespace: default
- it: Should create RBAC resources for Spark in namespace `default` by default
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-spark
namespace: default
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-spark
namespace: default
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark-operator-spark
- it: Should create RBAC resources for Spark in every Spark job namespace
set:
spark:
jobNamespaces:
- ns1
- ns2
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-spark
namespace: ns1
- it: Should create RBAC resources for Spark in every Spark job namespace
set:
spark:
jobNamespaces:
- ns1
- ns2
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-spark
namespace: ns1
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-spark
namespace: ns1
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark-operator-spark
- it: Should create RBAC resources for Spark in every Spark job namespace
set:
spark:
jobNamespaces:
- ns1
- ns2
documentIndex: 2
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark-operator-spark
namespace: ns2
- it: Should create RBAC resources for Spark in every Spark job namespace
set:
spark:
jobNamespaces:
- ns1
- ns2
documentIndex: 3
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark-operator-spark
namespace: ns2
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark-operator-spark
namespace: ns2
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark-operator-spark
- it: Should use the specified service account name if `spark.serviceAccount.name` is set
set:
spark:
serviceAccount:
name: spark
documentIndex: 0
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
name: spark
namespace: default
- it: Should use the specified service account name if `spark.serviceAccount.name` is set
set:
spark:
serviceAccount:
name: spark
documentIndex: 1
asserts:
- containsDocument:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
name: spark
namespace: default
- contains:
path: subjects
content:
kind: ServiceAccount
name: spark
namespace: default
- equal:
path: roleRef
value:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: spark

View File

@ -0,0 +1,101 @@
#
# Copyright 2024 The Kubeflow authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
suite: Test spark service account
templates:
- spark/serviceaccount.yaml
release:
name: spark-operator
namespace: spark-operator
tests:
- it: Should not create service account if `spark.serviceAccount.create` is false
set:
spark:
serviceAccount:
create: false
asserts:
- hasDocuments:
count: 0
- it: Should create service account by default
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-spark
- it: Should use the specified service account name if `spark.serviceAccount.name` is set
set:
spark:
serviceAccount:
name: spark
asserts:
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark
- it: Should add extra annotations if `spark.serviceAccount.annotations` is set
set:
spark:
serviceAccount:
annotations:
key1: value1
key2: value2
asserts:
- equal:
path: metadata.annotations.key1
value: value1
- equal:
path: metadata.annotations.key2
value: value2
- it: Should create service account for every non-empty spark job namespace if `spark.jobNamespaces` is set with multiple values
set:
spark:
jobNamespaces:
- ""
- ns1
- ns2
documentIndex: 0
asserts:
- hasDocuments:
count: 2
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-spark
namespace: ns1
- it: Should create service account for every non-empty spark job namespace if `spark.jobNamespaces` is set with multiple values
set:
spark:
jobNamespaces:
- ""
- ns1
- ns2
documentIndex: 1
asserts:
- hasDocuments:
count: 2
- containsDocument:
apiVersion: v1
kind: ServiceAccount
name: spark-operator-spark
namespace: ns2

Some files were not shown because too many files have changed in this diff Show More