Compare commits

...

302 Commits

Author SHA1 Message Date
Kubernetes Prow Robot 0ff374134f
Merge pull request #8239 from sbueringer/pr-update-docs
Remove redundant warning in the autoscaler Cluster API documentation
2025-06-12 12:36:56 -07:00
Kubernetes Prow Robot 31eb0a137d
Merge pull request #8181 from MenD32/fix/topology-spread-binpacking
fix: binpacking simulator scale up optimization on pods with topology…
2025-06-12 09:36:57 -07:00
Stefan Bueringer 6c12c60942
Remove redundant warning 2025-06-12 10:08:39 +02:00
Kubernetes Prow Robot 8014ae253d
Merge pull request #8236 from BigDarkClown/update-1.34
Update deps to use k8s 1.34.0-alpha.0
2025-06-11 08:22:56 -07:00
Bartłomiej Wróblewski efdd034d91 Update deps to use k8s 1.34.0-alpha.0 2025-06-11 15:13:28 +00:00
Kubernetes Prow Robot 8e7d62b26e
Merge pull request #8109 from abdelrahman882/dra-node-readiness
Handle node readiness for DRA after a scale-up
2025-06-11 07:40:56 -07:00
Kubernetes Prow Robot 5bc430e9a8
Merge pull request #8235 from mtrqq/dra-delta-snapshot
[DRA] Enable DeltaSnapshotStore to work when DRA is enabled
2025-06-11 05:04:56 -07:00
Omran b07e1e4c70
Handle node readiness for DRA after a scale-up 2025-06-11 08:54:53 +00:00
Maksym Fuhol 61328095ae Enable DeltaSnapshotStore to work when DRA is enabled 2025-06-11 08:29:53 +00:00
MenD32 0002157b3a nit: when scheduling fails on topology constraints, skip the last node that failed scheduling
Signed-off-by: MenD32 <amit.mendelevitch@gmail.com>
2025-06-11 09:38:49 +03:00
Kubernetes Prow Robot 134d636520
Merge pull request #8090 from mtrqq/dra-shapshot-patch
Patches based implementation for DRA snapshot.
2025-06-10 16:52:55 -07:00
Jack Francis a880a2bca3
Merge pull request #8207 from kubernetes/dependabot/docker/vertical-pod-autoscaler/pkg/updater/golang-1.24.4
Bump golang from 1.24.3 to 1.24.4 in /vertical-pod-autoscaler/pkg/updater
2025-06-10 12:09:31 -07:00
Jack Francis 4a989c0268
Merge pull request #8206 from kubernetes/dependabot/docker/vertical-pod-autoscaler/pkg/recommender/golang-1.24.4
Bump golang from 1.24.3 to 1.24.4 in /vertical-pod-autoscaler/pkg/recommender
2025-06-10 12:08:40 -07:00
Kubernetes Prow Robot 5ae532f340
Merge pull request #8148 from mohammaddehnavi/patch-1
bug: Fix the "VolumeAttachment" issue on the AWS providers example in the new version
2025-06-10 11:54:56 -07:00
Maksym Fuhol 98f86a71e6 Refactor PatchSet to be contained in a separate package for reusability. 2025-06-10 18:40:54 +00:00
Kubernetes Prow Robot 189e1397e9
Merge pull request #8204 from voelzmo/enh/improve-admission-controller-logging
include pod namespace when logging updates
2025-06-10 11:06:55 -07:00
Jack Francis 76f2bf8ae3
Merge pull request #8208 from kubernetes/dependabot/docker/vertical-pod-autoscaler/pkg/admission-controller/golang-1.24.4
Bump golang from 1.24.3 to 1.24.4 in /vertical-pod-autoscaler/pkg/admission-controller
2025-06-10 10:57:57 -07:00
Jack Francis 39487100ce
Merge pull request #8217 from elmiko/capi-readme-annotations
update cluster api provider readme
2025-06-10 10:18:30 -07:00
Jack Francis 67da65c813
Merge pull request #8219 from towca/jtuznik/drain-fix
Fix parsing --drain-priority-config
2025-06-10 10:11:26 -07:00
Kubernetes Prow Robot a187bc14d6
Merge pull request #8213 from jackfrancis/azure-agent-pool-ut
azure: fix azure_agent_pool UT
2025-06-10 09:22:25 -07:00
Jack Francis 79114057db azure: fix azure_agent_pool UT
Signed-off-by: Jack Francis <jackfrancis@gmail.com>
2025-06-10 08:55:51 -07:00
Kuba Tużnik 9e38ce69aa Fix parsing --drain-priority-config
The --drain-priority-config flag was only parsed if isFlagPassed()
returned true for it. However, isFlagPassed() would actually silently
never work. The implementation relied on walking the flags parsed by
the standard Go "flag" pkg. This seems like it would work since CA
defines its flags using the standard "flag" pkg. However, the flags
are then actually parsed and processed by the "github.com/spf13/pflag"
pkg, so isFlagPassed() could never see them.

This commit replaces removes isFlagPassed() and replaces the calls
with a pkg-provided pflag.CommandLine.Changed(). Unit tests are added
to verify that the flag is correctly parsed after this change.
2025-06-10 16:54:43 +02:00
elmiko 9987e76e91 update cluster api provider readme
this change adds some notes about how the scale from zero annotations
will interact with information already present on cluster api resources.
2025-06-10 09:56:54 -04:00
Maksym Fuhol f03a67ed81 Migrate DraSnapshot off PodOwnsClaim/ClaimOwningPod to DRA API IsForPod 2025-06-09 15:45:13 +00:00
Maksym Fuhol 0912f9b4fb Add tests for Patch and PatchSet objects 2025-06-09 15:35:20 +00:00
Maksym Fuhol eb48666180 Patches based implementation for DRA snapshot.
Instead of exposing DeepCloning API - dynamicresources.Snapshot now exposes Fork/Commit/Revert methods mimicking operations in ClusterSnapshot. Now instead of storing full-scale copies of DRA snapshot we are only storing a single object with patches list stored inside, which allows very effective Clone/Fork/Revert operations while sacrificing some performance and memory allocation while doing fetch requests and inplace objects modifications (ResourceClaims).
2025-06-09 15:35:19 +00:00
Maksym Fuhol 35029fed7a Add PredicateSnapshot benchmark for scheduling with DRA objects. 2025-06-09 15:32:37 +00:00
dependabot[bot] 8a975ee130
Bump golang in /vertical-pod-autoscaler/pkg/admission-controller
Bumps golang from 1.24.3 to 1.24.4.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-06 18:54:44 +00:00
dependabot[bot] 836f0a078c
Bump golang in /vertical-pod-autoscaler/pkg/updater
Bumps golang from 1.24.3 to 1.24.4.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-06 18:39:05 +00:00
dependabot[bot] 919665f93d
Bump golang in /vertical-pod-autoscaler/pkg/recommender
Bumps golang from 1.24.3 to 1.24.4.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-06 18:26:17 +00:00
Marco Voelz 18b96f2bc5 include pod namespace when logging updates 2025-06-06 14:44:08 +02:00
Kubernetes Prow Robot 3fdf196403
Merge pull request #8171 from maximrub/fix/bug-8170-alibaba-rrsa-old-envvar-support
fix bug 8170 to support old env var
2025-06-06 03:58:39 -07:00
Kubernetes Prow Robot 9220470d9f
Merge pull request #8188 from towca/jtuznik/tni-fix
CA: stop mutating the result of .TemplateNodeInfo() in SanitizedTemplateNodeInfoFromNodeGroup
2025-06-04 09:20:38 -07:00
Kubernetes Prow Robot 606aef22cb
Merge pull request #7852 from omerap12/migrate-ClaimOwningPod
migrate ClaimOwningPod to use upstream IsForPod
2025-06-04 06:24:38 -07:00
Kuba Tużnik 2f6c19a171 CA: stop mutating the result of .TemplateNodeInfo() in SanitizedTemplateNodeInfoFromNodeGroup
SanitizedTemplateNodeInfoFromNodeGroup() calls .TemplateNodeInfo(),
adds deprecated labels to the result, then sanitizes it - which includes
deep-copying.

This commit moves adding the deprecated labels after the sanitization
process, directly to the deep-copied result.

If .TemplateNodeInfo() caches its result internally, and is called
from a CloudProvider-specific goroutine at the same time as
SanitizedTemplateNodeInfoFromNodeGroup(), CA panics because of a
concurrent map read/write. This change removes the race condition.
2025-06-04 14:46:16 +02:00
Kubernetes Prow Robot 46bf4e4ace
Merge pull request #8202 from adrianmoisey/revert-8191-enable-flakey-tests
Revert "Enable TestUnchangedCAReloader tests"
2025-06-04 03:42:38 -07:00
Adrian Moisey 782af09511
Revert "Enable TestUnchangedCAReloader tests" 2025-06-04 12:36:29 +02:00
Kubernetes Prow Robot 8b9624d7f1
Merge pull request #8185 from kubernetes/x13n-patch-11
Remove stale TODO
2025-06-04 01:30:38 -07:00
Thalia Wang 70f79316f6
Add VMs implementation to master (#8078) 2025-06-03 14:40:39 -07:00
Omer Aplatony bcf3866fb8 migrate ClaimOwningPod to use upstream IsForPod
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-06-03 15:45:20 +00:00
Kubernetes Prow Robot daa15703a6
Merge pull request #8178 from elmiko/cleanup-capi-readme
cleanup capi readme note sections
2025-06-03 08:36:41 -07:00
Kubernetes Prow Robot 110cb78422
Merge pull request #8197 from hetznercloud/deps-update-hcloud-go
deps(hetzner): update vendored hcloud-go to v2.21.1
2025-06-03 03:14:43 -07:00
lukasmetzner 2a9103ea4c refactor(hetzner): fix deprecated methods 2025-06-03 11:54:01 +02:00
lukasmetzner 53f7463ee6 deps(hetzner): update hcloud-go to v2.21.1 2025-06-03 11:52:48 +02:00
Kubernetes Prow Robot 347db2d102
Merge pull request #8194 from laoj2/master
Update VPA default version to 1.4.1 (main branch)
2025-06-02 13:18:41 -07:00
Luiz Oliveira fc8599b820 Update VPA default version to 1.4.1 (main branch)
Signed-off-by: Luiz Oliveira <luizaoj@google.com>
2025-06-02 14:41:04 -04:00
Kubernetes Prow Robot 8ae1ad7343
Merge pull request #8192 from adrianmoisey/fix-github-action-caching
Fix the setup-go Github action caching
2025-06-02 08:14:38 -07:00
Adrian Moisey 4c154a7da8
Bump to Go 1.24 2025-06-02 16:25:44 +02:00
Adrian Moisey e1cb498992
Fix the setup-go Github action caching
I noticed that the Github Action caching wasn't working, so I added this
setting to fix it.

https://github.com/actions/setup-go#caching-dependency-files-and-build-outputs

If the cache is populated the tests now take 5 minutes to run (down from
18m).

The one downside is that caching will also cache the test results (along
with the dependencies) so this change adds a `--count=1` to the `go
test` commands to ensure that the go test doesn't use a cached result.
2025-06-02 09:59:01 +02:00
Mohammad Torkamandehnavi abe3e86e90
chore: 🔧 Update image version to v1.32.1 2025-06-01 14:47:57 +01:00
Mohammad Torkamandehnavi e79b5b8635
fix: 🐛 Fix failed to watch *v1.VolumeAttachment in another example files 2025-06-01 14:46:48 +01:00
Kubernetes Prow Robot c0443a7e7c
Merge pull request #8191 from adrianmoisey/enable-flakey-tests
Enable TestUnchangedCAReloader tests
2025-05-31 22:02:16 -07:00
Adrian Moisey 8d90da9ac5
Enable TestUnchangedCAReloader tests 2025-05-31 14:39:55 +02:00
MenD32 8fd9e1f04d tests: reverted to old test behavior
Signed-off-by: MenD32 <amit.mendelevitch@gmail.com>
2025-05-30 21:43:01 +03:00
Kubernetes Prow Robot 2511e4485c
Merge pull request #8164 from MenD32/fix/hard-topology-spread-constraints-stop-scaledown
fix: Cluster Autoscaler not scaling down nodes where Pods with hard topology spread constraints are scheduled
2025-05-30 06:12:19 -07:00
MenD32 f038712a13 fix: added check for hostname to minimize bin-packing compute
Signed-off-by: MenD32 <amit.mendelevitch@gmail.com>
2025-05-30 15:13:53 +03:00
Kubernetes Prow Robot e69b1a90d0
Merge pull request #8182 from vitanovs/feat/update-quickstart-guide-with-in-place-update-mode-details
docs(vpa): Add `InPlaceOrRecreate` to `quickstart` guide `updateMode`(s) list
2025-05-29 12:32:17 -07:00
MenD32 886516c2cf Merge branch 'master' into fix/topology-spread-binpacking 2025-05-29 22:25:43 +03:00
Stoyan Vitanov d49fb8672d docs(vpa): Label `InPlaceOrRecreate` mode as an `alpha feature` in `quick start` guide
This commit adds a dedicated `alpha feature` suffix to `InPlaceOrRecreate`
updateMode to indicate that the option is not available by default and
requires additional setup to enable it.

Also:
- Include reference to the `features.md` documentation to bring
  additional context to the reader and point them to the feature
  details.
2025-05-29 17:41:02 +03:00
Daniel Kłobuszewski a361d25ce6
Remove stale TODO
We had async node drain for years now.
2025-05-29 16:29:58 +02:00
Stoyan Vitanov d69fa69882 docs(vpa): Update the number of supported updateModes in quick start guide 2025-05-29 15:44:00 +03:00
Stoyan Vitanov c54764998b docs(vpa): Remove redundant mention of `in-place` updates in `"Recreate"` updateMode
This commit removes the redundant mention of `in-place` updates being
available with `"Auto"` update mode as that's not the case with the
current `v1.4.0` version of `VPA`.
2025-05-29 15:02:29 +03:00
Stoyan Vitanov 3a1842b83b docs(vpa): Remove redundant `in-place` reference from `"Auto"` updateMode
This commit removes the now redundant mention of `in-place`
capability from `"Auto"` as the currently released version `v1.4.0` does
not instrument `in-place` updates as an `"Auto"` default strategy.

As of now, `"Auto"` preserves it's behavior to perform `"Recreate"` when
specified.

There is a separate `updateMode` (the `"InPlaceOrRecreate"`) specified
within the `quickstart.md` guide that handles `in-place` updates.
2025-05-29 12:23:43 +03:00
Daniel Kłobuszewski 9c501ed6b9
Update Cluster Autoscaler release schedule (#8174)
* Update Cluster Autoscaler release schedule

* Swap jackfrancis & gjtempleton shifts
2025-05-29 02:18:18 -07:00
Stoyan Vitanov 6250a5169c docs(vpa): Add `InPlaceOrRecreate` updateMode details to `quickstart` guide
This commit adds an entry for the `InPlaceOrRecreate` _updateMode_ that got
introduced as part of the `v1.4.0` release.
2025-05-29 11:45:24 +03:00
Kubernetes Prow Robot aed9602edf
Merge pull request #8179 from laoj2/fix-kustomization
Add admission controller service to kustomization
2025-05-28 13:48:17 -07:00
MenD32 81a348d0e3 fix: binpacking simulator scale up optimization on pods with topology spread constraint
binpacking simulator will now consider old nodes when trying to pack pods with topology spread constraints in order to avoid unecessary scale ups. The previous behavior did not consider that nodes that were once unschedulable within the pod equivalence group can can become scehdulable for a pod. this can happen with topology spread constraint since node scale ups can increase the global minimum, thus allowing existing nodes to schedule pods due to the increase in global_minimum+max_skew.

Signed-off-by: MenD32 <amit.mendelevitch@gmail.com>
2025-05-28 23:43:43 +03:00
Luiz Antonio 69f24464fa Add admission controller service to kustomization 2025-05-28 16:27:43 -04:00
elmiko 6063b5fa17 cleanup capi readme note sections
make all those section look the same.
2025-05-28 15:58:10 -04:00
Kubernetes Prow Robot 05008b2488
Merge pull request #8050 from HadrienPatte/instance-types
CA - AWS - May25 Instance Update
2025-05-28 08:46:18 -07:00
Kubernetes Prow Robot 8341862f85
Merge pull request #8177 from elmiko/update-capi-readme
update wording in clusterapi readme on machinepools
2025-05-28 08:06:18 -07:00
elmiko c579b37f9e update wording in clusterapi readme on machinepools
removing the notes about it being an experimental feature.
2025-05-28 10:44:50 -04:00
MenD32 3a2933a24c docs: added helpful comments
Signed-off-by: MenD32 <amit.mendelevitch@gmail.com>
2025-05-27 21:31:47 +03:00
Kubernetes Prow Robot 9cf529cbc4
Merge pull request #8172 from ialidzhikov/fix/client-go-version
e2e: Use correct version of k8s.io/client-go
2025-05-27 08:16:17 -07:00
Ismail Alidzhikov db2f4684c5 e2e: Use correct version of k8s.io/client-go 2025-05-27 17:55:25 +03:00
Hadrien Patte 432cb11830
CA - AWS - May25 Instance Update 2025-05-27 16:43:12 +02:00
Maxim Rubchinsky 4ec085aec3
fix bug 8170 to support old env var
Signed-off-by: Maxim Rubchinsky <maxim@rubchinsky.com>
2025-05-26 23:30:07 +03:00
Maxim Rubchinsky e4bac533d5
fix bug 8170 to support old env var
Signed-off-by: Maxim Rubchinsky <maxim@rubchinsky.com>
2025-05-26 22:57:30 +03:00
Kubernetes Prow Robot bb8fe52ddc
Merge pull request #8165 from nickytd/set-client-go-version
Set correct k8s/client-go version in go.mod
2025-05-26 06:26:16 -07:00
Niki Dokovski fd76cb274b
Fix k8s/client-go version in go.mod 2025-05-26 09:25:53 +02:00
MenD32 4e8bd0ada5 tests: added test to check that scaledowns work with topology spread constraints
Signed-off-by: MenD32 <amit.mendelevitch@gmail.com>
2025-05-24 15:34:19 +03:00
MenD32 ea1c308130 fix: hard topology spread constraints stop scaledown
Signed-off-by: MenD32 <amit.mendelevitch@gmail.com>
2025-05-24 15:09:38 +03:00
Kubernetes Prow Robot 3b22e163c2
Merge pull request #8163 from raywainman/vpa-release-tag
Add tag script for VPA to make release easier and less error-prone
2025-05-24 00:46:41 -07:00
Ray Wainman cec8d3ead8 Add tag script for VPA to make release easier and less error-prone 2025-05-23 20:46:16 +00:00
Kubernetes Prow Robot fabcbe5b38
Merge pull request #8159 from bedsteye/migrate-klog-flush
migrate os.Exit(255) to klog.FlushAndExit
2025-05-23 07:26:37 -07:00
Kubernetes Prow Robot adf59d447b
Merge pull request #8156 from BigDarkClown/master
Rewrite TestCloudProvider to use builder pattern
2025-05-23 06:28:36 -07:00
Guy B.B 1c7958698e trigger cla
Signed-off-by: Guy B.B <guy212122@gmail.com>
2025-05-23 15:48:14 +03:00
Bartłomiej Wróblewski 2c7d8dc378 Rewrite TestCloudProvider to use builder pattern 2025-05-23 12:42:15 +00:00
Guy B.B 168a26ef42 migrate os.Exit(255) to klog.FlushAndExit
Signed-off-by: Guy B.B <guy212122@gmail.com>
2025-05-23 15:18:59 +03:00
Kubernetes Prow Robot 0d8f587118
Merge pull request #8153 from raywainman/vpa-release-140
VPA 1.4.0 release updates in main branch
2025-05-22 07:30:35 -07:00
Kubernetes Prow Robot ca0628ac82
Merge pull request #8155 from raywainman/vpa-release-instructions
Update VPA release instructions with correct sequencing for tags
2025-05-21 23:10:36 -07:00
Ray Wainman e3ca0a4a98 Update release instructions with correct sequencing for tags 2025-05-22 00:30:51 +00:00
Ray Wainman adc7e12e1e Update VPA default version to 1.4.0 2025-05-22 00:17:47 +00:00
Kubernetes Prow Robot d1d65772e3
Merge pull request #8119 from kubernetes/raywainman-patch-3
Update VPA version.go to 1.5.0 for release
2025-05-21 09:10:36 -07:00
Kubernetes Prow Robot 086fd4426f
Merge pull request #8144 from aman4433/feature-gate-fix
updating the script to handle multi-line flag descriptions
2025-05-21 04:32:36 -07:00
Aman Shrivastava 83b370ec5a updating the script to handle multi-line flag descriptions 2025-05-21 16:41:32 +05:30
Kubernetes Prow Robot d0a297372a
Merge pull request #7992 from voelzmo/enh/concurrent-recommender
Make VPA and Checkpoint updates concurrent
2025-05-21 01:18:34 -07:00
Mohammad Dehnavi 442ad2d5b2
fix: 🐛 Fix failed to watch *v1.VolumeAttachment issue 2025-05-20 10:39:58 +01:00
Kubernetes Prow Robot 708f44213b
Merge pull request #8137 from DigitalVeer/master
pricing changes: updated z3 pricing information
2025-05-20 01:29:15 -07:00
Marco Voelz 0ee2d8a2a7 Run `hack/generate-flags.sh` 2025-05-19 16:41:54 +02:00
Marco Voelz b2a081db94 Deprecate min-checkpoints flag and make it a no-op 2025-05-19 16:41:54 +02:00
Mohammad Dehnavi 7da6fb2961
chore: 🔧 Update image version 2025-05-19 07:32:10 +01:00
Kubernetes Prow Robot bf74f7afd3
Merge pull request #8136 from omerap12/add-metric-for-failed-in-place
Add metric for failed in-place update attempt
2025-05-18 08:39:14 -07:00
Omer Aplatony b1ed5ce4bc Add metric to docs
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-05-18 06:51:23 +00:00
Omer Aplatony 6a2b950b7b Add metric for failed in-place update attempt
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-05-17 13:54:59 +00:00
Veer Singh a9c37f71dc pricing changes: updated z3 pricing information 2025-05-17 09:05:05 +00:00
Kubernetes Prow Robot f3023467d2
Merge pull request #8126 from elmiko/update-capi-readme
update clusterapi readme with node group limit info
2025-05-16 12:41:13 -07:00
Kubernetes Prow Robot 26f59b5f21
Merge pull request #8131 from kubernetes/x13n-patch-9
Use go1.24 for Cluster Autoscaler builds
2025-05-16 02:35:14 -07:00
Kubernetes Prow Robot 3374be70d0
Merge pull request #8124 from jbtk/estimator_fix
Prevent the binpacking estimator from retrying to add additional nodes
2025-05-15 08:13:14 -07:00
Daniel Kłobuszewski 92f087b386
Use go1.24 for Cluster Autoscaler builds 2025-05-15 15:53:12 +02:00
Justyna Betkier c7f7cb5de8 Prevent the binpacking estimator from retrying to add additional nodes
when reaching the limits.
2025-05-15 14:21:57 +02:00
Kubernetes Prow Robot 2abc138872
Merge pull request #8121 from atwamahmoud/master
Cluster Autoscaler: Disables `printf` check for go vet in CA testing, Updates go version to go1.24.0
2025-05-14 02:25:19 -07:00
Kubernetes Prow Robot c85f22f7dd
Merge pull request #7798 from omerap12/migrate-claimReservedForPod
migrate claimReservedForPod to use upstream IsReservedForPod
2025-05-13 09:57:16 -07:00
Mahmoud Atwa 24ef1f4319 Cluster Autoscaler: Disable printf analyzer & update go version 2025-05-13 16:27:08 +00:00
elmiko e03f7068d5 update clusterapi readme with node group limit info
this change is adding an extra note to the readme to help users
understand how the autoscaler works when it is outside the minimum and
mamximum limits. it is being added to help inform users and also because
the readme is embedded in clusterapi documentation.
2025-05-13 10:45:35 -04:00
Kubernetes Prow Robot 6b55dc9009
Merge pull request #8120 from kubernetes/raywainman-patch-4
Update VPA release instructions with correct tag command
2025-05-12 22:35:16 -07:00
Kubernetes Prow Robot 8d45fcd183
Merge pull request #8122 from toVersus/chore/vpa-inplace-metrics-name-typo
Fix typos in the metric name for VPA In-Place update
2025-05-12 21:41:15 -07:00
Tsubasa Nagasawa 8e8979c91a Fix typos in the metric name for VPA In-Place update
Signed-off-by: Tsubasa Nagasawa <toversus2357@gmail.com>
2025-05-13 08:53:04 +09:00
Daniel Kłobuszewski baa6c52c47 Bump go version in builder 2025-05-12 22:27:03 +00:00
Daniel Kłobuszewski e42d50d191 Update Kubernetes deps to 1.33.0-beta.0 2025-05-12 22:27:03 +00:00
Ray Wainman 6076fb36f1
Update VPA release instructions with correct tag command 2025-05-12 17:49:44 -04:00
Ray Wainman a1ab8bc55f
Update VPA version.go to 1.5.0 for release 2025-05-12 17:25:21 -04:00
Kubernetes Prow Robot 2b33c4c790
Merge pull request #8115 from maxcao13/kubernetes-in-place-updates
VPA: Implement in-place updates support
2025-05-12 11:27:16 -07:00
Kubernetes Prow Robot 2ca75135fb
Merge pull request #7694 from mtrqq/bug/clean-instance-templates
Fix memory leak while fetching GCE instance templates.
2025-05-12 04:13:16 -07:00
Kubernetes Prow Robot 2937e9c3da
Merge pull request #8117 from jlamillan/jlamillan/oci_sdk_65.90
Upgrade OCI provider SDK to v65.90.0. Required for Go 1.24.
2025-05-12 03:45:15 -07:00
jesse.millan 3fd510bb5a
Upgrade OCI provider SDK to v65.90.0. Required for Go 1.24. 2025-05-10 22:57:16 -07:00
Max Cao 3039f3cc92
VPA: upgrade InPlacePodVerticalScaling internal logic to k8s 1.33
Signed-off-by: Max Cao <macao@redhat.com>
2025-05-09 13:38:51 -07:00
Max Cao 66b4c962d6
VPA: fix InPlaceOrRecreate feature gate version
Signed-off-by: Max Cao <macao@redhat.com>
2025-05-09 13:38:51 -07:00
Max Cao 2a3764d007
VPA: use sha256 digest for local kind image
Signed-off-by: Max Cao <macao@redhat.com>
2025-05-09 13:38:47 -07:00
Kubernetes Prow Robot ff5595519e
Merge pull request #8114 from d-honeybadger/priority-expander-docs-corrections
correct documentation on cluster-autoscaler priority expander configmap
2025-05-09 08:29:33 -07:00
Kubernetes Prow Robot a65fdb4031
Merge pull request #8081 from raywainman/vpa-release-instructions
VPA - Update release instructions for automatically built images
2025-05-09 07:21:16 -07:00
Ray Wainman 7bbb443a0f
Update vertical-pod-autoscaler/RELEASE.md
Co-authored-by: Adrian Moisey <adrian@changeover.za.net>
2025-05-09 10:02:53 -04:00
dkomsa b7fa3cd01d correct and clarify documentation on cluster-autoscaler priority expander configmap
Signed-off-by: dkomsa <dkomsa@digitalocean.com>
2025-05-09 09:27:30 -04:00
Max Cao 4f18830d51
VPA: refactor e2e test ginkgo wrapper functions
This commit refactors the VPA e2e test ginkgo wrappers so that they we can easily supply ginkgo decorators.
This allows us to add ginkgo v2 labels to suites so that later we can run tests that only run FG tests.
For now, this would only be useful for FG:InPlaceOrRecreate

Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:40:37 -07:00
Max Cao 087e946e1a
VPA: add InPlaceOrRecreate e2e tests
Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:40:37 -07:00
Max Cao 8a9a4b8c96
VPA: bump up overall e2e test timeout
Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:39:38 -07:00
Omer Aplatony 8806d180c2
Update features.md
Co-authored-by: Max Cao <macao@redhat.com>
2025-05-08 14:39:38 -07:00
Omer Aplatony 036a482bc9
Adjust comments
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-05-08 14:39:38 -07:00
Omer Aplatony 94d55a5f7b
Update vertical-pod-autoscaler/docs/features.md
Co-authored-by: Adrian Moisey <adrian@changeover.za.net>
2025-05-08 14:39:38 -07:00
Omer Aplatony 11e7560180
Add docs for in-place updates
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-05-08 14:39:37 -07:00
Max Cao c5eecc6c4c
address raywainman and omerap12 comments
Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:39:37 -07:00
Max Cao 9eac8fc5c5
VPA: refactor in-place and eviction logic
This commit refactors inplace logic outside of the pods eviction restriction and separates them into their own files.
Also this commit adds PatchResourceTarget to calculators to allow them to explictly specify to the caller which
resource/subresource they should be patched to. This commit also creates a utils subpackage in order to prevent
dependency cycles in the unit tests, and adds various unit tests. Lastly, this commit adds a rateLimiter specifically
for limiting inPlaceResize API calls.

Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:39:37 -07:00
Max Cao 15883dce79
VPA: Update vpa-rbac.yaml for allowing in place resize requests
Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:39:37 -07:00
Max Cao d6376c48f6
VPA: fixup vpa-process-yaml.sh script
The script needs to also check if the yaml input is a Deployment, and no longer needs to check for vpa-component names.

Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:39:37 -07:00
Max Cao 7df0c2fcbc
VPA: Updater in-place updates unit tests
Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:39:37 -07:00
Max Cao 6ebeb83f1d
VPA: Allow updater to actuate InPlaceOrRecreate updates
Introduces large changes in the updater component to allow InPlaceOrRecreate mode.
If the feature gate is enabled and the VPA update mode is InPlaceOrRecreate, the updater will attempt an in place update by first
checking a number of preconditions before actuation (e.g., if the pod's qosClass would be changed, whether we are already in-place resizing,
whether an in-place update may potentially violate disruption(previously eviction) tolerance, etc.).
After the preconditions are validated, we send an update signal to the InPlacePodVerticalScaling API with the recommendation, which may or may not fail.
Failures are handled in subsequent updater loops.

As for implementation details, patchCalculators have been re-used from the admission-controllers code for the updater in order to calculate recommendations for the updater to actuate.
InPlace logic has been mostly stuffed in the eviction package for now because of similarities and ease (user-initated API calls eviction vs. in-place; both cause disruption).
It may or may not be useful to refactor this later.

Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:38:54 -07:00
John Kyros 2af23c885b
VPA: Add metrics gauges for in-place updates
We might want to add a few more that are combined disruption counters,
e.g. in-place + eviction totals, but for now just add some separate
counters to keep track of what in-place updates are doing.
2025-05-08 14:38:36 -07:00
Max Cao b37a3eb264
VPA: Allow admission-controller to validate in-place spec
Only allow VPA objects with InPlaceOrRecreate update mode to be created if InPlaceOrRecreate feature gate is enabled. If a VPA object already exists with this mode on, and the feature gate is disabled, this prevents further objects to be created with InPlaceOrRecreate, but this does not prevent the existing InPlaceOrRecreate VPA objects with from being modified.

Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:38:36 -07:00
Max Cao eb153611ff
VPA: Allow deploying InPlaceOrRecreate in local e2e and ci
Allows you to specify an env var FEATURE_GATES which adds feature gates to all vpa components during vpa-up and e2e tests.
Also allows local e2e tests to run kind with a new kind-config file which enables KEP-1287 InPlacePodVerticalScaling feature gate.
Separates the admission-controller service into a separate deploy manifest.

Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:38:35 -07:00
Max Cao 6f86a9852f
VPA: Introduce VPA feature gates; add InPlaceOrRecreate feature gate
Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:37:29 -07:00
John Kyros 770b76797f
VPA: Add UpdateModeInPlaceOrRecreate to types
This adds the UpdateModeInPlaceOrRecreate mode to the types so we
can use it.

Signed-off-by: Max Cao <macao@redhat.com>
2025-05-08 14:37:29 -07:00
Kubernetes Prow Robot b98a5ffc16
Merge pull request #8061 from Uladzislau97/dra-provider
Move DRA provider to autoscaling context.
2025-05-08 09:27:14 -07:00
Vlad Vasilyeu f32d6cd542 Move DRA provider to autoscaling context. 2025-05-08 09:30:55 +00:00
Kubernetes Prow Robot 34115f8aa0
Merge pull request #8108 from kubernetes/dependabot/docker/vertical-pod-autoscaler/pkg/admission-controller/golang-1.24.3
Bump golang from 1.24.2 to 1.24.3 in /vertical-pod-autoscaler/pkg/admission-controller
2025-05-07 19:35:16 -07:00
Kubernetes Prow Robot d8050d79bc
Merge pull request #8106 from kubernetes/dependabot/docker/vertical-pod-autoscaler/pkg/recommender/golang-1.24.3
Bump golang from 1.24.2 to 1.24.3 in /vertical-pod-autoscaler/pkg/recommender
2025-05-07 19:33:15 -07:00
Kubernetes Prow Robot 53b0f037c7
Merge pull request #8107 from kubernetes/dependabot/docker/vertical-pod-autoscaler/pkg/updater/golang-1.24.3
Bump golang from 1.24.2 to 1.24.3 in /vertical-pod-autoscaler/pkg/updater
2025-05-07 19:31:17 -07:00
dependabot[bot] eea2dcd400
Bump golang in /vertical-pod-autoscaler/pkg/admission-controller
Bumps golang from 1.24.2 to 1.24.3.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-07 19:02:02 +00:00
Ray Wainman d110d05b4b
Update vertical-pod-autoscaler/RELEASE.md
Co-authored-by: Adrian Moisey <adrian@changeover.za.net>
2025-05-07 14:56:46 -04:00
dependabot[bot] 28581fc17e
Bump golang in /vertical-pod-autoscaler/pkg/updater
Bumps golang from 1.24.2 to 1.24.3.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-07 18:47:55 +00:00
dependabot[bot] 5d8f61a690
Bump golang in /vertical-pod-autoscaler/pkg/recommender
Bumps golang from 1.24.2 to 1.24.3.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-07 18:13:46 +00:00
Kubernetes Prow Robot 79a1375afe
Merge pull request #8101 from x13n/estimate-metric
Use exponential rather than arbitrary bucketing
2025-05-07 06:23:16 -07:00
Daniel Kłobuszewski deec4b7fbd Use exponential rather than arbitrary bucketing
Also adds an additional bucket to be able to capture more data points at
the high end.
2025-05-07 14:44:59 +02:00
Kubernetes Prow Robot 6ad982c932
Merge pull request #8099 from x13n/estimate-metric
Introduce a metric measuring heterogeneity of binpacking
2025-05-07 04:53:17 -07:00
Kubernetes Prow Robot 454e70ce15
Merge pull request #8097 from norbertcyran/fix-plugin-runner-panic
Prevent nil dereference of preFilterStatus
2025-05-07 03:17:16 -07:00
Norbert Cyran 6ab7e2eb78 Prevent nil dereference of preFilterStatus 2025-05-07 10:38:20 +02:00
Daniel Kłobuszewski 99121ebf42 Introduce a metric measuring heterogeneity of binpacking
Binpacking estimator has optimizations to allow homogeneous workloads to
be efficiently processed together. Measuring heterogeneity is required
to gain insight into cases when these optimizations matter and when they
don't.
2025-05-06 21:49:10 +02:00
Kubernetes Prow Robot 8bc75e5b58
Merge pull request #8091 from ialidzhikov/fix/vpa-updater-events-rbac
vpa-updater: Allow patch and update for events
2025-05-06 05:11:13 -07:00
Ismail Alidzhikov 9f0ec719a4 vpa-updater: Allow patch and update for events 2025-05-05 14:41:30 +03:00
Kubernetes Prow Robot bf23616cf0
Merge pull request #8059 from vitanovs/feat/vpa-recommender-prometheus-client-round-tripper
metrics(recommender): Add `RoundTripper` metrics to Prometheus API client
2025-05-05 01:29:58 -07:00
Kubernetes Prow Robot 0a9e1630a3
Merge pull request #8089 from adrianmoisey/bump-for-k8s1.33
Bump VPA Go dependencies prior to next release
2025-05-05 00:13:56 -07:00
Kubernetes Prow Robot 9cdcc284ea
Merge pull request #8047 from raykrueger/aws-eks-hybrid-nodes-fix
fix: AWSCloudProvider should ignore unrecognized provider IDs
2025-05-04 15:25:56 -07:00
Adrian Moisey 906a28f6a9
Bump to Kubernetes 1.33
Bump hack scripts to use Kubernetes 1.33
2025-05-04 16:05:01 +02:00
Adrian Moisey 88d4d5b836
Bump VPA Go dependencies prior to next release 2025-05-04 16:04:33 +02:00
Kubernetes Prow Robot e57ff80356
Merge pull request #8088 from LucaLanziani/chore/issue-7884
chore: improve logging for missing limit assignments
2025-05-04 01:21:56 -07:00
Luca Lanziani 13c1dd8730 chore: improve logging for missing limit assignments 2025-05-04 10:04:02 +02:00
Kubernetes Prow Robot 46cb6059af
Merge pull request #8077 from raywainman/master
Update default version of VPA to 1.3.1 in main branch
2025-05-02 12:00:01 -07:00
Ray Wainman c6ce144445 Update release instructions for automatically built images 2025-05-02 13:21:22 +00:00
Kubernetes Prow Robot 41630404f3
Merge pull request #7817 from karsten42/feature/hetzner-config-from-configmap
added possibility to retrieve hcloud cluster config from file
2025-05-02 03:55:54 -07:00
Karsten van Baal ea764b4ef7 chore: refactored config parsing 2025-05-02 12:25:25 +02:00
Kubernetes Prow Robot 24494f3c06
Merge pull request #7804 from ttsuuubasa/capi-scale-from-0-nodes
cluster-api: node template in scale-from-0-nodes scenario with DRA
2025-05-01 16:17:54 -07:00
Ray Wainman 397324d76e
Merge branch 'kubernetes:master' into master 2025-05-01 09:06:36 -04:00
Tsubasa Watanabe 2291b74a2d Make InstanceResourceSlices func more efficient and make comments about DRA annotation in capi more recognizable
Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
2025-05-01 12:09:48 +09:00
Kubernetes Prow Robot 6a6a912b41
Merge pull request #8079 from kubernetes/raywainman-patch-2
VPA - Bump build time and remove parallel builds (they don't work)
2025-04-30 13:13:56 -07:00
Ray Wainman 51a38bdc1a
Bump build time and remove parallel builds (they don't work) 2025-04-30 15:55:53 -04:00
Ray Wainman 1c79d5db69 Update version documentation 2025-04-30 19:27:03 +00:00
Ray Wainman 8c720f0c0d Update flags documentation 2025-04-30 19:10:39 +00:00
Ray Wainman 9211cd4d41 Update default version of VPA to 1.3.1 in main branch 2025-04-30 17:56:35 +00:00
Kubernetes Prow Robot f8ac5394ce
Merge pull request #8071 from laoj2/add-fallback-metric
Add metric to track calls to get container/pod resources
2025-04-29 23:53:55 -07:00
Luiz Antonio 27e1d0f17a Fix linter errors 2025-04-29 15:46:01 -04:00
Luiz Antonio cd8b7dc4e2 Add metric to track calls to get container/pod resources grouped by data source 2025-04-29 14:49:10 -04:00
Kubernetes Prow Robot 5d95cfde3b
Merge pull request #8062 from laoj2/fix-spec-client
Read container resources from containerStatus in the spec client
2025-04-29 02:19:55 -07:00
Kubernetes Prow Robot 130af548b5
Merge pull request #8066 from raywainman/cluster-autoscaler-build
cloudbuild.yaml file for cluster-autoscaler for automated builds
2025-04-28 14:45:54 -07:00
Ray Wainman c897c97623 small tweaks to makefile to ensure it works in Cloud Build 2025-04-28 21:28:13 +00:00
Kubernetes Prow Robot 420df58223
Merge pull request #8068 from kubernetes/raywainman-patch-1
Build VPA components in parallel
2025-04-28 11:43:54 -07:00
Luiz Antonio 3eba409c62 Address comments 2025-04-28 14:24:28 -04:00
Ray Wainman 20ebe030c3
Build VPA components in parallel 2025-04-28 14:24:28 -04:00
Kubernetes Prow Robot d4f4169873
Merge pull request #8065 from raywainman/vpa-build-fix
Fix default registry in VPA makefiles
2025-04-28 08:47:55 -07:00
Ray Wainman a9921418f6 cloudbuild.yaml file for cluster-autoscaler 2025-04-28 15:36:03 +00:00
Marco Voelz 700f1fbfca Re-phrase error log for checkpoints-timeout 2025-04-28 17:18:32 +02:00
Ray Wainman 50763a264d Fix default registry in VPA makefiles 2025-04-28 15:15:14 +00:00
Kubernetes Prow Robot 27a9a16b80
Merge pull request #8052 from plkokanov/fix/nil-pointer
VPA: Fix nil ptr when loading metrics and `memory-saver=true`
2025-04-28 07:47:55 -07:00
Stoyan Vitanov bc5657f108 metrics(recommender): Add counter and duration RoundTrippers to Prometheus API client
This commit instruments the `recommender`'s Prometheus API client with
two additional RoundTrippers for outbound request `counts` and
`duration`.
2025-04-28 10:57:20 +03:00
Stoyan Vitanov e64e081bc5 metrics(recommender): Add counter and duration metrics for `promhttp` RoundTripper
This commit introduces two additional metrics that will be used to
gather `count` and `duration` requests telemetry from the `recommender`
client to a Prometheus instance.

In addition to the the metrics, two new utility functions for creating
`promhttp.RoundTripperFunc`s are introduced in order to abstract the
`http.RoundTripper` instrumentation within the utility package.
2025-04-28 10:57:20 +03:00
Plamen Kokanov 260c306ab4 Address review comments 2025-04-28 10:12:52 +03:00
Kubernetes Prow Robot 46fb73a1dc
Merge pull request #8041 from laoj2/fix-capping
Read container resources from containerStatus in the capping processor
2025-04-26 09:13:24 -07:00
Luiz Antonio 986d7c875a Read container resources from containerStatus in the spec client 2025-04-25 15:13:01 -04:00
Luiz Antonio 69a9fe0b9b Address comments 2025-04-25 13:41:37 -04:00
Kubernetes Prow Robot 6771ca4848
Merge pull request #8042 from jackfrancis/cas-release-automation
cluster autoscaler: enable automated builds for release
2025-04-24 07:40:31 -07:00
Jack Francis 46acf7e536 cluster autoscaler: enable automated builds for release
Signed-off-by: Jack Francis <jackfrancis@gmail.com>
2025-04-23 09:05:39 -07:00
Plamen Kokanov 71ddfb7be0 Fix nil ptr when loading metrics and `memory-saver=true` 2025-04-23 14:08:51 +03:00
Kubernetes Prow Robot 55ce673590
Merge pull request #8049 from pbetkier/use-pod-requests-helper
Use k8s.io/component-helpers/resource for pod request calculations
2025-04-22 08:57:41 -07:00
Piotr Betkier ac1c7b5463 use k8s.io/component-helpers/resource for pod request calculations 2025-04-22 17:36:17 +02:00
Ray Krueger 3a1973872f fix: AWSCloudProvider ignores unrecognized provider IDs
The AWSCloudProvider only supports aws://zone/name ProviderIDs. It
should ignore ProviderIDs it does not recognize. Prior to this fix, an
unrecognized ProviderID, such as eks-hybrid://zone/cluster/my-node which
is used by EKS Hybrid Nodes, will break the Autoscaler loop.

This fix returns logs a warning, and returns nil, nil instead of
returning the error.
2025-04-17 17:27:46 +00:00
Luiz Antonio 5819634304 Read container resources from containerStatus in the capping processor 2025-04-15 15:05:37 -04:00
Kubernetes Prow Robot 66feee1483
Merge pull request #8036 from raywainman/vpa-build
Create cloudbuild.yaml for VPA binaries to automate builds
2025-04-15 10:35:06 -07:00
Ray Wainman 8ad920634c create cloudbuild.yaml for VPA binaries to automate builds 2025-04-15 15:48:26 +00:00
Kubernetes Prow Robot 1b92813df4
Merge pull request #8029 from raywainman/vpa-deps
[VPA] update golang.org/x/net to address vulnerability
2025-04-15 08:33:07 -07:00
Maksym Fuhol 99584890b4 Clean instance templates for untracked migs. 2025-04-15 12:29:19 +00:00
Kubernetes Prow Robot 01cd259f54
Merge pull request #8034 from mtrqq/patch-scaledown-in-cooldown-signature
Patch TestCleaningSoftTaintsInScaleDown to be compatible with new isScaleDownInCooldown signature
2025-04-15 04:55:08 -07:00
Maksym Fuhol 6cbf801235 Patch TestCleaningSoftTaintsInScaleDown to be compatible with new isScaleDownInCooldown signature. 2025-04-15 10:02:44 +00:00
Kubernetes Prow Robot 18f10c1e00
Merge pull request #7997 from damikag/scale-down-slo-update-metric
Emit scale down metric even when there is no scale down candidates.
2025-04-14 13:13:05 -07:00
Ray Wainman da002f58d8 update golang.org/x/net to address vulnerability 2025-04-14 16:12:40 +00:00
Kubernetes Prow Robot 75f90f2dc7
Merge pull request #8028 from vitanovs/chore/e2e-make-vpa-observer-reuse-test-framework-namespace
e2e(recommender): Restrict the internal `VPA` controller scope to test namespace
2025-04-14 07:23:07 -07:00
Stoyan Vitanov c450973f2b chore(e2e/recommender): Reuse test namespace in `VPA` observer controller
This commit instruments the `observer`, used within the `recommender`
e2e tests, reuse the test framework namespace. The main reason behind
this change is to allow running the test suite against an existing
cluster that already has `VPA` resources in other namespaces.

Currently, the test is considering all `namespaces` when it builds it's
`observer`'s controller, which makes the test fail when trying to validate
the recommendations for components outside of the test scope.

This behavior was discovered while running the e2e tests against a
cluster with VPA setup that uses Prometheus as history storage
provider.
2025-04-14 15:07:30 +03:00
Kubernetes Prow Robot 25ad4c2c26
Merge pull request #8011 from jinglinliang/allow-third-party-sts-to-drain
Allow draining when StatefulSet kind has custom API Group
2025-04-11 13:04:41 -07:00
karsten 22dc4e06f6 chore: added paragraph to readme for new HCLOUD_CLUSTER_CONFIG_FILE 2025-04-11 07:29:01 +02:00
Kubernetes Prow Robot 43d6fbd747
Merge pull request #8004 from laoj2/fix-container-status
Also consider containerStatus.resources when handling container resources
2025-04-10 11:12:47 -07:00
jinglinliang 25af21c515 Add unit test to allow draining when StatefulSet kind has custom API Group 2025-04-09 14:03:00 -07:00
jinglinliang cc3a9f5d10 Allow draining when StatefulSet kind has custom API Group 2025-04-09 14:03:00 -07:00
Kubernetes Prow Robot 87a67e3aa0
Merge pull request #7995 from abdelrahman882/cleaningSoftTaintsTesting
Add unit test for cleaning deletion soft taint in scale down cool down
2025-04-09 10:48:39 -07:00
Omran dd125d4ef1
Add unit test for cleaning deletion soft taints in scale down cool down 2025-04-09 08:21:49 +00:00
Luiz Antonio b766a060cf Address comments 2025-04-08 16:17:50 -04:00
Kubernetes Prow Robot afc3eafae5
Merge pull request #8016 from kubernetes/x13n-patch-8
Remove outdated GCE cloudprovider owners
2025-04-08 10:10:50 -07:00
Marco Voelz f5df60f4c2 fix worker context initialization 2025-04-08 14:06:01 +02:00
Daniel Kłobuszewski f1a44d89cf
Remove outdated GCE cloudprovider owners 2025-04-08 13:24:20 +02:00
Kubernetes Prow Robot 4bc861d097
Merge pull request #7923 from Uladzislau97/nap-resilience
Improve resilience of diskTypes requests.
2025-04-08 04:22:40 -07:00
Kubernetes Prow Robot 7c28f52f93
Merge pull request #7854 from AppliedIntuition/master
Fix 2 bugs in the OCI integration
2025-04-07 09:14:42 -07:00
Marco Voelz 46e19bfe4e format imports 2025-04-07 17:31:04 +02:00
Marco Voelz cf68c8f9c7 make golint happy 2025-04-07 17:23:46 +02:00
Marco Voelz 6a99f2e925 adapt test for concurrent method 2025-04-07 17:17:03 +02:00
Marco Voelz 1ead15ee80 run generate-flags 2025-04-07 17:17:03 +02:00
Marco Voelz 15cb8d163d Mention client-side rate limits in concurrent-worker-count flag 2025-04-07 17:17:03 +02:00
Marco Voelz 3713acbb33 Increase client-side rate limits by 10x 2025-04-07 17:17:03 +02:00
Marco Voelz 256bb7c142 use cli parameter for concurrency 2025-04-07 17:17:03 +02:00
Marco Voelz bf1d3832aa make update checkpoints concurrent 2025-04-07 17:17:03 +02:00
Marco Voelz c604166e31 make now a local variable 2025-04-07 17:17:00 +02:00
Marco Voelz 9691d2b264 Extract Checkpoint update method 2025-04-07 17:15:43 +02:00
Marco Voelz 981ec32278 Backfill unit test for minCheckpoints 2025-04-07 17:15:43 +02:00
Marco Voelz ef9d3ac0be Add flag for concurrent worker count 2025-04-07 17:15:43 +02:00
Marco Voelz 6651816743 Make VPA updates concurrent 2025-04-07 17:15:43 +02:00
Marco Voelz 3ac3fcdb4e extract VPA update method 2025-04-07 17:15:43 +02:00
Vlad Vasilyeu 93e21d05e2 Replace diskTypes.aggregatedList request with diskTypes.list in FetchAvailableDiskTypes. 2025-04-07 07:50:29 +00:00
Kubernetes Prow Robot 3e92831089
Merge pull request #8007 from omerap12/fixed-lint-context
fix lint: update tests with context
2025-04-03 07:56:47 -07:00
Kubernetes Prow Robot 1de2160986
Merge pull request #7908 from Preisschild/fix/capi-patch-instead-update
CA: Use Patch to Scale clusterapi nodepools
2025-04-03 07:16:48 -07:00
Omer Aplatony 358e3157ce fix lint: update tests with context
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-04-03 14:13:33 +00:00
Kubernetes Prow Robot 05b97efa12
Merge pull request #7900 from omerap12/ctx-propagation-recommender
Refactor recommender to use context
2025-04-02 11:52:37 -07:00
Luiz Antonio 4578a1a211 Fix linter errors 2025-04-02 10:20:47 -04:00
Luiz Antonio 29888c3ce3 Also read resources from containerStatus in pod eviction admission 2025-04-01 16:15:36 -04:00
Kubernetes Prow Robot 8657345226
Merge pull request #8001 from kubernetes/dependabot/docker/vertical-pod-autoscaler/pkg/updater/golang-1.24.2
Bump golang from 1.24.1 to 1.24.2 in /vertical-pod-autoscaler/pkg/updater
2025-04-01 12:40:37 -07:00
Kubernetes Prow Robot 71ef17fd6c
Merge pull request #8002 from kubernetes/dependabot/docker/vertical-pod-autoscaler/pkg/admission-controller/golang-1.24.2
Bump golang from 1.24.1 to 1.24.2 in /vertical-pod-autoscaler/pkg/admission-controller
2025-04-01 12:38:37 -07:00
Kubernetes Prow Robot 6c6db404af
Merge pull request #8003 from kubernetes/dependabot/docker/vertical-pod-autoscaler/pkg/recommender/golang-1.24.2
Bump golang from 1.24.1 to 1.24.2 in /vertical-pod-autoscaler/pkg/recommender
2025-04-01 12:30:38 -07:00
Luiz Antonio 00f627fbb9 Also read resources from containerStatus in priority processor 2025-04-01 15:25:10 -04:00
dependabot[bot] 3e43170446
Bump golang in /vertical-pod-autoscaler/pkg/recommender
Bumps golang from 1.24.1 to 1.24.2.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-01 18:35:52 +00:00
dependabot[bot] 4a1b362ca5
Bump golang in /vertical-pod-autoscaler/pkg/admission-controller
Bumps golang from 1.24.1 to 1.24.2.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-01 18:34:33 +00:00
dependabot[bot] 9148a69e87
Bump golang in /vertical-pod-autoscaler/pkg/updater
Bumps golang from 1.24.1 to 1.24.2.

---
updated-dependencies:
- dependency-name: golang
  dependency-version: 1.24.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-04-01 18:33:34 +00:00
Luiz Antonio 204ac56883 Also read resources from containerStatus in oom observer 2025-04-01 13:48:03 -04:00
Luiz Antonio a8c5030035 Also read resources from containerStatus in recommendation_provider 2025-04-01 12:21:38 -04:00
Luiz Antonio ea396b5f2b Also read resources from containerStatus in resource_updates 2025-04-01 12:00:04 -04:00
Kubernetes Prow Robot dc91330f6a
Merge pull request #7989 from loick111/feature/clusterapi-instances-status
ClusterAPI: Report machine phases to improve cluster-autoscaler decisions
2025-04-01 07:44:38 -07:00
Kubernetes Prow Robot c107f2bba5
Merge pull request #7998 from jackfrancis/add-owners-aliases
update OWNERS, add OWNERS_ALIASES
2025-04-01 01:56:51 -07:00
Florian Ströger ecb572a945 Use Patch to Scale clusterapi nodepools to avoid modification conflicts
Issue: https://github.com/kubernetes/autoscaler/issues/7872
Signed-off-by: Florian Ströger <stroeger@youniqx.com>
2025-04-01 08:26:45 +02:00
Kubernetes Prow Robot 6c7c0c1ffd
Merge pull request #7984 from toredash/fix/vpa-autoscaler-flags-link
chore(release-process): update flags and flags link
2025-03-31 18:58:36 -07:00
Luiz Antonio 3d138309b9 Add helpers to get containers requests/limits 2025-03-31 17:22:45 -04:00
Kubernetes Prow Robot 13dba1751c
Merge pull request #7936 from fcrespofastly/allow_tpl_on_various_parts_of_the_chart
allow `tpl` on common metadata to DRY
2025-03-31 10:08:41 -07:00
Jack Francis d97c4b22df update OWNERS, add OWNERS_ALIASES
Signed-off-by: Jack Francis <jackfrancis@gmail.com>
2025-03-31 09:50:51 -07:00
Fernando Crespo Gravalos 460797ba4a remove extra new line
Signed-off-by: Fernando Crespo Gravalos <fcrespo@fastly.com>
2025-03-31 18:49:37 +02:00
Fernando Crespo Gravalos a6a54e8806 use backticks to escape braces
Signed-off-by: Fernando Crespo Gravalos <fcrespo@fastly.com>
2025-03-31 18:46:34 +02:00
Fernando Crespo Gravalos 365c3d1d0c update README.md
Signed-off-by: Fernando Crespo Gravalos <fcrespo@fastly.com>
2025-03-31 18:46:34 +02:00
Fernando Crespo Gravalos 63c7d13622 remove testing label leftover
Signed-off-by: Fernando Crespo Gravalos <fcrespo@fastly.com>
2025-03-31 18:46:34 +02:00
Fernando Crespo Gravalos fd0f93a94d this allows DRY common metadata
Signed-off-by: Fernando Crespo Gravalos <fcrespo@fastly.com>
2025-03-31 18:46:32 +02:00
Damika Gamlath 49b271f75a Emit scale down metric even when there is no scale down candidates.
Update scale scaleDownInCooldown definition to skip considering zero candidates as a reason to be in scaleDownInCooldown state
2025-03-31 14:46:23 +00:00
Kubernetes Prow Robot 19cb11766d
Merge pull request #7994 from jackfrancis/update-helm-docs-braces
helm: backtick to escape braces
2025-03-31 03:26:46 -07:00
Jack Francis 8b31ea0140 helm: backtick to escape braces
Signed-off-by: Jack Francis <jackfrancis@gmail.com>
2025-03-28 15:33:18 -07:00
Kubernetes Prow Robot 5d4f6f1b80
Merge pull request #7979 from laoj2/fix-timeouts
Parametrize pod resize timeouts in AEP-4016 beta
2025-03-28 12:50:40 -07:00
Kubernetes Prow Robot 7d475d181c
Merge pull request #7990 from omerap12/add-omer-as-approver
Add omerap12 to VPA approvers
2025-03-28 12:48:45 -07:00
Loick MAHIEUX 005a42b9af feat(cluster-autoscaler): improve nodes listing in ClusterAPI provider
Add improved error handling for machines phase in the ClusterAPI node group
implementation. When a machine is in Deleting/Failed/Pending phase, mark the cloudprovider.Instance
with a status for cluster-autoscaler recovery actions.

The changes:
- Enhance Nodes listing to allow reporting the machine phase in Instance status
- Add error status reporting for failed machines

This change helps identify and manage failed machines more effectively,
allowing the autoscaler to make better scaling decisions.
2025-03-28 15:07:34 +01:00
Tore Stendal Lønøy 9bcecb96c8 chore(docs): update link to use full URI
the previous link was a failed attempt to using relative URL to avoid having to replace URLs in the future, if the repository was moved to a new location
2025-03-28 10:16:55 +01:00
Omer Aplatony 52334dec72 Add omerap12 to VPA approvers
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-03-28 09:00:42 +00:00
Tore Stendal Lønøy 21422e5419 chore(docs): update sed command to work correctly on mac 2025-03-28 09:42:52 +01:00
Luiz Antonio 81d42aacef Parametrize pod resize timeouts in AEP-4016 2025-03-27 14:39:15 -04:00
Tore Stendal Lønøy c9ee74c39b chore(release-process): update flags and flags link
as part of the release process, it is required to update the flags documentation and also ensuring that the components.md file points to the tagged version of the flags.md file.
2025-03-27 09:24:16 +01:00
eric-higgins-ai 8da9a7b4af add log messages 2025-03-21 14:02:10 -07:00
eric-higgins-ai 370c8eb78e Revert "Address comment"
This reverts commit 233d5c6e4d.
2025-03-21 13:58:56 -07:00
Omer Aplatony 10c2a3514e Refactor recommender to use context
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-03-08 14:05:19 +00:00
eric-higgins-ai 233d5c6e4d Address comment 2025-03-05 20:34:24 -08:00
eric-higgins-ai 91d20d533e unit test coverage 2025-02-24 11:17:00 -08:00
eric-higgins-ai cc430980d2 fixes 2025-02-21 18:39:48 -08:00
eric-higgins-ai 5735b8ae19 get all node shapes 2025-02-21 14:02:31 -08:00
eric-higgins-ai 9c0357a6f2 fix scale up bug 2025-02-21 13:57:03 -08:00
Tsubasa Watanabe 3fbacf0d0f cluster-api: node template in scale-from-0-nodes scenario with DRA
Modify TemplateNodeInfo() to return the template of ResourceSlice.
This is to address the DRA expansion of Cluster Autoscaler, allowing users to set the number of GPUs and DRA driver name by specifying
the annotation to NodeGroup provided by cluster-api.

Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
2025-02-12 11:56:04 +09:00
Karsten van Baal 65c14d5526 added possibility to retrieve hcloud cluster config from file 2025-02-10 16:41:39 +01:00
Omer Aplatony e02272a4f1 migrate claimReservedForPod to use upstream IsReservedForPod
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-02-02 21:09:08 +02:00
1661 changed files with 33922 additions and 14413 deletions

View File

@ -15,15 +15,19 @@ jobs:
test-and-verify:
runs-on: ubuntu-latest
steps:
- name: Set up Go
uses: actions/setup-go@v5.1.0
with:
go-version: '1.22.2'
- uses: actions/checkout@v4.2.2
with:
path: ${{ env.GOPATH }}/src/k8s.io/autoscaler
- name: Set up Go
uses: actions/setup-go@v5.5.0
with:
go-version: '1.24.0'
cache-dependency-path: |
${{ env.GOPATH}}/src/k8s.io/autoscaler/cluster-autoscaler/go.sum
${{ env.GOPATH}}/src/k8s.io/autoscaler/vertical-pod-autoscaler/go.sum
${{ env.GOPATH}}/src/k8s.io/autoscaler/vertical-pod-autoscaler/e2e/go.sum
- name: Apt-get
run: sudo apt-get install libseccomp-dev -qq

10
OWNERS
View File

@ -1,10 +1,8 @@
approvers:
- mwielgus
- maciekpytel
- gjtempleton
- sig-autoscaling-leads
reviewers:
- mwielgus
- maciekpytel
- gjtempleton
- sig-autoscaling-leads
emeritus_approvers:
- bskiba # 2022-09-30
- mwielgus
- maciekpytel

6
OWNERS_ALIASES Normal file
View File

@ -0,0 +1,6 @@
aliases:
sig-autoscaling-leads:
- gjtempleton
- jackfrancis
- raywainman
- towca

View File

@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
FROM golang:1.23.2
FROM golang:1.24.0
ENV GOPATH /gopath/
ENV PATH $GOPATH/bin:$PATH

View File

@ -11,4 +11,4 @@ name: cluster-autoscaler
sources:
- https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
type: application
version: 9.46.4
version: 9.46.6

View File

@ -79,7 +79,7 @@ To create a valid configuration, follow instructions for your cloud provider:
### Templating the autoDiscovery.clusterName
The cluster name can be templated in the `autoDiscovery.clusterName` variable. This is useful when the cluster name is dynamically generated based on other values coming from external systems like Argo CD or Flux. This also allows you to use global Helm values to set the cluster name, e.g., `autoDiscovery.clusterName=\{\{ .Values.global.clusterName }}`, so that you don't need to set it in more than 1 location in the values file.
The cluster name can be templated in the `autoDiscovery.clusterName` variable. This is useful when the cluster name is dynamically generated based on other values coming from external systems like Argo CD or Flux. This also allows you to use global Helm values to set the cluster name, e.g., `autoDiscovery.clusterName={{ .Values.global.clusterName }}`, so that you don't need to set it in more than 1 location in the values file.
### AWS - Using auto-discovery of tagged instance groups
@ -183,6 +183,8 @@ $ helm install my-release autoscaler/cluster-autoscaler \
Note that `your-ig-prefix` should be a _prefix_ matching one or more MIGs, and _not_ the full name of the MIG. For example, to match multiple instance groups - `k8s-node-group-a-standard`, `k8s-node-group-b-gpu`, you would use a prefix of `k8s-node-group-`.
Prefixes will be rendered using `tpl` function so you can use any value of your choice if that's a valid prefix. For instance (ignore escaping characters): `gke-{{ .Values.autoDiscovery.clusterName }}`
In the event you want to explicitly specify MIGs instead of using auto-discovery, set members of the `autoscalingGroups` array directly - e.g.
```
@ -326,7 +328,14 @@ For Kubernetes clusters that use Amazon EKS, the service account can be configur
In order to accomplish this, you will first need to create a new IAM role with the above mentions policies. Take care in [configuring the trust relationship](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html#iam-role-configuration) to restrict access just to the service account used by cluster autoscaler.
Once you have the IAM role configured, you would then need to `--set rbac.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::123456789012:role/MyRoleName` when installing.
Once you have the IAM role configured, you would then need to `--set rbac.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::123456789012:role/MyRoleName` when installing. Alternatively, you can embed templates in values (ignore escaping characters):
```yaml
rbac:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: "{{ .Values.aws.myroleARN }}"
```
### Azure - Using azure workload identity

View File

@ -79,7 +79,7 @@ To create a valid configuration, follow instructions for your cloud provider:
### Templating the autoDiscovery.clusterName
The cluster name can be templated in the `autoDiscovery.clusterName` variable. This is useful when the cluster name is dynamically generated based on other values coming from external systems like Argo CD or Flux. This also allows you to use global Helm values to set the cluster name, e.g., `autoDiscovery.clusterName=\{\{ .Values.global.clusterName }}`, so that you don't need to set it in more than 1 location in the values file.
The cluster name can be templated in the `autoDiscovery.clusterName` variable. This is useful when the cluster name is dynamically generated based on other values coming from external systems like Argo CD or Flux. This also allows you to use global Helm values to set the cluster name, e.g., `autoDiscovery.clusterName={{`{{ .Values.global.clusterName }}`}}`, so that you don't need to set it in more than 1 location in the values file.
### AWS - Using auto-discovery of tagged instance groups
@ -183,6 +183,8 @@ $ helm install my-release autoscaler/cluster-autoscaler \
Note that `your-ig-prefix` should be a _prefix_ matching one or more MIGs, and _not_ the full name of the MIG. For example, to match multiple instance groups - `k8s-node-group-a-standard`, `k8s-node-group-b-gpu`, you would use a prefix of `k8s-node-group-`.
Prefixes will be rendered using `tpl` function so you can use any value of your choice if that's a valid prefix. For instance (ignore escaping characters): `gke-{{`{{ .Values.autoDiscovery.clusterName }}`}}`
In the event you want to explicitly specify MIGs instead of using auto-discovery, set members of the `autoscalingGroups` array directly - e.g.
```
@ -326,7 +328,14 @@ For Kubernetes clusters that use Amazon EKS, the service account can be configur
In order to accomplish this, you will first need to create a new IAM role with the above mentions policies. Take care in [configuring the trust relationship](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html#iam-role-configuration) to restrict access just to the service account used by cluster autoscaler.
Once you have the IAM role configured, you would then need to `--set rbac.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::123456789012:role/MyRoleName` when installing.
Once you have the IAM role configured, you would then need to `--set rbac.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::123456789012:role/MyRoleName` when installing. Alternatively, you can embed templates in values (ignore escaping characters):
```yaml
rbac:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: "{{`{{ .Values.aws.myroleARN `}}}}"
```
### Azure - Using azure workload identity

View File

@ -86,7 +86,7 @@ spec:
{{- else if eq .Values.cloudProvider "gce" }}
{{- if .Values.autoscalingGroupsnamePrefix }}
{{- range .Values.autoscalingGroupsnamePrefix }}
- --node-group-auto-discovery=mig:namePrefix={{ .name }},min={{ .minSize }},max={{ .maxSize }}
- --node-group-auto-discovery=mig:namePrefix={{ tpl .name $ }},min={{ .minSize }},max={{ .maxSize }}
{{- end }}
{{- end }}
{{- if eq .Values.cloudProvider "oci" }}
@ -144,9 +144,9 @@ spec:
valueFrom:
fieldRef:
fieldPath: spec.serviceAccountName
{{- if and (eq .Values.cloudProvider "aws") (ne .Values.awsRegion "") }}
{{- if and (eq .Values.cloudProvider "aws") (ne (tpl .Values.awsRegion $) "") }}
- name: AWS_REGION
value: "{{ .Values.awsRegion }}"
value: "{{ tpl .Values.awsRegion $ }}"
{{- if .Values.awsAccessKeyID }}
- name: AWS_ACCESS_KEY_ID
valueFrom:

View File

@ -6,8 +6,12 @@ metadata:
{{ include "cluster-autoscaler.labels" . | indent 4 }}
name: {{ template "cluster-autoscaler.serviceAccountName" . }}
namespace: {{ .Release.Namespace }}
{{- if .Values.rbac.serviceAccount.annotations }}
annotations: {{ toYaml .Values.rbac.serviceAccount.annotations | nindent 4 }}
{{- with .Values.rbac.serviceAccount.annotations }}
annotations:
{{- range $k, $v := . }}
{{- printf "%s: %s" (tpl $k $) (tpl $v $) | nindent 4 }}
{{- end }}
{{- end }}
automountServiceAccountToken: {{ .Values.rbac.serviceAccount.automountServiceAccountToken }}
{{- end }}

View File

@ -11,9 +11,20 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG BASEIMAGE=gcr.io/distroless/static:nonroot-amd64
FROM $BASEIMAGE
FROM --platform=$BUILDPLATFORM golang:1.24 as builder
WORKDIR /workspace
COPY . .
ARG GOARCH
ARG LDFLAGS_FLAG
ARG TAGS_FLAG
RUN CGO_ENABLED=0 GOOS=linux go build -o cluster-autoscaler-$GOARCH $LDFLAGS_FLAG $TAGS_FLAG
FROM gcr.io/distroless/static:nonroot
ARG GOARCH
COPY --from=builder /workspace/cluster-autoscaler-$GOARCH /cluster-autoscaler
COPY cluster-autoscaler-amd64 /cluster-autoscaler
WORKDIR /
CMD ["/cluster-autoscaler"]

View File

@ -1,19 +0,0 @@
# Copyright 2016 The Kubernetes Authors. All rights reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG BASEIMAGE=gcr.io/distroless/static:nonroot-arm64
FROM $BASEIMAGE
COPY cluster-autoscaler-arm64 /cluster-autoscaler
WORKDIR /
CMD ["/cluster-autoscaler"]

View File

@ -1,19 +0,0 @@
# Copyright 2016 The Kubernetes Authors. All rights reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
ARG BASEIMAGE=gcr.io/distroless/static:nonroot-s390x
FROM $BASEIMAGE
COPY cluster-autoscaler-s390x /cluster-autoscaler
WORKDIR /
CMD ["/cluster-autoscaler"]

View File

@ -7,7 +7,7 @@ LDFLAGS?=-s
ENVVAR=CGO_ENABLED=0
GOOS?=linux
GOARCH?=$(shell go env GOARCH)
REGISTRY?=staging-k8s.gcr.io
REGISTRY?=gcr.io/k8s-staging-autoscaling
DOCKER_NETWORK?=default
SUPPORTED_BUILD_TAGS=$(shell ls cloudprovider/builder/ | grep -e '^builder_.*\.go' | sed 's/builder_\(.*\)\.go/\1/')
ifdef BUILD_TAGS
@ -20,7 +20,7 @@ else
FOR_PROVIDER=
endif
ifdef LDFLAGS
LDFLAGS_FLAG=--ldflags "${LDFLAGS}"
LDFLAGS_FLAG=--ldflags="${LDFLAGS}"
else
LDFLAGS_FLAG=
endif
@ -64,15 +64,10 @@ dev-release-arch-%: build-arch-% make-image-arch-% push-image-arch-%
make-image: make-image-arch-$(GOARCH)
make-image-arch-%:
ifdef BASEIMAGE
docker build --pull --build-arg BASEIMAGE=${BASEIMAGE} \
GOOS=$(GOOS) docker buildx build --pull --platform linux/$* \
--build-arg "GOARCH=$*" \
-t ${IMAGE}-$*:${TAG} \
-f Dockerfile.$* .
else
docker build --pull \
-t ${IMAGE}-$*:${TAG} \
-f Dockerfile.$* .
endif
-f Dockerfile .
@echo "Image ${TAG}${FOR_PROVIDER}-$* completed"
push-image: push-image-arch-$(GOARCH)
@ -80,12 +75,15 @@ push-image: push-image-arch-$(GOARCH)
push-image-arch-%:
./push_image.sh ${IMAGE}-$*:${TAG}
push-release-image-arch-%:
docker push ${IMAGE}-$*:${TAG}
push-manifest:
docker manifest create ${IMAGE}:${TAG} \
$(addprefix $(REGISTRY)/cluster-autoscaler$(PROVIDER)-, $(addsuffix :$(TAG), $(ALL_ARCH)))
docker manifest push --purge ${IMAGE}:${TAG}
execute-release: $(addprefix make-image-arch-,$(ALL_ARCH)) $(addprefix push-image-arch-,$(ALL_ARCH)) push-manifest
execute-release: $(addprefix make-image-arch-,$(ALL_ARCH)) $(addprefix push-release-image-arch-,$(ALL_ARCH)) push-manifest
@echo "Release ${TAG}${FOR_PROVIDER} completed"
clean: clean-arch-$(GOARCH)

View File

@ -92,12 +92,12 @@ target ETA and the actual releases.
| Date | Maintainer Preparing Release | Backup Maintainer | Type |
|------------|------------------------------|-------------------|-------|
| 2024-07-18 | x13n | MaciekPytel | patch |
| 2024-08-21 | MaciekPytel | gjtempleton | 1.31 |
| 2024-09-18 | gjtempleton | towca | patch |
| 2024-11-20 | towca | BigDarkClown | patch |
| 2024-12-18 | BigDarkClown | x13n | 1.32 |
| 2025-01-22 | x13n | MaciekPytel | patch |
| 2025-06-11 | jackfrancis | gjtempleton | 1.33 |
| 2025-07-16 | gjtempleton | towca | patch |
| 2025-08-20 | towca | BigDarkClown | patch |
| 2025-09-17 | BigDarkClown | x13n | 1.34 |
| 2025-10-22 | x13n | jackfrancis | patch |
| 2025-11-19 | jackfrancis | gjtempleton | patch |
Additional patch releases may happen outside of the schedule in case of critical
bugs or vulnerabilities.

View File

@ -1,16 +1,14 @@
module k8s.io/autoscaler/cluster-autoscaler/apis
go 1.23.0
toolchain go1.23.2
go 1.24.0
require (
github.com/onsi/ginkgo/v2 v2.21.0
github.com/onsi/gomega v1.35.1
k8s.io/apimachinery v0.33.0-alpha.0
k8s.io/client-go v0.33.0-alpha.0
k8s.io/code-generator v0.33.0-alpha.0
sigs.k8s.io/structured-merge-diff/v4 v4.4.2
k8s.io/apimachinery v0.34.0-alpha.0
k8s.io/client-go v0.34.0-alpha.0
k8s.io/code-generator v0.34.0-alpha.0
sigs.k8s.io/structured-merge-diff/v4 v4.6.0
)
require (
@ -23,10 +21,8 @@ require (
github.com/go-openapi/swag v0.23.0 // indirect
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/protobuf v1.5.4 // indirect
github.com/google/gnostic-models v0.6.8 // indirect
github.com/google/go-cmp v0.6.0 // indirect
github.com/google/gofuzz v1.2.0 // indirect
github.com/google/gnostic-models v0.6.9 // indirect
github.com/google/go-cmp v0.7.0 // indirect
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/josharian/intern v1.0.0 // indirect
@ -39,23 +35,24 @@ require (
github.com/spf13/pflag v1.0.5 // indirect
github.com/x448/float16 v0.8.4 // indirect
golang.org/x/mod v0.21.0 // indirect
golang.org/x/net v0.30.0 // indirect
golang.org/x/net v0.38.0 // indirect
golang.org/x/oauth2 v0.27.0 // indirect
golang.org/x/sync v0.8.0 // indirect
golang.org/x/sys v0.26.0 // indirect
golang.org/x/term v0.25.0 // indirect
golang.org/x/text v0.19.0 // indirect
golang.org/x/time v0.7.0 // indirect
golang.org/x/sync v0.12.0 // indirect
golang.org/x/sys v0.31.0 // indirect
golang.org/x/term v0.30.0 // indirect
golang.org/x/text v0.23.0 // indirect
golang.org/x/time v0.9.0 // indirect
golang.org/x/tools v0.26.0 // indirect
google.golang.org/protobuf v1.35.1 // indirect
google.golang.org/protobuf v1.36.5 // indirect
gopkg.in/evanphx/json-patch.v4 v4.12.0 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
k8s.io/api v0.33.0-alpha.0 // indirect
k8s.io/gengo/v2 v2.0.0-20240911193312-2b36238f13e9 // indirect
k8s.io/api v0.34.0-alpha.0 // indirect
k8s.io/gengo/v2 v2.0.0-20250207200755-1244d31929d7 // indirect
k8s.io/klog/v2 v2.130.1 // indirect
k8s.io/kube-openapi v0.0.0-20241105132330-32ad38e42d3f // indirect
k8s.io/kube-openapi v0.0.0-20250318190949-c8a335a9a2ff // indirect
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738 // indirect
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 // indirect
sigs.k8s.io/randfill v1.0.0 // indirect
sigs.k8s.io/yaml v1.4.0 // indirect
)

View File

@ -21,16 +21,12 @@ github.com/go-task/slim-sprig/v3 v3.0.0 h1:sUs3vkvUymDpBKi3qH1YSqBQk9+9D/8M2mN1v
github.com/go-task/slim-sprig/v3 v3.0.0/go.mod h1:W848ghGpv3Qj3dhTPRyJypKRiqCdHZiAzKg9hl15HA8=
github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q=
github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q=
github.com/golang/protobuf v1.5.4 h1:i7eJL8qZTpSEXOPTxNKhASYpMn+8e5Q6AdndVa1dWek=
github.com/golang/protobuf v1.5.4/go.mod h1:lnTiLA8Wa4RWRcIUkrtSVa5nRhsEGBg48fD6rSs7xps=
github.com/google/gnostic-models v0.6.8 h1:yo/ABAfM5IMRsS1VnXjTBvUb61tFIHozhlYvRgGre9I=
github.com/google/gnostic-models v0.6.8/go.mod h1:5n7qKqH0f5wFt+aWF8CW6pZLLNOfYuF5OpfBSENuI8U=
github.com/google/gnostic-models v0.6.9 h1:MU/8wDLif2qCXZmzncUQ/BOfxWfthHi63KqpoNbWqVw=
github.com/google/gnostic-models v0.6.9/go.mod h1:CiWsm0s6BSQd1hRn8/QmxqB6BesYcbSZxsz9b0KuDBw=
github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/gofuzz v1.2.0 h1:xRy4A+RhZaiKjJ1bPfwQ8sedCA+YS2YcCHW6ec7JMi0=
github.com/google/gofuzz v1.2.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db h1:097atOisP2aRj7vFgYQBbFN4U4JNXUNYpxael3UzMyo=
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db/go.mod h1:vavhavw2zAxS5dIdcRluK6cSGGPlZynqzFM8NdvU144=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
@ -63,26 +59,29 @@ github.com/onsi/gomega v1.35.1 h1:Cwbd75ZBPxFSuZ6T+rN/WCb/gOc6YgFBXLlZLhC7Ds4=
github.com/onsi/gomega v1.35.1/go.mod h1:PvZbdDc8J6XJEpDK4HCuRBm8a6Fzp9/DmhC9C7yFlog=
github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 h1:Jamvg5psRIccs7FGNTlIRMkT8wgtp5eCXdBlqhYGL6U=
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
github.com/rogpeppe/go-internal v1.13.1 h1:KvO1DLK/DRN07sQ1LQKScxyZJuNnedQ5/wKSR38lUII=
github.com/rogpeppe/go-internal v1.13.1/go.mod h1:uMEvuHeurkdAXX61udpOXGD/AzZDWNMNyH2VO9fmH0o=
github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
github.com/stretchr/objx v0.5.2 h1:xuMeJ0Sdp5ZMRXx/aWO6RZxdr3beISkG5/G/aIRr3pY=
github.com/stretchr/objx v0.5.2/go.mod h1:FRsXN1f5AsAjCGJKqEizvkpNtU+EGNCLh3NxZ/8L+MA=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
github.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg=
github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA=
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/x448/float16 v0.8.4 h1:qLwI1I70+NjRFUR3zs1JPUCgaCXSh3SW62uAKT1mSBM=
github.com/x448/float16 v0.8.4/go.mod h1:14CWIYCyZA/cWjXOioeEpHeN/83MdbZDRQHoFcYsOfg=
github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto=
go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
@ -94,28 +93,28 @@ golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.30.0 h1:AcW1SDZMkb8IpzCdQUaIq2sP4sZ4zw+55h6ynffypl4=
golang.org/x/net v0.30.0/go.mod h1:2wGyMJ5iFasEhkwi13ChkO/t1ECNC4X4eBKkVFyYFlU=
golang.org/x/net v0.38.0 h1:vRMAPTMaeGqVhG5QyLJHqNDwecKTomGeqbnfZyKlBI8=
golang.org/x/net v0.38.0/go.mod h1:ivrbrMbzFq5J41QOQh0siUuly180yBYtLp+CKbEaFx8=
golang.org/x/oauth2 v0.27.0 h1:da9Vo7/tDv5RH/7nZDz1eMGS/q1Vv1N/7FCrBhI9I3M=
golang.org/x/oauth2 v0.27.0/go.mod h1:onh5ek6nERTohokkhCD/y2cV4Do3fxFHFuAejCkRWT8=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.8.0 h1:3NFvSEYkUoMifnESzZl15y791HH1qU2xm6eCJU5ZPXQ=
golang.org/x/sync v0.8.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.12.0 h1:MHc5BpPuC30uJk597Ri8TV3CNZcTLu6B6z4lJy+g6Jw=
golang.org/x/sync v0.12.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.26.0 h1:KHjCJyddX0LoSTb3J+vWpupP9p0oznkqVk/IfjymZbo=
golang.org/x/sys v0.26.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/term v0.25.0 h1:WtHI/ltw4NvSUig5KARz9h521QvRC8RmF/cuYqifU24=
golang.org/x/term v0.25.0/go.mod h1:RPyXicDX+6vLxogjjRxjgD2TKtmAO6NZBsBRfrOLu7M=
golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
golang.org/x/term v0.30.0 h1:PQ39fJZ+mfadBm0y5WlL4vlM7Sx1Hgf13sMIY2+QS9Y=
golang.org/x/term v0.30.0/go.mod h1:NYYFdzHoI5wRh/h5tDMdMqCqPJZEuNqVR5xJLd/n67g=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.19.0 h1:kTxAhCbGbxhK0IwgSKiMO5awPoDQ0RpfiVYBfK860YM=
golang.org/x/text v0.19.0/go.mod h1:BuEKDfySbSR4drPmRPG/7iBdf8hvFMuRexcpahXilzY=
golang.org/x/time v0.7.0 h1:ntUhktv3OPE6TgYxXWv9vKvUSJyIFJlyohwbkEwPrKQ=
golang.org/x/time v0.7.0/go.mod h1:3BpzKBy/shNhVucY/MWOyx10tF3SFh9QdLuxbVysPQM=
golang.org/x/text v0.23.0 h1:D71I7dUrlY+VX0gQShAThNGHFxZ13dGLBHQLVl1mJlY=
golang.org/x/text v0.23.0/go.mod h1:/BLNzu4aZCJ1+kcD0DNRotWKage4q2rGVAg4o22unh4=
golang.org/x/time v0.9.0 h1:EsRrnYcQiGH+5FfbgvV4AP7qEZstoyrHB0DzarOQ4ZY=
golang.org/x/time v0.9.0/go.mod h1:3BpzKBy/shNhVucY/MWOyx10tF3SFh9QdLuxbVysPQM=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20200619180055-7c47624df98f/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE=
@ -126,8 +125,8 @@ golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8T
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
google.golang.org/protobuf v1.35.1 h1:m3LfL6/Ca+fqnjnlqQXNpFPABW1UD7mjh8KO2mKFytA=
google.golang.org/protobuf v1.35.1/go.mod h1:9fA7Ob0pmnwhb644+1+CVWFRbNajQ6iRojtC/QF5bRE=
google.golang.org/protobuf v1.36.5 h1:tPhr+woSbjfYvY6/GPufUoYizxw1cF/yFoxJ2fmpwlM=
google.golang.org/protobuf v1.36.5/go.mod h1:9fA7Ob0pmnwhb644+1+CVWFRbNajQ6iRojtC/QF5bRE=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
@ -138,25 +137,28 @@ gopkg.in/inf.v0 v0.9.1/go.mod h1:cWUDdTG/fYaXco+Dcufb5Vnc6Gp2YChqWtbxRZE0mXw=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
k8s.io/api v0.33.0-alpha.0 h1:bZn/3zFtD8eIj2kuvTnI9NOHVH0FlEMvqqUoTAqBPl0=
k8s.io/api v0.33.0-alpha.0/go.mod h1:hk95yeuwwXA2VCRMnCPNh/5vRMMxjSINs3nQPhxrp3Y=
k8s.io/apimachinery v0.33.0-alpha.0 h1:UEr11OY9sG+9Zizy6qPpyhLwOMhhs4c6+RLcUOjn5G4=
k8s.io/apimachinery v0.33.0-alpha.0/go.mod h1:HqhdaJUgQqky29T1V0o2yFkt/pZqLFIDyn9Zi/8rxoY=
k8s.io/client-go v0.33.0-alpha.0 h1:j/1m4ocOzykgF7Mx/xSX5rk5EiOghaCMtbIfVnIl2Gw=
k8s.io/client-go v0.33.0-alpha.0/go.mod h1:tKjHOpArmmeuq+J+ahsZ1LbZi4YFK5uwqn9HNq2++G4=
k8s.io/code-generator v0.33.0-alpha.0 h1:lvV/XBpfQFCXzzhY4M/YFIPo76+wkGCC449RMRcx1nY=
k8s.io/code-generator v0.33.0-alpha.0/go.mod h1:E6buYsOCImG+b6OcYyJMOjmkO8dbB3iY+JqmNdUdycE=
k8s.io/gengo/v2 v2.0.0-20240911193312-2b36238f13e9 h1:si3PfKm8dDYxgfbeA6orqrtLkvvIeH8UqffFJDl0bz4=
k8s.io/gengo/v2 v2.0.0-20240911193312-2b36238f13e9/go.mod h1:EJykeLsmFC60UQbYJezXkEsG2FLrt0GPNkU5iK5GWxU=
k8s.io/api v0.34.0-alpha.0 h1:plVaaO0yCTOGvWjEiEvvecQOPpf/IYdLnVMsfGfGMQo=
k8s.io/api v0.34.0-alpha.0/go.mod h1:brriDRpq4yMP4PN4P48NfXVLwWSwaIFSe0+pOajiwjQ=
k8s.io/apimachinery v0.34.0-alpha.0 h1:arymqm+uCpPEAVWBCvNF+yq01AJzsoUeUd2DYpoHuzc=
k8s.io/apimachinery v0.34.0-alpha.0/go.mod h1:BHW0YOu7n22fFv/JkYOEfkUYNRN0fj0BlvMFWA7b+SM=
k8s.io/client-go v0.34.0-alpha.0 h1:+hfihZ7vffuzoS4BoYg2nWs+9Bc1hXpZ7+iev2ISCo0=
k8s.io/client-go v0.34.0-alpha.0/go.mod h1:0sClwbFRpXuYhqaJEqLiy+e9dlC7FOhFHc9ZdvLDAbU=
k8s.io/code-generator v0.34.0-alpha.0 h1:aM4APBz/eAR8Qw4RWiCpfocZ2O2UUTi0UqTfvalouHc=
k8s.io/code-generator v0.34.0-alpha.0/go.mod h1:lwzb0eIHnmHnkhcHbxXf87XR512Xm7mF2RHtDKEW71c=
k8s.io/gengo/v2 v2.0.0-20250207200755-1244d31929d7 h1:2OX19X59HxDprNCVrWi6jb7LW1PoqTlYqEq5H2oetog=
k8s.io/gengo/v2 v2.0.0-20250207200755-1244d31929d7/go.mod h1:EJykeLsmFC60UQbYJezXkEsG2FLrt0GPNkU5iK5GWxU=
k8s.io/klog/v2 v2.130.1 h1:n9Xl7H1Xvksem4KFG4PYbdQCQxqc/tTUyrgXaOhHSzk=
k8s.io/klog/v2 v2.130.1/go.mod h1:3Jpz1GvMt720eyJH1ckRHK1EDfpxISzJ7I9OYgaDtPE=
k8s.io/kube-openapi v0.0.0-20241105132330-32ad38e42d3f h1:GA7//TjRY9yWGy1poLzYYJJ4JRdzg3+O6e8I+e+8T5Y=
k8s.io/kube-openapi v0.0.0-20241105132330-32ad38e42d3f/go.mod h1:R/HEjbvWI0qdfb8viZUeVZm0X6IZnxAydC7YU42CMw4=
k8s.io/kube-openapi v0.0.0-20250318190949-c8a335a9a2ff h1:/usPimJzUKKu+m+TE36gUyGcf03XZEP0ZIKgKj35LS4=
k8s.io/kube-openapi v0.0.0-20250318190949-c8a335a9a2ff/go.mod h1:5jIi+8yX4RIb8wk3XwBo5Pq2ccx4FP10ohkbSKCZoK8=
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738 h1:M3sRQVHv7vB20Xc2ybTt7ODCeFj6JSWYFzOFnYeS6Ro=
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0=
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 h1:/Rv+M11QRah1itp8VhT6HoVx1Ray9eB4DBr+K+/sCJ8=
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3/go.mod h1:18nIHnGi6636UCz6m8i4DhaJ65T6EruyzmoQqI2BVDo=
sigs.k8s.io/structured-merge-diff/v4 v4.4.2 h1:MdmvkGuXi/8io6ixD5wud3vOLwc1rj0aNqRlpuvjmwA=
sigs.k8s.io/structured-merge-diff/v4 v4.4.2/go.mod h1:N8f93tFZh9U6vpxwRArLiikrE5/2tiu1w1AGfACIGE4=
sigs.k8s.io/randfill v0.0.0-20250304075658-069ef1bbf016/go.mod h1:XeLlZ/jmk4i1HRopwe7/aU3H5n1zNUcX6TM94b3QxOY=
sigs.k8s.io/randfill v1.0.0 h1:JfjMILfT8A6RbawdsK2JXGBR5AQVfd+9TbzrlneTyrU=
sigs.k8s.io/randfill v1.0.0/go.mod h1:XeLlZ/jmk4i1HRopwe7/aU3H5n1zNUcX6TM94b3QxOY=
sigs.k8s.io/structured-merge-diff/v4 v4.6.0 h1:IUA9nvMmnKWcj5jl84xn+T5MnlZKThmUW1TdblaLVAc=
sigs.k8s.io/structured-merge-diff/v4 v4.6.0/go.mod h1:dDy58f92j70zLsuZVuUX5Wp9vtxXpaZnkPGWeqDfCps=
sigs.k8s.io/yaml v1.4.0 h1:Mk1wCc2gy/F0THH0TAp1QYyJNzRm2KCLy3o5ASXVI5E=
sigs.k8s.io/yaml v1.4.0/go.mod h1:Ejl7/uTz7PSA4eKMyQCUTnhZYNmLIl+5c2lQPGR2BPY=

View File

@ -1,60 +1,15 @@
# See https://cloud.google.com/cloud-build/docs/build-config
timeout: 3600s
options:
machineType: E2_HIGHCPU_32
timeout: 10800s
substitutions:
{ "_TAG": "dev" }
# this prevents errors if you don't use both _GIT_TAG and _PULL_BASE_REF,
# or any new substitutions added in the future.
substitution_option: ALLOW_LOOSE
steps:
- name: gcr.io/cloud-builders/git
id: git-clone
entrypoint: bash
args:
- "-c"
- |
set -ex
mkdir -p /workspace/src/k8s.io
cd /workspace/src/k8s.io
git clone https://github.com/kubernetes/autoscaler.git
- name: gcr.io/cloud-builders/docker
id: build-build-container
entrypoint: bash
dir: "/workspace/src/k8s.io/autoscaler/cluster-autoscaler"
args:
- "-c"
- |
set -e
docker build -t autoscaling-builder ../builder
- name: autoscaling-builder
id: run-tests
entrypoint: godep
dir: "/workspace/src/k8s.io/autoscaler/cluster-autoscaler"
env:
- "GOPATH=/workspace/"
args: ["go", "test", "./..."]
- name: autoscaling-builder
id: run-build
entrypoint: godep
dir: "/workspace/src/k8s.io/autoscaler/cluster-autoscaler"
env:
- "GOPATH=/workspace/"
- "GOOS=linux"
args: ["go", "build", "-o", "cluster-autoscaler"]
waitFor: build-build-container
- name: gcr.io/cloud-builders/docker
id: build-container
entrypoint: bash
dir: "/workspace/src/k8s.io/autoscaler/cluster-autoscaler"
args:
- "-c"
- |
set -e
docker build -t gcr.io/k8s-image-staging/cluster-autoscaler:${_TAG} .
waitFor: ["run-tests", "run-build"]
images:
- "gcr.io/k8s-image-staging/cluster-autoscaler:${_TAG}"
- name: "gcr.io/k8s-staging-test-infra/gcb-docker-gcloud:latest"
entrypoint: make
env:
- TAG=$_GIT_TAG
args:
- execute-release
substitutions:
_GIT_TAG: "0.0.0" # default value, this is substituted at build time

View File

@ -19,6 +19,13 @@ package signers
import (
"encoding/json"
"fmt"
"net/http"
"os"
"runtime"
"strconv"
"strings"
"time"
"github.com/jmespath/go-jmespath"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/alicloud/alibaba-cloud-sdk-go/sdk/auth/credentials"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/alicloud/alibaba-cloud-sdk-go/sdk/errors"
@ -26,16 +33,12 @@ import (
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/alicloud/alibaba-cloud-sdk-go/sdk/responses"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/alicloud/alibaba-cloud-sdk-go/sdk/utils"
"k8s.io/klog/v2"
"net/http"
"os"
"runtime"
"strconv"
"strings"
"time"
)
const (
defaultOIDCDurationSeconds = 3600
oidcTokenFilePath = "ALIBABA_CLOUD_OIDC_TOKEN_FILE"
oldOidcTokenFilePath = "ALICLOUD_OIDC_TOKEN_FILE_PATH"
)
// OIDCSigner is kind of signer
@ -149,7 +152,7 @@ func (signer *OIDCSigner) getOIDCToken(OIDCTokenFilePath string) string {
tokenPath := OIDCTokenFilePath
_, err := os.Stat(tokenPath)
if os.IsNotExist(err) {
tokenPath = os.Getenv("ALIBABA_CLOUD_OIDC_TOKEN_FILE")
tokenPath = utils.FirstNotEmpty(os.Getenv(oidcTokenFilePath), os.Getenv(oldOidcTokenFilePath))
if tokenPath == "" {
klog.Error("oidc token file path is missing")
return ""

View File

@ -22,11 +22,12 @@ import (
"encoding/hex"
"encoding/json"
"fmt"
"github.com/google/uuid"
"net/url"
"reflect"
"strconv"
"time"
"github.com/google/uuid"
)
/* if you use go 1.10 or higher, you can hack this util by these to avoid "TimeZone.zip not found" on Windows */
@ -127,3 +128,15 @@ func InitStructWithDefaultTag(bean interface{}) {
}
}
}
// FirstNotEmpty returns the first non-empty string from the input list.
// If all strings are empty or no arguments are provided, it returns an empty string.
func FirstNotEmpty(strs ...string) string {
for _, str := range strs {
if str != "" {
return str
}
}
return ""
}

View File

@ -0,0 +1,45 @@
/*
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
import (
"testing"
"github.com/stretchr/testify/assert"
)
func TestFirstNotEmpty(t *testing.T) {
// Test case where the first non-empty string is at the beginning
result := FirstNotEmpty("hello", "world", "test")
assert.Equal(t, "hello", result)
// Test case where the first non-empty string is in the middle
result = FirstNotEmpty("", "foo", "bar")
assert.Equal(t, "foo", result)
// Test case where the first non-empty string is at the end
result = FirstNotEmpty("", "", "baz")
assert.Equal(t, "baz", result)
// Test case where all strings are empty
result = FirstNotEmpty("", "", "")
assert.Equal(t, "", result)
// Test case with no arguments
result = FirstNotEmpty()
assert.Equal(t, "", result)
}

View File

@ -19,6 +19,7 @@ package alicloud
import (
"os"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/alicloud/alibaba-cloud-sdk-go/sdk/utils"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/alicloud/metadata"
"k8s.io/klog/v2"
)
@ -63,19 +64,19 @@ func (cc *cloudConfig) isValid() bool {
}
if cc.OIDCProviderARN == "" {
cc.OIDCProviderARN = firstNotEmpty(os.Getenv(oidcProviderARN), os.Getenv(oldOidcProviderARN))
cc.OIDCProviderARN = utils.FirstNotEmpty(os.Getenv(oidcProviderARN), os.Getenv(oldOidcProviderARN))
}
if cc.OIDCTokenFilePath == "" {
cc.OIDCTokenFilePath = firstNotEmpty(os.Getenv(oidcTokenFilePath), os.Getenv(oldOidcTokenFilePath))
cc.OIDCTokenFilePath = utils.FirstNotEmpty(os.Getenv(oidcTokenFilePath), os.Getenv(oldOidcTokenFilePath))
}
if cc.RoleARN == "" {
cc.RoleARN = firstNotEmpty(os.Getenv(roleARN), os.Getenv(oldRoleARN))
cc.RoleARN = utils.FirstNotEmpty(os.Getenv(roleARN), os.Getenv(oldRoleARN))
}
if cc.RoleSessionName == "" {
cc.RoleSessionName = firstNotEmpty(os.Getenv(roleSessionName), os.Getenv(oldRoleSessionName))
cc.RoleSessionName = utils.FirstNotEmpty(os.Getenv(roleSessionName), os.Getenv(oldRoleSessionName))
}
if cc.RegionId != "" && cc.AccessKeyID != "" && cc.AccessKeySecret != "" {
@ -133,15 +134,3 @@ func (cc *cloudConfig) getRegion() string {
}
return r
}
// firstNotEmpty returns the first non-empty string from the input list.
// If all strings are empty or no arguments are provided, it returns an empty string.
func firstNotEmpty(strs ...string) string {
for _, str := range strs {
if str != "" {
return str
}
}
return ""
}

View File

@ -55,25 +55,3 @@ func TestOldRRSACloudConfigIsValid(t *testing.T) {
assert.True(t, cfg.isValid())
assert.True(t, cfg.RRSAEnabled)
}
func TestFirstNotEmpty(t *testing.T) {
// Test case where the first non-empty string is at the beginning
result := firstNotEmpty("hello", "world", "test")
assert.Equal(t, "hello", result)
// Test case where the first non-empty string is in the middle
result = firstNotEmpty("", "foo", "bar")
assert.Equal(t, "foo", result)
// Test case where the first non-empty string is at the end
result = firstNotEmpty("", "", "baz")
assert.Equal(t, "baz", result)
// Test case where all strings are empty
result = firstNotEmpty("", "", "")
assert.Equal(t, "", result)
// Test case with no arguments
result = firstNotEmpty()
assert.Equal(t, "", result)
}

View File

@ -421,8 +421,7 @@ specify the command-line flag `--aws-use-static-instance-list=true` to switch
the CA back to its original use of a statically defined set.
To refresh static list, please run `go run ec2_instance_types/gen.go` under
`cluster-autoscaler/cloudprovider/aws/` and update `staticListLastUpdateTime` in
`aws_util.go`
`cluster-autoscaler/cloudprovider/aws/`.
## Using the AWS SDK vendored in the AWS cloudprovider

View File

@ -117,7 +117,9 @@ func (aws *awsCloudProvider) NodeGroupForNode(node *apiv1.Node) (cloudprovider.N
}
ref, err := AwsRefFromProviderId(node.Spec.ProviderID)
if err != nil {
return nil, err
// Dropping this into V as it will be noisy with many Hybrid Nodes
klog.V(6).Infof("Node %v has unrecognized providerId: %v", node.Name, node.Spec.ProviderID)
return nil, nil
}
asg := aws.awsManager.GetAsgForInstance(*ref)

View File

@ -18,6 +18,8 @@ package aws
import (
"fmt"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/mock"
apiv1 "k8s.io/api/core/v1"
@ -26,7 +28,6 @@ import (
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/aws/aws-sdk-go/aws"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/aws/aws-sdk-go/service/autoscaling"
"k8s.io/autoscaler/cluster-autoscaler/config"
"testing"
)
var testAwsManager = &AwsManager{
@ -251,6 +252,20 @@ func TestNodeGroupForNodeWithNoProviderId(t *testing.T) {
assert.Equal(t, group, nil)
}
func TestNodeGroupForNodeWithHybridNode(t *testing.T) {
hybridNode := &apiv1.Node{
Spec: apiv1.NodeSpec{
ProviderID: "eks-hybrid:///us-west-2/my-cluster/my-node-1",
},
}
a := &autoScalingMock{}
provider := testProvider(t, newTestAwsManagerWithAsgs(t, a, nil, []string{"1:5:test-asg"}))
group, err := provider.NodeGroupForNode(hybridNode)
assert.NoError(t, err)
assert.Nil(t, group)
}
func TestAwsRefFromProviderId(t *testing.T) {
tests := []struct {
provID string

View File

@ -28,7 +28,7 @@ type InstanceType struct {
}
// StaticListLastUpdateTime is a string declaring the last time the static list was updated.
var StaticListLastUpdateTime = "2024-10-02"
var StaticListLastUpdateTime = "2025-05-27"
// InstanceTypes is a map of ec2 resources
var InstanceTypes = map[string]*InstanceType{
@ -1187,6 +1187,20 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 0,
Architecture: "arm64",
},
"c7i-flex.12xlarge": {
InstanceType: "c7i-flex.12xlarge",
VCPU: 48,
MemoryMb: 98304,
GPU: 0,
Architecture: "amd64",
},
"c7i-flex.16xlarge": {
InstanceType: "c7i-flex.16xlarge",
VCPU: 64,
MemoryMb: 131072,
GPU: 0,
Architecture: "amd64",
},
"c7i-flex.2xlarge": {
InstanceType: "c7i-flex.2xlarge",
VCPU: 8,
@ -1383,6 +1397,90 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 0,
Architecture: "arm64",
},
"c8gd.12xlarge": {
InstanceType: "c8gd.12xlarge",
VCPU: 48,
MemoryMb: 98304,
GPU: 0,
Architecture: "arm64",
},
"c8gd.16xlarge": {
InstanceType: "c8gd.16xlarge",
VCPU: 64,
MemoryMb: 131072,
GPU: 0,
Architecture: "arm64",
},
"c8gd.24xlarge": {
InstanceType: "c8gd.24xlarge",
VCPU: 96,
MemoryMb: 196608,
GPU: 0,
Architecture: "arm64",
},
"c8gd.2xlarge": {
InstanceType: "c8gd.2xlarge",
VCPU: 8,
MemoryMb: 16384,
GPU: 0,
Architecture: "arm64",
},
"c8gd.48xlarge": {
InstanceType: "c8gd.48xlarge",
VCPU: 192,
MemoryMb: 393216,
GPU: 0,
Architecture: "arm64",
},
"c8gd.4xlarge": {
InstanceType: "c8gd.4xlarge",
VCPU: 16,
MemoryMb: 32768,
GPU: 0,
Architecture: "arm64",
},
"c8gd.8xlarge": {
InstanceType: "c8gd.8xlarge",
VCPU: 32,
MemoryMb: 65536,
GPU: 0,
Architecture: "arm64",
},
"c8gd.large": {
InstanceType: "c8gd.large",
VCPU: 2,
MemoryMb: 4096,
GPU: 0,
Architecture: "arm64",
},
"c8gd.medium": {
InstanceType: "c8gd.medium",
VCPU: 1,
MemoryMb: 2048,
GPU: 0,
Architecture: "arm64",
},
"c8gd.metal-24xl": {
InstanceType: "c8gd.metal-24xl",
VCPU: 96,
MemoryMb: 196608,
GPU: 0,
Architecture: "arm64",
},
"c8gd.metal-48xl": {
InstanceType: "c8gd.metal-48xl",
VCPU: 192,
MemoryMb: 393216,
GPU: 0,
Architecture: "arm64",
},
"c8gd.xlarge": {
InstanceType: "c8gd.xlarge",
VCPU: 4,
MemoryMb: 8192,
GPU: 0,
Architecture: "arm64",
},
"d2.2xlarge": {
InstanceType: "d2.2xlarge",
VCPU: 8,
@ -1509,32 +1607,25 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 0,
Architecture: "amd64",
},
"g3.16xlarge": {
InstanceType: "g3.16xlarge",
VCPU: 64,
MemoryMb: 499712,
GPU: 4,
"f2.12xlarge": {
InstanceType: "f2.12xlarge",
VCPU: 48,
MemoryMb: 524288,
GPU: 0,
Architecture: "amd64",
},
"g3.4xlarge": {
InstanceType: "g3.4xlarge",
VCPU: 16,
MemoryMb: 124928,
GPU: 1,
"f2.48xlarge": {
InstanceType: "f2.48xlarge",
VCPU: 192,
MemoryMb: 2097152,
GPU: 0,
Architecture: "amd64",
},
"g3.8xlarge": {
InstanceType: "g3.8xlarge",
VCPU: 32,
MemoryMb: 249856,
GPU: 2,
Architecture: "amd64",
},
"g3s.xlarge": {
InstanceType: "g3s.xlarge",
VCPU: 4,
MemoryMb: 31232,
GPU: 1,
"f2.6xlarge": {
InstanceType: "f2.6xlarge",
VCPU: 24,
MemoryMb: 262144,
GPU: 0,
Architecture: "amd64",
},
"g4ad.16xlarge": {
@ -2139,6 +2230,230 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 0,
Architecture: "amd64",
},
"i7i.12xlarge": {
InstanceType: "i7i.12xlarge",
VCPU: 48,
MemoryMb: 393216,
GPU: 0,
Architecture: "amd64",
},
"i7i.16xlarge": {
InstanceType: "i7i.16xlarge",
VCPU: 64,
MemoryMb: 524288,
GPU: 0,
Architecture: "amd64",
},
"i7i.24xlarge": {
InstanceType: "i7i.24xlarge",
VCPU: 96,
MemoryMb: 786432,
GPU: 0,
Architecture: "amd64",
},
"i7i.2xlarge": {
InstanceType: "i7i.2xlarge",
VCPU: 8,
MemoryMb: 65536,
GPU: 0,
Architecture: "amd64",
},
"i7i.48xlarge": {
InstanceType: "i7i.48xlarge",
VCPU: 192,
MemoryMb: 1572864,
GPU: 0,
Architecture: "amd64",
},
"i7i.4xlarge": {
InstanceType: "i7i.4xlarge",
VCPU: 16,
MemoryMb: 131072,
GPU: 0,
Architecture: "amd64",
},
"i7i.8xlarge": {
InstanceType: "i7i.8xlarge",
VCPU: 32,
MemoryMb: 262144,
GPU: 0,
Architecture: "amd64",
},
"i7i.large": {
InstanceType: "i7i.large",
VCPU: 2,
MemoryMb: 16384,
GPU: 0,
Architecture: "amd64",
},
"i7i.metal-24xl": {
InstanceType: "i7i.metal-24xl",
VCPU: 96,
MemoryMb: 786432,
GPU: 0,
Architecture: "amd64",
},
"i7i.metal-48xl": {
InstanceType: "i7i.metal-48xl",
VCPU: 192,
MemoryMb: 1572864,
GPU: 0,
Architecture: "amd64",
},
"i7i.xlarge": {
InstanceType: "i7i.xlarge",
VCPU: 4,
MemoryMb: 32768,
GPU: 0,
Architecture: "amd64",
},
"i7ie.12xlarge": {
InstanceType: "i7ie.12xlarge",
VCPU: 48,
MemoryMb: 393216,
GPU: 0,
Architecture: "amd64",
},
"i7ie.18xlarge": {
InstanceType: "i7ie.18xlarge",
VCPU: 72,
MemoryMb: 589824,
GPU: 0,
Architecture: "amd64",
},
"i7ie.24xlarge": {
InstanceType: "i7ie.24xlarge",
VCPU: 96,
MemoryMb: 786432,
GPU: 0,
Architecture: "amd64",
},
"i7ie.2xlarge": {
InstanceType: "i7ie.2xlarge",
VCPU: 8,
MemoryMb: 65536,
GPU: 0,
Architecture: "amd64",
},
"i7ie.3xlarge": {
InstanceType: "i7ie.3xlarge",
VCPU: 12,
MemoryMb: 98304,
GPU: 0,
Architecture: "amd64",
},
"i7ie.48xlarge": {
InstanceType: "i7ie.48xlarge",
VCPU: 192,
MemoryMb: 1572864,
GPU: 0,
Architecture: "amd64",
},
"i7ie.6xlarge": {
InstanceType: "i7ie.6xlarge",
VCPU: 24,
MemoryMb: 196608,
GPU: 0,
Architecture: "amd64",
},
"i7ie.large": {
InstanceType: "i7ie.large",
VCPU: 2,
MemoryMb: 16384,
GPU: 0,
Architecture: "amd64",
},
"i7ie.metal-24xl": {
InstanceType: "i7ie.metal-24xl",
VCPU: 96,
MemoryMb: 786432,
GPU: 0,
Architecture: "amd64",
},
"i7ie.metal-48xl": {
InstanceType: "i7ie.metal-48xl",
VCPU: 192,
MemoryMb: 1572864,
GPU: 0,
Architecture: "amd64",
},
"i7ie.xlarge": {
InstanceType: "i7ie.xlarge",
VCPU: 4,
MemoryMb: 32768,
GPU: 0,
Architecture: "amd64",
},
"i8g.12xlarge": {
InstanceType: "i8g.12xlarge",
VCPU: 48,
MemoryMb: 393216,
GPU: 0,
Architecture: "arm64",
},
"i8g.16xlarge": {
InstanceType: "i8g.16xlarge",
VCPU: 64,
MemoryMb: 524288,
GPU: 0,
Architecture: "arm64",
},
"i8g.24xlarge": {
InstanceType: "i8g.24xlarge",
VCPU: 96,
MemoryMb: 786432,
GPU: 0,
Architecture: "arm64",
},
"i8g.2xlarge": {
InstanceType: "i8g.2xlarge",
VCPU: 8,
MemoryMb: 65536,
GPU: 0,
Architecture: "arm64",
},
"i8g.48xlarge": {
InstanceType: "i8g.48xlarge",
VCPU: 192,
MemoryMb: 1572864,
GPU: 0,
Architecture: "arm64",
},
"i8g.4xlarge": {
InstanceType: "i8g.4xlarge",
VCPU: 16,
MemoryMb: 131072,
GPU: 0,
Architecture: "arm64",
},
"i8g.8xlarge": {
InstanceType: "i8g.8xlarge",
VCPU: 32,
MemoryMb: 262144,
GPU: 0,
Architecture: "arm64",
},
"i8g.large": {
InstanceType: "i8g.large",
VCPU: 2,
MemoryMb: 16384,
GPU: 0,
Architecture: "arm64",
},
"i8g.metal-24xl": {
InstanceType: "i8g.metal-24xl",
VCPU: 96,
MemoryMb: 786432,
GPU: 0,
Architecture: "arm64",
},
"i8g.xlarge": {
InstanceType: "i8g.xlarge",
VCPU: 4,
MemoryMb: 32768,
GPU: 0,
Architecture: "arm64",
},
"im4gn.16xlarge": {
InstanceType: "im4gn.16xlarge",
VCPU: 64,
@ -3504,6 +3819,20 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 0,
Architecture: "arm64",
},
"m7i-flex.12xlarge": {
InstanceType: "m7i-flex.12xlarge",
VCPU: 48,
MemoryMb: 196608,
GPU: 0,
Architecture: "amd64",
},
"m7i-flex.16xlarge": {
InstanceType: "m7i-flex.16xlarge",
VCPU: 64,
MemoryMb: 262144,
GPU: 0,
Architecture: "amd64",
},
"m7i-flex.2xlarge": {
InstanceType: "m7i-flex.2xlarge",
VCPU: 8,
@ -3700,6 +4029,90 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 0,
Architecture: "arm64",
},
"m8gd.12xlarge": {
InstanceType: "m8gd.12xlarge",
VCPU: 48,
MemoryMb: 196608,
GPU: 0,
Architecture: "arm64",
},
"m8gd.16xlarge": {
InstanceType: "m8gd.16xlarge",
VCPU: 64,
MemoryMb: 262144,
GPU: 0,
Architecture: "arm64",
},
"m8gd.24xlarge": {
InstanceType: "m8gd.24xlarge",
VCPU: 96,
MemoryMb: 393216,
GPU: 0,
Architecture: "arm64",
},
"m8gd.2xlarge": {
InstanceType: "m8gd.2xlarge",
VCPU: 8,
MemoryMb: 32768,
GPU: 0,
Architecture: "arm64",
},
"m8gd.48xlarge": {
InstanceType: "m8gd.48xlarge",
VCPU: 192,
MemoryMb: 786432,
GPU: 0,
Architecture: "arm64",
},
"m8gd.4xlarge": {
InstanceType: "m8gd.4xlarge",
VCPU: 16,
MemoryMb: 65536,
GPU: 0,
Architecture: "arm64",
},
"m8gd.8xlarge": {
InstanceType: "m8gd.8xlarge",
VCPU: 32,
MemoryMb: 131072,
GPU: 0,
Architecture: "arm64",
},
"m8gd.large": {
InstanceType: "m8gd.large",
VCPU: 2,
MemoryMb: 8192,
GPU: 0,
Architecture: "arm64",
},
"m8gd.medium": {
InstanceType: "m8gd.medium",
VCPU: 1,
MemoryMb: 4096,
GPU: 0,
Architecture: "arm64",
},
"m8gd.metal-24xl": {
InstanceType: "m8gd.metal-24xl",
VCPU: 96,
MemoryMb: 393216,
GPU: 0,
Architecture: "arm64",
},
"m8gd.metal-48xl": {
InstanceType: "m8gd.metal-48xl",
VCPU: 192,
MemoryMb: 786432,
GPU: 0,
Architecture: "arm64",
},
"m8gd.xlarge": {
InstanceType: "m8gd.xlarge",
VCPU: 4,
MemoryMb: 16384,
GPU: 0,
Architecture: "arm64",
},
"mac1.metal": {
InstanceType: "mac1.metal",
VCPU: 12,
@ -3735,27 +4148,6 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 0,
Architecture: "amd64",
},
"p2.16xlarge": {
InstanceType: "p2.16xlarge",
VCPU: 64,
MemoryMb: 749568,
GPU: 16,
Architecture: "amd64",
},
"p2.8xlarge": {
InstanceType: "p2.8xlarge",
VCPU: 32,
MemoryMb: 499712,
GPU: 8,
Architecture: "amd64",
},
"p2.xlarge": {
InstanceType: "p2.xlarge",
VCPU: 4,
MemoryMb: 62464,
GPU: 1,
Architecture: "amd64",
},
"p3.16xlarge": {
InstanceType: "p3.16xlarge",
VCPU: 64,
@ -3805,6 +4197,13 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 8,
Architecture: "amd64",
},
"p5en.48xlarge": {
InstanceType: "p5en.48xlarge",
VCPU: 192,
MemoryMb: 2097152,
GPU: 8,
Architecture: "amd64",
},
"r3.2xlarge": {
InstanceType: "r3.2xlarge",
VCPU: 8,
@ -5233,6 +5632,90 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 0,
Architecture: "arm64",
},
"r8gd.12xlarge": {
InstanceType: "r8gd.12xlarge",
VCPU: 48,
MemoryMb: 393216,
GPU: 0,
Architecture: "arm64",
},
"r8gd.16xlarge": {
InstanceType: "r8gd.16xlarge",
VCPU: 64,
MemoryMb: 524288,
GPU: 0,
Architecture: "arm64",
},
"r8gd.24xlarge": {
InstanceType: "r8gd.24xlarge",
VCPU: 96,
MemoryMb: 786432,
GPU: 0,
Architecture: "arm64",
},
"r8gd.2xlarge": {
InstanceType: "r8gd.2xlarge",
VCPU: 8,
MemoryMb: 65536,
GPU: 0,
Architecture: "arm64",
},
"r8gd.48xlarge": {
InstanceType: "r8gd.48xlarge",
VCPU: 192,
MemoryMb: 1572864,
GPU: 0,
Architecture: "arm64",
},
"r8gd.4xlarge": {
InstanceType: "r8gd.4xlarge",
VCPU: 16,
MemoryMb: 131072,
GPU: 0,
Architecture: "arm64",
},
"r8gd.8xlarge": {
InstanceType: "r8gd.8xlarge",
VCPU: 32,
MemoryMb: 262144,
GPU: 0,
Architecture: "arm64",
},
"r8gd.large": {
InstanceType: "r8gd.large",
VCPU: 2,
MemoryMb: 16384,
GPU: 0,
Architecture: "arm64",
},
"r8gd.medium": {
InstanceType: "r8gd.medium",
VCPU: 1,
MemoryMb: 8192,
GPU: 0,
Architecture: "arm64",
},
"r8gd.metal-24xl": {
InstanceType: "r8gd.metal-24xl",
VCPU: 96,
MemoryMb: 786432,
GPU: 0,
Architecture: "arm64",
},
"r8gd.metal-48xl": {
InstanceType: "r8gd.metal-48xl",
VCPU: 192,
MemoryMb: 1572864,
GPU: 0,
Architecture: "arm64",
},
"r8gd.xlarge": {
InstanceType: "r8gd.xlarge",
VCPU: 4,
MemoryMb: 32768,
GPU: 0,
Architecture: "arm64",
},
"t1.micro": {
InstanceType: "t1.micro",
VCPU: 1,
@ -5513,6 +5996,20 @@ var InstanceTypes = map[string]*InstanceType{
GPU: 0,
Architecture: "amd64",
},
"u7i-6tb.112xlarge": {
InstanceType: "u7i-6tb.112xlarge",
VCPU: 448,
MemoryMb: 6291456,
GPU: 0,
Architecture: "amd64",
},
"u7i-8tb.112xlarge": {
InstanceType: "u7i-8tb.112xlarge",
VCPU: 448,
MemoryMb: 8388608,
GPU: 0,
Architecture: "amd64",
},
"u7in-16tb.224xlarge": {
InstanceType: "u7in-16tb.224xlarge",
VCPU: 896,

View File

@ -51,7 +51,7 @@ rules:
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities", "volumeattachments"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
@ -146,7 +146,7 @@ spec:
type: RuntimeDefault
serviceAccountName: cluster-autoscaler
containers:
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.26.2
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.32.1
name: cluster-autoscaler
resources:
limits:

View File

@ -51,7 +51,7 @@ rules:
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities", "volumeattachments"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
@ -146,7 +146,7 @@ spec:
type: RuntimeDefault
serviceAccountName: cluster-autoscaler
containers:
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.26.2
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.32.1
name: cluster-autoscaler
resources:
limits:

View File

@ -51,7 +51,7 @@ rules:
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities", "volumeattachments"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
@ -146,7 +146,7 @@ spec:
type: RuntimeDefault
serviceAccountName: cluster-autoscaler
containers:
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.26.2
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.32.1
name: cluster-autoscaler
resources:
limits:

View File

@ -51,7 +51,7 @@ rules:
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities", "volumeattachments"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
@ -153,7 +153,7 @@ spec:
nodeSelector:
kubernetes.io/role: control-plane
containers:
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.26.2
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.32.1
name: cluster-autoscaler
resources:
limits:

View File

@ -20,7 +20,6 @@ import (
"context"
"fmt"
"net/http"
"strings"
"testing"
"time"
@ -422,8 +421,7 @@ func TestDeleteInstances(t *testing.T) {
},
}, nil)
err = as.DeleteInstances(instances)
expectedErrStr := "The specified account is disabled."
assert.True(t, strings.Contains(err.Error(), expectedErrStr))
assert.Error(t, err)
}
func TestAgentPoolDeleteNodes(t *testing.T) {
@ -478,8 +476,7 @@ func TestAgentPoolDeleteNodes(t *testing.T) {
ObjectMeta: v1.ObjectMeta{Name: "node"},
},
})
expectedErrStr := "The specified account is disabled."
assert.True(t, strings.Contains(err.Error(), expectedErrStr))
assert.Error(t, err)
as.minSize = 3
err = as.DeleteNodes([]*apiv1.Node{})

View File

@ -25,6 +25,7 @@ import (
"sync"
"time"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v5"
"github.com/Azure/azure-sdk-for-go/services/compute/mgmt/2022-08-01/compute"
"github.com/Azure/go-autorest/autorest/to"
"github.com/Azure/skewer"
@ -67,13 +68,18 @@ type azureCache struct {
// Cache content.
// resourceGroup specifies the name of the resource group that this cache tracks
resourceGroup string
// resourceGroup specifies the name of the node resource group that this cache tracks
resourceGroup string
clusterResourceGroup string
clusterName string
// enableVMsAgentPool specifies whether VMs agent pool type is supported.
enableVMsAgentPool bool
// vmType can be one of vmTypeVMSS (default), vmTypeStandard
vmType string
vmsPoolSet map[string]struct{} // track the nodepools that're vms pool
vmsPoolMap map[string]armcontainerservice.AgentPool // track the nodepools that're vms pool
// scaleSets keeps the set of all known scalesets in the resource group, populated/refreshed via VMSS.List() call.
// It is only used/populated if vmType is vmTypeVMSS (default).
@ -106,8 +112,11 @@ func newAzureCache(client *azClient, cacheTTL time.Duration, config Config) (*az
azClient: client,
refreshInterval: cacheTTL,
resourceGroup: config.ResourceGroup,
clusterResourceGroup: config.ClusterResourceGroup,
clusterName: config.ClusterName,
enableVMsAgentPool: config.EnableVMsAgentPool,
vmType: config.VMType,
vmsPoolSet: make(map[string]struct{}),
vmsPoolMap: make(map[string]armcontainerservice.AgentPool),
scaleSets: make(map[string]compute.VirtualMachineScaleSet),
virtualMachines: make(map[string][]compute.VirtualMachine),
registeredNodeGroups: make([]cloudprovider.NodeGroup, 0),
@ -130,11 +139,11 @@ func newAzureCache(client *azClient, cacheTTL time.Duration, config Config) (*az
return cache, nil
}
func (m *azureCache) getVMsPoolSet() map[string]struct{} {
func (m *azureCache) getVMsPoolMap() map[string]armcontainerservice.AgentPool {
m.mutex.Lock()
defer m.mutex.Unlock()
return m.vmsPoolSet
return m.vmsPoolMap
}
func (m *azureCache) getVirtualMachines() map[string][]compute.VirtualMachine {
@ -232,13 +241,20 @@ func (m *azureCache) fetchAzureResources() error {
return err
}
m.scaleSets = vmssResult
vmResult, vmsPoolSet, err := m.fetchVirtualMachines()
vmResult, err := m.fetchVirtualMachines()
if err != nil {
return err
}
// we fetch both sets of resources since CAS may operate on mixed nodepools
m.virtualMachines = vmResult
m.vmsPoolSet = vmsPoolSet
// fetch VMs pools if enabled
if m.enableVMsAgentPool {
vmsPoolMap, err := m.fetchVMsPools()
if err != nil {
return err
}
m.vmsPoolMap = vmsPoolMap
}
return nil
}
@ -251,19 +267,17 @@ const (
)
// fetchVirtualMachines returns the updated list of virtual machines in the config resource group using the Azure API.
func (m *azureCache) fetchVirtualMachines() (map[string][]compute.VirtualMachine, map[string]struct{}, error) {
func (m *azureCache) fetchVirtualMachines() (map[string][]compute.VirtualMachine, error) {
ctx, cancel := getContextWithCancel()
defer cancel()
result, err := m.azClient.virtualMachinesClient.List(ctx, m.resourceGroup)
if err != nil {
klog.Errorf("VirtualMachinesClient.List in resource group %q failed: %v", m.resourceGroup, err)
return nil, nil, err.Error()
return nil, err.Error()
}
instances := make(map[string][]compute.VirtualMachine)
// track the nodepools that're vms pools
vmsPoolSet := make(map[string]struct{})
for _, instance := range result {
if instance.Tags == nil {
continue
@ -280,20 +294,43 @@ func (m *azureCache) fetchVirtualMachines() (map[string][]compute.VirtualMachine
}
instances[to.String(vmPoolName)] = append(instances[to.String(vmPoolName)], instance)
}
return instances, nil
}
// if the nodepool is already in the map, skip it
if _, ok := vmsPoolSet[to.String(vmPoolName)]; ok {
continue
// fetchVMsPools returns a name to agentpool map of all the VMs pools in the cluster
func (m *azureCache) fetchVMsPools() (map[string]armcontainerservice.AgentPool, error) {
ctx, cancel := getContextWithTimeout(vmsContextTimeout)
defer cancel()
// defensive check, should never happen when enableVMsAgentPool toggle is on
if m.azClient.agentPoolClient == nil {
return nil, errors.New("agentPoolClient is nil")
}
vmsPoolMap := make(map[string]armcontainerservice.AgentPool)
pager := m.azClient.agentPoolClient.NewListPager(m.clusterResourceGroup, m.clusterName, nil)
var aps []*armcontainerservice.AgentPool
for pager.More() {
resp, err := pager.NextPage(ctx)
if err != nil {
klog.Errorf("agentPoolClient.pager.NextPage in cluster %s resource group %s failed: %v",
m.clusterName, m.clusterResourceGroup, err)
return nil, err
}
aps = append(aps, resp.Value...)
}
// nodes from vms pool will have tag "aks-managed-agentpool-type" set to "VirtualMachines"
if agentpoolType := tags[agentpoolTypeTag]; agentpoolType != nil {
if strings.EqualFold(to.String(agentpoolType), vmsPoolType) {
vmsPoolSet[to.String(vmPoolName)] = struct{}{}
}
for _, ap := range aps {
if ap != nil && ap.Name != nil && ap.Properties != nil && ap.Properties.Type != nil &&
*ap.Properties.Type == armcontainerservice.AgentPoolTypeVirtualMachines {
// we only care about VMs pools, skip other types
klog.V(6).Infof("Found VMs pool %q", *ap.Name)
vmsPoolMap[*ap.Name] = *ap
}
}
return instances, vmsPoolSet, nil
return vmsPoolMap, nil
}
// fetchScaleSets returns the updated list of scale sets in the config resource group using the Azure API.
@ -422,7 +459,7 @@ func (m *azureCache) HasInstance(providerID string) (bool, error) {
// FindForInstance returns node group of the given Instance
func (m *azureCache) FindForInstance(instance *azureRef, vmType string) (cloudprovider.NodeGroup, error) {
vmsPoolSet := m.getVMsPoolSet()
vmsPoolMap := m.getVMsPoolMap()
m.mutex.Lock()
defer m.mutex.Unlock()
@ -441,7 +478,7 @@ func (m *azureCache) FindForInstance(instance *azureRef, vmType string) (cloudpr
}
// cluster with vmss pool only
if vmType == providerazureconsts.VMTypeVMSS && len(vmsPoolSet) == 0 {
if vmType == providerazureconsts.VMTypeVMSS && len(vmsPoolMap) == 0 {
if m.areAllScaleSetsUniform() {
// Omit virtual machines not managed by vmss only in case of uniform scale set.
if ok := virtualMachineRE.Match([]byte(inst.Name)); ok {

View File

@ -22,9 +22,42 @@ import (
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
providerazureconsts "sigs.k8s.io/cloud-provider-azure/pkg/consts"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v5"
"github.com/Azure/go-autorest/autorest/to"
"github.com/stretchr/testify/assert"
"go.uber.org/mock/gomock"
)
func TestFetchVMsPools(t *testing.T) {
ctrl := gomock.NewController(t)
defer ctrl.Finish()
provider := newTestProvider(t)
ac := provider.azureManager.azureCache
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
ac.azClient.agentPoolClient = mockAgentpoolclient
vmsPool := getTestVMsAgentPool(false)
vmssPoolType := armcontainerservice.AgentPoolTypeVirtualMachineScaleSets
vmssPool := armcontainerservice.AgentPool{
Name: to.StringPtr("vmsspool1"),
Properties: &armcontainerservice.ManagedClusterAgentPoolProfileProperties{
Type: &vmssPoolType,
},
}
invalidPool := armcontainerservice.AgentPool{}
fakeAPListPager := getFakeAgentpoolListPager(&vmsPool, &vmssPool, &invalidPool)
mockAgentpoolclient.EXPECT().NewListPager(gomock.Any(), gomock.Any(), nil).
Return(fakeAPListPager)
vmsPoolMap, err := ac.fetchVMsPools()
assert.NoError(t, err)
assert.Equal(t, 1, len(vmsPoolMap))
_, ok := vmsPoolMap[to.String(vmsPool.Name)]
assert.True(t, ok)
}
func TestRegister(t *testing.T) {
provider := newTestProvider(t)
ss := newTestScaleSet(provider.azureManager, "ss")

View File

@ -19,6 +19,8 @@ package azure
import (
"context"
"fmt"
"os"
"time"
_ "go.uber.org/mock/mockgen/model" // for go:generate
@ -29,7 +31,7 @@ import (
azurecore_policy "github.com/Azure/azure-sdk-for-go/sdk/azcore/policy"
"github.com/Azure/azure-sdk-for-go/sdk/azcore/runtime"
"github.com/Azure/azure-sdk-for-go/sdk/azidentity"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v4"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v5"
"github.com/Azure/azure-sdk-for-go/services/compute/mgmt/2022-08-01/compute"
"github.com/Azure/go-autorest/autorest"
"github.com/Azure/go-autorest/autorest/azure"
@ -47,7 +49,12 @@ import (
providerazureconfig "sigs.k8s.io/cloud-provider-azure/pkg/provider/config"
)
//go:generate sh -c "mockgen k8s.io/autoscaler/cluster-autoscaler/cloudprovider/azure AgentPoolsClient >./agentpool_client.go"
//go:generate sh -c "mockgen -source=azure_client.go -destination azure_mock_agentpool_client.go -package azure -exclude_interfaces DeploymentsClient"
const (
vmsContextTimeout = 5 * time.Minute
vmsAsyncContextTimeout = 30 * time.Minute
)
// AgentPoolsClient interface defines the methods needed for scaling vms pool.
// it is implemented by track2 sdk armcontainerservice.AgentPoolsClient
@ -68,52 +75,89 @@ type AgentPoolsClient interface {
machines armcontainerservice.AgentPoolDeleteMachinesParameter,
options *armcontainerservice.AgentPoolsClientBeginDeleteMachinesOptions) (
*runtime.Poller[armcontainerservice.AgentPoolsClientDeleteMachinesResponse], error)
NewListPager(
resourceGroupName, resourceName string,
options *armcontainerservice.AgentPoolsClientListOptions,
) *runtime.Pager[armcontainerservice.AgentPoolsClientListResponse]
}
func getAgentpoolClientCredentials(cfg *Config) (azcore.TokenCredential, error) {
var cred azcore.TokenCredential
var err error
if cfg.AuthMethod == authMethodCLI {
cred, err = azidentity.NewAzureCLICredential(&azidentity.AzureCLICredentialOptions{
TenantID: cfg.TenantID})
if err != nil {
klog.Errorf("NewAzureCLICredential failed: %v", err)
return nil, err
if cfg.AuthMethod == "" || cfg.AuthMethod == authMethodPrincipal {
// Use MSI
if cfg.UseManagedIdentityExtension {
// Use System Assigned MSI
if cfg.UserAssignedIdentityID == "" {
klog.V(4).Info("Agentpool client: using System Assigned MSI to retrieve access token")
return azidentity.NewManagedIdentityCredential(nil)
}
// Use User Assigned MSI
klog.V(4).Info("Agentpool client: using User Assigned MSI to retrieve access token")
return azidentity.NewManagedIdentityCredential(&azidentity.ManagedIdentityCredentialOptions{
ID: azidentity.ClientID(cfg.UserAssignedIdentityID),
})
}
} else if cfg.AuthMethod == "" || cfg.AuthMethod == authMethodPrincipal {
cred, err = azidentity.NewClientSecretCredential(cfg.TenantID, cfg.AADClientID, cfg.AADClientSecret, nil)
if err != nil {
klog.Errorf("NewClientSecretCredential failed: %v", err)
return nil, err
}
} else {
return nil, fmt.Errorf("unsupported authorization method: %s", cfg.AuthMethod)
}
return cred, nil
}
func getAgentpoolClientRetryOptions(cfg *Config) azurecore_policy.RetryOptions {
if cfg.AuthMethod == authMethodCLI {
return azurecore_policy.RetryOptions{
MaxRetries: -1, // no retry when using CLI auth for UT
// Use Service Principal with ClientID and ClientSecret
if cfg.AADClientID != "" && cfg.AADClientSecret != "" {
klog.V(2).Infoln("Agentpool client: using client_id+client_secret to retrieve access token")
return azidentity.NewClientSecretCredential(cfg.TenantID, cfg.AADClientID, cfg.AADClientSecret, nil)
}
// Use Service Principal with ClientCert and AADClientCertPassword
if cfg.AADClientID != "" && cfg.AADClientCertPath != "" {
klog.V(2).Infoln("Agentpool client: using client_cert+client_private_key to retrieve access token")
certData, err := os.ReadFile(cfg.AADClientCertPath)
if err != nil {
return nil, fmt.Errorf("reading the client certificate from file %s failed with error: %w", cfg.AADClientCertPath, err)
}
certs, privateKey, err := azidentity.ParseCertificates(certData, []byte(cfg.AADClientCertPassword))
if err != nil {
return nil, fmt.Errorf("parsing service principal certificate data failed with error: %w", err)
}
return azidentity.NewClientCertificateCredential(cfg.TenantID, cfg.AADClientID, certs, privateKey, &azidentity.ClientCertificateCredentialOptions{
SendCertificateChain: true,
})
}
}
return azextensions.DefaultRetryOpts()
if cfg.UseFederatedWorkloadIdentityExtension {
klog.V(4).Info("Agentpool client: using workload identity for access token")
return azidentity.NewWorkloadIdentityCredential(&azidentity.WorkloadIdentityCredentialOptions{
TokenFilePath: cfg.AADFederatedTokenFile,
})
}
return nil, fmt.Errorf("unsupported authorization method: %s", cfg.AuthMethod)
}
func newAgentpoolClient(cfg *Config) (AgentPoolsClient, error) {
retryOptions := getAgentpoolClientRetryOptions(cfg)
retryOptions := azextensions.DefaultRetryOpts()
cred, err := getAgentpoolClientCredentials(cfg)
if err != nil {
klog.Errorf("failed to get agent pool client credentials: %v", err)
return nil, err
}
env := azure.PublicCloud // default to public cloud
if cfg.Cloud != "" {
var err error
env, err = azure.EnvironmentFromName(cfg.Cloud)
if err != nil {
klog.Errorf("failed to get environment from name %s: with error: %v", cfg.Cloud, err)
return nil, err
}
}
if cfg.ARMBaseURLForAPClient != "" {
klog.V(10).Infof("Using ARMBaseURLForAPClient to create agent pool client")
return newAgentpoolClientWithConfig(cfg.SubscriptionID, nil, cfg.ARMBaseURLForAPClient, "UNKNOWN", retryOptions)
return newAgentpoolClientWithConfig(cfg.SubscriptionID, cred, cfg.ARMBaseURLForAPClient, env.TokenAudience, retryOptions, true /*insecureAllowCredentialWithHTTP*/)
}
return newAgentpoolClientWithPublicEndpoint(cfg, retryOptions)
return newAgentpoolClientWithConfig(cfg.SubscriptionID, cred, env.ResourceManagerEndpoint, env.TokenAudience, retryOptions, false /*insecureAllowCredentialWithHTTP*/)
}
func newAgentpoolClientWithConfig(subscriptionID string, cred azcore.TokenCredential,
cloudCfgEndpoint, cloudCfgAudience string, retryOptions azurecore_policy.RetryOptions) (AgentPoolsClient, error) {
cloudCfgEndpoint, cloudCfgAudience string, retryOptions azurecore_policy.RetryOptions, insecureAllowCredentialWithHTTP bool) (AgentPoolsClient, error) {
agentPoolsClient, err := armcontainerservice.NewAgentPoolsClient(subscriptionID, cred,
&policy.ClientOptions{
ClientOptions: azurecore_policy.ClientOptions{
@ -125,9 +169,10 @@ func newAgentpoolClientWithConfig(subscriptionID string, cred azcore.TokenCreden
},
},
},
Telemetry: azextensions.DefaultTelemetryOpts(getUserAgentExtension()),
Transport: azextensions.DefaultHTTPClient(),
Retry: retryOptions,
InsecureAllowCredentialWithHTTP: insecureAllowCredentialWithHTTP,
Telemetry: azextensions.DefaultTelemetryOpts(getUserAgentExtension()),
Transport: azextensions.DefaultHTTPClient(),
Retry: retryOptions,
},
})
@ -139,26 +184,6 @@ func newAgentpoolClientWithConfig(subscriptionID string, cred azcore.TokenCreden
return agentPoolsClient, nil
}
func newAgentpoolClientWithPublicEndpoint(cfg *Config, retryOptions azurecore_policy.RetryOptions) (AgentPoolsClient, error) {
cred, err := getAgentpoolClientCredentials(cfg)
if err != nil {
klog.Errorf("failed to get agent pool client credentials: %v", err)
return nil, err
}
// default to public cloud
env := azure.PublicCloud
if cfg.Cloud != "" {
env, err = azure.EnvironmentFromName(cfg.Cloud)
if err != nil {
klog.Errorf("failed to get environment from name %s: with error: %v", cfg.Cloud, err)
return nil, err
}
}
return newAgentpoolClientWithConfig(cfg.SubscriptionID, cred, env.ResourceManagerEndpoint, env.TokenAudience, retryOptions)
}
type azClient struct {
virtualMachineScaleSetsClient vmssclient.Interface
virtualMachineScaleSetVMsClient vmssvmclient.Interface
@ -232,9 +257,11 @@ func newAzClient(cfg *Config, env *azure.Environment) (*azClient, error) {
agentPoolClient, err := newAgentpoolClient(cfg)
if err != nil {
// we don't want to fail the whole process so we don't break any existing functionality
// since this may not be fatal - it is only used by vms pool which is still under development.
klog.Warningf("newAgentpoolClient failed with error: %s", err)
klog.Errorf("newAgentpoolClient failed with error: %s", err)
if cfg.EnableVMsAgentPool {
// only return error if VMs agent pool is supported which is controlled by toggle
return nil, err
}
}
return &azClient{

View File

@ -20,6 +20,7 @@ import (
"fmt"
"testing"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v5"
"github.com/Azure/azure-sdk-for-go/services/compute/mgmt/2022-08-01/compute"
"github.com/Azure/azure-sdk-for-go/services/resources/mgmt/2017-05-10/resources"
"github.com/Azure/go-autorest/autorest/to"
@ -132,7 +133,7 @@ func TestNodeGroups(t *testing.T) {
)
assert.True(t, registered)
registered = provider.azureManager.RegisterNodeGroup(
newTestVMsPool(provider.azureManager, "test-vms-pool"),
newTestVMsPool(provider.azureManager),
)
assert.True(t, registered)
assert.Equal(t, len(provider.NodeGroups()), 2)
@ -146,9 +147,14 @@ func TestHasInstance(t *testing.T) {
mockVMSSClient := mockvmssclient.NewMockInterface(ctrl)
mockVMClient := mockvmclient.NewMockInterface(ctrl)
mockVMSSVMClient := mockvmssvmclient.NewMockInterface(ctrl)
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
provider.azureManager.azClient.virtualMachinesClient = mockVMClient
provider.azureManager.azClient.virtualMachineScaleSetsClient = mockVMSSClient
provider.azureManager.azClient.virtualMachineScaleSetVMsClient = mockVMSSVMClient
provider.azureManager.azClient.agentPoolClient = mockAgentpoolclient
provider.azureManager.azureCache.clusterName = "test-cluster"
provider.azureManager.azureCache.clusterResourceGroup = "test-rg"
provider.azureManager.azureCache.enableVMsAgentPool = true // enable VMs agent pool to support mixed node group types
// Simulate node groups and instances
expectedScaleSets := newTestVMSSList(3, "test-asg", "eastus", compute.Uniform)
@ -158,6 +164,20 @@ func TestHasInstance(t *testing.T) {
mockVMSSClient.EXPECT().List(gomock.Any(), provider.azureManager.config.ResourceGroup).Return(expectedScaleSets, nil).AnyTimes()
mockVMClient.EXPECT().List(gomock.Any(), provider.azureManager.config.ResourceGroup).Return(expectedVMsPoolVMs, nil).AnyTimes()
mockVMSSVMClient.EXPECT().List(gomock.Any(), provider.azureManager.config.ResourceGroup, "test-asg", gomock.Any()).Return(expectedVMSSVMs, nil).AnyTimes()
vmssType := armcontainerservice.AgentPoolTypeVirtualMachineScaleSets
vmssPool := armcontainerservice.AgentPool{
Name: to.StringPtr("test-asg"),
Properties: &armcontainerservice.ManagedClusterAgentPoolProfileProperties{
Type: &vmssType,
},
}
vmsPool := getTestVMsAgentPool(false)
fakeAPListPager := getFakeAgentpoolListPager(&vmssPool, &vmsPool)
mockAgentpoolclient.EXPECT().NewListPager(
provider.azureManager.azureCache.clusterResourceGroup,
provider.azureManager.azureCache.clusterName, nil).
Return(fakeAPListPager).AnyTimes()
// Register node groups
assert.Equal(t, len(provider.NodeGroups()), 0)
@ -168,9 +188,9 @@ func TestHasInstance(t *testing.T) {
assert.True(t, registered)
registered = provider.azureManager.RegisterNodeGroup(
newTestVMsPool(provider.azureManager, "test-vms-pool"),
newTestVMsPool(provider.azureManager),
)
provider.azureManager.explicitlyConfigured["test-vms-pool"] = true
provider.azureManager.explicitlyConfigured[vmsNodeGroupName] = true
assert.True(t, registered)
assert.Equal(t, len(provider.NodeGroups()), 2)
@ -264,9 +284,14 @@ func TestMixedNodeGroups(t *testing.T) {
mockVMSSClient := mockvmssclient.NewMockInterface(ctrl)
mockVMClient := mockvmclient.NewMockInterface(ctrl)
mockVMSSVMClient := mockvmssvmclient.NewMockInterface(ctrl)
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
provider.azureManager.azClient.virtualMachinesClient = mockVMClient
provider.azureManager.azClient.virtualMachineScaleSetsClient = mockVMSSClient
provider.azureManager.azClient.virtualMachineScaleSetVMsClient = mockVMSSVMClient
provider.azureManager.azureCache.clusterName = "test-cluster"
provider.azureManager.azureCache.clusterResourceGroup = "test-rg"
provider.azureManager.azureCache.enableVMsAgentPool = true // enable VMs agent pool to support mixed node group types
provider.azureManager.azClient.agentPoolClient = mockAgentpoolclient
expectedScaleSets := newTestVMSSList(3, "test-asg", "eastus", compute.Uniform)
expectedVMsPoolVMs := newTestVMsPoolVMList(3)
@ -276,6 +301,19 @@ func TestMixedNodeGroups(t *testing.T) {
mockVMClient.EXPECT().List(gomock.Any(), provider.azureManager.config.ResourceGroup).Return(expectedVMsPoolVMs, nil).AnyTimes()
mockVMSSVMClient.EXPECT().List(gomock.Any(), provider.azureManager.config.ResourceGroup, "test-asg", gomock.Any()).Return(expectedVMSSVMs, nil).AnyTimes()
vmssType := armcontainerservice.AgentPoolTypeVirtualMachineScaleSets
vmssPool := armcontainerservice.AgentPool{
Name: to.StringPtr("test-asg"),
Properties: &armcontainerservice.ManagedClusterAgentPoolProfileProperties{
Type: &vmssType,
},
}
vmsPool := getTestVMsAgentPool(false)
fakeAPListPager := getFakeAgentpoolListPager(&vmssPool, &vmsPool)
mockAgentpoolclient.EXPECT().NewListPager(provider.azureManager.azureCache.clusterResourceGroup, provider.azureManager.azureCache.clusterName, nil).
Return(fakeAPListPager).AnyTimes()
assert.Equal(t, len(provider.NodeGroups()), 0)
registered := provider.azureManager.RegisterNodeGroup(
newTestScaleSet(provider.azureManager, "test-asg"),
@ -284,9 +322,9 @@ func TestMixedNodeGroups(t *testing.T) {
assert.True(t, registered)
registered = provider.azureManager.RegisterNodeGroup(
newTestVMsPool(provider.azureManager, "test-vms-pool"),
newTestVMsPool(provider.azureManager),
)
provider.azureManager.explicitlyConfigured["test-vms-pool"] = true
provider.azureManager.explicitlyConfigured[vmsNodeGroupName] = true
assert.True(t, registered)
assert.Equal(t, len(provider.NodeGroups()), 2)
@ -307,7 +345,7 @@ func TestMixedNodeGroups(t *testing.T) {
group, err = provider.NodeGroupForNode(vmsPoolNode)
assert.NoError(t, err)
assert.NotNil(t, group, "Group should not be nil")
assert.Equal(t, group.Id(), "test-vms-pool")
assert.Equal(t, group.Id(), vmsNodeGroupName)
assert.Equal(t, group.MinSize(), 3)
assert.Equal(t, group.MaxSize(), 10)
}

View File

@ -86,6 +86,9 @@ type Config struct {
// EnableForceDelete defines whether to enable force deletion on the APIs
EnableForceDelete bool `json:"enableForceDelete,omitempty" yaml:"enableForceDelete,omitempty"`
// EnableVMsAgentPool defines whether to support VMs agentpool type in addition to VMSS type
EnableVMsAgentPool bool `json:"enableVMsAgentPool,omitempty" yaml:"enableVMsAgentPool,omitempty"`
// (DEPRECATED, DO NOT USE) EnableDynamicInstanceList defines whether to enable dynamic instance workflow for instance information check
EnableDynamicInstanceList bool `json:"enableDynamicInstanceList,omitempty" yaml:"enableDynamicInstanceList,omitempty"`
@ -122,6 +125,7 @@ func BuildAzureConfig(configReader io.Reader) (*Config, error) {
// Static defaults
cfg.EnableDynamicInstanceList = false
cfg.EnableVmssFlexNodes = false
cfg.EnableVMsAgentPool = false
cfg.CloudProviderBackoffRetries = providerazureconsts.BackoffRetriesDefault
cfg.CloudProviderBackoffExponent = providerazureconsts.BackoffExponentDefault
cfg.CloudProviderBackoffDuration = providerazureconsts.BackoffDurationDefault
@ -257,6 +261,9 @@ func BuildAzureConfig(configReader io.Reader) (*Config, error) {
if _, err = assignBoolFromEnvIfExists(&cfg.StrictCacheUpdates, "AZURE_STRICT_CACHE_UPDATES"); err != nil {
return nil, err
}
if _, err = assignBoolFromEnvIfExists(&cfg.EnableVMsAgentPool, "AZURE_ENABLE_VMS_AGENT_POOLS"); err != nil {
return nil, err
}
if _, err = assignBoolFromEnvIfExists(&cfg.EnableDynamicInstanceList, "AZURE_ENABLE_DYNAMIC_INSTANCE_LIST"); err != nil {
return nil, err
}

View File

@ -22,80 +22,79 @@ import (
"regexp"
"strings"
"github.com/Azure/azure-sdk-for-go/services/compute/mgmt/2022-08-01/compute"
"k8s.io/klog/v2"
)
// GetVMSSTypeStatically uses static list of vmss generated at azure_instance_types.go to fetch vmss instance information.
// GetInstanceTypeStatically uses static list of vmss generated at azure_instance_types.go to fetch vmss instance information.
// It is declared as a variable for testing purpose.
var GetVMSSTypeStatically = func(template compute.VirtualMachineScaleSet) (*InstanceType, error) {
var vmssType *InstanceType
var GetInstanceTypeStatically = func(template NodeTemplate) (*InstanceType, error) {
var instanceType *InstanceType
for k := range InstanceTypes {
if strings.EqualFold(k, *template.Sku.Name) {
vmssType = InstanceTypes[k]
if strings.EqualFold(k, template.SkuName) {
instanceType = InstanceTypes[k]
break
}
}
promoRe := regexp.MustCompile(`(?i)_promo`)
if promoRe.MatchString(*template.Sku.Name) {
if vmssType == nil {
if promoRe.MatchString(template.SkuName) {
if instanceType == nil {
// We didn't find an exact match but this is a promo type, check for matching standard
klog.V(4).Infof("No exact match found for %s, checking standard types", *template.Sku.Name)
skuName := promoRe.ReplaceAllString(*template.Sku.Name, "")
klog.V(4).Infof("No exact match found for %s, checking standard types", template.SkuName)
skuName := promoRe.ReplaceAllString(template.SkuName, "")
for k := range InstanceTypes {
if strings.EqualFold(k, skuName) {
vmssType = InstanceTypes[k]
instanceType = InstanceTypes[k]
break
}
}
}
}
if vmssType == nil {
return vmssType, fmt.Errorf("instance type %q not supported", *template.Sku.Name)
if instanceType == nil {
return instanceType, fmt.Errorf("instance type %q not supported", template.SkuName)
}
return vmssType, nil
return instanceType, nil
}
// GetVMSSTypeDynamically fetched vmss instance information using sku api calls.
// GetInstanceTypeDynamically fetched vmss instance information using sku api calls.
// It is declared as a variable for testing purpose.
var GetVMSSTypeDynamically = func(template compute.VirtualMachineScaleSet, azCache *azureCache) (InstanceType, error) {
var GetInstanceTypeDynamically = func(template NodeTemplate, azCache *azureCache) (InstanceType, error) {
ctx := context.Background()
var vmssType InstanceType
var instanceType InstanceType
sku, err := azCache.GetSKU(ctx, *template.Sku.Name, *template.Location)
sku, err := azCache.GetSKU(ctx, template.SkuName, template.Location)
if err != nil {
// We didn't find an exact match but this is a promo type, check for matching standard
promoRe := regexp.MustCompile(`(?i)_promo`)
skuName := promoRe.ReplaceAllString(*template.Sku.Name, "")
if skuName != *template.Sku.Name {
klog.V(1).Infof("No exact match found for %q, checking standard type %q. Error %v", *template.Sku.Name, skuName, err)
sku, err = azCache.GetSKU(ctx, skuName, *template.Location)
skuName := promoRe.ReplaceAllString(template.SkuName, "")
if skuName != template.SkuName {
klog.V(1).Infof("No exact match found for %q, checking standard type %q. Error %v", template.SkuName, skuName, err)
sku, err = azCache.GetSKU(ctx, skuName, template.Location)
}
if err != nil {
return vmssType, fmt.Errorf("instance type %q not supported. Error %v", *template.Sku.Name, err)
return instanceType, fmt.Errorf("instance type %q not supported. Error %v", template.SkuName, err)
}
}
vmssType.VCPU, err = sku.VCPU()
instanceType.VCPU, err = sku.VCPU()
if err != nil {
klog.V(1).Infof("Failed to parse vcpu from sku %q %v", *template.Sku.Name, err)
return vmssType, err
klog.V(1).Infof("Failed to parse vcpu from sku %q %v", template.SkuName, err)
return instanceType, err
}
gpu, err := getGpuFromSku(sku)
if err != nil {
klog.V(1).Infof("Failed to parse gpu from sku %q %v", *template.Sku.Name, err)
return vmssType, err
klog.V(1).Infof("Failed to parse gpu from sku %q %v", template.SkuName, err)
return instanceType, err
}
vmssType.GPU = gpu
instanceType.GPU = gpu
memoryGb, err := sku.Memory()
if err != nil {
klog.V(1).Infof("Failed to parse memoryMb from sku %q %v", *template.Sku.Name, err)
return vmssType, err
klog.V(1).Infof("Failed to parse memoryMb from sku %q %v", template.SkuName, err)
return instanceType, err
}
vmssType.MemoryMb = int64(memoryGb) * 1024
instanceType.MemoryMb = int64(memoryGb) * 1024
return vmssType, nil
return instanceType, nil
}

View File

@ -168,6 +168,23 @@ func (m *AzureManager) fetchExplicitNodeGroups(specs []string) error {
return nil
}
// parseSKUAndVMsAgentpoolNameFromSpecName parses the spec name for a mixed-SKU VMs pool.
// The spec name should be in the format <agentpoolname>/<sku>, e.g., "mypool1/Standard_D2s_v3", if the agent pool is a VMs pool.
// This method returns a boolean indicating if the agent pool is a VMs pool, along with the agent pool name and SKU.
func (m *AzureManager) parseSKUAndVMsAgentpoolNameFromSpecName(name string) (bool, string, string) {
parts := strings.Split(name, "/")
if len(parts) == 2 {
agentPoolName := parts[0]
sku := parts[1]
vmsPoolMap := m.azureCache.getVMsPoolMap()
if _, ok := vmsPoolMap[agentPoolName]; ok {
return true, agentPoolName, sku
}
}
return false, "", ""
}
func (m *AzureManager) buildNodeGroupFromSpec(spec string) (cloudprovider.NodeGroup, error) {
scaleToZeroSupported := scaleToZeroSupportedStandard
if strings.EqualFold(m.config.VMType, providerazureconsts.VMTypeVMSS) {
@ -177,9 +194,13 @@ func (m *AzureManager) buildNodeGroupFromSpec(spec string) (cloudprovider.NodeGr
if err != nil {
return nil, fmt.Errorf("failed to parse node group spec: %v", err)
}
vmsPoolSet := m.azureCache.getVMsPoolSet()
if _, ok := vmsPoolSet[s.Name]; ok {
return NewVMsPool(s, m), nil
// Starting from release 1.30, a cluster may have both VMSS and VMs pools.
// Therefore, we cannot solely rely on the VMType to determine the node group type.
// Instead, we need to check the cache to determine if the agent pool is a VMs pool.
isVMsPool, agentPoolName, sku := m.parseSKUAndVMsAgentpoolNameFromSpecName(s.Name)
if isVMsPool {
return NewVMPool(s, m, agentPoolName, sku)
}
switch m.config.VMType {

View File

@ -297,6 +297,7 @@ func TestCreateAzureManagerValidConfig(t *testing.T) {
VmssVmsCacheJitter: 120,
MaxDeploymentsCount: 8,
EnableFastDeleteOnFailedProvisioning: true,
EnableVMsAgentPool: false,
}
assert.NoError(t, err)
@ -618,9 +619,14 @@ func TestCreateAzureManagerWithNilConfig(t *testing.T) {
mockVMSSClient := mockvmssclient.NewMockInterface(ctrl)
mockVMSSClient.EXPECT().List(gomock.Any(), "resourceGroup").Return([]compute.VirtualMachineScaleSet{}, nil).AnyTimes()
mockVMClient.EXPECT().List(gomock.Any(), "resourceGroup").Return([]compute.VirtualMachine{}, nil).AnyTimes()
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
vmspool := getTestVMsAgentPool(false)
fakeAPListPager := getFakeAgentpoolListPager(&vmspool)
mockAgentpoolclient.EXPECT().NewListPager(gomock.Any(), gomock.Any(), nil).Return(fakeAPListPager).AnyTimes()
mockAzClient := &azClient{
virtualMachinesClient: mockVMClient,
virtualMachineScaleSetsClient: mockVMSSClient,
agentPoolClient: mockAgentpoolclient,
}
expectedConfig := &Config{
@ -702,6 +708,7 @@ func TestCreateAzureManagerWithNilConfig(t *testing.T) {
VmssVmsCacheJitter: 90,
MaxDeploymentsCount: 8,
EnableFastDeleteOnFailedProvisioning: true,
EnableVMsAgentPool: true,
}
t.Setenv("ARM_CLOUD", "AzurePublicCloud")
@ -735,6 +742,7 @@ func TestCreateAzureManagerWithNilConfig(t *testing.T) {
t.Setenv("ARM_CLUSTER_RESOURCE_GROUP", "myrg")
t.Setenv("ARM_BASE_URL_FOR_AP_CLIENT", "nodeprovisioner-svc.nodeprovisioner.svc.cluster.local")
t.Setenv("AZURE_ENABLE_FAST_DELETE_ON_FAILED_PROVISIONING", "true")
t.Setenv("AZURE_ENABLE_VMS_AGENT_POOLS", "true")
t.Run("environment variables correctly set", func(t *testing.T) {
manager, err := createAzureManagerInternal(nil, cloudprovider.NodeGroupDiscoveryOptions{}, mockAzClient)

View File

@ -21,7 +21,7 @@ import (
reflect "reflect"
runtime "github.com/Azure/azure-sdk-for-go/sdk/azcore/runtime"
armcontainerservice "github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v4"
armcontainerservice "github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v5"
gomock "go.uber.org/mock/gomock"
)
@ -49,46 +49,60 @@ func (m *MockAgentPoolsClient) EXPECT() *MockAgentPoolsClientMockRecorder {
}
// BeginCreateOrUpdate mocks base method.
func (m *MockAgentPoolsClient) BeginCreateOrUpdate(arg0 context.Context, arg1, arg2, arg3 string, arg4 armcontainerservice.AgentPool, arg5 *armcontainerservice.AgentPoolsClientBeginCreateOrUpdateOptions) (*runtime.Poller[armcontainerservice.AgentPoolsClientCreateOrUpdateResponse], error) {
func (m *MockAgentPoolsClient) BeginCreateOrUpdate(ctx context.Context, resourceGroupName, resourceName, agentPoolName string, parameters armcontainerservice.AgentPool, options *armcontainerservice.AgentPoolsClientBeginCreateOrUpdateOptions) (*runtime.Poller[armcontainerservice.AgentPoolsClientCreateOrUpdateResponse], error) {
m.ctrl.T.Helper()
ret := m.ctrl.Call(m, "BeginCreateOrUpdate", arg0, arg1, arg2, arg3, arg4, arg5)
ret := m.ctrl.Call(m, "BeginCreateOrUpdate", ctx, resourceGroupName, resourceName, agentPoolName, parameters, options)
ret0, _ := ret[0].(*runtime.Poller[armcontainerservice.AgentPoolsClientCreateOrUpdateResponse])
ret1, _ := ret[1].(error)
return ret0, ret1
}
// BeginCreateOrUpdate indicates an expected call of BeginCreateOrUpdate.
func (mr *MockAgentPoolsClientMockRecorder) BeginCreateOrUpdate(arg0, arg1, arg2, arg3, arg4, arg5 any) *gomock.Call {
func (mr *MockAgentPoolsClientMockRecorder) BeginCreateOrUpdate(ctx, resourceGroupName, resourceName, agentPoolName, parameters, options any) *gomock.Call {
mr.mock.ctrl.T.Helper()
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "BeginCreateOrUpdate", reflect.TypeOf((*MockAgentPoolsClient)(nil).BeginCreateOrUpdate), arg0, arg1, arg2, arg3, arg4, arg5)
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "BeginCreateOrUpdate", reflect.TypeOf((*MockAgentPoolsClient)(nil).BeginCreateOrUpdate), ctx, resourceGroupName, resourceName, agentPoolName, parameters, options)
}
// BeginDeleteMachines mocks base method.
func (m *MockAgentPoolsClient) BeginDeleteMachines(arg0 context.Context, arg1, arg2, arg3 string, arg4 armcontainerservice.AgentPoolDeleteMachinesParameter, arg5 *armcontainerservice.AgentPoolsClientBeginDeleteMachinesOptions) (*runtime.Poller[armcontainerservice.AgentPoolsClientDeleteMachinesResponse], error) {
func (m *MockAgentPoolsClient) BeginDeleteMachines(ctx context.Context, resourceGroupName, resourceName, agentPoolName string, machines armcontainerservice.AgentPoolDeleteMachinesParameter, options *armcontainerservice.AgentPoolsClientBeginDeleteMachinesOptions) (*runtime.Poller[armcontainerservice.AgentPoolsClientDeleteMachinesResponse], error) {
m.ctrl.T.Helper()
ret := m.ctrl.Call(m, "BeginDeleteMachines", arg0, arg1, arg2, arg3, arg4, arg5)
ret := m.ctrl.Call(m, "BeginDeleteMachines", ctx, resourceGroupName, resourceName, agentPoolName, machines, options)
ret0, _ := ret[0].(*runtime.Poller[armcontainerservice.AgentPoolsClientDeleteMachinesResponse])
ret1, _ := ret[1].(error)
return ret0, ret1
}
// BeginDeleteMachines indicates an expected call of BeginDeleteMachines.
func (mr *MockAgentPoolsClientMockRecorder) BeginDeleteMachines(arg0, arg1, arg2, arg3, arg4, arg5 any) *gomock.Call {
func (mr *MockAgentPoolsClientMockRecorder) BeginDeleteMachines(ctx, resourceGroupName, resourceName, agentPoolName, machines, options any) *gomock.Call {
mr.mock.ctrl.T.Helper()
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "BeginDeleteMachines", reflect.TypeOf((*MockAgentPoolsClient)(nil).BeginDeleteMachines), arg0, arg1, arg2, arg3, arg4, arg5)
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "BeginDeleteMachines", reflect.TypeOf((*MockAgentPoolsClient)(nil).BeginDeleteMachines), ctx, resourceGroupName, resourceName, agentPoolName, machines, options)
}
// Get mocks base method.
func (m *MockAgentPoolsClient) Get(arg0 context.Context, arg1, arg2, arg3 string, arg4 *armcontainerservice.AgentPoolsClientGetOptions) (armcontainerservice.AgentPoolsClientGetResponse, error) {
func (m *MockAgentPoolsClient) Get(ctx context.Context, resourceGroupName, resourceName, agentPoolName string, options *armcontainerservice.AgentPoolsClientGetOptions) (armcontainerservice.AgentPoolsClientGetResponse, error) {
m.ctrl.T.Helper()
ret := m.ctrl.Call(m, "Get", arg0, arg1, arg2, arg3, arg4)
ret := m.ctrl.Call(m, "Get", ctx, resourceGroupName, resourceName, agentPoolName, options)
ret0, _ := ret[0].(armcontainerservice.AgentPoolsClientGetResponse)
ret1, _ := ret[1].(error)
return ret0, ret1
}
// Get indicates an expected call of Get.
func (mr *MockAgentPoolsClientMockRecorder) Get(arg0, arg1, arg2, arg3, arg4 any) *gomock.Call {
func (mr *MockAgentPoolsClientMockRecorder) Get(ctx, resourceGroupName, resourceName, agentPoolName, options any) *gomock.Call {
mr.mock.ctrl.T.Helper()
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "Get", reflect.TypeOf((*MockAgentPoolsClient)(nil).Get), arg0, arg1, arg2, arg3, arg4)
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "Get", reflect.TypeOf((*MockAgentPoolsClient)(nil).Get), ctx, resourceGroupName, resourceName, agentPoolName, options)
}
// NewListPager mocks base method.
func (m *MockAgentPoolsClient) NewListPager(resourceGroupName, resourceName string, options *armcontainerservice.AgentPoolsClientListOptions) *runtime.Pager[armcontainerservice.AgentPoolsClientListResponse] {
m.ctrl.T.Helper()
ret := m.ctrl.Call(m, "NewListPager", resourceGroupName, resourceName, options)
ret0, _ := ret[0].(*runtime.Pager[armcontainerservice.AgentPoolsClientListResponse])
return ret0
}
// NewListPager indicates an expected call of NewListPager.
func (mr *MockAgentPoolsClientMockRecorder) NewListPager(resourceGroupName, resourceName, options any) *gomock.Call {
mr.mock.ctrl.T.Helper()
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "NewListPager", reflect.TypeOf((*MockAgentPoolsClient)(nil).NewListPager), resourceGroupName, resourceName, options)
}

View File

@ -651,15 +651,18 @@ func (scaleSet *ScaleSet) Debug() string {
// TemplateNodeInfo returns a node template for this scale set.
func (scaleSet *ScaleSet) TemplateNodeInfo() (*framework.NodeInfo, error) {
template, err := scaleSet.getVMSSFromCache()
vmss, err := scaleSet.getVMSSFromCache()
if err != nil {
return nil, err
}
inputLabels := map[string]string{}
inputTaints := ""
node, err := buildNodeFromTemplate(scaleSet.Name, inputLabels, inputTaints, template, scaleSet.manager, scaleSet.enableDynamicInstanceList)
template, err := buildNodeTemplateFromVMSS(vmss, inputLabels, inputTaints)
if err != nil {
return nil, err
}
node, err := buildNodeFromTemplate(scaleSet.Name, template, scaleSet.manager, scaleSet.enableDynamicInstanceList)
if err != nil {
return nil, err
}

View File

@ -1232,12 +1232,12 @@ func TestScaleSetTemplateNodeInfo(t *testing.T) {
// Properly testing dynamic SKU list through skewer is not possible,
// because there are no Resource API mocks included yet.
// Instead, the rest of the (consumer side) tests here
// override GetVMSSTypeDynamically and GetVMSSTypeStatically functions.
// override GetInstanceTypeDynamically and GetInstanceTypeStatically functions.
t.Run("Checking dynamic workflow", func(t *testing.T) {
asg.enableDynamicInstanceList = true
GetVMSSTypeDynamically = func(template compute.VirtualMachineScaleSet, azCache *azureCache) (InstanceType, error) {
GetInstanceTypeDynamically = func(template NodeTemplate, azCache *azureCache) (InstanceType, error) {
vmssType := InstanceType{}
vmssType.VCPU = 1
vmssType.GPU = 2
@ -1255,10 +1255,10 @@ func TestScaleSetTemplateNodeInfo(t *testing.T) {
t.Run("Checking static workflow if dynamic fails", func(t *testing.T) {
asg.enableDynamicInstanceList = true
GetVMSSTypeDynamically = func(template compute.VirtualMachineScaleSet, azCache *azureCache) (InstanceType, error) {
GetInstanceTypeDynamically = func(template NodeTemplate, azCache *azureCache) (InstanceType, error) {
return InstanceType{}, fmt.Errorf("dynamic error exists")
}
GetVMSSTypeStatically = func(template compute.VirtualMachineScaleSet) (*InstanceType, error) {
GetInstanceTypeStatically = func(template NodeTemplate) (*InstanceType, error) {
vmssType := InstanceType{}
vmssType.VCPU = 1
vmssType.GPU = 2
@ -1276,10 +1276,10 @@ func TestScaleSetTemplateNodeInfo(t *testing.T) {
t.Run("Fails to find vmss instance information using static and dynamic workflow, instance not supported", func(t *testing.T) {
asg.enableDynamicInstanceList = true
GetVMSSTypeDynamically = func(template compute.VirtualMachineScaleSet, azCache *azureCache) (InstanceType, error) {
GetInstanceTypeDynamically = func(template NodeTemplate, azCache *azureCache) (InstanceType, error) {
return InstanceType{}, fmt.Errorf("dynamic error exists")
}
GetVMSSTypeStatically = func(template compute.VirtualMachineScaleSet) (*InstanceType, error) {
GetInstanceTypeStatically = func(template NodeTemplate) (*InstanceType, error) {
return &InstanceType{}, fmt.Errorf("static error exists")
}
nodeInfo, err := asg.TemplateNodeInfo()
@ -1292,7 +1292,7 @@ func TestScaleSetTemplateNodeInfo(t *testing.T) {
t.Run("Checking static-only workflow", func(t *testing.T) {
asg.enableDynamicInstanceList = false
GetVMSSTypeStatically = func(template compute.VirtualMachineScaleSet) (*InstanceType, error) {
GetInstanceTypeStatically = func(template NodeTemplate) (*InstanceType, error) {
vmssType := InstanceType{}
vmssType.VCPU = 1
vmssType.GPU = 2

View File

@ -24,7 +24,9 @@ import (
"strings"
"time"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v5"
"github.com/Azure/azure-sdk-for-go/services/compute/mgmt/2022-08-01/compute"
"github.com/Azure/go-autorest/autorest/to"
apiv1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
@ -84,8 +86,132 @@ const (
clusterLabelKey = AKSLabelKeyPrefixValue + "cluster"
)
func buildNodeFromTemplate(nodeGroupName string, inputLabels map[string]string, inputTaints string,
template compute.VirtualMachineScaleSet, manager *AzureManager, enableDynamicInstanceList bool) (*apiv1.Node, error) {
// VMPoolNodeTemplate holds properties for node from VMPool
type VMPoolNodeTemplate struct {
AgentPoolName string
Taints []apiv1.Taint
Labels map[string]*string
OSDiskType *armcontainerservice.OSDiskType
}
// VMSSNodeTemplate holds properties for node from VMSS
type VMSSNodeTemplate struct {
InputLabels map[string]string
InputTaints string
Tags map[string]*string
OSDisk *compute.VirtualMachineScaleSetOSDisk
}
// NodeTemplate represents a template for an Azure node
type NodeTemplate struct {
SkuName string
InstanceOS string
Location string
Zones []string
VMPoolNodeTemplate *VMPoolNodeTemplate
VMSSNodeTemplate *VMSSNodeTemplate
}
func buildNodeTemplateFromVMSS(vmss compute.VirtualMachineScaleSet, inputLabels map[string]string, inputTaints string) (NodeTemplate, error) {
instanceOS := cloudprovider.DefaultOS
if vmss.VirtualMachineProfile != nil &&
vmss.VirtualMachineProfile.OsProfile != nil &&
vmss.VirtualMachineProfile.OsProfile.WindowsConfiguration != nil {
instanceOS = "windows"
}
var osDisk *compute.VirtualMachineScaleSetOSDisk
if vmss.VirtualMachineProfile != nil &&
vmss.VirtualMachineProfile.StorageProfile != nil &&
vmss.VirtualMachineProfile.StorageProfile.OsDisk != nil {
osDisk = vmss.VirtualMachineProfile.StorageProfile.OsDisk
}
if vmss.Sku == nil || vmss.Sku.Name == nil {
return NodeTemplate{}, fmt.Errorf("VMSS %s has no SKU", to.String(vmss.Name))
}
if vmss.Location == nil {
return NodeTemplate{}, fmt.Errorf("VMSS %s has no location", to.String(vmss.Name))
}
zones := []string{}
if vmss.Zones != nil {
zones = *vmss.Zones
}
return NodeTemplate{
SkuName: *vmss.Sku.Name,
Location: *vmss.Location,
Zones: zones,
InstanceOS: instanceOS,
VMSSNodeTemplate: &VMSSNodeTemplate{
InputLabels: inputLabels,
InputTaints: inputTaints,
OSDisk: osDisk,
Tags: vmss.Tags,
},
}, nil
}
func buildNodeTemplateFromVMPool(vmsPool armcontainerservice.AgentPool, location string, skuName string, labelsFromSpec map[string]string, taintsFromSpec string) (NodeTemplate, error) {
if vmsPool.Properties == nil {
return NodeTemplate{}, fmt.Errorf("vmsPool %s has nil properties", to.String(vmsPool.Name))
}
// labels from the agentpool
labels := vmsPool.Properties.NodeLabels
// labels from spec
for k, v := range labelsFromSpec {
if labels == nil {
labels = make(map[string]*string)
}
labels[k] = to.StringPtr(v)
}
// taints from the agentpool
taintsList := []string{}
for _, taint := range vmsPool.Properties.NodeTaints {
if to.String(taint) != "" {
taintsList = append(taintsList, to.String(taint))
}
}
// taints from spec
if taintsFromSpec != "" {
taintsList = append(taintsList, taintsFromSpec)
}
taintsStr := strings.Join(taintsList, ",")
taints := extractTaintsFromSpecString(taintsStr)
var zones []string
if vmsPool.Properties.AvailabilityZones != nil {
for _, zone := range vmsPool.Properties.AvailabilityZones {
if zone != nil {
zones = append(zones, *zone)
}
}
}
var instanceOS string
if vmsPool.Properties.OSType != nil {
instanceOS = strings.ToLower(string(*vmsPool.Properties.OSType))
}
return NodeTemplate{
SkuName: skuName,
Zones: zones,
InstanceOS: instanceOS,
Location: location,
VMPoolNodeTemplate: &VMPoolNodeTemplate{
AgentPoolName: to.String(vmsPool.Name),
OSDiskType: vmsPool.Properties.OSDiskType,
Taints: taints,
Labels: labels,
},
}, nil
}
func buildNodeFromTemplate(nodeGroupName string, template NodeTemplate, manager *AzureManager, enableDynamicInstanceList bool) (*apiv1.Node, error) {
node := apiv1.Node{}
nodeName := fmt.Sprintf("%s-asg-%d", nodeGroupName, rand.Int63())
@ -104,28 +230,28 @@ func buildNodeFromTemplate(nodeGroupName string, inputLabels map[string]string,
// Fetching SKU information from SKU API if enableDynamicInstanceList is true.
var dynamicErr error
if enableDynamicInstanceList {
var vmssTypeDynamic InstanceType
klog.V(1).Infof("Fetching instance information for SKU: %s from SKU API", *template.Sku.Name)
vmssTypeDynamic, dynamicErr = GetVMSSTypeDynamically(template, manager.azureCache)
var instanceTypeDynamic InstanceType
klog.V(1).Infof("Fetching instance information for SKU: %s from SKU API", template.SkuName)
instanceTypeDynamic, dynamicErr = GetInstanceTypeDynamically(template, manager.azureCache)
if dynamicErr == nil {
vcpu = vmssTypeDynamic.VCPU
gpuCount = vmssTypeDynamic.GPU
memoryMb = vmssTypeDynamic.MemoryMb
vcpu = instanceTypeDynamic.VCPU
gpuCount = instanceTypeDynamic.GPU
memoryMb = instanceTypeDynamic.MemoryMb
} else {
klog.Errorf("Dynamically fetching of instance information from SKU api failed with error: %v", dynamicErr)
}
}
if !enableDynamicInstanceList || dynamicErr != nil {
klog.V(1).Infof("Falling back to static SKU list for SKU: %s", *template.Sku.Name)
klog.V(1).Infof("Falling back to static SKU list for SKU: %s", template.SkuName)
// fall-back on static list of vmss if dynamic workflow fails.
vmssTypeStatic, staticErr := GetVMSSTypeStatically(template)
instanceTypeStatic, staticErr := GetInstanceTypeStatically(template)
if staticErr == nil {
vcpu = vmssTypeStatic.VCPU
gpuCount = vmssTypeStatic.GPU
memoryMb = vmssTypeStatic.MemoryMb
vcpu = instanceTypeStatic.VCPU
gpuCount = instanceTypeStatic.GPU
memoryMb = instanceTypeStatic.MemoryMb
} else {
// return error if neither of the workflows results with vmss data.
klog.V(1).Infof("Instance type %q not supported, err: %v", *template.Sku.Name, staticErr)
klog.V(1).Infof("Instance type %q not supported, err: %v", template.SkuName, staticErr)
return nil, staticErr
}
}
@ -134,7 +260,7 @@ func buildNodeFromTemplate(nodeGroupName string, inputLabels map[string]string,
node.Status.Capacity[apiv1.ResourceCPU] = *resource.NewQuantity(vcpu, resource.DecimalSI)
// isNPSeries returns if a SKU is an NP-series SKU
// SKU API reports GPUs for NP-series but it's actually FPGAs
if isNPSeries(*template.Sku.Name) {
if isNPSeries(template.SkuName) {
node.Status.Capacity[xilinxFpgaResourceName] = *resource.NewQuantity(gpuCount, resource.DecimalSI)
} else {
node.Status.Capacity[gpu.ResourceNvidiaGPU] = *resource.NewQuantity(gpuCount, resource.DecimalSI)
@ -145,9 +271,37 @@ func buildNodeFromTemplate(nodeGroupName string, inputLabels map[string]string,
// TODO: set real allocatable.
node.Status.Allocatable = node.Status.Capacity
if template.VMSSNodeTemplate != nil {
node = processVMSSTemplate(template, nodeName, node)
} else if template.VMPoolNodeTemplate != nil {
node = processVMPoolTemplate(template, nodeName, node)
} else {
return nil, fmt.Errorf("invalid node template: missing both VMSS and VMPool templates")
}
klog.V(4).Infof("Setting node %s labels to: %s", nodeName, node.Labels)
klog.V(4).Infof("Setting node %s taints to: %s", nodeName, node.Spec.Taints)
node.Status.Conditions = cloudprovider.BuildReadyConditions()
return &node, nil
}
func processVMPoolTemplate(template NodeTemplate, nodeName string, node apiv1.Node) apiv1.Node {
labels := buildGenericLabels(template, nodeName)
labels[agentPoolNodeLabelKey] = template.VMPoolNodeTemplate.AgentPoolName
if template.VMPoolNodeTemplate.Labels != nil {
for k, v := range template.VMPoolNodeTemplate.Labels {
labels[k] = to.String(v)
}
}
node.Labels = cloudprovider.JoinStringMaps(node.Labels, labels)
node.Spec.Taints = template.VMPoolNodeTemplate.Taints
return node
}
func processVMSSTemplate(template NodeTemplate, nodeName string, node apiv1.Node) apiv1.Node {
// NodeLabels
if template.Tags != nil {
for k, v := range template.Tags {
if template.VMSSNodeTemplate.Tags != nil {
for k, v := range template.VMSSNodeTemplate.Tags {
if v != nil {
node.Labels[k] = *v
} else {
@ -164,10 +318,10 @@ func buildNodeFromTemplate(nodeGroupName string, inputLabels map[string]string,
labels := make(map[string]string)
// Prefer the explicit labels in spec coming from RP over the VMSS template
if len(inputLabels) > 0 {
labels = inputLabels
if len(template.VMSSNodeTemplate.InputLabels) > 0 {
labels = template.VMSSNodeTemplate.InputLabels
} else {
labels = extractLabelsFromScaleSet(template.Tags)
labels = extractLabelsFromTags(template.VMSSNodeTemplate.Tags)
}
// Add the agentpool label, its value should come from the VMSS poolName tag
@ -182,87 +336,74 @@ func buildNodeFromTemplate(nodeGroupName string, inputLabels map[string]string,
labels[agentPoolNodeLabelKey] = node.Labels[poolNameTag]
}
// Add the storage profile and storage tier labels
if template.VirtualMachineProfile != nil && template.VirtualMachineProfile.StorageProfile != nil && template.VirtualMachineProfile.StorageProfile.OsDisk != nil {
// Add the storage profile and storage tier labels for vmss node
if template.VMSSNodeTemplate.OSDisk != nil {
// ephemeral
if template.VirtualMachineProfile.StorageProfile.OsDisk.DiffDiskSettings != nil && template.VirtualMachineProfile.StorageProfile.OsDisk.DiffDiskSettings.Option == compute.Local {
if template.VMSSNodeTemplate.OSDisk.DiffDiskSettings != nil && template.VMSSNodeTemplate.OSDisk.DiffDiskSettings.Option == compute.Local {
labels[legacyStorageProfileNodeLabelKey] = "ephemeral"
labels[storageProfileNodeLabelKey] = "ephemeral"
} else {
labels[legacyStorageProfileNodeLabelKey] = "managed"
labels[storageProfileNodeLabelKey] = "managed"
}
if template.VirtualMachineProfile.StorageProfile.OsDisk.ManagedDisk != nil {
labels[legacyStorageTierNodeLabelKey] = string(template.VirtualMachineProfile.StorageProfile.OsDisk.ManagedDisk.StorageAccountType)
labels[storageTierNodeLabelKey] = string(template.VirtualMachineProfile.StorageProfile.OsDisk.ManagedDisk.StorageAccountType)
if template.VMSSNodeTemplate.OSDisk.ManagedDisk != nil {
labels[legacyStorageTierNodeLabelKey] = string(template.VMSSNodeTemplate.OSDisk.ManagedDisk.StorageAccountType)
labels[storageTierNodeLabelKey] = string(template.VMSSNodeTemplate.OSDisk.ManagedDisk.StorageAccountType)
}
// Add ephemeral-storage value
if template.VirtualMachineProfile.StorageProfile.OsDisk.DiskSizeGB != nil {
node.Status.Capacity[apiv1.ResourceEphemeralStorage] = *resource.NewQuantity(int64(int(*template.VirtualMachineProfile.StorageProfile.OsDisk.DiskSizeGB)*1024*1024*1024), resource.DecimalSI)
klog.V(4).Infof("OS Disk Size from template is: %d", *template.VirtualMachineProfile.StorageProfile.OsDisk.DiskSizeGB)
if template.VMSSNodeTemplate.OSDisk.DiskSizeGB != nil {
node.Status.Capacity[apiv1.ResourceEphemeralStorage] = *resource.NewQuantity(int64(int(*template.VMSSNodeTemplate.OSDisk.DiskSizeGB)*1024*1024*1024), resource.DecimalSI)
klog.V(4).Infof("OS Disk Size from template is: %d", *template.VMSSNodeTemplate.OSDisk.DiskSizeGB)
klog.V(4).Infof("Setting ephemeral storage to: %v", node.Status.Capacity[apiv1.ResourceEphemeralStorage])
}
}
// If we are on GPU-enabled SKUs, append the accelerator
// label so that CA makes better decision when scaling from zero for GPU pools
if isNvidiaEnabledSKU(*template.Sku.Name) {
if isNvidiaEnabledSKU(template.SkuName) {
labels[GPULabel] = "nvidia"
labels[legacyGPULabel] = "nvidia"
}
// Extract allocatables from tags
resourcesFromTags := extractAllocatableResourcesFromScaleSet(template.Tags)
resourcesFromTags := extractAllocatableResourcesFromScaleSet(template.VMSSNodeTemplate.Tags)
for resourceName, val := range resourcesFromTags {
node.Status.Capacity[apiv1.ResourceName(resourceName)] = *val
}
node.Labels = cloudprovider.JoinStringMaps(node.Labels, labels)
klog.V(4).Infof("Setting node %s labels to: %s", nodeName, node.Labels)
var taints []apiv1.Taint
// Prefer the explicit taints in spec over the VMSS template
if inputTaints != "" {
taints = extractTaintsFromSpecString(inputTaints)
// Prefer the explicit taints in spec over the tags from vmss or vm
if template.VMSSNodeTemplate.InputTaints != "" {
taints = extractTaintsFromSpecString(template.VMSSNodeTemplate.InputTaints)
} else {
taints = extractTaintsFromScaleSet(template.Tags)
taints = extractTaintsFromTags(template.VMSSNodeTemplate.Tags)
}
// Taints from the Scale Set's Tags
node.Spec.Taints = taints
klog.V(4).Infof("Setting node %s taints to: %s", nodeName, node.Spec.Taints)
node.Status.Conditions = cloudprovider.BuildReadyConditions()
return &node, nil
return node
}
func buildInstanceOS(template compute.VirtualMachineScaleSet) string {
instanceOS := cloudprovider.DefaultOS
if template.VirtualMachineProfile != nil && template.VirtualMachineProfile.OsProfile != nil && template.VirtualMachineProfile.OsProfile.WindowsConfiguration != nil {
instanceOS = "windows"
}
return instanceOS
}
func buildGenericLabels(template compute.VirtualMachineScaleSet, nodeName string) map[string]string {
func buildGenericLabels(template NodeTemplate, nodeName string) map[string]string {
result := make(map[string]string)
result[kubeletapis.LabelArch] = cloudprovider.DefaultArch
result[apiv1.LabelArchStable] = cloudprovider.DefaultArch
result[kubeletapis.LabelOS] = buildInstanceOS(template)
result[apiv1.LabelOSStable] = buildInstanceOS(template)
result[kubeletapis.LabelOS] = template.InstanceOS
result[apiv1.LabelOSStable] = template.InstanceOS
result[apiv1.LabelInstanceType] = *template.Sku.Name
result[apiv1.LabelInstanceTypeStable] = *template.Sku.Name
result[apiv1.LabelZoneRegion] = strings.ToLower(*template.Location)
result[apiv1.LabelTopologyRegion] = strings.ToLower(*template.Location)
result[apiv1.LabelInstanceType] = template.SkuName
result[apiv1.LabelInstanceTypeStable] = template.SkuName
result[apiv1.LabelZoneRegion] = strings.ToLower(template.Location)
result[apiv1.LabelTopologyRegion] = strings.ToLower(template.Location)
if template.Zones != nil && len(*template.Zones) > 0 {
failureDomains := make([]string, len(*template.Zones))
for k, v := range *template.Zones {
failureDomains[k] = strings.ToLower(*template.Location) + "-" + v
if len(template.Zones) > 0 {
failureDomains := make([]string, len(template.Zones))
for k, v := range template.Zones {
failureDomains[k] = strings.ToLower(template.Location) + "-" + v
}
//Picks random zones for Multi-zone nodepool when scaling from zero.
//This random zone will not be the same as the zone of the VMSS that is being created, the purpose of creating
@ -283,7 +424,7 @@ func buildGenericLabels(template compute.VirtualMachineScaleSet, nodeName string
return result
}
func extractLabelsFromScaleSet(tags map[string]*string) map[string]string {
func extractLabelsFromTags(tags map[string]*string) map[string]string {
result := make(map[string]string)
for tagName, tagValue := range tags {
@ -300,7 +441,7 @@ func extractLabelsFromScaleSet(tags map[string]*string) map[string]string {
return result
}
func extractTaintsFromScaleSet(tags map[string]*string) []apiv1.Taint {
func extractTaintsFromTags(tags map[string]*string) []apiv1.Taint {
taints := make([]apiv1.Taint, 0)
for tagName, tagValue := range tags {
@ -327,35 +468,61 @@ func extractTaintsFromScaleSet(tags map[string]*string) []apiv1.Taint {
return taints
}
// extractTaintsFromSpecString is for nodepool taints
// Example of a valid taints string, is the same argument to kubelet's `--register-with-taints`
// "dedicated=foo:NoSchedule,group=bar:NoExecute,app=fizz:PreferNoSchedule"
func extractTaintsFromSpecString(taintsString string) []apiv1.Taint {
taints := make([]apiv1.Taint, 0)
dedupMap := make(map[string]interface{})
// First split the taints at the separator
splits := strings.Split(taintsString, ",")
for _, split := range splits {
taintSplit := strings.Split(split, "=")
if len(taintSplit) != 2 {
if dedupMap[split] != nil {
continue
}
taintKey := taintSplit[0]
taintValue := taintSplit[1]
r, _ := regexp.Compile("(.*):(?:NoSchedule|NoExecute|PreferNoSchedule)")
if !r.MatchString(taintValue) {
continue
dedupMap[split] = struct{}{}
valid, taint := constructTaintFromString(split)
if valid {
taints = append(taints, taint)
}
}
return taints
}
values := strings.SplitN(taintValue, ":", 2)
taints = append(taints, apiv1.Taint{
Key: taintKey,
Value: values[0],
Effect: apiv1.TaintEffect(values[1]),
})
// buildNodeTaintsForVMPool is for VMPool taints, it looks for the taints in the format
// []string{zone=dmz:NoSchedule, usage=monitoring:NoSchedule}
func buildNodeTaintsForVMPool(taintStrs []string) []apiv1.Taint {
taints := make([]apiv1.Taint, 0)
for _, taintStr := range taintStrs {
valid, taint := constructTaintFromString(taintStr)
if valid {
taints = append(taints, taint)
}
}
return taints
}
// constructTaintFromString constructs a taint from a string in the format <key>=<value>:<effect>
// if the input string is not in the correct format, it returns false and an empty taint
func constructTaintFromString(taintString string) (bool, apiv1.Taint) {
taintSplit := strings.Split(taintString, "=")
if len(taintSplit) != 2 {
return false, apiv1.Taint{}
}
taintKey := taintSplit[0]
taintValue := taintSplit[1]
r, _ := regexp.Compile("(.*):(?:NoSchedule|NoExecute|PreferNoSchedule)")
if !r.MatchString(taintValue) {
return false, apiv1.Taint{}
}
return taints
values := strings.SplitN(taintValue, ":", 2)
return true, apiv1.Taint{
Key: taintKey,
Value: values[0],
Effect: apiv1.TaintEffect(values[1]),
}
}
func extractAutoscalingOptionsFromScaleSetTags(tags map[string]*string) map[string]string {

View File

@ -21,6 +21,7 @@ import (
"strings"
"testing"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v5"
"github.com/Azure/azure-sdk-for-go/services/compute/mgmt/2022-08-01/compute"
"github.com/Azure/go-autorest/autorest"
"github.com/Azure/go-autorest/autorest/to"
@ -30,7 +31,7 @@ import (
"k8s.io/apimachinery/pkg/api/resource"
)
func TestExtractLabelsFromScaleSet(t *testing.T) {
func TestExtractLabelsFromTags(t *testing.T) {
expectedNodeLabelKey := "zip"
expectedNodeLabelValue := "zap"
extraNodeLabelValue := "buzz"
@ -52,14 +53,14 @@ func TestExtractLabelsFromScaleSet(t *testing.T) {
fmt.Sprintf("%s%s", nodeLabelTagName, escapedUnderscoreNodeLabelKey): &escapedUnderscoreNodeLabelValue,
}
labels := extractLabelsFromScaleSet(tags)
labels := extractLabelsFromTags(tags)
assert.Len(t, labels, 3)
assert.Equal(t, expectedNodeLabelValue, labels[expectedNodeLabelKey])
assert.Equal(t, escapedSlashNodeLabelValue, labels[expectedSlashEscapedNodeLabelKey])
assert.Equal(t, escapedUnderscoreNodeLabelValue, labels[expectedUnderscoreEscapedNodeLabelKey])
}
func TestExtractTaintsFromScaleSet(t *testing.T) {
func TestExtractTaintsFromTags(t *testing.T) {
noScheduleTaintValue := "foo:NoSchedule"
noExecuteTaintValue := "bar:NoExecute"
preferNoScheduleTaintValue := "fizz:PreferNoSchedule"
@ -100,7 +101,7 @@ func TestExtractTaintsFromScaleSet(t *testing.T) {
},
}
taints := extractTaintsFromScaleSet(tags)
taints := extractTaintsFromTags(tags)
assert.Len(t, taints, 4)
assert.Equal(t, makeTaintSet(expectedTaints), makeTaintSet(taints))
}
@ -137,6 +138,11 @@ func TestExtractTaintsFromSpecString(t *testing.T) {
Value: "fizz",
Effect: apiv1.TaintEffectPreferNoSchedule,
},
{
Key: "dedicated", // duplicate key, should be ignored
Value: "foo",
Effect: apiv1.TaintEffectNoSchedule,
},
}
taints := extractTaintsFromSpecString(strings.Join(taintsString, ","))
@ -176,8 +182,9 @@ func TestTopologyFromScaleSet(t *testing.T) {
Location: to.StringPtr("westus"),
}
expectedZoneValues := []string{"westus-1", "westus-2", "westus-3"}
labels := buildGenericLabels(testVmss, testNodeName)
template, err := buildNodeTemplateFromVMSS(testVmss, map[string]string{}, "")
assert.NoError(t, err)
labels := buildGenericLabels(template, testNodeName)
failureDomain, ok := labels[apiv1.LabelZoneFailureDomain]
assert.True(t, ok)
topologyZone, ok := labels[apiv1.LabelTopologyZone]
@ -205,7 +212,9 @@ func TestEmptyTopologyFromScaleSet(t *testing.T) {
expectedFailureDomain := "0"
expectedTopologyZone := "0"
expectedAzureDiskTopology := ""
labels := buildGenericLabels(testVmss, testNodeName)
template, err := buildNodeTemplateFromVMSS(testVmss, map[string]string{}, "")
assert.NoError(t, err)
labels := buildGenericLabels(template, testNodeName)
failureDomain, ok := labels[apiv1.LabelZoneFailureDomain]
assert.True(t, ok)
@ -219,6 +228,61 @@ func TestEmptyTopologyFromScaleSet(t *testing.T) {
assert.True(t, ok)
assert.Equal(t, expectedAzureDiskTopology, azureDiskTopology)
}
func TestBuildNodeTemplateFromVMPool(t *testing.T) {
agentPoolName := "testpool"
location := "eastus"
skuName := "Standard_DS2_v2"
labelKey := "foo"
labelVal := "bar"
taintStr := "dedicated=foo:NoSchedule,boo=fizz:PreferNoSchedule,group=bar:NoExecute"
osType := armcontainerservice.OSTypeLinux
osDiskType := armcontainerservice.OSDiskTypeEphemeral
zone1 := "1"
zone2 := "2"
vmpool := armcontainerservice.AgentPool{
Name: to.StringPtr(agentPoolName),
Properties: &armcontainerservice.ManagedClusterAgentPoolProfileProperties{
NodeLabels: map[string]*string{
"existing": to.StringPtr("label"),
"department": to.StringPtr("engineering"),
},
NodeTaints: []*string{to.StringPtr("group=bar:NoExecute")},
OSType: &osType,
OSDiskType: &osDiskType,
AvailabilityZones: []*string{&zone1, &zone2},
},
}
labelsFromSpec := map[string]string{labelKey: labelVal}
taintsFromSpec := taintStr
template, err := buildNodeTemplateFromVMPool(vmpool, location, skuName, labelsFromSpec, taintsFromSpec)
assert.NoError(t, err)
assert.Equal(t, skuName, template.SkuName)
assert.Equal(t, location, template.Location)
assert.ElementsMatch(t, []string{zone1, zone2}, template.Zones)
assert.Equal(t, "linux", template.InstanceOS)
assert.NotNil(t, template.VMPoolNodeTemplate)
assert.Equal(t, agentPoolName, template.VMPoolNodeTemplate.AgentPoolName)
assert.Equal(t, &osDiskType, template.VMPoolNodeTemplate.OSDiskType)
// Labels: should include both from NodeLabels and labelsFromSpec
assert.Contains(t, template.VMPoolNodeTemplate.Labels, "existing")
assert.Equal(t, "label", *template.VMPoolNodeTemplate.Labels["existing"])
assert.Contains(t, template.VMPoolNodeTemplate.Labels, "department")
assert.Equal(t, "engineering", *template.VMPoolNodeTemplate.Labels["department"])
assert.Contains(t, template.VMPoolNodeTemplate.Labels, labelKey)
assert.Equal(t, labelVal, *template.VMPoolNodeTemplate.Labels[labelKey])
// Taints: should include both from NodeTaints and taintsFromSpec
taintSet := makeTaintSet(template.VMPoolNodeTemplate.Taints)
expectedTaints := []apiv1.Taint{
{Key: "group", Value: "bar", Effect: apiv1.TaintEffectNoExecute},
{Key: "dedicated", Value: "foo", Effect: apiv1.TaintEffectNoSchedule},
{Key: "boo", Value: "fizz", Effect: apiv1.TaintEffectPreferNoSchedule},
}
assert.Equal(t, makeTaintSet(expectedTaints), taintSet)
}
func makeTaintSet(taints []apiv1.Taint) map[apiv1.Taint]bool {
set := make(map[apiv1.Taint]bool)

View File

@ -18,142 +18,426 @@ package azure
import (
"fmt"
"net/http"
"strings"
"github.com/Azure/azure-sdk-for-go/sdk/azcore/policy"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v5"
"github.com/Azure/azure-sdk-for-go/services/compute/mgmt/2022-08-01/compute"
"github.com/Azure/go-autorest/autorest/to"
apiv1 "k8s.io/api/core/v1"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
"k8s.io/autoscaler/cluster-autoscaler/config"
"k8s.io/autoscaler/cluster-autoscaler/config/dynamic"
"k8s.io/autoscaler/cluster-autoscaler/simulator/framework"
klog "k8s.io/klog/v2"
)
// VMsPool is single instance VM pool
// this is a placeholder for now, no real implementation
type VMsPool struct {
// VMPool represents a group of standalone virtual machines (VMs) with a single SKU.
// It is part of a mixed-SKU agent pool (an agent pool with type `VirtualMachines`).
// Terminology:
// - Agent pool: A node pool in an AKS cluster.
// - VMs pool: An agent pool of type `VirtualMachines`, which can contain mixed SKUs.
// - VMPool: A subset of VMs within a VMs pool that share the same SKU.
type VMPool struct {
azureRef
manager *AzureManager
resourceGroup string
agentPoolName string // the virtual machines agentpool that this VMPool belongs to
sku string // sku of the VM in the pool
minSize int
maxSize int
curSize int64
// sizeMutex sync.Mutex
// lastSizeRefresh time.Time
}
// NewVMsPool creates a new VMsPool
func NewVMsPool(spec *dynamic.NodeGroupSpec, am *AzureManager) *VMsPool {
nodepool := &VMsPool{
azureRef: azureRef{
Name: spec.Name,
},
manager: am,
resourceGroup: am.config.ResourceGroup,
curSize: -1,
minSize: spec.MinSize,
maxSize: spec.MaxSize,
// NewVMPool creates a new VMPool - a pool of standalone VMs of a single size.
func NewVMPool(spec *dynamic.NodeGroupSpec, am *AzureManager, agentPoolName string, sku string) (*VMPool, error) {
if am.azClient.agentPoolClient == nil {
return nil, fmt.Errorf("agentPoolClient is nil")
}
return nodepool
nodepool := &VMPool{
azureRef: azureRef{
Name: spec.Name, // in format "<agentPoolName>/<sku>"
},
manager: am,
sku: sku,
agentPoolName: agentPoolName,
minSize: spec.MinSize,
maxSize: spec.MaxSize,
}
return nodepool, nil
}
// MinSize returns the minimum size the cluster is allowed to scaled down
// MinSize returns the minimum size the vmPool is allowed to scaled down
// to as provided by the node spec in --node parameter.
func (agentPool *VMsPool) MinSize() int {
return agentPool.minSize
func (vmPool *VMPool) MinSize() int {
return vmPool.minSize
}
// Exist is always true since we are initialized with an existing agentpool
func (agentPool *VMsPool) Exist() bool {
// Exist is always true since we are initialized with an existing vmPool
func (vmPool *VMPool) Exist() bool {
return true
}
// Create creates the node group on the cloud provider side.
func (agentPool *VMsPool) Create() (cloudprovider.NodeGroup, error) {
func (vmPool *VMPool) Create() (cloudprovider.NodeGroup, error) {
return nil, cloudprovider.ErrAlreadyExist
}
// Delete deletes the node group on the cloud provider side.
func (agentPool *VMsPool) Delete() error {
func (vmPool *VMPool) Delete() error {
return cloudprovider.ErrNotImplemented
}
// ForceDeleteNodes deletes nodes from the group regardless of constraints.
func (vmPool *VMPool) ForceDeleteNodes(nodes []*apiv1.Node) error {
return cloudprovider.ErrNotImplemented
}
// Autoprovisioned is always false since we are initialized with an existing agentpool
func (agentPool *VMsPool) Autoprovisioned() bool {
func (vmPool *VMPool) Autoprovisioned() bool {
return false
}
// GetOptions returns NodeGroupAutoscalingOptions that should be used for this particular
// NodeGroup. Returning a nil will result in using default options.
func (agentPool *VMsPool) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
// TODO(wenxuan): Implement this method
return nil, cloudprovider.ErrNotImplemented
func (vmPool *VMPool) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error) {
// TODO(wenxuan): implement this method when vmPool can fully support GPU nodepool
return nil, nil
}
// MaxSize returns the maximum size scale limit provided by --node
// parameter to the autoscaler main
func (agentPool *VMsPool) MaxSize() int {
return agentPool.maxSize
func (vmPool *VMPool) MaxSize() int {
return vmPool.maxSize
}
// TargetSize returns the current TARGET size of the node group. It is possible that the
// number is different from the number of nodes registered in Kubernetes.
func (agentPool *VMsPool) TargetSize() (int, error) {
// TODO(wenxuan): Implement this method
return -1, cloudprovider.ErrNotImplemented
// TargetSize returns the current target size of the node group. This value represents
// the desired number of nodes in the VMPool, which may differ from the actual number
// of nodes currently present.
func (vmPool *VMPool) TargetSize() (int, error) {
// VMs in the "Deleting" state are not counted towards the target size.
size, err := vmPool.getCurSize(skipOption{skipDeleting: true, skipFailed: false})
return int(size), err
}
// IncreaseSize increase the size through a PUT AP call. It calculates the expected size
// based on a delta provided as parameter
func (agentPool *VMsPool) IncreaseSize(delta int) error {
// TODO(wenxuan): Implement this method
return cloudprovider.ErrNotImplemented
// IncreaseSize increases the size of the VMPool by sending a PUT request to update the agent pool.
// This method waits until the asynchronous PUT operation completes or the client-side timeout is reached.
func (vmPool *VMPool) IncreaseSize(delta int) error {
if delta <= 0 {
return fmt.Errorf("size increase must be positive, current delta: %d", delta)
}
// Skip VMs in the failed state so that a PUT AP will be triggered to fix the failed VMs.
currentSize, err := vmPool.getCurSize(skipOption{skipDeleting: true, skipFailed: true})
if err != nil {
return err
}
if int(currentSize)+delta > vmPool.MaxSize() {
return fmt.Errorf("size-increasing request of %d is bigger than max size %d", int(currentSize)+delta, vmPool.MaxSize())
}
updateCtx, cancel := getContextWithTimeout(vmsAsyncContextTimeout)
defer cancel()
versionedAP, err := vmPool.getAgentpoolFromCache()
if err != nil {
klog.Errorf("Failed to get vmPool %s, error: %s", vmPool.agentPoolName, err)
return err
}
count := currentSize + int32(delta)
requestBody := armcontainerservice.AgentPool{}
// self-hosted CAS will be using Manual scale profile
if len(versionedAP.Properties.VirtualMachinesProfile.Scale.Manual) > 0 {
requestBody = buildRequestBodyForScaleUp(versionedAP, count, vmPool.sku)
} else { // AKS-managed CAS will use custom header for setting the target count
header := make(http.Header)
header.Set("Target-Count", fmt.Sprintf("%d", count))
updateCtx = policy.WithHTTPHeader(updateCtx, header)
}
defer vmPool.manager.invalidateCache()
poller, err := vmPool.manager.azClient.agentPoolClient.BeginCreateOrUpdate(
updateCtx,
vmPool.manager.config.ClusterResourceGroup,
vmPool.manager.config.ClusterName,
vmPool.agentPoolName,
requestBody, nil)
if err != nil {
klog.Errorf("Failed to scale up agentpool %s in cluster %s for vmPool %s with error: %v",
vmPool.agentPoolName, vmPool.manager.config.ClusterName, vmPool.Name, err)
return err
}
if _, err := poller.PollUntilDone(updateCtx, nil /*default polling interval is 30s*/); err != nil {
klog.Errorf("agentPoolClient.BeginCreateOrUpdate for aks cluster %s agentpool %s for scaling up vmPool %s failed with error %s",
vmPool.manager.config.ClusterName, vmPool.agentPoolName, vmPool.Name, err)
return err
}
klog.Infof("Successfully scaled up agentpool %s in cluster %s for vmPool %s to size %d",
vmPool.agentPoolName, vmPool.manager.config.ClusterName, vmPool.Name, count)
return nil
}
// DeleteNodes extracts the providerIDs from the node spec and
// delete or deallocate the nodes from the agent pool based on the scale down policy.
func (agentPool *VMsPool) DeleteNodes(nodes []*apiv1.Node) error {
// TODO(wenxuan): Implement this method
return cloudprovider.ErrNotImplemented
// buildRequestBodyForScaleUp builds the request body for scale up for self-hosted CAS
func buildRequestBodyForScaleUp(agentpool armcontainerservice.AgentPool, count int32, vmSku string) armcontainerservice.AgentPool {
requestBody := armcontainerservice.AgentPool{
Properties: &armcontainerservice.ManagedClusterAgentPoolProfileProperties{
Type: agentpool.Properties.Type,
},
}
// the request body must have the same mode as the original agentpool
// otherwise the PUT request will fail
if agentpool.Properties.Mode != nil &&
*agentpool.Properties.Mode == armcontainerservice.AgentPoolModeSystem {
systemMode := armcontainerservice.AgentPoolModeSystem
requestBody.Properties.Mode = &systemMode
}
// set the count of the matching manual scale profile to the new target value
for _, manualProfile := range agentpool.Properties.VirtualMachinesProfile.Scale.Manual {
if manualProfile != nil && len(manualProfile.Sizes) == 1 &&
strings.EqualFold(to.String(manualProfile.Sizes[0]), vmSku) {
klog.V(5).Infof("Found matching manual profile for VM SKU: %s, updating count to: %d", vmSku, count)
manualProfile.Count = to.Int32Ptr(count)
requestBody.Properties.VirtualMachinesProfile = agentpool.Properties.VirtualMachinesProfile
break
}
}
return requestBody
}
// ForceDeleteNodes deletes nodes from the group regardless of constraints.
func (agentPool *VMsPool) ForceDeleteNodes(nodes []*apiv1.Node) error {
return cloudprovider.ErrNotImplemented
// DeleteNodes removes the specified nodes from the VMPool by extracting their providerIDs
// and performing the appropriate delete or deallocate operation based on the agent pool's
// scale-down policy. This method waits for the asynchronous delete operation to complete,
// with a client-side timeout.
func (vmPool *VMPool) DeleteNodes(nodes []*apiv1.Node) error {
// Ensure we don't scale below the minimum size by excluding VMs in the "Deleting" state.
currentSize, err := vmPool.getCurSize(skipOption{skipDeleting: true, skipFailed: false})
if err != nil {
return fmt.Errorf("unable to retrieve current size: %w", err)
}
if int(currentSize) <= vmPool.MinSize() {
return fmt.Errorf("cannot delete nodes as minimum size of %d has been reached", vmPool.MinSize())
}
providerIDs, err := vmPool.getProviderIDsForNodes(nodes)
if err != nil {
return fmt.Errorf("failed to retrieve provider IDs for nodes: %w", err)
}
if len(providerIDs) == 0 {
return nil
}
klog.V(3).Infof("Deleting nodes from vmPool %s: %v", vmPool.Name, providerIDs)
machineNames := make([]*string, len(providerIDs))
for i, providerID := range providerIDs {
// extract the machine name from the providerID by splitting the providerID by '/' and get the last element
// The providerID look like this:
// "azure:///subscriptions/0000000-0000-0000-0000-00000000000/resourceGroups/mc_myrg_mycluster_eastus/providers/Microsoft.Compute/virtualMachines/aks-mypool-12345678-vms0"
machineName, err := resourceName(providerID)
if err != nil {
return err
}
machineNames[i] = &machineName
}
requestBody := armcontainerservice.AgentPoolDeleteMachinesParameter{
MachineNames: machineNames,
}
deleteCtx, cancel := getContextWithTimeout(vmsAsyncContextTimeout)
defer cancel()
defer vmPool.manager.invalidateCache()
poller, err := vmPool.manager.azClient.agentPoolClient.BeginDeleteMachines(
deleteCtx,
vmPool.manager.config.ClusterResourceGroup,
vmPool.manager.config.ClusterName,
vmPool.agentPoolName,
requestBody, nil)
if err != nil {
klog.Errorf("Failed to delete nodes from agentpool %s in cluster %s with error: %v",
vmPool.agentPoolName, vmPool.manager.config.ClusterName, err)
return err
}
if _, err := poller.PollUntilDone(deleteCtx, nil); err != nil {
klog.Errorf("agentPoolClient.BeginDeleteMachines for aks cluster %s for scaling down vmPool %s failed with error %s",
vmPool.manager.config.ClusterName, vmPool.agentPoolName, err)
return err
}
klog.Infof("Successfully deleted %d nodes from vmPool %s", len(providerIDs), vmPool.Name)
return nil
}
func (vmPool *VMPool) getProviderIDsForNodes(nodes []*apiv1.Node) ([]string, error) {
var providerIDs []string
for _, node := range nodes {
belongs, err := vmPool.Belongs(node)
if err != nil {
return nil, fmt.Errorf("failed to check if node %s belongs to vmPool %s: %w", node.Name, vmPool.Name, err)
}
if !belongs {
return nil, fmt.Errorf("node %s does not belong to vmPool %s", node.Name, vmPool.Name)
}
providerIDs = append(providerIDs, node.Spec.ProviderID)
}
return providerIDs, nil
}
// Belongs returns true if the given k8s node belongs to this vms nodepool.
func (vmPool *VMPool) Belongs(node *apiv1.Node) (bool, error) {
klog.V(6).Infof("Check if node belongs to this vmPool:%s, node:%v\n", vmPool, node)
ref := &azureRef{
Name: node.Spec.ProviderID,
}
nodeGroup, err := vmPool.manager.GetNodeGroupForInstance(ref)
if err != nil {
return false, err
}
if nodeGroup == nil {
return false, fmt.Errorf("%s doesn't belong to a known node group", node.Name)
}
if !strings.EqualFold(nodeGroup.Id(), vmPool.Id()) {
return false, nil
}
return true, nil
}
// DecreaseTargetSize decreases the target size of the node group.
func (agentPool *VMsPool) DecreaseTargetSize(delta int) error {
// TODO(wenxuan): Implement this method
return cloudprovider.ErrNotImplemented
func (vmPool *VMPool) DecreaseTargetSize(delta int) error {
// The TargetSize of a VMPool is automatically adjusted after node deletions.
// This method is invoked in scenarios such as (see details in clusterstate.go):
// - len(readiness.Registered) > acceptableRange.CurrentTarget
// - len(readiness.Registered) < acceptableRange.CurrentTarget - unregisteredNodes
// For VMPool, this method should not be called because:
// CurrentTarget = len(readiness.Registered) + unregisteredNodes - len(nodesInDeletingState)
// Here, nodesInDeletingState is a subset of unregisteredNodes,
// ensuring len(readiness.Registered) is always within the acceptable range.
// here we just invalidate the cache to avoid any potential bugs
vmPool.manager.invalidateCache()
klog.Warningf("DecreaseTargetSize called for VMPool %s, but it should not be used, invalidating cache", vmPool.Name)
return nil
}
// Id returns the name of the agentPool
func (agentPool *VMsPool) Id() string {
return agentPool.azureRef.Name
// Id returns the name of the agentPool, it is in the format of <agentpoolname>/<sku>
// e.g. mypool1/Standard_D2s_v3
func (vmPool *VMPool) Id() string {
return vmPool.azureRef.Name
}
// Debug returns a string with basic details of the agentPool
func (agentPool *VMsPool) Debug() string {
return fmt.Sprintf("%s (%d:%d)", agentPool.Id(), agentPool.MinSize(), agentPool.MaxSize())
func (vmPool *VMPool) Debug() string {
return fmt.Sprintf("%s (%d:%d)", vmPool.Id(), vmPool.MinSize(), vmPool.MaxSize())
}
func (agentPool *VMsPool) getVMsFromCache() ([]compute.VirtualMachine, error) {
// vmsPoolMap is a map of agent pool name to the list of virtual machines
vmsPoolMap := agentPool.manager.azureCache.getVirtualMachines()
if _, ok := vmsPoolMap[agentPool.Name]; !ok {
return []compute.VirtualMachine{}, fmt.Errorf("vms pool %s not found in the cache", agentPool.Name)
func isSpotAgentPool(ap armcontainerservice.AgentPool) bool {
if ap.Properties != nil && ap.Properties.ScaleSetPriority != nil {
return strings.EqualFold(string(*ap.Properties.ScaleSetPriority), "Spot")
}
return false
}
// skipOption is used to determine whether to skip VMs in certain states when calculating the current size of the vmPool.
type skipOption struct {
// skipDeleting indicates whether to skip VMs in the "Deleting" state.
skipDeleting bool
// skipFailed indicates whether to skip VMs in the "Failed" state.
skipFailed bool
}
// getCurSize determines the current count of VMs in the vmPool, including unregistered ones.
// The source of truth depends on the pool type (spot or non-spot).
func (vmPool *VMPool) getCurSize(op skipOption) (int32, error) {
agentPool, err := vmPool.getAgentpoolFromCache()
if err != nil {
klog.Errorf("Failed to retrieve agent pool %s from cache: %v", vmPool.agentPoolName, err)
return -1, err
}
return vmsPoolMap[agentPool.Name], nil
// spot pool size is retrieved directly from Azure instead of the cache
if isSpotAgentPool(agentPool) {
return vmPool.getSpotPoolSize()
}
// non-spot pool size is retrieved from the cache
vms, err := vmPool.getVMsFromCache(op)
if err != nil {
klog.Errorf("Failed to get VMs from cache for agentpool %s with error: %v", vmPool.agentPoolName, err)
return -1, err
}
return int32(len(vms)), nil
}
// getSpotPoolSize retrieves the current size of a spot agent pool directly from Azure.
func (vmPool *VMPool) getSpotPoolSize() (int32, error) {
ap, err := vmPool.getAgentpoolFromAzure()
if err != nil {
klog.Errorf("Failed to get agentpool %s from Azure with error: %v", vmPool.agentPoolName, err)
return -1, err
}
if ap.Properties != nil {
// the VirtualMachineNodesStatus returned by AKS-RP is constructed from the vm list returned from CRP.
// it only contains VMs in the running state.
for _, status := range ap.Properties.VirtualMachineNodesStatus {
if status != nil {
if strings.EqualFold(to.String(status.Size), vmPool.sku) {
return to.Int32(status.Count), nil
}
}
}
}
return -1, fmt.Errorf("failed to get the size of spot agentpool %s", vmPool.agentPoolName)
}
// getVMsFromCache retrieves the list of virtual machines in this VMPool.
// If excludeDeleting is true, it skips VMs in the "Deleting" state.
// https://learn.microsoft.com/en-us/azure/virtual-machines/states-billing#provisioning-states
func (vmPool *VMPool) getVMsFromCache(op skipOption) ([]compute.VirtualMachine, error) {
vmsMap := vmPool.manager.azureCache.getVirtualMachines()
var filteredVMs []compute.VirtualMachine
for _, vm := range vmsMap[vmPool.agentPoolName] {
if vm.VirtualMachineProperties == nil ||
vm.VirtualMachineProperties.HardwareProfile == nil ||
!strings.EqualFold(string(vm.HardwareProfile.VMSize), vmPool.sku) {
continue
}
if op.skipDeleting && strings.Contains(to.String(vm.VirtualMachineProperties.ProvisioningState), "Deleting") {
klog.V(4).Infof("Skipping VM %s in deleting state", to.String(vm.ID))
continue
}
if op.skipFailed && strings.Contains(to.String(vm.VirtualMachineProperties.ProvisioningState), "Failed") {
klog.V(4).Infof("Skipping VM %s in failed state", to.String(vm.ID))
continue
}
filteredVMs = append(filteredVMs, vm)
}
return filteredVMs, nil
}
// Nodes returns the list of nodes in the vms agentPool.
func (agentPool *VMsPool) Nodes() ([]cloudprovider.Instance, error) {
vms, err := agentPool.getVMsFromCache()
func (vmPool *VMPool) Nodes() ([]cloudprovider.Instance, error) {
vms, err := vmPool.getVMsFromCache(skipOption{}) // no skip option, get all VMs
if err != nil {
return nil, err
}
@ -163,7 +447,7 @@ func (agentPool *VMsPool) Nodes() ([]cloudprovider.Instance, error) {
if vm.ID == nil || len(*vm.ID) == 0 {
continue
}
resourceID, err := convertResourceGroupNameToLower("azure://" + *vm.ID)
resourceID, err := convertResourceGroupNameToLower("azure://" + to.String(vm.ID))
if err != nil {
return nil, err
}
@ -173,12 +457,53 @@ func (agentPool *VMsPool) Nodes() ([]cloudprovider.Instance, error) {
return nodes, nil
}
// TemplateNodeInfo is not implemented.
func (agentPool *VMsPool) TemplateNodeInfo() (*framework.NodeInfo, error) {
return nil, cloudprovider.ErrNotImplemented
// TemplateNodeInfo returns a NodeInfo object that can be used to create a new node in the vmPool.
func (vmPool *VMPool) TemplateNodeInfo() (*framework.NodeInfo, error) {
ap, err := vmPool.getAgentpoolFromCache()
if err != nil {
return nil, err
}
inputLabels := map[string]string{}
inputTaints := ""
template, err := buildNodeTemplateFromVMPool(ap, vmPool.manager.config.Location, vmPool.sku, inputLabels, inputTaints)
if err != nil {
return nil, err
}
node, err := buildNodeFromTemplate(vmPool.agentPoolName, template, vmPool.manager, vmPool.manager.config.EnableDynamicInstanceList)
if err != nil {
return nil, err
}
nodeInfo := framework.NewNodeInfo(node, nil, &framework.PodInfo{Pod: cloudprovider.BuildKubeProxy(vmPool.agentPoolName)})
return nodeInfo, nil
}
func (vmPool *VMPool) getAgentpoolFromCache() (armcontainerservice.AgentPool, error) {
vmsPoolMap := vmPool.manager.azureCache.getVMsPoolMap()
if _, exists := vmsPoolMap[vmPool.agentPoolName]; !exists {
return armcontainerservice.AgentPool{}, fmt.Errorf("VMs agent pool %s not found in cache", vmPool.agentPoolName)
}
return vmsPoolMap[vmPool.agentPoolName], nil
}
// getAgentpoolFromAzure returns the AKS agentpool from Azure
func (vmPool *VMPool) getAgentpoolFromAzure() (armcontainerservice.AgentPool, error) {
ctx, cancel := getContextWithTimeout(vmsContextTimeout)
defer cancel()
resp, err := vmPool.manager.azClient.agentPoolClient.Get(
ctx,
vmPool.manager.config.ClusterResourceGroup,
vmPool.manager.config.ClusterName,
vmPool.agentPoolName, nil)
if err != nil {
return resp.AgentPool, fmt.Errorf("failed to get agentpool %s in cluster %s with error: %v",
vmPool.agentPoolName, vmPool.manager.config.ClusterName, err)
}
return resp.AgentPool, nil
}
// AtomicIncreaseSize is not implemented.
func (agentPool *VMsPool) AtomicIncreaseSize(delta int) error {
func (vmPool *VMPool) AtomicIncreaseSize(delta int) error {
return cloudprovider.ErrNotImplemented
}

View File

@ -17,45 +17,64 @@ limitations under the License.
package azure
import (
"context"
"fmt"
"net/http"
"testing"
"github.com/Azure/azure-sdk-for-go/sdk/azcore/runtime"
"github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v5"
"github.com/Azure/azure-sdk-for-go/services/compute/mgmt/2022-08-01/compute"
"github.com/Azure/go-autorest/autorest/to"
"go.uber.org/mock/gomock"
"github.com/stretchr/testify/assert"
apiv1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider"
"k8s.io/autoscaler/cluster-autoscaler/config"
"k8s.io/autoscaler/cluster-autoscaler/config/dynamic"
providerazure "sigs.k8s.io/cloud-provider-azure/pkg/provider"
"sigs.k8s.io/cloud-provider-azure/pkg/azureclients/vmclient/mockvmclient"
)
func newTestVMsPool(manager *AzureManager, name string) *VMsPool {
return &VMsPool{
const (
vmSku = "Standard_D2_v2"
vmsAgentPoolName = "test-vms-pool"
vmsNodeGroupName = vmsAgentPoolName + "/" + vmSku
fakeVMsNodeName = "aks-" + vmsAgentPoolName + "-13222729-vms%d"
fakeVMsPoolVMID = "/subscriptions/test-subscription-id/resourceGroups/test-rg/providers/Microsoft.Compute/virtualMachines/" + fakeVMsNodeName
)
func newTestVMsPool(manager *AzureManager) *VMPool {
return &VMPool{
azureRef: azureRef{
Name: name,
Name: vmsNodeGroupName,
},
manager: manager,
minSize: 3,
maxSize: 10,
manager: manager,
minSize: 3,
maxSize: 10,
agentPoolName: vmsAgentPoolName,
sku: vmSku,
}
}
const (
fakeVMsPoolVMID = "/subscriptions/test-subscription-id/resourceGroups/test-rg/providers/Microsoft.Compute/virtualMachines/%d"
)
func newTestVMsPoolVMList(count int) []compute.VirtualMachine {
var vmList []compute.VirtualMachine
for i := 0; i < count; i++ {
vm := compute.VirtualMachine{
ID: to.StringPtr(fmt.Sprintf(fakeVMsPoolVMID, i)),
VirtualMachineProperties: &compute.VirtualMachineProperties{
VMID: to.StringPtr(fmt.Sprintf("123E4567-E89B-12D3-A456-426655440000-%d", i)),
HardwareProfile: &compute.HardwareProfile{
VMSize: compute.VirtualMachineSizeTypes(vmSku),
},
ProvisioningState: to.StringPtr("Succeeded"),
},
Tags: map[string]*string{
agentpoolTypeTag: to.StringPtr("VirtualMachines"),
agentpoolNameTag: to.StringPtr("test-vms-pool"),
agentpoolNameTag: to.StringPtr(vmsAgentPoolName),
},
}
vmList = append(vmList, vm)
@ -63,41 +82,73 @@ func newTestVMsPoolVMList(count int) []compute.VirtualMachine {
return vmList
}
func newVMsNode(vmID int64) *apiv1.Node {
node := &apiv1.Node{
func newVMsNode(vmIdx int64) *apiv1.Node {
return &apiv1.Node{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf(fakeVMsNodeName, vmIdx),
},
Spec: apiv1.NodeSpec{
ProviderID: "azure://" + fmt.Sprintf(fakeVMsPoolVMID, vmID),
ProviderID: "azure://" + fmt.Sprintf(fakeVMsPoolVMID, vmIdx),
},
}
return node
}
func TestNewVMsPool(t *testing.T) {
spec := &dynamic.NodeGroupSpec{
Name: "test-nodepool",
MinSize: 1,
MaxSize: 5,
func getTestVMsAgentPool(isSystemPool bool) armcontainerservice.AgentPool {
mode := armcontainerservice.AgentPoolModeUser
if isSystemPool {
mode = armcontainerservice.AgentPoolModeSystem
}
am := &AzureManager{
config: &Config{
Config: providerazure.Config{
ResourceGroup: "test-resource-group",
vmsPoolType := armcontainerservice.AgentPoolTypeVirtualMachines
return armcontainerservice.AgentPool{
Name: to.StringPtr(vmsAgentPoolName),
Properties: &armcontainerservice.ManagedClusterAgentPoolProfileProperties{
Type: &vmsPoolType,
Mode: &mode,
VirtualMachinesProfile: &armcontainerservice.VirtualMachinesProfile{
Scale: &armcontainerservice.ScaleProfile{
Manual: []*armcontainerservice.ManualScaleProfile{
{
Count: to.Int32Ptr(3),
Sizes: []*string{to.StringPtr(vmSku)},
},
},
},
},
VirtualMachineNodesStatus: []*armcontainerservice.VirtualMachineNodes{
{
Count: to.Int32Ptr(3),
Size: to.StringPtr(vmSku),
},
},
},
}
}
nodepool := NewVMsPool(spec, am)
func TestNewVMsPool(t *testing.T) {
ctrl := gomock.NewController(t)
defer ctrl.Finish()
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
manager := newTestAzureManager(t)
manager.azClient.agentPoolClient = mockAgentpoolclient
manager.config.ResourceGroup = "MC_rg"
manager.config.ClusterResourceGroup = "rg"
manager.config.ClusterName = "mycluster"
assert.Equal(t, "test-nodepool", nodepool.azureRef.Name)
assert.Equal(t, "test-resource-group", nodepool.resourceGroup)
assert.Equal(t, int64(-1), nodepool.curSize)
assert.Equal(t, 1, nodepool.minSize)
assert.Equal(t, 5, nodepool.maxSize)
assert.Equal(t, am, nodepool.manager)
spec := &dynamic.NodeGroupSpec{
Name: vmsAgentPoolName,
MinSize: 1,
MaxSize: 10,
}
ap, err := NewVMPool(spec, manager, vmsAgentPoolName, vmSku)
assert.NoError(t, err)
assert.Equal(t, vmsAgentPoolName, ap.azureRef.Name)
assert.Equal(t, 1, ap.minSize)
assert.Equal(t, 10, ap.maxSize)
}
func TestMinSize(t *testing.T) {
agentPool := &VMsPool{
agentPool := &VMPool{
minSize: 1,
}
@ -105,12 +156,12 @@ func TestMinSize(t *testing.T) {
}
func TestExist(t *testing.T) {
agentPool := &VMsPool{}
agentPool := &VMPool{}
assert.True(t, agentPool.Exist())
}
func TestCreate(t *testing.T) {
agentPool := &VMsPool{}
agentPool := &VMPool{}
nodeGroup, err := agentPool.Create()
assert.Nil(t, nodeGroup)
@ -118,65 +169,43 @@ func TestCreate(t *testing.T) {
}
func TestDelete(t *testing.T) {
agentPool := &VMsPool{}
agentPool := &VMPool{}
err := agentPool.Delete()
assert.Equal(t, cloudprovider.ErrNotImplemented, err)
}
func TestAutoprovisioned(t *testing.T) {
agentPool := &VMsPool{}
agentPool := &VMPool{}
assert.False(t, agentPool.Autoprovisioned())
}
func TestGetOptions(t *testing.T) {
agentPool := &VMsPool{}
agentPool := &VMPool{}
defaults := config.NodeGroupAutoscalingOptions{}
options, err := agentPool.GetOptions(defaults)
assert.Nil(t, options)
assert.Equal(t, cloudprovider.ErrNotImplemented, err)
assert.Nil(t, err)
}
func TestMaxSize(t *testing.T) {
agentPool := &VMsPool{
agentPool := &VMPool{
maxSize: 10,
}
assert.Equal(t, 10, agentPool.MaxSize())
}
func TestTargetSize(t *testing.T) {
agentPool := &VMsPool{}
size, err := agentPool.TargetSize()
assert.Equal(t, -1, size)
assert.Equal(t, cloudprovider.ErrNotImplemented, err)
}
func TestIncreaseSize(t *testing.T) {
agentPool := &VMsPool{}
err := agentPool.IncreaseSize(1)
assert.Equal(t, cloudprovider.ErrNotImplemented, err)
}
func TestDeleteNodes(t *testing.T) {
agentPool := &VMsPool{}
err := agentPool.DeleteNodes(nil)
assert.Equal(t, cloudprovider.ErrNotImplemented, err)
}
func TestDecreaseTargetSize(t *testing.T) {
agentPool := &VMsPool{}
agentPool := newTestVMsPool(newTestAzureManager(t))
err := agentPool.DecreaseTargetSize(1)
assert.Equal(t, cloudprovider.ErrNotImplemented, err)
assert.Nil(t, err)
}
func TestId(t *testing.T) {
agentPool := &VMsPool{
agentPool := &VMPool{
azureRef: azureRef{
Name: "test-id",
},
@ -186,7 +215,7 @@ func TestId(t *testing.T) {
}
func TestDebug(t *testing.T) {
agentPool := &VMsPool{
agentPool := &VMPool{
azureRef: azureRef{
Name: "test-debug",
},
@ -198,115 +227,341 @@ func TestDebug(t *testing.T) {
assert.Equal(t, expectedDebugString, agentPool.Debug())
}
func TestTemplateNodeInfo(t *testing.T) {
agentPool := &VMsPool{}
ctrl := gomock.NewController(t)
defer ctrl.Finish()
nodeInfo, err := agentPool.TemplateNodeInfo()
assert.Nil(t, nodeInfo)
assert.Equal(t, cloudprovider.ErrNotImplemented, err)
ap := newTestVMsPool(newTestAzureManager(t))
ap.manager.config.EnableVMsAgentPool = true
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
ap.manager.azClient.agentPoolClient = mockAgentpoolclient
agentpool := getTestVMsAgentPool(false)
fakeAPListPager := getFakeAgentpoolListPager(&agentpool)
mockAgentpoolclient.EXPECT().NewListPager(gomock.Any(), gomock.Any(), nil).
Return(fakeAPListPager)
ac, err := newAzureCache(ap.manager.azClient, refreshInterval, *ap.manager.config)
assert.NoError(t, err)
ap.manager.azureCache = ac
nodeInfo, err := ap.TemplateNodeInfo()
assert.NotNil(t, nodeInfo)
assert.Nil(t, err)
}
func TestAtomicIncreaseSize(t *testing.T) {
agentPool := &VMsPool{}
agentPool := &VMPool{}
err := agentPool.AtomicIncreaseSize(1)
assert.Equal(t, cloudprovider.ErrNotImplemented, err)
}
// Test cases for getVMsFromCache()
// Test case 1 - when the vms pool is not found in the cache
// Test case 2 - when the vms pool is found in the cache but has no VMs
// Test case 3 - when the vms pool is found in the cache and has VMs
// Test case 4 - when the vms pool is found in the cache and has VMs with no name
func TestGetVMsFromCache(t *testing.T) {
// Test case 1
manager := &AzureManager{
azureCache: &azureCache{
virtualMachines: make(map[string][]compute.VirtualMachine),
vmsPoolMap: make(map[string]armcontainerservice.AgentPool),
},
}
agentPool := &VMsPool{
manager: manager,
azureRef: azureRef{
Name: "test-vms-pool",
},
agentPool := &VMPool{
manager: manager,
agentPoolName: vmsAgentPoolName,
sku: vmSku,
}
_, err := agentPool.getVMsFromCache()
assert.EqualError(t, err, "vms pool test-vms-pool not found in the cache")
// Test case 1 - when the vms pool is not found in the cache
vms, err := agentPool.getVMsFromCache(skipOption{})
assert.Nil(t, err)
assert.Len(t, vms, 0)
// Test case 2
manager.azureCache.virtualMachines["test-vms-pool"] = []compute.VirtualMachine{}
_, err = agentPool.getVMsFromCache()
// Test case 2 - when the vms pool is found in the cache but has no VMs
manager.azureCache.virtualMachines[vmsAgentPoolName] = []compute.VirtualMachine{}
vms, err = agentPool.getVMsFromCache(skipOption{})
assert.NoError(t, err)
assert.Len(t, vms, 0)
// Test case 3
manager.azureCache.virtualMachines["test-vms-pool"] = newTestVMsPoolVMList(3)
vms, err := agentPool.getVMsFromCache()
// Test case 3 - when the vms pool is found in the cache and has VMs
manager.azureCache.virtualMachines[vmsAgentPoolName] = newTestVMsPoolVMList(3)
vms, err = agentPool.getVMsFromCache(skipOption{})
assert.NoError(t, err)
assert.Len(t, vms, 3)
// Test case 4
manager.azureCache.virtualMachines["test-vms-pool"] = newTestVMsPoolVMList(3)
agentPool.azureRef.Name = ""
_, err = agentPool.getVMsFromCache()
assert.EqualError(t, err, "vms pool not found in the cache")
// Test case 4 - should skip failed VMs
vmList := newTestVMsPoolVMList(3)
vmList[0].VirtualMachineProperties.ProvisioningState = to.StringPtr("Failed")
manager.azureCache.virtualMachines[vmsAgentPoolName] = vmList
vms, err = agentPool.getVMsFromCache(skipOption{skipFailed: true})
assert.NoError(t, err)
assert.Len(t, vms, 2)
// Test case 5 - should skip deleting VMs
vmList = newTestVMsPoolVMList(3)
vmList[0].VirtualMachineProperties.ProvisioningState = to.StringPtr("Deleting")
manager.azureCache.virtualMachines[vmsAgentPoolName] = vmList
vms, err = agentPool.getVMsFromCache(skipOption{skipDeleting: true})
assert.NoError(t, err)
assert.Len(t, vms, 2)
// Test case 6 - should not skip deleting VMs
vmList = newTestVMsPoolVMList(3)
vmList[0].VirtualMachineProperties.ProvisioningState = to.StringPtr("Deleting")
manager.azureCache.virtualMachines[vmsAgentPoolName] = vmList
vms, err = agentPool.getVMsFromCache(skipOption{skipFailed: true})
assert.NoError(t, err)
assert.Len(t, vms, 3)
// Test case 7 - when the vms pool is found in the cache and has VMs with no name
manager.azureCache.virtualMachines[vmsAgentPoolName] = newTestVMsPoolVMList(3)
agentPool.agentPoolName = ""
vms, err = agentPool.getVMsFromCache(skipOption{})
assert.NoError(t, err)
assert.Len(t, vms, 0)
}
func TestGetVMsFromCacheForVMsPool(t *testing.T) {
ctrl := gomock.NewController(t)
defer ctrl.Finish()
ap := newTestVMsPool(newTestAzureManager(t))
expectedVMs := newTestVMsPoolVMList(2)
mockVMClient := mockvmclient.NewMockInterface(ctrl)
ap.manager.azClient.virtualMachinesClient = mockVMClient
ap.manager.config.EnableVMsAgentPool = true
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
ap.manager.azClient.agentPoolClient = mockAgentpoolclient
mockVMClient.EXPECT().List(gomock.Any(), ap.manager.config.ResourceGroup).Return(expectedVMs, nil)
agentpool := getTestVMsAgentPool(false)
fakeAPListPager := getFakeAgentpoolListPager(&agentpool)
mockAgentpoolclient.EXPECT().NewListPager(gomock.Any(), gomock.Any(), nil).
Return(fakeAPListPager)
ac, err := newAzureCache(ap.manager.azClient, refreshInterval, *ap.manager.config)
assert.NoError(t, err)
ac.enableVMsAgentPool = true
ap.manager.azureCache = ac
vms, err := ap.getVMsFromCache(skipOption{})
assert.Equal(t, 2, len(vms))
assert.NoError(t, err)
}
// Test cases for Nodes()
// Test case 1 - when there are no VMs in the pool
// Test case 2 - when there are VMs in the pool
// Test case 3 - when there are VMs in the pool with no ID
// Test case 4 - when there is an error converting resource group name
// Test case 5 - when there is an error getting VMs from cache
func TestNodes(t *testing.T) {
// Test case 1
manager := &AzureManager{
azureCache: &azureCache{
virtualMachines: make(map[string][]compute.VirtualMachine),
},
}
agentPool := &VMsPool{
manager: manager,
azureRef: azureRef{
Name: "test-vms-pool",
},
}
ctrl := gomock.NewController(t)
defer ctrl.Finish()
nodes, err := agentPool.Nodes()
assert.EqualError(t, err, "vms pool test-vms-pool not found in the cache")
assert.Empty(t, nodes)
ap := newTestVMsPool(newTestAzureManager(t))
expectedVMs := newTestVMsPoolVMList(2)
// Test case 2
manager.azureCache.virtualMachines["test-vms-pool"] = newTestVMsPoolVMList(3)
nodes, err = agentPool.Nodes()
mockVMClient := mockvmclient.NewMockInterface(ctrl)
ap.manager.azClient.virtualMachinesClient = mockVMClient
mockVMClient.EXPECT().List(gomock.Any(), ap.manager.config.ResourceGroup).Return(expectedVMs, nil)
ap.manager.config.EnableVMsAgentPool = true
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
ap.manager.azClient.agentPoolClient = mockAgentpoolclient
agentpool := getTestVMsAgentPool(false)
fakeAPListPager := getFakeAgentpoolListPager(&agentpool)
mockAgentpoolclient.EXPECT().NewListPager(gomock.Any(), gomock.Any(), nil).
Return(fakeAPListPager)
ac, err := newAzureCache(ap.manager.azClient, refreshInterval, *ap.manager.config)
assert.NoError(t, err)
assert.Len(t, nodes, 3)
ap.manager.azureCache = ac
// Test case 3
manager.azureCache.virtualMachines["test-vms-pool"] = newTestVMsPoolVMList(3)
manager.azureCache.virtualMachines["test-vms-pool"][0].ID = nil
nodes, err = agentPool.Nodes()
vms, err := ap.Nodes()
assert.Equal(t, 2, len(vms))
assert.NoError(t, err)
assert.Len(t, nodes, 2)
manager.azureCache.virtualMachines["test-vms-pool"] = newTestVMsPoolVMList(3)
emptyString := ""
manager.azureCache.virtualMachines["test-vms-pool"][0].ID = &emptyString
nodes, err = agentPool.Nodes()
assert.NoError(t, err)
assert.Len(t, nodes, 2)
// Test case 4
manager.azureCache.virtualMachines["test-vms-pool"] = newTestVMsPoolVMList(3)
bogusID := "foo"
manager.azureCache.virtualMachines["test-vms-pool"][0].ID = &bogusID
nodes, err = agentPool.Nodes()
assert.Empty(t, nodes)
assert.Error(t, err)
// Test case 5
manager.azureCache.virtualMachines["test-vms-pool"] = newTestVMsPoolVMList(1)
agentPool.azureRef.Name = ""
nodes, err = agentPool.Nodes()
assert.Empty(t, nodes)
assert.Error(t, err)
}
func TestGetCurSizeForVMsPool(t *testing.T) {
ctrl := gomock.NewController(t)
defer ctrl.Finish()
ap := newTestVMsPool(newTestAzureManager(t))
expectedVMs := newTestVMsPoolVMList(3)
mockVMClient := mockvmclient.NewMockInterface(ctrl)
ap.manager.azClient.virtualMachinesClient = mockVMClient
mockVMClient.EXPECT().List(gomock.Any(), ap.manager.config.ResourceGroup).Return(expectedVMs, nil)
ap.manager.config.EnableVMsAgentPool = true
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
ap.manager.azClient.agentPoolClient = mockAgentpoolclient
agentpool := getTestVMsAgentPool(false)
fakeAPListPager := getFakeAgentpoolListPager(&agentpool)
mockAgentpoolclient.EXPECT().NewListPager(gomock.Any(), gomock.Any(), nil).
Return(fakeAPListPager)
ac, err := newAzureCache(ap.manager.azClient, refreshInterval, *ap.manager.config)
assert.NoError(t, err)
ap.manager.azureCache = ac
curSize, err := ap.getCurSize(skipOption{})
assert.NoError(t, err)
assert.Equal(t, int32(3), curSize)
}
func TestVMsPoolIncreaseSize(t *testing.T) {
ctrl := gomock.NewController(t)
defer ctrl.Finish()
manager := newTestAzureManager(t)
ap := newTestVMsPool(manager)
expectedVMs := newTestVMsPoolVMList(3)
mockVMClient := mockvmclient.NewMockInterface(ctrl)
ap.manager.azClient.virtualMachinesClient = mockVMClient
mockVMClient.EXPECT().List(gomock.Any(), ap.manager.config.ResourceGroup).Return(expectedVMs, nil)
ap.manager.config.EnableVMsAgentPool = true
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
ap.manager.azClient.agentPoolClient = mockAgentpoolclient
agentpool := getTestVMsAgentPool(false)
fakeAPListPager := getFakeAgentpoolListPager(&agentpool)
mockAgentpoolclient.EXPECT().NewListPager(gomock.Any(), gomock.Any(), nil).
Return(fakeAPListPager)
ac, err := newAzureCache(ap.manager.azClient, refreshInterval, *ap.manager.config)
assert.NoError(t, err)
ap.manager.azureCache = ac
// failure case 1
err1 := ap.IncreaseSize(-1)
expectedErr := fmt.Errorf("size increase must be positive, current delta: -1")
assert.Equal(t, expectedErr, err1)
// failure case 2
err2 := ap.IncreaseSize(8)
expectedErr = fmt.Errorf("size-increasing request of 11 is bigger than max size 10")
assert.Equal(t, expectedErr, err2)
// success case 3
resp := &http.Response{
Header: map[string][]string{
"Fake-Poller-Status": {"Done"},
},
}
fakePoller, pollerErr := runtime.NewPoller(resp, runtime.Pipeline{},
&runtime.NewPollerOptions[armcontainerservice.AgentPoolsClientCreateOrUpdateResponse]{
Handler: &fakehandler[armcontainerservice.AgentPoolsClientCreateOrUpdateResponse]{},
})
assert.NoError(t, pollerErr)
mockAgentpoolclient.EXPECT().BeginCreateOrUpdate(
gomock.Any(), manager.config.ClusterResourceGroup,
manager.config.ClusterName,
vmsAgentPoolName,
gomock.Any(), gomock.Any()).Return(fakePoller, nil)
err3 := ap.IncreaseSize(1)
assert.NoError(t, err3)
}
func TestDeleteVMsPoolNodes_Failed(t *testing.T) {
ctrl := gomock.NewController(t)
defer ctrl.Finish()
ap := newTestVMsPool(newTestAzureManager(t))
node := newVMsNode(0)
expectedVMs := newTestVMsPoolVMList(3)
mockVMClient := mockvmclient.NewMockInterface(ctrl)
ap.manager.azClient.virtualMachinesClient = mockVMClient
ap.manager.config.EnableVMsAgentPool = true
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
agentpool := getTestVMsAgentPool(false)
ap.manager.azClient.agentPoolClient = mockAgentpoolclient
fakeAPListPager := getFakeAgentpoolListPager(&agentpool)
mockAgentpoolclient.EXPECT().NewListPager(gomock.Any(), gomock.Any(), nil).Return(fakeAPListPager)
mockVMClient.EXPECT().List(gomock.Any(), ap.manager.config.ResourceGroup).Return(expectedVMs, nil)
ap.manager.azureCache.enableVMsAgentPool = true
registered := ap.manager.RegisterNodeGroup(ap)
assert.True(t, registered)
ap.manager.explicitlyConfigured[vmsNodeGroupName] = true
ap.manager.forceRefresh()
// failure case
deleteErr := ap.DeleteNodes([]*apiv1.Node{node})
assert.Error(t, deleteErr)
assert.Contains(t, deleteErr.Error(), "cannot delete nodes as minimum size of 3 has been reached")
}
func TestDeleteVMsPoolNodes_Success(t *testing.T) {
ctrl := gomock.NewController(t)
defer ctrl.Finish()
ap := newTestVMsPool(newTestAzureManager(t))
expectedVMs := newTestVMsPoolVMList(5)
mockVMClient := mockvmclient.NewMockInterface(ctrl)
ap.manager.azClient.virtualMachinesClient = mockVMClient
ap.manager.config.EnableVMsAgentPool = true
mockAgentpoolclient := NewMockAgentPoolsClient(ctrl)
agentpool := getTestVMsAgentPool(false)
ap.manager.azClient.agentPoolClient = mockAgentpoolclient
fakeAPListPager := getFakeAgentpoolListPager(&agentpool)
mockAgentpoolclient.EXPECT().NewListPager(gomock.Any(), gomock.Any(), nil).Return(fakeAPListPager)
mockVMClient.EXPECT().List(gomock.Any(), ap.manager.config.ResourceGroup).Return(expectedVMs, nil)
ap.manager.azureCache.enableVMsAgentPool = true
registered := ap.manager.RegisterNodeGroup(ap)
assert.True(t, registered)
ap.manager.explicitlyConfigured[vmsNodeGroupName] = true
ap.manager.forceRefresh()
// success case
resp := &http.Response{
Header: map[string][]string{
"Fake-Poller-Status": {"Done"},
},
}
fakePoller, err := runtime.NewPoller(resp, runtime.Pipeline{},
&runtime.NewPollerOptions[armcontainerservice.AgentPoolsClientDeleteMachinesResponse]{
Handler: &fakehandler[armcontainerservice.AgentPoolsClientDeleteMachinesResponse]{},
})
assert.NoError(t, err)
mockAgentpoolclient.EXPECT().BeginDeleteMachines(
gomock.Any(), ap.manager.config.ClusterResourceGroup,
ap.manager.config.ClusterName,
vmsAgentPoolName,
gomock.Any(), gomock.Any()).Return(fakePoller, nil)
node := newVMsNode(0)
derr := ap.DeleteNodes([]*apiv1.Node{node})
assert.NoError(t, derr)
}
type fakehandler[T any] struct{}
func (f *fakehandler[T]) Done() bool {
return true
}
func (f *fakehandler[T]) Poll(ctx context.Context) (*http.Response, error) {
return nil, nil
}
func (f *fakehandler[T]) Result(ctx context.Context, out *T) error {
return nil
}
func getFakeAgentpoolListPager(agentpool ...*armcontainerservice.AgentPool) *runtime.Pager[armcontainerservice.AgentPoolsClientListResponse] {
fakeFetcher := func(ctx context.Context, response *armcontainerservice.AgentPoolsClientListResponse) (armcontainerservice.AgentPoolsClientListResponse, error) {
return armcontainerservice.AgentPoolsClientListResponse{
AgentPoolListResult: armcontainerservice.AgentPoolListResult{
Value: agentpool,
},
}, nil
}
return runtime.NewPager(runtime.PagingHandler[armcontainerservice.AgentPoolsClientListResponse]{
More: func(response armcontainerservice.AgentPoolsClientListResponse) bool {
return false
},
Fetcher: fakeFetcher,
})
}

View File

@ -186,9 +186,15 @@ There are two annotations that control how a cluster resource should be scaled:
The autoscaler will monitor any `MachineSet`, `MachineDeployment`, or `MachinePool` containing
both of these annotations.
> Note: The cluster autoscaler does not enforce the node group sizes. If a node group is
> below the minimum number of nodes, or above the maximum number of nodes, the cluster
> autoscaler will not scale that node group up or down. The cluster autoscaler can be configured
> to enforce the minimum node group size by enabling the `--enforce-node-group-min-size` flag.
> Please see [this entry in the Cluster Autoscaler FAQ](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#my-cluster-is-below-minimum--above-maximum-number-of-nodes-but-ca-did-not-fix-that-why)
> for more information.
> Note: `MachinePool` support in cluster-autoscaler requires a provider implementation
> that supports the new "MachinePool Machines" feature. MachinePools in Cluster API are
> considered an [experimental feature](https://cluster-api.sigs.k8s.io/tasks/experimental-features/experimental-features.html#active-experimental-features) and are not enabled by default.
> that supports the "MachinePool Machines" feature.
### Scale from zero support
@ -208,6 +214,11 @@ autoscaler about the sizing of the nodes in the node group. At the minimum,
you must specify the CPU and memory annotations, these annotations should
match the expected capacity of the nodes created from the infrastructure.
> Note: The scale from zero annotations will override any capacity information
> supplied by the Cluster API provider in the infrastructure machine templates.
> If both the annotations and the provider supplied capacity information are
> present, the annotations will take precedence.
For example, if my MachineDeployment will create nodes that have "16000m" CPU,
"128G" memory, "100Gi" ephemeral disk storage, 2 NVidia GPUs, and can support
200 max pods, the following annotations will instruct the autoscaler how to
@ -223,14 +234,23 @@ metadata:
capacity.cluster-autoscaler.kubernetes.io/memory: "128G"
capacity.cluster-autoscaler.kubernetes.io/cpu: "16"
capacity.cluster-autoscaler.kubernetes.io/ephemeral-disk: "100Gi"
capacity.cluster-autoscaler.kubernetes.io/gpu-type: "nvidia.com/gpu"
capacity.cluster-autoscaler.kubernetes.io/gpu-count: "2"
capacity.cluster-autoscaler.kubernetes.io/maxPods: "200"
// Device Plugin
// Comment out the below annotation if DRA is enabled on your cluster running k8s v1.32.0 or greater
capacity.cluster-autoscaler.kubernetes.io/gpu-type: "nvidia.com/gpu"
// Dynamic Resource Allocation (DRA)
// Uncomment the below annotation if DRA is enabled on your cluster running k8s v1.32.0 or greater
// capacity.cluster-autoscaler.kubernetes.io/dra-driver: "gpu.nvidia.com"
// Common in Device Plugin and DRA
capacity.cluster-autoscaler.kubernetes.io/gpu-count: "2"
```
*Note* the `maxPods` annotation will default to `110` if it is not supplied.
This value is inspired by the Kubernetes best practices
[Considerations for large clusters](https://kubernetes.io/docs/setup/best-practices/cluster-large/).
> Note: the `maxPods` annotation will default to `110` if it is not supplied.
> This value is inspired by the Kubernetes best practices
> [Considerations for large clusters](https://kubernetes.io/docs/setup/best-practices/cluster-large/).
> Note: User should select the annotation for GPU either `gpu-type` or `dra-driver` depends on whether using
> Device Plugin or Dynamic Resource Allocation(DRA). `gpu-count` is a common parameter in both.
#### RBAC changes for scaling from zero
@ -275,6 +295,12 @@ metadata:
capacity.cluster-autoscaler.kubernetes.io/taints: "key1=value1:NoSchedule,key2=value2:NoExecute"
```
> Note: The labels supplied through the capacity annotation will be combined
> with the labels to be propagated from the scalable Cluster API resource.
> The annotation does not override the labels in the scalable resource.
> Please see the [Cluster API Book chapter on Metadata propagation](https://cluster-api.sigs.k8s.io/reference/api/metadata-propagation)
> for more information.
#### Per-NodeGroup autoscaling options
Custom autoscaling options per node group (MachineDeployment/MachinePool/MachineSet) can be specified as annoations with a common prefix:
@ -391,8 +417,6 @@ spec:
## replicas: 1
```
**Warning**: If the Autoscaler is enabled **and** the replicas field is set for a `MachineDeployment` or `MachineSet` the Cluster may enter a broken state where replicas become unpredictable.
If the replica field is unset in the Cluster definition Autoscaling can be enabled [as described above](#enabling-autoscaling)
## Special note on GPU instances

View File

@ -18,7 +18,9 @@ package clusterapi
import (
"context"
"encoding/json"
"fmt"
"k8s.io/apimachinery/pkg/types"
"math/rand"
"path"
"reflect"
@ -251,6 +253,39 @@ func mustCreateTestController(t testing.TB, testConfigs ...*testConfig) (*machin
}
return true, s, nil
case "patch":
action, ok := action.(clientgotesting.PatchAction)
if !ok {
return true, nil, fmt.Errorf("failed to convert Action to PatchAction: %T", action)
}
pt := action.GetPatchType()
if pt != types.MergePatchType {
return true, nil, fmt.Errorf("unexpected patch type: expected = %s, got = %s", types.MergePatchType, pt)
}
var scale autoscalingv1.Scale
err := json.Unmarshal(action.GetPatch(), &scale)
if err != nil {
return true, nil, fmt.Errorf("couldn't unmarshal patch: %w", err)
}
_, err = dynamicClientset.Resource(gvr).Namespace(action.GetNamespace()).Patch(context.TODO(), action.GetName(), pt, action.GetPatch(), metav1.PatchOptions{})
if err != nil {
return true, nil, err
}
newReplicas := scale.Spec.Replicas
return true, &autoscalingv1.Scale{
ObjectMeta: metav1.ObjectMeta{
Name: action.GetName(),
Namespace: action.GetNamespace(),
},
Spec: autoscalingv1.ScaleSpec{
Replicas: newReplicas,
},
}, nil
default:
return true, nil, fmt.Errorf("unknown verb: %v", action.GetVerb())
}

View File

@ -253,9 +253,82 @@ func (ng *nodegroup) Nodes() ([]cloudprovider.Instance, error) {
// must match the ID on the Node object itself.
// https://github.com/kubernetes/autoscaler/blob/a973259f1852303ba38a3a61eeee8489cf4e1b13/cluster-autoscaler/clusterstate/clusterstate.go#L967-L985
instances := make([]cloudprovider.Instance, len(providerIDs))
for i := range providerIDs {
for i, providerID := range providerIDs {
providerIDNormalized := normalizedProviderID(providerID)
// Add instance Status to report instance state to cluster-autoscaler.
// This helps cluster-autoscaler make better scaling decisions.
//
// Machine can be Failed for a variety of reasons, here we are looking for a specific errors:
// - Failed to provision
// - Failed to delete.
// Other reasons for a machine to be in a Failed state are not forwarding Status to the autoscaler.
var status *cloudprovider.InstanceStatus
switch {
case isFailedMachineProviderID(providerIDNormalized):
klog.V(4).Infof("Machine failed in node group %s (%s)", ng.Id(), providerID)
machine, err := ng.machineController.findMachineByProviderID(providerIDNormalized)
if err != nil {
return nil, err
}
if machine != nil {
if !machine.GetDeletionTimestamp().IsZero() {
klog.V(4).Infof("Machine failed in node group %s (%s) is being deleted", ng.Id(), providerID)
status = &cloudprovider.InstanceStatus{
State: cloudprovider.InstanceDeleting,
ErrorInfo: &cloudprovider.InstanceErrorInfo{
ErrorClass: cloudprovider.OtherErrorClass,
ErrorCode: "DeletingFailed",
ErrorMessage: "Machine deletion failed",
},
}
} else {
_, nodeFound, err := unstructured.NestedFieldCopy(machine.UnstructuredContent(), "status", "nodeRef")
if err != nil {
return nil, err
}
// Machine failed without a nodeRef, this indicates that the machine failed to provision.
// This is a special case where the machine is in a Failed state and the node is not created.
// Machine controller will not reconcile this machine, so we need to report this to the autoscaler.
if !nodeFound {
klog.V(4).Infof("Machine failed in node group %s (%s) was being created", ng.Id(), providerID)
status = &cloudprovider.InstanceStatus{
State: cloudprovider.InstanceCreating,
ErrorInfo: &cloudprovider.InstanceErrorInfo{
ErrorClass: cloudprovider.OtherErrorClass,
ErrorCode: "ProvisioningFailed",
ErrorMessage: "Machine provisioning failed",
},
}
}
}
}
case isPendingMachineProviderID(providerIDNormalized):
klog.V(4).Infof("Machine pending in node group %s (%s)", ng.Id(), providerID)
status = &cloudprovider.InstanceStatus{
State: cloudprovider.InstanceCreating,
}
case isDeletingMachineProviderID(providerIDNormalized):
klog.V(4).Infof("Machine deleting in node group %s (%s)", ng.Id(), providerID)
status = &cloudprovider.InstanceStatus{
State: cloudprovider.InstanceDeleting,
}
default:
klog.V(4).Infof("Machine running in node group %s (%s)", ng.Id(), providerID)
status = &cloudprovider.InstanceStatus{
State: cloudprovider.InstanceRunning,
}
}
instances[i] = cloudprovider.Instance{
Id: providerIDs[i],
Id: providerID,
Status: status,
}
}
@ -298,7 +371,12 @@ func (ng *nodegroup) TemplateNodeInfo() (*framework.NodeInfo, error) {
return nil, err
}
nodeInfo := framework.NewNodeInfo(&node, nil, &framework.PodInfo{Pod: cloudprovider.BuildKubeProxy(ng.scalableResource.Name())})
resourceSlices, err := ng.scalableResource.InstanceResourceSlices(nodeName)
if err != nil {
return nil, err
}
nodeInfo := framework.NewNodeInfo(&node, resourceSlices, &framework.PodInfo{Pod: cloudprovider.BuildKubeProxy(ng.scalableResource.Name())})
return nodeInfo, nil
}

View File

@ -1446,12 +1446,19 @@ func TestNodeGroupTemplateNodeInfo(t *testing.T) {
nodeGroupMaxSizeAnnotationKey: "10",
}
type testResourceSlice struct {
driverName string
gpuCount int
deviceType string
}
type testCaseConfig struct {
nodeLabels map[string]string
includeNodes bool
expectedErr error
expectedCapacity map[corev1.ResourceName]int64
expectedNodeLabels map[string]string
nodeLabels map[string]string
includeNodes bool
expectedErr error
expectedCapacity map[corev1.ResourceName]int64
expectedNodeLabels map[string]string
expectedResourceSlice testResourceSlice
}
testCases := []struct {
@ -1544,6 +1551,33 @@ func TestNodeGroupTemplateNodeInfo(t *testing.T) {
},
},
},
{
name: "When the NodeGroup can scale from zero and DRA is enabled, it creates ResourceSlice derived from the annotation of DRA driver name and GPU count",
nodeGroupAnnotations: map[string]string{
memoryKey: "2048Mi",
cpuKey: "2",
draDriverKey: "gpu.nvidia.com",
gpuCountKey: "2",
},
config: testCaseConfig{
expectedErr: nil,
expectedCapacity: map[corev1.ResourceName]int64{
corev1.ResourceCPU: 2,
corev1.ResourceMemory: 2048 * 1024 * 1024,
corev1.ResourcePods: 110,
},
expectedResourceSlice: testResourceSlice{
driverName: "gpu.nvidia.com",
gpuCount: 2,
deviceType: GpuDeviceType,
},
expectedNodeLabels: map[string]string{
"kubernetes.io/os": "linux",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "random value",
},
},
},
}
test := func(t *testing.T, testConfig *testConfig, config testCaseConfig) {
@ -1607,6 +1641,18 @@ func TestNodeGroupTemplateNodeInfo(t *testing.T) {
}
}
}
for _, resourceslice := range nodeInfo.LocalResourceSlices {
if resourceslice.Spec.Driver != config.expectedResourceSlice.driverName {
t.Errorf("Expected DRA driver in ResourceSlice to have: %s, but got: %s", config.expectedResourceSlice.driverName, resourceslice.Spec.Driver)
} else if len(resourceslice.Spec.Devices) != config.expectedResourceSlice.gpuCount {
t.Errorf("Expected the number of DRA devices in ResourceSlice to have: %d, but got: %d", config.expectedResourceSlice.gpuCount, len(resourceslice.Spec.Devices))
}
for _, device := range resourceslice.Spec.Devices {
if *device.Basic.Attributes["type"].StringValue != config.expectedResourceSlice.deviceType {
t.Errorf("Expected device type to have: %s, but got: %s", config.expectedResourceSlice.deviceType, *device.Basic.Attributes["type"].StringValue)
}
}
}
}
for _, tc := range testCases {
@ -1768,3 +1814,223 @@ func TestNodeGroupGetOptions(t *testing.T) {
})
}
}
func TestNodeGroupNodesInstancesStatus(t *testing.T) {
type testCase struct {
description string
nodeCount int
includePendingMachine bool
includeDeletingMachine bool
includeFailedMachineWithNodeRef bool
includeFailedMachineWithoutNodeRef bool
includeFailedMachineDeleting bool
}
testCases := []testCase{
{
description: "standard number of nodes",
nodeCount: 5,
},
{
description: "includes a machine in pending state",
nodeCount: 5,
includePendingMachine: true,
},
{
description: "includes a machine in deleting state",
nodeCount: 5,
includeDeletingMachine: true,
},
{
description: "includes a machine in failed state with nodeRef",
nodeCount: 5,
includeFailedMachineWithNodeRef: true,
},
{
description: "includes a machine in failed state without nodeRef",
nodeCount: 5,
includeFailedMachineWithoutNodeRef: true,
},
}
test := func(t *testing.T, tc *testCase, testConfig *testConfig) {
controller, stop := mustCreateTestController(t, testConfig)
defer stop()
if tc.includePendingMachine {
if tc.nodeCount < 1 {
t.Fatal("test cannot pass, deleted machine requires at least 1 machine in machineset")
}
machine := testConfig.machines[0].DeepCopy()
unstructured.RemoveNestedField(machine.Object, "spec", "providerID")
unstructured.RemoveNestedField(machine.Object, "status", "nodeRef")
if err := updateResource(controller.managementClient, controller.machineInformer, controller.machineResource, machine); err != nil {
t.Fatalf("unexpected error updating machine, got %v", err)
}
}
if tc.includeDeletingMachine {
if tc.nodeCount < 2 {
t.Fatal("test cannot pass, deleted machine requires at least 2 machine in machineset")
}
machine := testConfig.machines[1].DeepCopy()
timestamp := metav1.Now()
machine.SetDeletionTimestamp(&timestamp)
if err := updateResource(controller.managementClient, controller.machineInformer, controller.machineResource, machine); err != nil {
t.Fatalf("unexpected error updating machine, got %v", err)
}
}
if tc.includeFailedMachineWithNodeRef {
if tc.nodeCount < 3 {
t.Fatal("test cannot pass, deleted machine requires at least 3 machine in machineset")
}
machine := testConfig.machines[2].DeepCopy()
unstructured.SetNestedField(machine.Object, "node-1", "status", "nodeRef", "name")
unstructured.SetNestedField(machine.Object, "ErrorMessage", "status", "errorMessage")
if err := updateResource(controller.managementClient, controller.machineInformer, controller.machineResource, machine); err != nil {
t.Fatalf("unexpected error updating machine, got %v", err)
}
}
if tc.includeFailedMachineWithoutNodeRef {
if tc.nodeCount < 4 {
t.Fatal("test cannot pass, deleted machine requires at least 4 machine in machineset")
}
machine := testConfig.machines[3].DeepCopy()
unstructured.RemoveNestedField(machine.Object, "status", "nodeRef")
unstructured.SetNestedField(machine.Object, "ErrorMessage", "status", "errorMessage")
if err := updateResource(controller.managementClient, controller.machineInformer, controller.machineResource, machine); err != nil {
t.Fatalf("unexpected error updating machine, got %v", err)
}
}
if tc.includeFailedMachineDeleting {
if tc.nodeCount < 5 {
t.Fatal("test cannot pass, deleted machine requires at least 5 machine in machineset")
}
machine := testConfig.machines[4].DeepCopy()
timestamp := metav1.Now()
machine.SetDeletionTimestamp(&timestamp)
unstructured.SetNestedField(machine.Object, "ErrorMessage", "status", "errorMessage")
if err := updateResource(controller.managementClient, controller.machineInformer, controller.machineResource, machine); err != nil {
t.Fatalf("unexpected error updating machine, got %v", err)
}
}
nodegroups, err := controller.nodeGroups()
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if l := len(nodegroups); l != 1 {
t.Fatalf("expected 1 nodegroup, got %d", l)
}
ng := nodegroups[0]
instances, err := ng.Nodes()
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
expectedCount := tc.nodeCount
if len(instances) != expectedCount {
t.Errorf("expected %d nodes, got %d", expectedCount, len(instances))
}
// Sort instances by Id for stable comparison
sort.Slice(instances, func(i, j int) bool {
return instances[i].Id < instances[j].Id
})
for _, instance := range instances {
t.Logf("instance: %v", instance)
if tc.includePendingMachine && strings.HasPrefix(instance.Id, pendingMachinePrefix) {
if instance.Status == nil || instance.Status.State != cloudprovider.InstanceCreating {
t.Errorf("expected pending machine to have status %v, got %v", cloudprovider.InstanceCreating, instance.Status)
}
} else if tc.includeDeletingMachine && strings.HasPrefix(instance.Id, deletingMachinePrefix) {
if instance.Status == nil || instance.Status.State != cloudprovider.InstanceDeleting {
t.Errorf("expected deleting machine to have status %v, got %v", cloudprovider.InstanceDeleting, instance.Status)
}
} else if tc.includeFailedMachineWithNodeRef && strings.HasPrefix(instance.Id, failedMachinePrefix) {
if instance.Status != nil {
t.Errorf("expected failed machine with nodeRef to not have status, got %v", instance.Status)
}
} else if tc.includeFailedMachineWithoutNodeRef && strings.HasPrefix(instance.Id, failedMachinePrefix) {
if instance.Status == nil || instance.Status.State != cloudprovider.InstanceCreating {
t.Errorf("expected failed machine without nodeRef to have status %v, got %v", cloudprovider.InstanceCreating, instance.Status)
}
if instance.Status == nil || instance.Status.ErrorInfo.ErrorClass != cloudprovider.OtherErrorClass {
t.Errorf("expected failed machine without nodeRef to have error class %v, got %v", cloudprovider.OtherErrorClass, instance.Status.ErrorInfo.ErrorClass)
}
if instance.Status == nil || instance.Status.ErrorInfo.ErrorCode != "ProvisioningFailed" {
t.Errorf("expected failed machine without nodeRef to have error code %v, got %v", "ProvisioningFailed", instance.Status.ErrorInfo.ErrorCode)
}
} else if tc.includeFailedMachineDeleting && strings.HasPrefix(instance.Id, failedMachinePrefix) {
if instance.Status == nil || instance.Status.State != cloudprovider.InstanceDeleting {
t.Errorf("expected failed machine deleting to have status %v, got %v", cloudprovider.InstanceDeleting, instance.Status)
}
if instance.Status == nil || instance.Status.ErrorInfo.ErrorClass != cloudprovider.OtherErrorClass {
t.Errorf("expected failed machine deleting to have error class %v, got %v", cloudprovider.OtherErrorClass, instance.Status.ErrorInfo.ErrorClass)
}
if instance.Status == nil || instance.Status.ErrorInfo.ErrorCode != "DeletingFailed" {
t.Errorf("expected failed machine deleting to have error code %v, got %v", "DeletingFailed", instance.Status.ErrorInfo.ErrorCode)
}
}
}
}
annotations := map[string]string{
nodeGroupMinSizeAnnotationKey: "1",
nodeGroupMaxSizeAnnotationKey: "10",
}
t.Run("MachineSet", func(t *testing.T) {
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
test(
t,
&tc,
createMachineSetTestConfig(
RandomString(6),
RandomString(6),
RandomString(6),
tc.nodeCount,
annotations,
nil,
),
)
})
}
})
t.Run("MachineDeployment", func(t *testing.T) {
for _, tc := range testCases {
t.Run(tc.description, func(t *testing.T) {
test(
t,
&tc,
createMachineDeploymentTestConfig(
RandomString(6),
RandomString(6),
RandomString(6),
tc.nodeCount,
annotations,
nil,
),
)
})
}
})
}

View File

@ -18,20 +18,26 @@ package clusterapi
import (
"context"
"encoding/json"
"fmt"
"path"
"strconv"
"strings"
"time"
"github.com/pkg/errors"
autoscalingv1 "k8s.io/api/autoscaling/v1"
apiv1 "k8s.io/api/core/v1"
corev1 "k8s.io/api/core/v1"
resourceapi "k8s.io/api/resource/v1beta1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/validation"
klog "k8s.io/klog/v2"
"k8s.io/utils/ptr"
)
type unstructuredScalableResource struct {
@ -118,17 +124,18 @@ func (r unstructuredScalableResource) SetSize(nreplicas int) error {
return err
}
s, err := r.controller.managementScaleClient.Scales(r.Namespace()).Get(context.TODO(), gvr.GroupResource(), r.Name(), metav1.GetOptions{})
spec := autoscalingv1.Scale{
Spec: autoscalingv1.ScaleSpec{
Replicas: int32(nreplicas),
},
}
patch, err := json.Marshal(spec)
if err != nil {
return err
return fmt.Errorf("could not marshal json patch for scaling: %w", err)
}
if s == nil {
return fmt.Errorf("unknown %s %s/%s", r.Kind(), r.Namespace(), r.Name())
}
s.Spec.Replicas = int32(nreplicas)
_, updateErr := r.controller.managementScaleClient.Scales(r.Namespace()).Update(context.TODO(), gvr.GroupResource(), s, metav1.UpdateOptions{})
_, updateErr := r.controller.managementScaleClient.Scales(r.Namespace()).Patch(context.TODO(), gvr, r.Name(), types.MergePatchType, patch, metav1.PatchOptions{})
if updateErr == nil {
updateErr = unstructured.SetNestedField(r.unstructured.UnstructuredContent(), int64(nreplicas), "spec", "replicas")
@ -297,6 +304,48 @@ func (r unstructuredScalableResource) InstanceCapacity() (map[corev1.ResourceNam
return capacity, nil
}
func (r unstructuredScalableResource) InstanceResourceSlices(nodeName string) ([]*resourceapi.ResourceSlice, error) {
var result []*resourceapi.ResourceSlice
driver := r.InstanceDRADriver()
if driver == "" {
return nil, nil
}
gpuCount, err := r.InstanceGPUCapacityAnnotation()
if err != nil {
return nil, err
}
if !gpuCount.IsZero() {
resourceslice := &resourceapi.ResourceSlice{
ObjectMeta: metav1.ObjectMeta{
Name: nodeName + "-" + driver,
},
Spec: resourceapi.ResourceSliceSpec{
Driver: driver,
NodeName: nodeName,
Pool: resourceapi.ResourcePool{
Name: nodeName,
},
},
}
for i := 0; i < int(gpuCount.Value()); i++ {
device := resourceapi.Device{
Name: "gpu-" + strconv.Itoa(i),
Basic: &resourceapi.BasicDevice{
Attributes: map[resourceapi.QualifiedName]resourceapi.DeviceAttribute{
"type": {
StringValue: ptr.To(GpuDeviceType),
},
},
},
}
resourceslice.Spec.Devices = append(resourceslice.Spec.Devices, device)
}
result = append(result, resourceslice)
return result, nil
}
return nil, nil
}
func (r unstructuredScalableResource) InstanceEphemeralDiskCapacityAnnotation() (resource.Quantity, error) {
return parseEphemeralDiskCapacity(r.unstructured.GetAnnotations())
}
@ -321,6 +370,10 @@ func (r unstructuredScalableResource) InstanceMaxPodsCapacityAnnotation() (resou
return parseMaxPodsCapacity(r.unstructured.GetAnnotations())
}
func (r unstructuredScalableResource) InstanceDRADriver() string {
return parseDRADriver(r.unstructured.GetAnnotations())
}
func (r unstructuredScalableResource) readInfrastructureReferenceResource() (*unstructured.Unstructured, error) {
infraref, found, err := unstructured.NestedStringMap(r.unstructured.Object, "spec", "template", "spec", "infrastructureRef")
if !found || err != nil {

View File

@ -24,10 +24,12 @@ import (
"github.com/stretchr/testify/assert"
v1 "k8s.io/api/core/v1"
resourceapi "k8s.io/api/resource/v1beta1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/client-go/tools/cache"
"k8s.io/utils/ptr"
)
const (
@ -297,6 +299,32 @@ func TestAnnotations(t *testing.T) {
gpuQuantity := resource.MustParse("1")
maxPodsQuantity := resource.MustParse("42")
expectedTaints := []v1.Taint{{Key: "key1", Effect: v1.TaintEffectNoSchedule, Value: "value1"}, {Key: "key2", Effect: v1.TaintEffectNoExecute, Value: "value2"}}
testNodeName := "test-node"
draDriver := "test-driver"
expectedResourceSlice := &resourceapi.ResourceSlice{
ObjectMeta: metav1.ObjectMeta{
Name: testNodeName + "-" + draDriver,
},
Spec: resourceapi.ResourceSliceSpec{
Driver: draDriver,
NodeName: testNodeName,
Pool: resourceapi.ResourcePool{
Name: testNodeName,
},
Devices: []resourceapi.Device{
{
Name: "gpu-0",
Basic: &resourceapi.BasicDevice{
Attributes: map[resourceapi.QualifiedName]resourceapi.DeviceAttribute{
"type": {
StringValue: ptr.To(GpuDeviceType),
},
},
},
},
},
},
}
annotations := map[string]string{
cpuKey: cpuQuantity.String(),
memoryKey: memQuantity.String(),
@ -305,6 +333,7 @@ func TestAnnotations(t *testing.T) {
maxPodsKey: maxPodsQuantity.String(),
taintsKey: "key1=value1:NoSchedule,key2=value2:NoExecute",
labelsKey: "key3=value3,key4=value4,key5=value5",
draDriverKey: draDriver,
}
test := func(t *testing.T, testConfig *testConfig, testResource *unstructured.Unstructured) {
@ -346,6 +375,14 @@ func TestAnnotations(t *testing.T) {
t.Errorf("expected %v, got %v", maxPodsQuantity, maxPods)
}
if resourceSlices, err := sr.InstanceResourceSlices(testNodeName); err != nil {
t.Fatal(err)
} else {
for _, resourceslice := range resourceSlices {
assert.Equal(t, expectedResourceSlice, resourceslice)
}
}
taints := sr.Taints()
assert.Equal(t, expectedTaints, taints)

View File

@ -40,6 +40,7 @@ const (
maxPodsKey = "capacity.cluster-autoscaler.kubernetes.io/maxPods"
taintsKey = "capacity.cluster-autoscaler.kubernetes.io/taints"
labelsKey = "capacity.cluster-autoscaler.kubernetes.io/labels"
draDriverKey = "capacity.cluster-autoscaler.kubernetes.io/dra-driver"
// UnknownArch is used if the Architecture is Unknown
UnknownArch SystemArchitecture = ""
// Amd64 is used if the Architecture is x86_64
@ -54,6 +55,8 @@ const (
DefaultArch = Amd64
// scaleUpFromZeroDefaultEnvVar is the name of the env var for the default architecture
scaleUpFromZeroDefaultArchEnvVar = "CAPI_SCALE_ZERO_DEFAULT_ARCH"
// GpuDeviceType is used if DRA device is GPU
GpuDeviceType = "gpu"
)
var (
@ -282,6 +285,13 @@ func parseMaxPodsCapacity(annotations map[string]string) (resource.Quantity, err
return parseIntKey(annotations, maxPodsKey)
}
func parseDRADriver(annotations map[string]string) string {
if val, found := annotations[draDriverKey]; found {
return val
}
return ""
}
func clusterNameFromResource(r *unstructured.Unstructured) string {
// Use Spec.ClusterName if defined (only available on v1alpha3+ types)
clusterName, found, err := unstructured.NestedString(r.Object, "spec", "clusterName")

View File

@ -21,6 +21,7 @@ import (
"time"
apiv1 "k8s.io/api/core/v1"
podutils "k8s.io/autoscaler/cluster-autoscaler/utils/pod"
"k8s.io/autoscaler/cluster-autoscaler/utils/units"
)
@ -73,11 +74,8 @@ func getHours(startTime time.Time, endTime time.Time) float64 {
// PodPrice returns a theoretical minimum price of running a pod for a given
// period of time on a perfectly matching machine.
func (model *Price) PodPrice(pod *apiv1.Pod, startTime time.Time, endTime time.Time) (float64, error) {
price := 0.0
for _, container := range pod.Spec.Containers {
price += getBasePrice(container.Resources.Requests, startTime, endTime)
}
return price, nil
podRequests := podutils.PodRequests(pod)
return getBasePrice(podRequests, startTime, endTime), nil
}
func getBasePrice(resources apiv1.ResourceList, startTime time.Time, endTime time.Time) float64 {

View File

@ -1,13 +1,9 @@
approvers:
- jayantjain93
- maciekpytel
- towca
- x13n
- yaroslava-serdiuk
- BigDarkClown
reviewers:
- jayantjain93
- maciekpytel
- towca
- x13n
- yaroslava-serdiuk

View File

@ -130,7 +130,7 @@ type AutoscalingGceClient interface {
FetchMigsWithName(zone string, filter *regexp.Regexp) ([]string, error)
FetchZones(region string) ([]string, error)
FetchAvailableCpuPlatforms() (map[string][]string, error)
FetchAvailableDiskTypes() (map[string][]string, error)
FetchAvailableDiskTypes(zone string) ([]string, error)
FetchReservations() ([]*gce.Reservation, error)
FetchReservationsInProject(projectId string) ([]*gce.Reservation, error)
FetchListManagedInstancesResults(migRef GceRef) (string, error)
@ -753,21 +753,13 @@ func (client *autoscalingGceClientV1) FetchAvailableCpuPlatforms() (map[string][
return availableCpuPlatforms, nil
}
func (client *autoscalingGceClientV1) FetchAvailableDiskTypes() (map[string][]string, error) {
availableDiskTypes := make(map[string][]string)
func (client *autoscalingGceClientV1) FetchAvailableDiskTypes(zone string) ([]string, error) {
availableDiskTypes := []string{}
req := client.gceService.DiskTypes.AggregatedList(client.projectId)
if err := req.Pages(context.TODO(), func(page *gce.DiskTypeAggregatedList) error {
for _, diskTypesScopedList := range page.Items {
for _, diskType := range diskTypesScopedList.DiskTypes {
// skip data for regions
if diskType.Zone == "" {
continue
}
// convert URL of the zone, into the short name, e.g. us-central1-a
zone := path.Base(diskType.Zone)
availableDiskTypes[zone] = append(availableDiskTypes[zone], diskType.Name)
}
req := client.gceService.DiskTypes.List(client.projectId, zone)
if err := req.Pages(context.TODO(), func(page *gce.DiskTypeList) error {
for _, diskType := range page.Items {
availableDiskTypes = append(availableDiskTypes, diskType.Name)
}
return nil
}); err != nil {

View File

@ -653,18 +653,13 @@ func TestFetchAvailableDiskTypes(t *testing.T) {
defer server.Close()
g := newTestAutoscalingGceClient(t, "project-id", server.URL, "")
// ref: https://cloud.google.com/compute/docs/reference/rest/v1/diskTypes/aggregatedList
getDiskTypesAggregatedListOKResponse, _ := os.ReadFile("fixtures/diskTypes_aggregatedList.json")
server.On("handle", "/projects/project-id/aggregated/diskTypes").Return(string(getDiskTypesAggregatedListOKResponse)).Times(1)
// ref: https://cloud.google.com/compute/docs/reference/rest/v1/diskTypes/list
getDiskTypesListOKResponse, _ := os.ReadFile("fixtures/diskTypes_list.json")
server.On("handle", "/projects/project-id/zones/us-central1-b/diskTypes").Return(string(getDiskTypesListOKResponse)).Times(1)
t.Run("correctly parse a response", func(t *testing.T) {
want := map[string][]string{
// "us-central1" region should be skipped
"us-central1-a": {"local-ssd", "pd-balanced", "pd-ssd", "pd-standard"},
"us-central1-b": {"hyperdisk-balanced", "hyperdisk-extreme", "hyperdisk-throughput", "local-ssd", "pd-balanced", "pd-extreme", "pd-ssd", "pd-standard"},
}
got, err := g.FetchAvailableDiskTypes()
want := []string{"hyperdisk-balanced", "hyperdisk-extreme", "hyperdisk-throughput", "local-ssd", "pd-balanced", "pd-extreme", "pd-ssd", "pd-standard"}
got, err := g.FetchAvailableDiskTypes("us-central1-b")
assert.NoError(t, err)
if diff := cmp.Diff(want, got, cmpopts.EquateErrors()); diff != "" {
@ -860,7 +855,7 @@ func TestAutoscalingClientTimeouts(t *testing.T) {
},
"FetchAvailableDiskTypes_HttpClientTimeout": {
clientFunc: func(client *autoscalingGceClientV1) error {
_, err := client.FetchAvailableDiskTypes()
_, err := client.FetchAvailableDiskTypes("")
return err
},
httpTimeout: instantTimeout,

View File

@ -432,6 +432,25 @@ func (gc *GceCache) InvalidateAllMigInstanceTemplates() {
gc.instanceTemplatesCache = map[GceRef]*gce.InstanceTemplate{}
}
// DropInstanceTemplatesForMissingMigs clears the instance template
// cache intended MIGs which are no longer present in the cluster
func (gc *GceCache) DropInstanceTemplatesForMissingMigs(currentMigs []Mig) {
gc.cacheMutex.Lock()
defer gc.cacheMutex.Unlock()
requiredKeys := make(map[GceRef]struct{}, len(currentMigs))
for _, mig := range currentMigs {
requiredKeys[mig.GceRef()] = struct{}{}
}
klog.V(5).Infof("Instance template cache partially invalidated")
for key := range gc.instanceTemplatesCache {
if _, exists := requiredKeys[key]; !exists {
delete(gc.instanceTemplatesCache, key)
}
}
}
// GetMigKubeEnv returns the cached KubeEnv for a mig GceRef
func (gc *GceCache) GetMigKubeEnv(ref GceRef) (KubeEnv, bool) {
gc.cacheMutex.Lock()

View File

@ -1,162 +0,0 @@
{
"kind": "compute#diskTypeAggregatedList",
"id": "projects/project-id/aggregated/diskTypes",
"items": {
"regions/us-central1": {
"diskTypes": [
{
"kind": "compute#diskType",
"id": "30007",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-balanced",
"description": "Balanced Persistent Disk",
"validDiskSize": "10GB-65536GB",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-central1/diskTypes/pd-balanced",
"defaultDiskSizeGb": "100",
"region": "https://www.googleapis.com/compute/v1/projects/project-id/regions/us-central1"
}
]
},
"zones/us-central1-a": {
"diskTypes": [
{
"kind": "compute#diskType",
"id": "30003",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "local-ssd",
"description": "Local SSD",
"validDiskSize": "375GB-375GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-a",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-a/diskTypes/local-ssd",
"defaultDiskSizeGb": "375"
},
{
"kind": "compute#diskType",
"id": "30007",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-balanced",
"description": "Balanced Persistent Disk",
"validDiskSize": "10GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-a",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-a/diskTypes/pd-balanced",
"defaultDiskSizeGb": "100"
},
{
"kind": "compute#diskType",
"id": "30002",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-ssd",
"description": "SSD Persistent Disk",
"validDiskSize": "10GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-a",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-a/diskTypes/pd-ssd",
"defaultDiskSizeGb": "100"
},
{
"kind": "compute#diskType",
"id": "30001",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-standard",
"description": "Standard Persistent Disk",
"validDiskSize": "10GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-a",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-a/diskTypes/pd-standard",
"defaultDiskSizeGb": "500"
}
]
},
"zones/us-central1-b": {
"diskTypes": [
{
"kind": "compute#diskType",
"id": "30014",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "hyperdisk-balanced",
"description": "Hyperdisk Balanced Persistent Disk",
"validDiskSize": "4GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/hyperdisk-balanced",
"defaultDiskSizeGb": "100"
},
{
"kind": "compute#diskType",
"id": "30012",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "hyperdisk-extreme",
"description": "Hyperdisk Extreme Persistent Disk",
"validDiskSize": "64GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/hyperdisk-extreme",
"defaultDiskSizeGb": "1000"
},
{
"kind": "compute#diskType",
"id": "30013",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "hyperdisk-throughput",
"description": "Hyperdisk Throughput Persistent Disk",
"validDiskSize": "2048GB-32768GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/hyperdisk-throughput",
"defaultDiskSizeGb": "2048"
},
{
"kind": "compute#diskType",
"id": "30003",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "local-ssd",
"description": "Local SSD",
"validDiskSize": "375GB-375GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/local-ssd",
"defaultDiskSizeGb": "375"
},
{
"kind": "compute#diskType",
"id": "30007",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-balanced",
"description": "Balanced Persistent Disk",
"validDiskSize": "10GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/pd-balanced",
"defaultDiskSizeGb": "100"
},
{
"kind": "compute#diskType",
"id": "30008",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-extreme",
"description": "Extreme Persistent Disk",
"validDiskSize": "500GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/pd-extreme",
"defaultDiskSizeGb": "1000"
},
{
"kind": "compute#diskType",
"id": "30002",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-ssd",
"description": "SSD Persistent Disk",
"validDiskSize": "10GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/pd-ssd",
"defaultDiskSizeGb": "100"
},
{
"kind": "compute#diskType",
"id": "30001",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-standard",
"description": "Standard Persistent Disk",
"validDiskSize": "10GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/pd-standard",
"defaultDiskSizeGb": "500"
}
]
}
},
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/aggregated/diskTypes"
}

View File

@ -0,0 +1,95 @@
{
"kind": "compute#diskTypeList",
"id": "projects/project-id/aggregated/diskTypes",
"items": [
{
"kind": "compute#diskType",
"id": "30014",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "hyperdisk-balanced",
"description": "Hyperdisk Balanced Persistent Disk",
"validDiskSize": "4GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/hyperdisk-balanced",
"defaultDiskSizeGb": "100"
},
{
"kind": "compute#diskType",
"id": "30012",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "hyperdisk-extreme",
"description": "Hyperdisk Extreme Persistent Disk",
"validDiskSize": "64GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/hyperdisk-extreme",
"defaultDiskSizeGb": "1000"
},
{
"kind": "compute#diskType",
"id": "30013",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "hyperdisk-throughput",
"description": "Hyperdisk Throughput Persistent Disk",
"validDiskSize": "2048GB-32768GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/hyperdisk-throughput",
"defaultDiskSizeGb": "2048"
},
{
"kind": "compute#diskType",
"id": "30003",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "local-ssd",
"description": "Local SSD",
"validDiskSize": "375GB-375GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/local-ssd",
"defaultDiskSizeGb": "375"
},
{
"kind": "compute#diskType",
"id": "30007",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-balanced",
"description": "Balanced Persistent Disk",
"validDiskSize": "10GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/pd-balanced",
"defaultDiskSizeGb": "100"
},
{
"kind": "compute#diskType",
"id": "30008",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-extreme",
"description": "Extreme Persistent Disk",
"validDiskSize": "500GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/pd-extreme",
"defaultDiskSizeGb": "1000"
},
{
"kind": "compute#diskType",
"id": "30002",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-ssd",
"description": "SSD Persistent Disk",
"validDiskSize": "10GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/pd-ssd",
"defaultDiskSizeGb": "100"
},
{
"kind": "compute#diskType",
"id": "30001",
"creationTimestamp": "1969-12-31T16:00:00.000-08:00",
"name": "pd-standard",
"description": "Standard Persistent Disk",
"validDiskSize": "10GB-65536GB",
"zone": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b",
"selfLink": "https://www.googleapis.com/compute/v1/projects/project-id/zones/us-central1-b/diskTypes/pd-standard",
"defaultDiskSizeGb": "500"
}
],
"selfLink": "https://www.googleapis.com/compute/v1/projects/project/zones/us-central1-b/diskTypes"
}

View File

@ -306,7 +306,14 @@ func (m *gceManagerImpl) Refresh() error {
if m.lastRefresh.Add(refreshInterval).After(time.Now()) {
return nil
}
return m.forceRefresh()
if err := m.forceRefresh(); err != nil {
return err
}
migs := m.migLister.GetMigs()
m.cache.DropInstanceTemplatesForMissingMigs(migs)
return nil
}
func (m *gceManagerImpl) CreateInstances(mig Mig, delta int64) error {

View File

@ -342,13 +342,14 @@ var (
"t2d-standard-32": 1.3519,
"t2d-standard-48": 2.0278,
"t2d-standard-60": 2.5348,
"z3-highmem-176": 21.980576,
"z3-highmem-8": 1.145745,
"z3-highmem-14": 2.085698,
"z3-highmem-22": 3.231443,
"z3-highmem-30": 4.377188,
"z3-highmem-36": 5.317141,
"z3-highmem-44": 6.462886,
"z3-highmem-88": 12.925772,
"z3-highmem-176": 21.980576,
"z3-highmem-8-highlssd": 1.145745,
"z3-highmem-16-highlssd": 2.291489,
"z3-highmem-22-highlssd": 3.231443,
@ -537,24 +538,25 @@ var (
"t2d-standard-32": 0.3271,
"t2d-standard-48": 0.4907,
"t2d-standard-60": 0.6134,
"z3-highmem-176": 7.568291,
"z3-highmem-8": 0.402664,
"z3-highmem-14": 0.736921,
"z3-highmem-22": 1.139585,
"z3-highmem-30": 1.542249,
"z3-highmem-36": 1.876505,
"z3-highmem-44": 2.279170,
"z3-highmem-8-highlssd": 0.402664,
"z3-highmem-16-highlssd": 0.805329,
"z3-highmem-22-highlssd": 1.139585,
"z3-highmem-32-highlssd": 1.610657,
"z3-highmem-44-highlssd": 2.279170,
"z3-highmem-88-highlssd": 4.558339,
"z3-highmem-14-standardlssd": 0.607888,
"z3-highmem-22-standardlssd": 1.010553,
"z3-highmem-44-standardlssd": 1.892073,
"z3-highmem-88-standardlssd": 3.784146,
"z3-highmem-176-standardlssd": 7.568291,
"z3-highmem-8": 0.290841,
"z3-highmem-14": 0.532258,
"z3-highmem-22": 0.823099,
"z3-highmem-30": 1.113941,
"z3-highmem-36": 1.355358,
"z3-highmem-44": 1.646199,
"z3-highmem-88": 3.292398,
"z3-highmem-176": 5.467054,
"z3-highmem-8-highlssd": 0.290841,
"z3-highmem-16-highlssd": 0.581682,
"z3-highmem-22-highlssd": 0.823099,
"z3-highmem-32-highlssd": 1.163365,
"z3-highmem-44-highlssd": 1.646199,
"z3-highmem-88-highlssd": 3.292398,
"z3-highmem-14-standardlssd": 0.439113,
"z3-highmem-22-standardlssd": 0.729954,
"z3-highmem-44-standardlssd": 1.366763,
"z3-highmem-88-standardlssd": 2.733527,
"z3-highmem-176-standardlssd": 5.467054,
}
gpuPrices = map[string]float64{
"nvidia-tesla-t4": 0.35,

View File

@ -24,6 +24,7 @@ import (
apiv1 "k8s.io/api/core/v1"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/gce/localssdsize"
"k8s.io/autoscaler/cluster-autoscaler/utils/gpu"
podutils "k8s.io/autoscaler/cluster-autoscaler/utils/pod"
"k8s.io/autoscaler/cluster-autoscaler/utils/units"
klog "k8s.io/klog/v2"
@ -155,11 +156,8 @@ func (model *GcePriceModel) getPreemptibleDiscount(node *apiv1.Node) float64 {
// PodPrice returns a theoretical minimum price of running a pod for a given
// period of time on a perfectly matching machine.
func (model *GcePriceModel) PodPrice(pod *apiv1.Pod, startTime time.Time, endTime time.Time) (float64, error) {
price := 0.0
for _, container := range pod.Spec.Containers {
price += model.getBasePrice(container.Resources.Requests, "", startTime, endTime)
price += model.getAdditionalPrice(container.Resources.Requests, startTime, endTime)
}
podRequests := podutils.PodRequests(pod)
price := model.getBasePrice(podRequests, "", startTime, endTime) + model.getAdditionalPrice(podRequests, startTime, endTime)
return price, nil
}

View File

@ -172,7 +172,7 @@ func (client *mockAutoscalingGceClient) FetchAvailableCpuPlatforms() (map[string
return nil, nil
}
func (client *mockAutoscalingGceClient) FetchAvailableDiskTypes() (map[string][]string, error) {
func (client *mockAutoscalingGceClient) FetchAvailableDiskTypes(_ string) ([]string, error) {
return nil, nil
}

View File

@ -41,6 +41,12 @@ The cluster autoscaler for Hetzner Cloud scales worker nodes.
}
```
`HCLOUD_CLUSTER_CONFIG_FILE` Can be used as alternative to `HCLOUD_CLUSTER_CONFIG`. This is the path to a file
containing the JSON structure described above. The file will be read and the contents will be used as the configuration.
Can be useful when you have many different node pools and run into issues of the env var becoming too long.
**NOTE**: In contrast to `HCLOUD_CLUSTER_CONFIG`, this file is not base64 encoded.
`HCLOUD_NETWORK` Default empty , The id or name of the network that is used in the cluster , @see https://docs.hetzner.cloud/#networks

View File

@ -6,6 +6,7 @@ import (
"net/url"
"time"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -54,9 +55,21 @@ const (
type ActionError struct {
Code string
Message string
action *Action
}
// Action returns the [Action] that triggered the error if available.
func (e ActionError) Action() *Action {
return e.action
}
func (e ActionError) Error() string {
action := e.Action()
if action != nil {
// For easier debugging, the error string contains the Action ID.
return fmt.Sprintf("%s (%s, %d)", e.Message, e.Code, action.ID)
}
return fmt.Sprintf("%s (%s)", e.Message, e.Code)
}
@ -65,6 +78,7 @@ func (a *Action) Error() error {
return ActionError{
Code: a.ErrorCode,
Message: a.ErrorMessage,
action: a,
}
}
return nil
@ -111,11 +125,15 @@ func (c *ActionClient) List(ctx context.Context, opts ActionListOpts) ([]*Action
}
// All returns all actions.
//
// Deprecated: It is required to pass in a list of IDs since 30 January 2025. Please use [ActionClient.AllWithOpts] instead.
func (c *ActionClient) All(ctx context.Context) ([]*Action, error) {
return c.action.All(ctx, ActionListOpts{ListOpts: ListOpts{PerPage: 50}})
}
// AllWithOpts returns all actions for the given options.
//
// It is required to set [ActionListOpts.ID]. Any other fields set in the opts are ignored.
func (c *ActionClient) AllWithOpts(ctx context.Context, opts ActionListOpts) ([]*Action, error) {
return c.action.All(ctx, opts)
}
@ -136,20 +154,19 @@ func (c *ResourceActionClient) getBaseURL() string {
// GetByID retrieves an action by its ID. If the action does not exist, nil is returned.
func (c *ResourceActionClient) GetByID(ctx context.Context, id int64) (*Action, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("%s/actions/%d", c.getBaseURL(), id), nil)
if err != nil {
return nil, nil, err
}
opPath := c.getBaseURL() + "/actions/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.ActionGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.ActionGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, nil, err
return nil, resp, err
}
return ActionFromSchema(body.Action), resp, nil
return ActionFromSchema(respBody.Action), resp, nil
}
// List returns a list of actions for a specific page.
@ -157,44 +174,23 @@ func (c *ResourceActionClient) GetByID(ctx context.Context, id int64) (*Action,
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *ResourceActionClient) List(ctx context.Context, opts ActionListOpts) ([]*Action, *Response, error) {
req, err := c.client.NewRequest(
ctx,
"GET",
fmt.Sprintf("%s/actions?%s", c.getBaseURL(), opts.values().Encode()),
nil,
)
opPath := c.getBaseURL() + "/actions?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.ActionListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.ActionListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
actions := make([]*Action, 0, len(body.Actions))
for _, i := range body.Actions {
actions = append(actions, ActionFromSchema(i))
}
return actions, resp, nil
return allFromSchemaFunc(respBody.Actions, ActionFromSchema), resp, nil
}
// All returns all actions for the given options.
func (c *ResourceActionClient) All(ctx context.Context, opts ActionListOpts) ([]*Action, error) {
allActions := []*Action{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*Action, *Response, error) {
opts.Page = page
actions, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allActions = append(allActions, actions...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allActions, nil
}

View File

@ -16,11 +16,14 @@ type ActionWaiter interface {
var _ ActionWaiter = (*ActionClient)(nil)
// WaitForFunc waits until all actions are completed by polling the API at the interval
// defined by [WithPollBackoffFunc]. An action is considered as complete when its status is
// defined by [WithPollOpts]. An action is considered as complete when its status is
// either [ActionStatusSuccess] or [ActionStatusError].
//
// The handleUpdate callback is called every time an action is updated.
func (c *ActionClient) WaitForFunc(ctx context.Context, handleUpdate func(update *Action) error, actions ...*Action) error {
// Filter out nil actions
actions = slices.DeleteFunc(actions, func(a *Action) bool { return a == nil })
running := make(map[int64]struct{}, len(actions))
for _, action := range actions {
if action.Status == ActionStatusRunning {
@ -48,18 +51,19 @@ func (c *ActionClient) WaitForFunc(ctx context.Context, handleUpdate func(update
retries++
}
opts := ActionListOpts{
Sort: []string{"status", "id"},
ID: make([]int64, 0, len(running)),
}
for actionID := range running {
opts.ID = append(opts.ID, actionID)
}
slices.Sort(opts.ID)
updates := make([]*Action, 0, len(running))
for runningIDsChunk := range slices.Chunk(slices.Sorted(maps.Keys(running)), 25) {
opts := ActionListOpts{
Sort: []string{"status", "id"},
ID: runningIDsChunk,
}
updates, err := c.AllWithOpts(ctx, opts)
if err != nil {
return err
updatesChunk, err := c.AllWithOpts(ctx, opts)
if err != nil {
return err
}
updates = append(updates, updatesChunk...)
}
if len(updates) != len(running) {
@ -95,7 +99,7 @@ func (c *ActionClient) WaitForFunc(ctx context.Context, handleUpdate func(update
}
// WaitFor waits until all actions succeed by polling the API at the interval defined by
// [WithPollBackoffFunc]. An action is considered as succeeded when its status is either
// [WithPollOpts]. An action is considered as succeeded when its status is either
// [ActionStatusSuccess].
//
// If a single action fails, the function will stop waiting and the error set in the

View File

@ -21,7 +21,7 @@ import (
// timeout, use the [context.Context]. Once the method has stopped watching,
// both returned channels are closed.
//
// WatchOverallProgress uses the [WithPollBackoffFunc] of the [Client] to wait
// WatchOverallProgress uses the [WithPollOpts] of the [Client] to wait
// until sending the next request.
//
// Deprecated: WatchOverallProgress is deprecated, use [WaitForFunc] instead.
@ -86,7 +86,7 @@ func (c *ActionClient) WatchOverallProgress(ctx context.Context, actions []*Acti
// timeout, use the [context.Context]. Once the method has stopped watching,
// both returned channels are closed.
//
// WatchProgress uses the [WithPollBackoffFunc] of the [Client] to wait until
// WatchProgress uses the [WithPollOpts] of the [Client] to wait until
// sending the next request.
//
// Deprecated: WatchProgress is deprecated, use [WaitForFunc] instead.

View File

@ -1,15 +1,12 @@
package hcloud
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"net/url"
"strconv"
"time"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -98,41 +95,32 @@ type CertificateClient struct {
// GetByID retrieves a Certificate by its ID. If the Certificate does not exist, nil is returned.
func (c *CertificateClient) GetByID(ctx context.Context, id int64) (*Certificate, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/certificates/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/certificates/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.CertificateGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.CertificateGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, nil, err
return nil, resp, err
}
return CertificateFromSchema(body.Certificate), resp, nil
return CertificateFromSchema(respBody.Certificate), resp, nil
}
// GetByName retrieves a Certificate by its name. If the Certificate does not exist, nil is returned.
func (c *CertificateClient) GetByName(ctx context.Context, name string) (*Certificate, *Response, error) {
if name == "" {
return nil, nil, nil
}
Certificate, response, err := c.List(ctx, CertificateListOpts{Name: name})
if len(Certificate) == 0 {
return nil, response, err
}
return Certificate[0], response, err
return firstByName(name, func() ([]*Certificate, *Response, error) {
return c.List(ctx, CertificateListOpts{Name: name})
})
}
// Get retrieves a Certificate by its ID if the input can be parsed as an integer, otherwise it
// retrieves a Certificate by its name. If the Certificate does not exist, nil is returned.
func (c *CertificateClient) Get(ctx context.Context, idOrName string) (*Certificate, *Response, error) {
if id, err := strconv.ParseInt(idOrName, 10, 64); err == nil {
return c.GetByID(ctx, id)
}
return c.GetByName(ctx, idOrName)
return getByIDOrName(ctx, c.GetByID, c.GetByName, idOrName)
}
// CertificateListOpts specifies options for listing Certificates.
@ -158,22 +146,17 @@ func (l CertificateListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *CertificateClient) List(ctx context.Context, opts CertificateListOpts) ([]*Certificate, *Response, error) {
path := "/certificates?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/certificates?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.CertificateListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.CertificateListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
Certificates := make([]*Certificate, 0, len(body.Certificates))
for _, s := range body.Certificates {
Certificates = append(Certificates, CertificateFromSchema(s))
}
return Certificates, resp, nil
return allFromSchemaFunc(respBody.Certificates, CertificateFromSchema), resp, nil
}
// All returns all Certificates.
@ -183,22 +166,10 @@ func (c *CertificateClient) All(ctx context.Context) ([]*Certificate, error) {
// AllWithOpts returns all Certificates for the given options.
func (c *CertificateClient) AllWithOpts(ctx context.Context, opts CertificateListOpts) ([]*Certificate, error) {
allCertificates := []*Certificate{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*Certificate, *Response, error) {
opts.Page = page
Certificates, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allCertificates = append(allCertificates, Certificates...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allCertificates, nil
}
// CertificateCreateOpts specifies options for creating a new Certificate.
@ -214,7 +185,7 @@ type CertificateCreateOpts struct {
// Validate checks if options are valid.
func (o CertificateCreateOpts) Validate() error {
if o.Name == "" {
return errors.New("missing name")
return missingField(o, "Name")
}
switch o.Type {
case "", CertificateTypeUploaded:
@ -222,23 +193,23 @@ func (o CertificateCreateOpts) Validate() error {
case CertificateTypeManaged:
return o.validateManaged()
default:
return fmt.Errorf("invalid type: %s", o.Type)
return invalidFieldValue(o, "Type", o.Type)
}
}
func (o CertificateCreateOpts) validateManaged() error {
if len(o.DomainNames) == 0 {
return errors.New("no domain names")
return missingField(o, "DomainNames")
}
return nil
}
func (o CertificateCreateOpts) validateUploaded() error {
if o.Certificate == "" {
return errors.New("missing certificate")
return missingField(o, "Certificate")
}
if o.PrivateKey == "" {
return errors.New("missing private key")
return missingField(o, "PrivateKey")
}
return nil
}
@ -249,7 +220,7 @@ func (o CertificateCreateOpts) validateUploaded() error {
// CreateCertificate to create such certificates.
func (c *CertificateClient) Create(ctx context.Context, opts CertificateCreateOpts) (*Certificate, *Response, error) {
if !(opts.Type == "" || opts.Type == CertificateTypeUploaded) {
return nil, nil, fmt.Errorf("invalid certificate type: %s", opts.Type)
return nil, nil, invalidFieldValue(opts, "Type", opts.Type)
}
result, resp, err := c.CreateCertificate(ctx, opts)
if err != nil {
@ -262,16 +233,20 @@ func (c *CertificateClient) Create(ctx context.Context, opts CertificateCreateOp
func (c *CertificateClient) CreateCertificate(
ctx context.Context, opts CertificateCreateOpts,
) (CertificateCreateResult, *Response, error) {
var (
action *Action
reqBody schema.CertificateCreateRequest
)
const opPath = "/certificates"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := opPath
result := CertificateCreateResult{}
if err := opts.Validate(); err != nil {
return CertificateCreateResult{}, nil, err
return result, nil, err
}
reqBody.Name = opts.Name
reqBody := schema.CertificateCreateRequest{
Name: opts.Name,
}
switch opts.Type {
case "", CertificateTypeUploaded:
@ -282,32 +257,24 @@ func (c *CertificateClient) CreateCertificate(
reqBody.Type = string(CertificateTypeManaged)
reqBody.DomainNames = opts.DomainNames
default:
return CertificateCreateResult{}, nil, fmt.Errorf("invalid certificate type: %v", opts.Type)
return result, nil, invalidFieldValue(opts, "Type", opts.Type)
}
if opts.Labels != nil {
reqBody.Labels = &opts.Labels
}
reqBodyData, err := json.Marshal(reqBody)
respBody, resp, err := postRequest[schema.CertificateCreateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return CertificateCreateResult{}, nil, err
}
req, err := c.client.NewRequest(ctx, "POST", "/certificates", bytes.NewReader(reqBodyData))
if err != nil {
return CertificateCreateResult{}, nil, err
return result, resp, err
}
respBody := schema.CertificateCreateResponse{}
resp, err := c.client.Do(req, &respBody)
if err != nil {
return CertificateCreateResult{}, resp, err
}
cert := CertificateFromSchema(respBody.Certificate)
result.Certificate = CertificateFromSchema(respBody.Certificate)
if respBody.Action != nil {
action = ActionFromSchema(*respBody.Action)
result.Action = ActionFromSchema(*respBody.Action)
}
return CertificateCreateResult{Certificate: cert, Action: action}, resp, nil
return result, resp, nil
}
// CertificateUpdateOpts specifies options for updating a Certificate.
@ -318,6 +285,11 @@ type CertificateUpdateOpts struct {
// Update updates a Certificate.
func (c *CertificateClient) Update(ctx context.Context, certificate *Certificate, opts CertificateUpdateOpts) (*Certificate, *Response, error) {
const opPath = "/certificates/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, certificate.ID)
reqBody := schema.CertificateUpdateRequest{}
if opts.Name != "" {
reqBody.Name = &opts.Name
@ -325,46 +297,36 @@ func (c *CertificateClient) Update(ctx context.Context, certificate *Certificate
if opts.Labels != nil {
reqBody.Labels = &opts.Labels
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/certificates/%d", certificate.ID)
req, err := c.client.NewRequest(ctx, "PUT", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.CertificateUpdateResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := putRequest[schema.CertificateUpdateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return CertificateFromSchema(respBody.Certificate), resp, nil
}
// Delete deletes a certificate.
func (c *CertificateClient) Delete(ctx context.Context, certificate *Certificate) (*Response, error) {
req, err := c.client.NewRequest(ctx, "DELETE", fmt.Sprintf("/certificates/%d", certificate.ID), nil)
if err != nil {
return nil, err
}
return c.client.Do(req, nil)
const opPath = "/certificates/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, certificate.ID)
return deleteRequestNoResult(ctx, c.client, reqPath)
}
// RetryIssuance retries the issuance of a failed managed certificate.
func (c *CertificateClient) RetryIssuance(ctx context.Context, certificate *Certificate) (*Action, *Response, error) {
var respBody schema.CertificateIssuanceRetryResponse
const opPath = "/certificates/%d/actions/retry"
ctx = ctxutil.SetOpPath(ctx, opPath)
req, err := c.client.NewRequest(ctx, "POST", fmt.Sprintf("/certificates/%d/actions/retry", certificate.ID), nil)
reqPath := fmt.Sprintf(opPath, certificate.ID)
respBody, resp, err := postRequest[schema.CertificateIssuanceRetryResponse](ctx, c.client, reqPath, nil)
if err != nil {
return nil, nil, err
return nil, resp, err
}
resp, err := c.client.Do(req, &respBody)
if err != nil {
return nil, nil, err
}
action := ActionFromSchema(respBody.Action)
return action, resp, nil
return ActionFromSchema(respBody.Action), resp, nil
}

View File

@ -3,13 +3,12 @@ package hcloud
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"io"
"math"
"math/rand"
"net/http"
"net/http/httputil"
"net/url"
"strconv"
"strings"
@ -19,7 +18,6 @@ import (
"golang.org/x/net/http/httpguts"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/internal/instrumentation"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
// Endpoint is the base URL of the API.
@ -43,13 +41,43 @@ func ConstantBackoff(d time.Duration) BackoffFunc {
}
// ExponentialBackoff returns a BackoffFunc which implements an exponential
// backoff.
// It uses the formula:
// backoff, truncated to 60 seconds.
// See [ExponentialBackoffWithOpts] for more details.
func ExponentialBackoff(multiplier float64, base time.Duration) BackoffFunc {
return ExponentialBackoffWithOpts(ExponentialBackoffOpts{
Base: base,
Multiplier: multiplier,
Cap: time.Minute,
})
}
// ExponentialBackoffOpts defines the options used by [ExponentialBackoffWithOpts].
type ExponentialBackoffOpts struct {
Base time.Duration
Multiplier float64
Cap time.Duration
Jitter bool
}
// ExponentialBackoffWithOpts returns a BackoffFunc which implements an exponential
// backoff, truncated to a maximum, and an optional full jitter.
//
// b^retries * d
func ExponentialBackoff(b float64, d time.Duration) BackoffFunc {
// See https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/
func ExponentialBackoffWithOpts(opts ExponentialBackoffOpts) BackoffFunc {
baseSeconds := opts.Base.Seconds()
capSeconds := opts.Cap.Seconds()
return func(retries int) time.Duration {
return time.Duration(math.Pow(b, float64(retries))) * d
// Exponential backoff
backoff := baseSeconds * math.Pow(opts.Multiplier, float64(retries))
// Cap backoff
backoff = math.Min(capSeconds, backoff)
// Add jitter
if opts.Jitter {
backoff = ((backoff - baseSeconds) * rand.Float64()) + baseSeconds // #nosec G404
}
return time.Duration(backoff * float64(time.Second))
}
}
@ -58,7 +86,8 @@ type Client struct {
endpoint string
token string
tokenValid bool
backoffFunc BackoffFunc
retryBackoffFunc BackoffFunc
retryMaxRetries int
pollBackoffFunc BackoffFunc
httpClient *http.Client
applicationName string
@ -66,6 +95,7 @@ type Client struct {
userAgent string
debugWriter io.Writer
instrumentationRegistry prometheus.Registerer
handler handler
Action ActionClient
Certificate CertificateClient
@ -110,30 +140,73 @@ func WithToken(token string) ClientOption {
// polling from the API.
//
// Deprecated: Setting the poll interval is deprecated, you can now configure
// [WithPollBackoffFunc] with a [ConstantBackoff] to get the same results. To
// [WithPollOpts] with a [ConstantBackoff] to get the same results. To
// migrate your code, replace your usage like this:
//
// // before
// hcloud.WithPollInterval(2 * time.Second)
// // now
// hcloud.WithPollBackoffFunc(hcloud.ConstantBackoff(2 * time.Second))
// hcloud.WithPollOpts(hcloud.PollOpts{
// BackoffFunc: hcloud.ConstantBackoff(2 * time.Second),
// })
func WithPollInterval(pollInterval time.Duration) ClientOption {
return WithPollBackoffFunc(ConstantBackoff(pollInterval))
return WithPollOpts(PollOpts{
BackoffFunc: ConstantBackoff(pollInterval),
})
}
// WithPollBackoffFunc configures a Client to use the specified backoff
// function when polling from the API.
//
// Deprecated: WithPollBackoffFunc is deprecated, use [WithPollOpts] instead.
func WithPollBackoffFunc(f BackoffFunc) ClientOption {
return WithPollOpts(PollOpts{
BackoffFunc: f,
})
}
// PollOpts defines the options used by [WithPollOpts].
type PollOpts struct {
BackoffFunc BackoffFunc
}
// WithPollOpts configures a Client to use the specified options when polling from the API.
//
// If [PollOpts.BackoffFunc] is nil, the existing backoff function will be preserved.
func WithPollOpts(opts PollOpts) ClientOption {
return func(client *Client) {
client.pollBackoffFunc = f
if opts.BackoffFunc != nil {
client.pollBackoffFunc = opts.BackoffFunc
}
}
}
// WithBackoffFunc configures a Client to use the specified backoff function.
// The backoff function is used for retrying HTTP requests.
//
// Deprecated: WithBackoffFunc is deprecated, use [WithRetryOpts] instead.
func WithBackoffFunc(f BackoffFunc) ClientOption {
return func(client *Client) {
client.backoffFunc = f
client.retryBackoffFunc = f
}
}
// RetryOpts defines the options used by [WithRetryOpts].
type RetryOpts struct {
BackoffFunc BackoffFunc
MaxRetries int
}
// WithRetryOpts configures a Client to use the specified options when retrying API
// requests.
//
// If [RetryOpts.BackoffFunc] is nil, the existing backoff function will be preserved.
func WithRetryOpts(opts RetryOpts) ClientOption {
return func(client *Client) {
if opts.BackoffFunc != nil {
client.retryBackoffFunc = opts.BackoffFunc
}
client.retryMaxRetries = opts.MaxRetries
}
}
@ -172,10 +245,18 @@ func WithInstrumentation(registry prometheus.Registerer) ClientOption {
// NewClient creates a new client.
func NewClient(options ...ClientOption) *Client {
client := &Client{
endpoint: Endpoint,
tokenValid: true,
httpClient: &http.Client{},
backoffFunc: ExponentialBackoff(2, 500*time.Millisecond),
endpoint: Endpoint,
tokenValid: true,
httpClient: &http.Client{},
retryBackoffFunc: ExponentialBackoffWithOpts(ExponentialBackoffOpts{
Base: time.Second,
Multiplier: 2,
Cap: time.Minute,
Jitter: true,
}),
retryMaxRetries: 5,
pollBackoffFunc: ConstantBackoff(500 * time.Millisecond),
}
@ -186,9 +267,11 @@ func NewClient(options ...ClientOption) *Client {
client.buildUserAgent()
if client.instrumentationRegistry != nil {
i := instrumentation.New("api", client.instrumentationRegistry)
client.httpClient.Transport = i.InstrumentedRoundTripper()
client.httpClient.Transport = i.InstrumentedRoundTripper(client.httpClient.Transport)
}
client.handler = assembleHandlerChain(client)
client.Action = ActionClient{action: &ResourceActionClient{client: client}}
client.Datacenter = DatacenterClient{client: client}
client.FloatingIP = FloatingIPClient{client: client, Action: &ResourceActionClient{client: client, resource: "floating_ips"}}
@ -238,97 +321,8 @@ func (c *Client) NewRequest(ctx context.Context, method, path string, body io.Re
// Do performs an HTTP request against the API.
// v can be nil, an io.Writer to write the response body to or a pointer to
// a struct to json.Unmarshal the response to.
func (c *Client) Do(r *http.Request, v interface{}) (*Response, error) {
var retries int
var body []byte
var err error
if r.ContentLength > 0 {
body, err = io.ReadAll(r.Body)
if err != nil {
r.Body.Close()
return nil, err
}
r.Body.Close()
}
for {
if r.ContentLength > 0 {
r.Body = io.NopCloser(bytes.NewReader(body))
}
if c.debugWriter != nil {
dumpReq, err := dumpRequest(r)
if err != nil {
return nil, err
}
fmt.Fprintf(c.debugWriter, "--- Request:\n%s\n\n", dumpReq)
}
resp, err := c.httpClient.Do(r)
if err != nil {
return nil, err
}
response := &Response{Response: resp}
body, err := io.ReadAll(resp.Body)
if err != nil {
resp.Body.Close()
return response, err
}
resp.Body.Close()
resp.Body = io.NopCloser(bytes.NewReader(body))
if c.debugWriter != nil {
dumpResp, err := httputil.DumpResponse(resp, true)
if err != nil {
return nil, err
}
fmt.Fprintf(c.debugWriter, "--- Response:\n%s\n\n", dumpResp)
}
if err = response.readMeta(body); err != nil {
return response, fmt.Errorf("hcloud: error reading response meta data: %s", err)
}
if response.StatusCode >= 400 && response.StatusCode <= 599 {
err = errorFromResponse(response, body)
if err == nil {
err = fmt.Errorf("hcloud: server responded with status code %d", resp.StatusCode)
} else if IsError(err, ErrorCodeConflict) {
c.backoff(retries)
retries++
continue
}
return response, err
}
if v != nil {
if w, ok := v.(io.Writer); ok {
_, err = io.Copy(w, bytes.NewReader(body))
} else {
err = json.Unmarshal(body, v)
}
}
return response, err
}
}
func (c *Client) backoff(retries int) {
time.Sleep(c.backoffFunc(retries))
}
func (c *Client) all(f func(int) (*Response, error)) error {
var (
page = 1
)
for {
resp, err := f(page)
if err != nil {
return err
}
if resp.Meta.Pagination == nil || resp.Meta.Pagination.NextPage == 0 {
return nil
}
page = resp.Meta.Pagination.NextPage
}
func (c *Client) Do(req *http.Request, v any) (*Response, error) {
return c.handler.Do(req, v)
}
func (c *Client) buildUserAgent() {
@ -342,43 +336,6 @@ func (c *Client) buildUserAgent() {
}
}
func dumpRequest(r *http.Request) ([]byte, error) {
// Duplicate the request, so we can redact the auth header
rDuplicate := r.Clone(context.Background())
rDuplicate.Header.Set("Authorization", "REDACTED")
// To get the request body we need to read it before the request was actually sent.
// See https://github.com/golang/go/issues/29792
dumpReq, err := httputil.DumpRequestOut(rDuplicate, true)
if err != nil {
return nil, err
}
// Set original request body to the duplicate created by DumpRequestOut. The request body is not duplicated
// by .Clone() and instead just referenced, so it would be completely read otherwise.
r.Body = rDuplicate.Body
return dumpReq, nil
}
func errorFromResponse(resp *Response, body []byte) error {
if !strings.HasPrefix(resp.Header.Get("Content-Type"), "application/json") {
return nil
}
var respBody schema.ErrorResponse
if err := json.Unmarshal(body, &respBody); err != nil {
return nil
}
if respBody.Error.Code == "" && respBody.Error.Message == "" {
return nil
}
hcErr := ErrorFromSchema(respBody.Error)
hcErr.response = resp
return hcErr
}
const (
headerCorrelationID = "X-Correlation-Id"
)
@ -387,35 +344,34 @@ const (
type Response struct {
*http.Response
Meta Meta
// body holds a copy of the http.Response body that must be used within the handler
// chain. The http.Response.Body is reserved for external users.
body []byte
}
func (r *Response) readMeta(body []byte) error {
if h := r.Header.Get("RateLimit-Limit"); h != "" {
r.Meta.Ratelimit.Limit, _ = strconv.Atoi(h)
}
if h := r.Header.Get("RateLimit-Remaining"); h != "" {
r.Meta.Ratelimit.Remaining, _ = strconv.Atoi(h)
}
if h := r.Header.Get("RateLimit-Reset"); h != "" {
if ts, err := strconv.ParseInt(h, 10, 64); err == nil {
r.Meta.Ratelimit.Reset = time.Unix(ts, 0)
}
// populateBody copies the original [http.Response] body into the internal [Response] body
// property, and restore the original [http.Response] body as if it was untouched.
func (r *Response) populateBody() error {
// Read full response body and save it for later use
body, err := io.ReadAll(r.Body)
r.Body.Close()
if err != nil {
return err
}
r.body = body
if strings.HasPrefix(r.Header.Get("Content-Type"), "application/json") {
var s schema.MetaResponse
if err := json.Unmarshal(body, &s); err != nil {
return err
}
if s.Meta.Pagination != nil {
p := PaginationFromSchema(*s.Meta.Pagination)
r.Meta.Pagination = &p
}
}
// Restore the body as if it was untouched, as it might be read by external users
r.Body = io.NopCloser(bytes.NewReader(body))
return nil
}
// hasJSONBody returns whether the response has a JSON body.
func (r *Response) hasJSONBody() bool {
return len(r.body) > 0 && strings.HasPrefix(r.Header.Get("Content-Type"), "application/json")
}
// internalCorrelationID returns the unique ID of the request as set by the API. This ID can help with support requests,
// as it allows the people working on identify this request in particular.
func (r *Response) internalCorrelationID() string {

View File

@ -0,0 +1,101 @@
package hcloud
import (
"bytes"
"context"
"encoding/json"
"io"
)
func getRequest[Schema any](ctx context.Context, client *Client, url string) (Schema, *Response, error) {
var respBody Schema
req, err := client.NewRequest(ctx, "GET", url, nil)
if err != nil {
return respBody, nil, err
}
resp, err := client.Do(req, &respBody)
if err != nil {
return respBody, resp, err
}
return respBody, resp, nil
}
func postRequest[Schema any](ctx context.Context, client *Client, url string, reqBody any) (Schema, *Response, error) {
var respBody Schema
var reqBodyReader io.Reader
if reqBody != nil {
reqBodyBytes, err := json.Marshal(reqBody)
if err != nil {
return respBody, nil, err
}
reqBodyReader = bytes.NewReader(reqBodyBytes)
}
req, err := client.NewRequest(ctx, "POST", url, reqBodyReader)
if err != nil {
return respBody, nil, err
}
resp, err := client.Do(req, &respBody)
if err != nil {
return respBody, resp, err
}
return respBody, resp, nil
}
func putRequest[Schema any](ctx context.Context, client *Client, url string, reqBody any) (Schema, *Response, error) {
var respBody Schema
var reqBodyReader io.Reader
if reqBody != nil {
reqBodyBytes, err := json.Marshal(reqBody)
if err != nil {
return respBody, nil, err
}
reqBodyReader = bytes.NewReader(reqBodyBytes)
}
req, err := client.NewRequest(ctx, "PUT", url, reqBodyReader)
if err != nil {
return respBody, nil, err
}
resp, err := client.Do(req, &respBody)
if err != nil {
return respBody, resp, err
}
return respBody, resp, nil
}
func deleteRequest[Schema any](ctx context.Context, client *Client, url string) (Schema, *Response, error) {
var respBody Schema
req, err := client.NewRequest(ctx, "DELETE", url, nil)
if err != nil {
return respBody, nil, err
}
resp, err := client.Do(req, &respBody)
if err != nil {
return respBody, resp, err
}
return respBody, resp, nil
}
func deleteRequestNoResult(ctx context.Context, client *Client, url string) (*Response, error) {
req, err := client.NewRequest(ctx, "DELETE", url, nil)
if err != nil {
return nil, err
}
return client.Do(req, nil)
}

View File

@ -0,0 +1,56 @@
package hcloud
import (
"context"
"net/http"
)
// handler is an interface representing a client request transaction. The handler are
// meant to be chained, similarly to the [http.RoundTripper] interface.
//
// The handler chain is placed between the [Client] API operations and the
// [http.Client].
type handler interface {
Do(req *http.Request, v any) (resp *Response, err error)
}
// assembleHandlerChain assembles the chain of handlers used to make API requests.
//
// The order of the handlers is important.
func assembleHandlerChain(client *Client) handler {
// Start down the chain: sending the http request
h := newHTTPHandler(client.httpClient)
// Insert debug writer if enabled
if client.debugWriter != nil {
h = wrapDebugHandler(h, client.debugWriter)
}
// Read rate limit headers
h = wrapRateLimitHandler(h)
// Build error from response
h = wrapErrorHandler(h)
// Retry request if condition are met
h = wrapRetryHandler(h, client.retryBackoffFunc, client.retryMaxRetries)
// Finally parse the response body into the provided schema
h = wrapParseHandler(h)
return h
}
// cloneRequest clones both the request and the request body.
func cloneRequest(req *http.Request, ctx context.Context) (cloned *http.Request, err error) { //revive:disable:context-as-argument
cloned = req.Clone(ctx)
if req.ContentLength > 0 {
cloned.Body, err = req.GetBody()
if err != nil {
return nil, err
}
}
return cloned, nil
}

View File

@ -0,0 +1,50 @@
package hcloud
import (
"context"
"fmt"
"io"
"net/http"
"net/http/httputil"
)
func wrapDebugHandler(wrapped handler, output io.Writer) handler {
return &debugHandler{wrapped, output}
}
type debugHandler struct {
handler handler
output io.Writer
}
func (h *debugHandler) Do(req *http.Request, v any) (resp *Response, err error) {
// Clone the request, so we can redact the auth header, read the body
// and use a new context.
cloned, err := cloneRequest(req, context.Background())
if err != nil {
return nil, err
}
cloned.Header.Set("Authorization", "REDACTED")
dumpReq, err := httputil.DumpRequestOut(cloned, true)
if err != nil {
return nil, err
}
fmt.Fprintf(h.output, "--- Request:\n%s\n\n", dumpReq)
resp, err = h.handler.Do(req, v)
if err != nil {
return resp, err
}
dumpResp, err := httputil.DumpResponse(resp.Response, true)
if err != nil {
return nil, err
}
fmt.Fprintf(h.output, "--- Response:\n%s\n\n", dumpResp)
return resp, err
}

View File

@ -0,0 +1,53 @@
package hcloud
import (
"encoding/json"
"errors"
"fmt"
"net/http"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
var ErrStatusCode = errors.New("server responded with status code")
func wrapErrorHandler(wrapped handler) handler {
return &errorHandler{wrapped}
}
type errorHandler struct {
handler handler
}
func (h *errorHandler) Do(req *http.Request, v any) (resp *Response, err error) {
resp, err = h.handler.Do(req, v)
if err != nil {
return resp, err
}
if resp.StatusCode >= 400 && resp.StatusCode <= 599 {
err = errorFromBody(resp)
if err == nil {
err = fmt.Errorf("hcloud: %w %d", ErrStatusCode, resp.StatusCode)
}
}
return resp, err
}
func errorFromBody(resp *Response) error {
if !resp.hasJSONBody() {
return nil
}
var s schema.ErrorResponse
if err := json.Unmarshal(resp.body, &s); err != nil {
return nil // nolint: nilerr
}
if s.Error.Code == "" && s.Error.Message == "" {
return nil
}
hcErr := ErrorFromSchema(s.Error)
hcErr.response = resp
return hcErr
}

View File

@ -0,0 +1,28 @@
package hcloud
import (
"net/http"
)
func newHTTPHandler(httpClient *http.Client) handler {
return &httpHandler{httpClient}
}
type httpHandler struct {
httpClient *http.Client
}
func (h *httpHandler) Do(req *http.Request, _ interface{}) (*Response, error) {
httpResponse, err := h.httpClient.Do(req) //nolint: bodyclose
resp := &Response{Response: httpResponse}
if err != nil {
return resp, err
}
err = resp.populateBody()
if err != nil {
return resp, err
}
return resp, err
}

View File

@ -0,0 +1,50 @@
package hcloud
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
func wrapParseHandler(wrapped handler) handler {
return &parseHandler{wrapped}
}
type parseHandler struct {
handler handler
}
func (h *parseHandler) Do(req *http.Request, v any) (resp *Response, err error) {
// respBody is not needed down the handler chain
resp, err = h.handler.Do(req, nil)
if err != nil {
return resp, err
}
if resp.hasJSONBody() {
// Parse the response meta
var s schema.MetaResponse
if err := json.Unmarshal(resp.body, &s); err != nil {
return resp, fmt.Errorf("hcloud: error reading response meta data: %w", err)
}
if s.Meta.Pagination != nil {
p := PaginationFromSchema(*s.Meta.Pagination)
resp.Meta.Pagination = &p
}
}
// Parse the response schema
if v != nil {
if w, ok := v.(io.Writer); ok {
_, err = io.Copy(w, bytes.NewReader(resp.body))
} else {
err = json.Unmarshal(resp.body, v)
}
}
return resp, err
}

View File

@ -0,0 +1,36 @@
package hcloud
import (
"net/http"
"strconv"
"time"
)
func wrapRateLimitHandler(wrapped handler) handler {
return &rateLimitHandler{wrapped}
}
type rateLimitHandler struct {
handler handler
}
func (h *rateLimitHandler) Do(req *http.Request, v any) (resp *Response, err error) {
resp, err = h.handler.Do(req, v)
// Ensure the embedded [*http.Response] is not nil, e.g. on canceled context
if resp != nil && resp.Response != nil && resp.Response.Header != nil {
if h := resp.Header.Get("RateLimit-Limit"); h != "" {
resp.Meta.Ratelimit.Limit, _ = strconv.Atoi(h)
}
if h := resp.Header.Get("RateLimit-Remaining"); h != "" {
resp.Meta.Ratelimit.Remaining, _ = strconv.Atoi(h)
}
if h := resp.Header.Get("RateLimit-Reset"); h != "" {
if ts, err := strconv.ParseInt(h, 10, 64); err == nil {
resp.Meta.Ratelimit.Reset = time.Unix(ts, 0)
}
}
}
return resp, err
}

View File

@ -0,0 +1,84 @@
package hcloud
import (
"errors"
"net"
"net/http"
"time"
)
func wrapRetryHandler(wrapped handler, backoffFunc BackoffFunc, maxRetries int) handler {
return &retryHandler{wrapped, backoffFunc, maxRetries}
}
type retryHandler struct {
handler handler
backoffFunc BackoffFunc
maxRetries int
}
func (h *retryHandler) Do(req *http.Request, v any) (resp *Response, err error) {
retries := 0
ctx := req.Context()
for {
// Clone the request using the original context
cloned, err := cloneRequest(req, ctx)
if err != nil {
return nil, err
}
resp, err = h.handler.Do(cloned, v)
if err != nil {
// Beware the diversity of the errors:
// - request preparation
// - network connectivity
// - http status code (see [errorHandler])
if ctx.Err() != nil {
// early return if the context was canceled or timed out
return resp, err
}
if retries < h.maxRetries && retryPolicy(resp, err) {
select {
case <-ctx.Done():
return resp, err
case <-time.After(h.backoffFunc(retries)):
retries++
continue
}
}
}
return resp, err
}
}
func retryPolicy(resp *Response, err error) bool {
if err != nil {
var apiErr Error
var netErr net.Error
switch {
case errors.As(err, &apiErr):
switch apiErr.Code { //nolint:exhaustive
case ErrorCodeConflict:
return true
case ErrorCodeRateLimitExceeded:
return true
}
case errors.Is(err, ErrStatusCode):
switch resp.Response.StatusCode {
// 5xx errors
case http.StatusBadGateway, http.StatusGatewayTimeout:
return true
}
case errors.As(err, &netErr):
if netErr.Timeout() {
return true
}
}
}
return false
}

View File

@ -0,0 +1,85 @@
package hcloud
import (
"context"
"strconv"
)
// allFromSchemaFunc transform each item in the list using the FromSchema function, and
// returns the result.
func allFromSchemaFunc[T, V any](all []T, fn func(T) V) []V {
result := make([]V, len(all))
for i, t := range all {
result[i] = fn(t)
}
return result
}
// iterPages fetches each pages using the list function, and returns the result.
func iterPages[T any](listFn func(int) ([]*T, *Response, error)) ([]*T, error) {
page := 1
result := []*T{}
for {
pageResult, resp, err := listFn(page)
if err != nil {
return nil, err
}
result = append(result, pageResult...)
if resp.Meta.Pagination == nil || resp.Meta.Pagination.NextPage == 0 {
return result, nil
}
page = resp.Meta.Pagination.NextPage
}
}
// firstBy fetches a list of items using the list function, and returns the first item
// of the list if present otherwise nil.
func firstBy[T any](listFn func() ([]*T, *Response, error)) (*T, *Response, error) {
items, resp, err := listFn()
if len(items) == 0 {
return nil, resp, err
}
return items[0], resp, err
}
// firstByName is a wrapper around [firstBy], that checks if the provided name is not
// empty.
func firstByName[T any](name string, listFn func() ([]*T, *Response, error)) (*T, *Response, error) {
if name == "" {
return nil, nil, nil
}
return firstBy(listFn)
}
// getByIDOrName fetches the resource by ID when the identifier is an integer, otherwise
// by Name. To support resources that have a integer as Name, an additional attempt is
// made to fetch the resource by Name using the ID.
//
// Since API managed resources (locations, server types, ...) do not have integers as
// names, this function is only meaningful for user managed resources (ssh keys,
// servers).
func getByIDOrName[T any](
ctx context.Context,
getByIDFn func(ctx context.Context, id int64) (*T, *Response, error),
getByNameFn func(ctx context.Context, name string) (*T, *Response, error),
idOrName string,
) (*T, *Response, error) {
if id, err := strconv.ParseInt(idOrName, 10, 64); err == nil {
result, resp, err := getByIDFn(ctx, id)
if err != nil {
return result, resp, err
}
if result != nil {
return result, resp, err
}
// Fallback to get by Name if the resource was not found
}
return getByNameFn(ctx, idOrName)
}

View File

@ -6,6 +6,7 @@ import (
"net/url"
"strconv"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -32,32 +33,27 @@ type DatacenterClient struct {
// GetByID retrieves a datacenter by its ID. If the datacenter does not exist, nil is returned.
func (c *DatacenterClient) GetByID(ctx context.Context, id int64) (*Datacenter, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/datacenters/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/datacenters/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.DatacenterGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.DatacenterGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, resp, err
}
return DatacenterFromSchema(body.Datacenter), resp, nil
return DatacenterFromSchema(respBody.Datacenter), resp, nil
}
// GetByName retrieves a datacenter by its name. If the datacenter does not exist, nil is returned.
func (c *DatacenterClient) GetByName(ctx context.Context, name string) (*Datacenter, *Response, error) {
if name == "" {
return nil, nil, nil
}
datacenters, response, err := c.List(ctx, DatacenterListOpts{Name: name})
if len(datacenters) == 0 {
return nil, response, err
}
return datacenters[0], response, err
return firstByName(name, func() ([]*Datacenter, *Response, error) {
return c.List(ctx, DatacenterListOpts{Name: name})
})
}
// Get retrieves a datacenter by its ID if the input can be parsed as an integer, otherwise it
@ -92,22 +88,17 @@ func (l DatacenterListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *DatacenterClient) List(ctx context.Context, opts DatacenterListOpts) ([]*Datacenter, *Response, error) {
path := "/datacenters?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/datacenters?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.DatacenterListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.DatacenterListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
datacenters := make([]*Datacenter, 0, len(body.Datacenters))
for _, i := range body.Datacenters {
datacenters = append(datacenters, DatacenterFromSchema(i))
}
return datacenters, resp, nil
return allFromSchemaFunc(respBody.Datacenters, DatacenterFromSchema), resp, nil
}
// All returns all datacenters.
@ -117,20 +108,8 @@ func (c *DatacenterClient) All(ctx context.Context) ([]*Datacenter, error) {
// AllWithOpts returns all datacenters for the given options.
func (c *DatacenterClient) AllWithOpts(ctx context.Context, opts DatacenterListOpts) ([]*Datacenter, error) {
allDatacenters := []*Datacenter{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*Datacenter, *Response, error) {
opts.Page = page
datacenters, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allDatacenters = append(allDatacenters, datacenters...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allDatacenters, nil
}

View File

@ -4,6 +4,8 @@ import (
"errors"
"fmt"
"net"
"slices"
"strings"
)
// ErrorCode represents an error code returned from the API.
@ -29,6 +31,7 @@ const (
ErrorCodeRobotUnavailable ErrorCode = "robot_unavailable" // Robot was not available. The caller may retry the operation after a short delay
ErrorCodeResourceLocked ErrorCode = "resource_locked" // The resource is locked. The caller should contact support
ErrorUnsupportedError ErrorCode = "unsupported_error" // The given resource does not support this
ErrorDeprecatedAPIEndpoint ErrorCode = "deprecated_api_endpoint" // The request can not be answered because the API functionality was removed
// Server related error codes.
@ -126,11 +129,16 @@ type ErrorDetailsInvalidInputField struct {
Messages []string
}
// IsError returns whether err is an API error with the given error code.
func IsError(err error, code ErrorCode) bool {
// ErrorDetailsDeprecatedAPIEndpoint contains the details of a 'deprecated_api_endpoint' error.
type ErrorDetailsDeprecatedAPIEndpoint struct {
Announcement string
}
// IsError returns whether err is an API error with one of the given error codes.
func IsError(err error, code ...ErrorCode) bool {
var apiErr Error
ok := errors.As(err, &apiErr)
return ok && apiErr.Code == code
return ok && slices.Index(code, apiErr.Code) > -1
}
type InvalidIPError struct {
@ -148,3 +156,40 @@ type DNSNotFoundError struct {
func (e DNSNotFoundError) Error() string {
return fmt.Sprintf("dns for ip %s not found", e.IP.String())
}
// ArgumentError is a type of error returned when validating arguments.
type ArgumentError string
func (e ArgumentError) Error() string { return string(e) }
func newArgumentErrorf(format string, args ...any) ArgumentError {
return ArgumentError(fmt.Sprintf(format, args...))
}
func missingArgument(name string, obj any) error {
return newArgumentErrorf("missing argument '%s' [%T]", name, obj)
}
func invalidArgument(name string, obj any) error {
return newArgumentErrorf("invalid value '%v' for argument '%s' [%T]", obj, name, obj)
}
func missingField(obj any, field string) error {
return newArgumentErrorf("missing field [%s] in [%T]", field, obj)
}
func invalidFieldValue(obj any, field string, value any) error {
return newArgumentErrorf("invalid value '%v' for field [%s] in [%T]", value, field, obj)
}
func missingOneOfFields(obj any, fields ...string) error {
return newArgumentErrorf("missing one of fields [%s] in [%T]", strings.Join(fields, ", "), obj)
}
func mutuallyExclusiveFields(obj any, fields ...string) error {
return newArgumentErrorf("found mutually exclusive fields [%s] in [%T]", strings.Join(fields, ", "), obj)
}
func missingRequiredTogetherFields(obj any, fields ...string) error {
return newArgumentErrorf("missing required together fields [%s] in [%T]", strings.Join(fields, ", "), obj)
}

View File

@ -0,0 +1,11 @@
package actionutil
import "k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud"
// AppendNext return the action and the next actions in a new slice.
func AppendNext(action *hcloud.Action, nextActions []*hcloud.Action) []*hcloud.Action {
all := make([]*hcloud.Action, 0, 1+len(nextActions))
all = append(all, action)
all = append(all, nextActions...)
return all
}

View File

@ -0,0 +1,30 @@
package ctxutil
import (
"context"
"strings"
)
// key is an unexported type to prevents collisions with keys defined in other packages.
type key struct{}
// opPathKey is the key for operation path in Contexts.
var opPathKey = key{}
// SetOpPath processes the operation path and save it in the context before returning it.
func SetOpPath(ctx context.Context, path string) context.Context {
path, _, _ = strings.Cut(path, "?")
path = strings.ReplaceAll(path, "%d", "-")
path = strings.ReplaceAll(path, "%s", "-")
return context.WithValue(ctx, opPathKey, path)
}
// OpPath returns the operation path from the context.
func OpPath(ctx context.Context) string {
result, ok := ctx.Value(opPathKey).(string)
if !ok {
return ""
}
return result
}

View File

@ -0,0 +1,4 @@
// Package exp is a namespace that holds experimental features for the `hcloud-go` library.
//
// Breaking changes may occur without notice. Do not use in production!
package exp

View File

@ -0,0 +1,40 @@
package envutil
import (
"fmt"
"os"
"strings"
)
// LookupEnvWithFile retrieves the value of the environment variable named by the key (e.g.
// HCLOUD_TOKEN). If the previous environment variable is not set, it retrieves the
// content of the file located by a second environment variable named by the key +
// '_FILE' (.e.g HCLOUD_TOKEN_FILE).
//
// For both cases, the returned value may be empty.
//
// The value from the environment takes precedence over the value from the file.
func LookupEnvWithFile(key string) (string, error) {
// Check if the value is set in the environment (e.g. HCLOUD_TOKEN)
value, ok := os.LookupEnv(key)
if ok {
return value, nil
}
key += "_FILE"
// Check if the value is set via a file (e.g. HCLOUD_TOKEN_FILE)
valueFile, ok := os.LookupEnv(key)
if !ok {
// Validation of the value happens outside of this function
return "", nil
}
// Read the content of the file
valueBytes, err := os.ReadFile(valueFile)
if err != nil {
return "", fmt.Errorf("failed to read %s: %w", key, err)
}
return strings.TrimSpace(string(valueBytes)), nil
}

View File

@ -0,0 +1,19 @@
package randutil
import (
"crypto/rand"
"encoding/hex"
"fmt"
)
// GenerateID returns a hex encoded random string with a len of 8 chars similar to
// "2873fce7".
func GenerateID() string {
b := make([]byte, 4)
_, err := rand.Read(b)
if err != nil {
// Should never happen as of go1.24: https://github.com/golang/go/issues/66821
panic(fmt.Errorf("failed to generate random string: %w", err))
}
return hex.EncodeToString(b)
}

View File

@ -0,0 +1,86 @@
package sshutil
import (
"crypto"
"crypto/ed25519"
"encoding/pem"
"fmt"
"golang.org/x/crypto/ssh"
)
// GenerateKeyPair generates a new ed25519 ssh key pair, and returns the private key and
// the public key respectively.
func GenerateKeyPair() ([]byte, []byte, error) {
pub, priv, err := ed25519.GenerateKey(nil)
if err != nil {
return nil, nil, fmt.Errorf("could not generate key pair: %w", err)
}
privBytes, err := encodePrivateKey(priv)
if err != nil {
return nil, nil, fmt.Errorf("could not encode private key: %w", err)
}
pubBytes, err := encodePublicKey(pub)
if err != nil {
return nil, nil, fmt.Errorf("could not encode public key: %w", err)
}
return privBytes, pubBytes, nil
}
func encodePrivateKey(priv crypto.PrivateKey) ([]byte, error) {
privPem, err := ssh.MarshalPrivateKey(priv, "")
if err != nil {
return nil, err
}
return pem.EncodeToMemory(privPem), nil
}
func encodePublicKey(pub crypto.PublicKey) ([]byte, error) {
sshPub, err := ssh.NewPublicKey(pub)
if err != nil {
return nil, err
}
return ssh.MarshalAuthorizedKey(sshPub), nil
}
type privateKeyWithPublicKey interface {
crypto.PrivateKey
Public() crypto.PublicKey
}
// GeneratePublicKey generate a public key from the provided private key.
func GeneratePublicKey(privBytes []byte) ([]byte, error) {
priv, err := ssh.ParseRawPrivateKey(privBytes)
if err != nil {
return nil, fmt.Errorf("could not decode private key: %w", err)
}
key, ok := priv.(privateKeyWithPublicKey)
if !ok {
return nil, fmt.Errorf("private key doesn't export Public() crypto.PublicKey")
}
pubBytes, err := encodePublicKey(key.Public())
if err != nil {
return nil, fmt.Errorf("could not encode public key: %w", err)
}
return pubBytes, nil
}
// GetPublicKeyFingerprint generate the finger print for the provided public key.
func GetPublicKeyFingerprint(pubBytes []byte) (string, error) {
pub, _, _, _, err := ssh.ParseAuthorizedKey(pubBytes)
if err != nil {
return "", fmt.Errorf("could not decode public key: %w", err)
}
fingerprint := ssh.FingerprintLegacyMD5(pub)
return fingerprint, nil
}

View File

@ -0,0 +1,24 @@
package labelutil
import (
"fmt"
"sort"
"strings"
)
// Selector combines the label set into a [label selector](https://docs.hetzner.cloud/#label-selector) that only selects
// resources have all specified labels set.
//
// The selector string can be used to filter resources when listing, for example with [hcloud.ServerClient.AllWithOpts()].
func Selector(labels map[string]string) string {
selectors := make([]string, 0, len(labels))
for k, v := range labels {
selectors = append(selectors, fmt.Sprintf("%s=%s", k, v))
}
// Reproducible result for tests
sort.Strings(selectors)
return strings.Join(selectors, ",")
}

View File

@ -0,0 +1,123 @@
package mockutil
import (
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// Request describes a http request that a [httptest.Server] should receive, and the
// corresponding response to return.
//
// Additional checks on the request (e.g. request body) may be added using the
// [Request.Want] function.
//
// The response body is populated from either a JSON struct, or a JSON string.
type Request struct {
Method string
Path string
Want func(t *testing.T, r *http.Request)
Status int
JSON any
JSONRaw string
}
// Handler is using a [Server] to mock http requests provided by the user.
func Handler(t *testing.T, requests []Request) http.HandlerFunc {
t.Helper()
server := NewServer(t, requests)
t.Cleanup(server.close)
return server.handler
}
// NewServer returns a new mock server that closes itself at the end of the test.
func NewServer(t *testing.T, requests []Request) *Server {
t.Helper()
o := &Server{t: t}
o.Server = httptest.NewServer(http.HandlerFunc(o.handler))
t.Cleanup(o.close)
o.Expect(requests)
return o
}
// Server embeds a [httptest.Server] that answers HTTP calls with a list of expected [Request].
//
// Request matching is based on the request count, and the user provided request will be
// iterated over.
//
// A Server must be created using the [NewServer] function.
type Server struct {
*httptest.Server
t *testing.T
requests []Request
index int
}
// Expect adds requests to the list of requests expected by the [Server].
func (m *Server) Expect(requests []Request) {
m.requests = append(m.requests, requests...)
}
func (m *Server) close() {
m.t.Helper()
m.Server.Close()
assert.EqualValues(m.t, len(m.requests), m.index, "expected more calls")
}
func (m *Server) handler(w http.ResponseWriter, r *http.Request) {
if testing.Verbose() {
m.t.Logf("call %d: %s %s\n", m.index, r.Method, r.RequestURI)
}
if m.index >= len(m.requests) {
m.t.Fatalf("received unknown call %d", m.index)
}
expected := m.requests[m.index]
expectedCall := expected.Method
foundCall := r.Method
if expected.Path != "" {
expectedCall += " " + expected.Path
foundCall += " " + r.RequestURI
}
require.Equal(m.t, expectedCall, foundCall) // nolint: testifylint
if expected.Want != nil {
expected.Want(m.t, r)
}
switch {
case expected.JSON != nil:
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(expected.Status)
if err := json.NewEncoder(w).Encode(expected.JSON); err != nil {
m.t.Fatal(err)
}
case expected.JSONRaw != "":
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(expected.Status)
_, err := w.Write([]byte(expected.JSONRaw))
if err != nil {
m.t.Fatal(err)
}
default:
w.WriteHeader(expected.Status)
}
m.index++
}

View File

@ -1,16 +1,13 @@
package hcloud
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"net"
"net/url"
"strconv"
"time"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -96,41 +93,33 @@ type FirewallClient struct {
// GetByID retrieves a Firewall by its ID. If the Firewall does not exist, nil is returned.
func (c *FirewallClient) GetByID(ctx context.Context, id int64) (*Firewall, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/firewalls/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/firewalls/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.FirewallGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.FirewallGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, nil, err
return nil, resp, err
}
return FirewallFromSchema(body.Firewall), resp, nil
return FirewallFromSchema(respBody.Firewall), resp, nil
}
// GetByName retrieves a Firewall by its name. If the Firewall does not exist, nil is returned.
func (c *FirewallClient) GetByName(ctx context.Context, name string) (*Firewall, *Response, error) {
if name == "" {
return nil, nil, nil
}
firewalls, response, err := c.List(ctx, FirewallListOpts{Name: name})
if len(firewalls) == 0 {
return nil, response, err
}
return firewalls[0], response, err
return firstByName(name, func() ([]*Firewall, *Response, error) {
return c.List(ctx, FirewallListOpts{Name: name})
})
}
// Get retrieves a Firewall by its ID if the input can be parsed as an integer, otherwise it
// retrieves a Firewall by its name. If the Firewall does not exist, nil is returned.
func (c *FirewallClient) Get(ctx context.Context, idOrName string) (*Firewall, *Response, error) {
if id, err := strconv.ParseInt(idOrName, 10, 64); err == nil {
return c.GetByID(ctx, id)
}
return c.GetByName(ctx, idOrName)
return getByIDOrName(ctx, c.GetByID, c.GetByName, idOrName)
}
// FirewallListOpts specifies options for listing Firewalls.
@ -156,22 +145,17 @@ func (l FirewallListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *FirewallClient) List(ctx context.Context, opts FirewallListOpts) ([]*Firewall, *Response, error) {
path := "/firewalls?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/firewalls?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.FirewallListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.FirewallListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
firewalls := make([]*Firewall, 0, len(body.Firewalls))
for _, s := range body.Firewalls {
firewalls = append(firewalls, FirewallFromSchema(s))
}
return firewalls, resp, nil
return allFromSchemaFunc(respBody.Firewalls, FirewallFromSchema), resp, nil
}
// All returns all Firewalls.
@ -181,22 +165,10 @@ func (c *FirewallClient) All(ctx context.Context) ([]*Firewall, error) {
// AllWithOpts returns all Firewalls for the given options.
func (c *FirewallClient) AllWithOpts(ctx context.Context, opts FirewallListOpts) ([]*Firewall, error) {
allFirewalls := []*Firewall{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*Firewall, *Response, error) {
opts.Page = page
firewalls, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allFirewalls = append(allFirewalls, firewalls...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allFirewalls, nil
}
// FirewallCreateOpts specifies options for creating a new Firewall.
@ -210,7 +182,7 @@ type FirewallCreateOpts struct {
// Validate checks if options are valid.
func (o FirewallCreateOpts) Validate() error {
if o.Name == "" {
return errors.New("missing name")
return missingField(o, "Name")
}
return nil
}
@ -223,28 +195,27 @@ type FirewallCreateResult struct {
// Create creates a new Firewall.
func (c *FirewallClient) Create(ctx context.Context, opts FirewallCreateOpts) (FirewallCreateResult, *Response, error) {
const opPath = "/firewalls"
ctx = ctxutil.SetOpPath(ctx, opPath)
result := FirewallCreateResult{}
reqPath := opPath
if err := opts.Validate(); err != nil {
return FirewallCreateResult{}, nil, err
}
reqBody := firewallCreateOptsToSchema(opts)
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return FirewallCreateResult{}, nil, err
}
req, err := c.client.NewRequest(ctx, "POST", "/firewalls", bytes.NewReader(reqBodyData))
if err != nil {
return FirewallCreateResult{}, nil, err
return result, nil, err
}
respBody := schema.FirewallCreateResponse{}
resp, err := c.client.Do(req, &respBody)
reqBody := firewallCreateOptsToSchema(opts)
respBody, resp, err := postRequest[schema.FirewallCreateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return FirewallCreateResult{}, resp, err
}
result := FirewallCreateResult{
Firewall: FirewallFromSchema(respBody.Firewall),
Actions: ActionsFromSchema(respBody.Actions),
return result, resp, err
}
result.Firewall = FirewallFromSchema(respBody.Firewall)
result.Actions = ActionsFromSchema(respBody.Actions)
return result, resp, nil
}
@ -256,6 +227,11 @@ type FirewallUpdateOpts struct {
// Update updates a Firewall.
func (c *FirewallClient) Update(ctx context.Context, firewall *Firewall, opts FirewallUpdateOpts) (*Firewall, *Response, error) {
const opPath = "/firewalls/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, firewall.ID)
reqBody := schema.FirewallUpdateRequest{}
if opts.Name != "" {
reqBody.Name = &opts.Name
@ -263,32 +239,23 @@ func (c *FirewallClient) Update(ctx context.Context, firewall *Firewall, opts Fi
if opts.Labels != nil {
reqBody.Labels = &opts.Labels
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/firewalls/%d", firewall.ID)
req, err := c.client.NewRequest(ctx, "PUT", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.FirewallUpdateResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := putRequest[schema.FirewallUpdateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return FirewallFromSchema(respBody.Firewall), resp, nil
}
// Delete deletes a Firewall.
func (c *FirewallClient) Delete(ctx context.Context, firewall *Firewall) (*Response, error) {
req, err := c.client.NewRequest(ctx, "DELETE", fmt.Sprintf("/firewalls/%d", firewall.ID), nil)
if err != nil {
return nil, err
}
return c.client.Do(req, nil)
const opPath = "/firewalls/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, firewall.ID)
return deleteRequestNoResult(ctx, c.client, reqPath)
}
// FirewallSetRulesOpts specifies options for setting rules of a Firewall.
@ -298,75 +265,59 @@ type FirewallSetRulesOpts struct {
// SetRules sets the rules of a Firewall.
func (c *FirewallClient) SetRules(ctx context.Context, firewall *Firewall, opts FirewallSetRulesOpts) ([]*Action, *Response, error) {
const opPath = "/firewalls/%d/actions/set_rules"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, firewall.ID)
reqBody := firewallSetRulesOptsToSchema(opts)
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/firewalls/%d/actions/set_rules", firewall.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
var respBody schema.FirewallActionSetRulesResponse
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.FirewallActionSetRulesResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionsFromSchema(respBody.Actions), resp, nil
}
func (c *FirewallClient) ApplyResources(ctx context.Context, firewall *Firewall, resources []FirewallResource) ([]*Action, *Response, error) {
const opPath = "/firewalls/%d/actions/apply_to_resources"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, firewall.ID)
applyTo := make([]schema.FirewallResource, len(resources))
for i, r := range resources {
applyTo[i] = firewallResourceToSchema(r)
}
reqBody := schema.FirewallActionApplyToResourcesRequest{ApplyTo: applyTo}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/firewalls/%d/actions/apply_to_resources", firewall.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
var respBody schema.FirewallActionApplyToResourcesResponse
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.FirewallActionApplyToResourcesResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionsFromSchema(respBody.Actions), resp, nil
}
func (c *FirewallClient) RemoveResources(ctx context.Context, firewall *Firewall, resources []FirewallResource) ([]*Action, *Response, error) {
const opPath = "/firewalls/%d/actions/remove_from_resources"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, firewall.ID)
removeFrom := make([]schema.FirewallResource, len(resources))
for i, r := range resources {
removeFrom[i] = firewallResourceToSchema(r)
}
reqBody := schema.FirewallActionRemoveFromResourcesRequest{RemoveFrom: removeFrom}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/firewalls/%d/actions/remove_from_resources", firewall.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
var respBody schema.FirewallActionRemoveFromResourcesResponse
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.FirewallActionRemoveFromResourcesResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionsFromSchema(respBody.Actions), resp, nil
}

View File

@ -1,16 +1,13 @@
package hcloud
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"net"
"net/url"
"strconv"
"time"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -54,26 +51,21 @@ const (
// changeDNSPtr changes or resets the reverse DNS pointer for an IP address.
// Pass a nil ptr to reset the reverse DNS pointer to its default value.
func (f *FloatingIP) changeDNSPtr(ctx context.Context, client *Client, ip net.IP, ptr *string) (*Action, *Response, error) {
const opPath = "/floating_ips/%d/actions/change_dns_ptr"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, f.ID)
reqBody := schema.FloatingIPActionChangeDNSPtrRequest{
IP: ip.String(),
DNSPtr: ptr,
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/floating_ips/%d/actions/change_dns_ptr", f.ID)
req, err := client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.FloatingIPActionChangeDNSPtrResponse{}
resp, err := client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.FloatingIPActionChangeDNSPtrResponse](ctx, client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -97,41 +89,33 @@ type FloatingIPClient struct {
// GetByID retrieves a Floating IP by its ID. If the Floating IP does not exist,
// nil is returned.
func (c *FloatingIPClient) GetByID(ctx context.Context, id int64) (*FloatingIP, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/floating_ips/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/floating_ips/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.FloatingIPGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.FloatingIPGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, resp, err
}
return FloatingIPFromSchema(body.FloatingIP), resp, nil
return FloatingIPFromSchema(respBody.FloatingIP), resp, nil
}
// GetByName retrieves a Floating IP by its name. If the Floating IP does not exist, nil is returned.
func (c *FloatingIPClient) GetByName(ctx context.Context, name string) (*FloatingIP, *Response, error) {
if name == "" {
return nil, nil, nil
}
floatingIPs, response, err := c.List(ctx, FloatingIPListOpts{Name: name})
if len(floatingIPs) == 0 {
return nil, response, err
}
return floatingIPs[0], response, err
return firstByName(name, func() ([]*FloatingIP, *Response, error) {
return c.List(ctx, FloatingIPListOpts{Name: name})
})
}
// Get retrieves a Floating IP by its ID if the input can be parsed as an integer, otherwise it
// retrieves a Floating IP by its name. If the Floating IP does not exist, nil is returned.
func (c *FloatingIPClient) Get(ctx context.Context, idOrName string) (*FloatingIP, *Response, error) {
if id, err := strconv.ParseInt(idOrName, 10, 64); err == nil {
return c.GetByID(ctx, id)
}
return c.GetByName(ctx, idOrName)
return getByIDOrName(ctx, c.GetByID, c.GetByName, idOrName)
}
// FloatingIPListOpts specifies options for listing Floating IPs.
@ -157,22 +141,17 @@ func (l FloatingIPListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *FloatingIPClient) List(ctx context.Context, opts FloatingIPListOpts) ([]*FloatingIP, *Response, error) {
path := "/floating_ips?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/floating_ips?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.FloatingIPListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.FloatingIPListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
floatingIPs := make([]*FloatingIP, 0, len(body.FloatingIPs))
for _, s := range body.FloatingIPs {
floatingIPs = append(floatingIPs, FloatingIPFromSchema(s))
}
return floatingIPs, resp, nil
return allFromSchemaFunc(respBody.FloatingIPs, FloatingIPFromSchema), resp, nil
}
// All returns all Floating IPs.
@ -182,22 +161,10 @@ func (c *FloatingIPClient) All(ctx context.Context) ([]*FloatingIP, error) {
// AllWithOpts returns all Floating IPs for the given options.
func (c *FloatingIPClient) AllWithOpts(ctx context.Context, opts FloatingIPListOpts) ([]*FloatingIP, error) {
allFloatingIPs := []*FloatingIP{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*FloatingIP, *Response, error) {
opts.Page = page
floatingIPs, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allFloatingIPs = append(allFloatingIPs, floatingIPs...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allFloatingIPs, nil
}
// FloatingIPCreateOpts specifies options for creating a Floating IP.
@ -216,10 +183,10 @@ func (o FloatingIPCreateOpts) Validate() error {
case FloatingIPTypeIPv4, FloatingIPTypeIPv6:
break
default:
return errors.New("missing or invalid type")
return invalidFieldValue(o, "Type", o.Type)
}
if o.HomeLocation == nil && o.Server == nil {
return errors.New("one of home location or server is required")
return missingOneOfFields(o, "HomeLocation", "Server")
}
return nil
}
@ -232,8 +199,15 @@ type FloatingIPCreateResult struct {
// Create creates a Floating IP.
func (c *FloatingIPClient) Create(ctx context.Context, opts FloatingIPCreateOpts) (FloatingIPCreateResult, *Response, error) {
result := FloatingIPCreateResult{}
const opPath = "/floating_ips"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := opPath
if err := opts.Validate(); err != nil {
return FloatingIPCreateResult{}, nil, err
return result, nil, err
}
reqBody := schema.FloatingIPCreateRequest{
@ -250,38 +224,28 @@ func (c *FloatingIPClient) Create(ctx context.Context, opts FloatingIPCreateOpts
if opts.Labels != nil {
reqBody.Labels = &opts.Labels
}
reqBodyData, err := json.Marshal(reqBody)
respBody, resp, err := postRequest[schema.FloatingIPCreateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return FloatingIPCreateResult{}, nil, err
return result, resp, err
}
req, err := c.client.NewRequest(ctx, "POST", "/floating_ips", bytes.NewReader(reqBodyData))
if err != nil {
return FloatingIPCreateResult{}, nil, err
}
var respBody schema.FloatingIPCreateResponse
resp, err := c.client.Do(req, &respBody)
if err != nil {
return FloatingIPCreateResult{}, resp, err
}
var action *Action
result.FloatingIP = FloatingIPFromSchema(respBody.FloatingIP)
if respBody.Action != nil {
action = ActionFromSchema(*respBody.Action)
result.Action = ActionFromSchema(*respBody.Action)
}
return FloatingIPCreateResult{
FloatingIP: FloatingIPFromSchema(respBody.FloatingIP),
Action: action,
}, resp, nil
return result, resp, nil
}
// Delete deletes a Floating IP.
func (c *FloatingIPClient) Delete(ctx context.Context, floatingIP *FloatingIP) (*Response, error) {
req, err := c.client.NewRequest(ctx, "DELETE", fmt.Sprintf("/floating_ips/%d", floatingIP.ID), nil)
if err != nil {
return nil, err
}
return c.client.Do(req, nil)
const opPath = "/floating_ips/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, floatingIP.ID)
return deleteRequestNoResult(ctx, c.client, reqPath)
}
// FloatingIPUpdateOpts specifies options for updating a Floating IP.
@ -293,6 +257,11 @@ type FloatingIPUpdateOpts struct {
// Update updates a Floating IP.
func (c *FloatingIPClient) Update(ctx context.Context, floatingIP *FloatingIP, opts FloatingIPUpdateOpts) (*FloatingIP, *Response, error) {
const opPath = "/floating_ips/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, floatingIP.ID)
reqBody := schema.FloatingIPUpdateRequest{
Description: opts.Description,
Name: opts.Name,
@ -300,68 +269,48 @@ func (c *FloatingIPClient) Update(ctx context.Context, floatingIP *FloatingIP, o
if opts.Labels != nil {
reqBody.Labels = &opts.Labels
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/floating_ips/%d", floatingIP.ID)
req, err := c.client.NewRequest(ctx, "PUT", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.FloatingIPUpdateResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := putRequest[schema.FloatingIPUpdateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return FloatingIPFromSchema(respBody.FloatingIP), resp, nil
}
// Assign assigns a Floating IP to a server.
func (c *FloatingIPClient) Assign(ctx context.Context, floatingIP *FloatingIP, server *Server) (*Action, *Response, error) {
const opPath = "/floating_ips/%d/actions/assign"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, floatingIP.ID)
reqBody := schema.FloatingIPActionAssignRequest{
Server: server.ID,
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/floating_ips/%d/actions/assign", floatingIP.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
var respBody schema.FloatingIPActionAssignResponse
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.FloatingIPActionAssignResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
// Unassign unassigns a Floating IP from the currently assigned server.
func (c *FloatingIPClient) Unassign(ctx context.Context, floatingIP *FloatingIP) (*Action, *Response, error) {
var reqBody schema.FloatingIPActionUnassignRequest
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
const opPath = "/floating_ips/%d/actions/unassign"
ctx = ctxutil.SetOpPath(ctx, opPath)
path := fmt.Sprintf("/floating_ips/%d/actions/unassign", floatingIP.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
reqPath := fmt.Sprintf(opPath, floatingIP.ID)
var respBody schema.FloatingIPActionUnassignResponse
resp, err := c.client.Do(req, &respBody)
reqBody := schema.FloatingIPActionUnassignRequest{}
respBody, resp, err := postRequest[schema.FloatingIPActionUnassignResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -382,24 +331,19 @@ type FloatingIPChangeProtectionOpts struct {
// ChangeProtection changes the resource protection level of a Floating IP.
func (c *FloatingIPClient) ChangeProtection(ctx context.Context, floatingIP *FloatingIP, opts FloatingIPChangeProtectionOpts) (*Action, *Response, error) {
const opPath = "/floating_ips/%d/actions/change_protection"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, floatingIP.ID)
reqBody := schema.FloatingIPActionChangeProtectionRequest{
Delete: opts.Delete,
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/floating_ips/%d/actions/change_protection", floatingIP.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.FloatingIPActionChangeProtectionResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.FloatingIPActionChangeProtectionResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, err
return ActionFromSchema(respBody.Action), resp, nil
}

View File

@ -1,5 +1,34 @@
// Package hcloud is a library for the Hetzner Cloud API.
/*
Package hcloud is a library for the Hetzner Cloud API.
The Hetzner Cloud API reference is available at https://docs.hetzner.cloud.
Make sure to follow our API changelog available at https://docs.hetzner.cloud/changelog
(or the RRS feed available at https://docs.hetzner.cloud/changelog/feed.rss) to be
notified about additions, deprecations and removals.
# Retry mechanism
The [Client.Do] method will retry failed requests that match certain criteria. The
default retry interval is defined by an exponential backoff algorithm truncated to 60s
with jitter. The default maximal number of retries is 5.
The following rules defines when a request can be retried:
When the [http.Client] returned a network timeout error.
When the API returned an HTTP error, with the status code:
- [http.StatusBadGateway]
- [http.StatusGatewayTimeout]
When the API returned an application error, with the code:
- [ErrorCodeConflict]
- [ErrorCodeRateLimitExceeded]
Changes to the retry policy might occur between releases, and will not be considered
breaking changes.
*/
package hcloud
// Version is the library's version following Semantic Versioning.
const Version = "2.8.0" // x-release-please-version
const Version = "2.21.1" // x-releaser-pleaser-version

View File

@ -1,14 +1,13 @@
package hcloud
import (
"bytes"
"context"
"encoding/json"
"fmt"
"net/url"
"strconv"
"time"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -83,34 +82,29 @@ type ImageClient struct {
// GetByID retrieves an image by its ID. If the image does not exist, nil is returned.
func (c *ImageClient) GetByID(ctx context.Context, id int64) (*Image, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/images/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/images/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.ImageGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.ImageGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, nil, err
return nil, resp, err
}
return ImageFromSchema(body.Image), resp, nil
return ImageFromSchema(respBody.Image), resp, nil
}
// GetByName retrieves an image by its name. If the image does not exist, nil is returned.
//
// Deprecated: Use [ImageClient.GetByNameAndArchitecture] instead.
func (c *ImageClient) GetByName(ctx context.Context, name string) (*Image, *Response, error) {
if name == "" {
return nil, nil, nil
}
images, response, err := c.List(ctx, ImageListOpts{Name: name})
if len(images) == 0 {
return nil, response, err
}
return images[0], response, err
return firstByName(name, func() ([]*Image, *Response, error) {
return c.List(ctx, ImageListOpts{Name: name})
})
}
// GetByNameAndArchitecture retrieves an image by its name and architecture. If the image does not exist,
@ -118,14 +112,9 @@ func (c *ImageClient) GetByName(ctx context.Context, name string) (*Image, *Resp
// In contrast to [ImageClient.Get], this method also returns deprecated images. Depending on your needs you should
// check for this in your calling method.
func (c *ImageClient) GetByNameAndArchitecture(ctx context.Context, name string, architecture Architecture) (*Image, *Response, error) {
if name == "" {
return nil, nil, nil
}
images, response, err := c.List(ctx, ImageListOpts{Name: name, Architecture: []Architecture{architecture}, IncludeDeprecated: true})
if len(images) == 0 {
return nil, response, err
}
return images[0], response, err
return firstByName(name, func() ([]*Image, *Response, error) {
return c.List(ctx, ImageListOpts{Name: name, Architecture: []Architecture{architecture}, IncludeDeprecated: true})
})
}
// Get retrieves an image by its ID if the input can be parsed as an integer, otherwise it
@ -133,10 +122,7 @@ func (c *ImageClient) GetByNameAndArchitecture(ctx context.Context, name string,
//
// Deprecated: Use [ImageClient.GetForArchitecture] instead.
func (c *ImageClient) Get(ctx context.Context, idOrName string) (*Image, *Response, error) {
if id, err := strconv.ParseInt(idOrName, 10, 64); err == nil {
return c.GetByID(ctx, id)
}
return c.GetByName(ctx, idOrName)
return getByIDOrName(ctx, c.GetByID, c.GetByName, idOrName)
}
// GetForArchitecture retrieves an image by its ID if the input can be parsed as an integer, otherwise it
@ -145,10 +131,13 @@ func (c *ImageClient) Get(ctx context.Context, idOrName string) (*Image, *Respon
// In contrast to [ImageClient.Get], this method also returns deprecated images. Depending on your needs you should
// check for this in your calling method.
func (c *ImageClient) GetForArchitecture(ctx context.Context, idOrName string, architecture Architecture) (*Image, *Response, error) {
if id, err := strconv.ParseInt(idOrName, 10, 64); err == nil {
return c.GetByID(ctx, id)
}
return c.GetByNameAndArchitecture(ctx, idOrName, architecture)
return getByIDOrName(ctx,
c.GetByID,
func(ctx context.Context, name string) (*Image, *Response, error) {
return c.GetByNameAndArchitecture(ctx, name, architecture)
},
idOrName,
)
}
// ImageListOpts specifies options for listing images.
@ -194,22 +183,17 @@ func (l ImageListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *ImageClient) List(ctx context.Context, opts ImageListOpts) ([]*Image, *Response, error) {
path := "/images?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/images?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.ImageListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.ImageListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
images := make([]*Image, 0, len(body.Images))
for _, i := range body.Images {
images = append(images, ImageFromSchema(i))
}
return images, resp, nil
return allFromSchemaFunc(respBody.Images, ImageFromSchema), resp, nil
}
// All returns all images.
@ -219,31 +203,20 @@ func (c *ImageClient) All(ctx context.Context) ([]*Image, error) {
// AllWithOpts returns all images for the given options.
func (c *ImageClient) AllWithOpts(ctx context.Context, opts ImageListOpts) ([]*Image, error) {
allImages := []*Image{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*Image, *Response, error) {
opts.Page = page
images, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allImages = append(allImages, images...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allImages, nil
}
// Delete deletes an image.
func (c *ImageClient) Delete(ctx context.Context, image *Image) (*Response, error) {
req, err := c.client.NewRequest(ctx, "DELETE", fmt.Sprintf("/images/%d", image.ID), nil)
if err != nil {
return nil, err
}
return c.client.Do(req, nil)
const opPath = "/images/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, image.ID)
return deleteRequestNoResult(ctx, c.client, reqPath)
}
// ImageUpdateOpts specifies options for updating an image.
@ -255,6 +228,11 @@ type ImageUpdateOpts struct {
// Update updates an image.
func (c *ImageClient) Update(ctx context.Context, image *Image, opts ImageUpdateOpts) (*Image, *Response, error) {
const opPath = "/images/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, image.ID)
reqBody := schema.ImageUpdateRequest{
Description: opts.Description,
}
@ -264,22 +242,12 @@ func (c *ImageClient) Update(ctx context.Context, image *Image, opts ImageUpdate
if opts.Labels != nil {
reqBody.Labels = &opts.Labels
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/images/%d", image.ID)
req, err := c.client.NewRequest(ctx, "PUT", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.ImageUpdateResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := putRequest[schema.ImageUpdateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ImageFromSchema(respBody.Image), resp, nil
}
@ -290,24 +258,19 @@ type ImageChangeProtectionOpts struct {
// ChangeProtection changes the resource protection level of an image.
func (c *ImageClient) ChangeProtection(ctx context.Context, image *Image, opts ImageChangeProtectionOpts) (*Action, *Response, error) {
const opPath = "/images/%d/actions/change_protection"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, image.ID)
reqBody := schema.ImageActionChangeProtectionRequest{
Delete: opts.Delete,
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/images/%d/actions/change_protection", image.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.ImageActionChangeProtectionResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.ImageActionChangeProtectionResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, err
return ActionFromSchema(respBody.Action), resp, nil
}

View File

@ -1,6 +1,7 @@
package instrumentation
import (
"errors"
"fmt"
"net/http"
"regexp"
@ -9,6 +10,8 @@ import (
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
)
type Instrumenter struct {
@ -22,7 +25,12 @@ func New(subsystemIdentifier string, instrumentationRegistry prometheus.Register
}
// InstrumentedRoundTripper returns an instrumented round tripper.
func (i *Instrumenter) InstrumentedRoundTripper() http.RoundTripper {
func (i *Instrumenter) InstrumentedRoundTripper(transport http.RoundTripper) http.RoundTripper {
// By default, http client would use DefaultTransport on nil, but we internally are relying on it being configured
if transport == nil {
transport = http.DefaultTransport
}
inFlightRequestsGauge := registerOrReuse(
i.instrumentationRegistry,
prometheus.NewGauge(prometheus.GaugeOpts{
@ -57,7 +65,7 @@ func (i *Instrumenter) InstrumentedRoundTripper() http.RoundTripper {
return promhttp.InstrumentRoundTripperInFlight(inFlightRequestsGauge,
promhttp.InstrumentRoundTripperDuration(requestLatencyHistogram,
i.instrumentRoundTripperEndpoint(requestsPerEndpointCounter,
http.DefaultTransport,
transport,
),
),
)
@ -73,8 +81,17 @@ func (i *Instrumenter) instrumentRoundTripperEndpoint(counter *prometheus.Counte
return func(r *http.Request) (*http.Response, error) {
resp, err := next.RoundTrip(r)
if err == nil {
statusCode := strconv.Itoa(resp.StatusCode)
counter.WithLabelValues(statusCode, strings.ToLower(resp.Request.Method), preparePathForLabel(resp.Request.URL.Path)).Inc()
apiEndpoint := ctxutil.OpPath(r.Context())
// If the request does not set the operation path, we must construct it. Happens e.g. for
// user crafted requests.
if apiEndpoint == "" {
apiEndpoint = preparePathForLabel(resp.Request.URL.Path)
}
counter.WithLabelValues(
strconv.Itoa(resp.StatusCode),
strings.ToLower(resp.Request.Method),
apiEndpoint,
).Inc()
}
return resp, err
@ -87,9 +104,10 @@ func (i *Instrumenter) instrumentRoundTripperEndpoint(counter *prometheus.Counte
func registerOrReuse[C prometheus.Collector](registry prometheus.Registerer, collector C) C {
err := registry.Register(collector)
if err != nil {
var arErr prometheus.AlreadyRegisteredError
// If we get a AlreadyRegisteredError we can return the existing collector
if are, ok := err.(prometheus.AlreadyRegisteredError); ok {
if existingCollector, ok := are.ExistingCollector.(C); ok {
if errors.As(err, &arErr) {
if existingCollector, ok := arErr.ExistingCollector.(C); ok {
collector = existingCollector
} else {
panic("received incompatible existing collector")
@ -102,16 +120,16 @@ func registerOrReuse[C prometheus.Collector](registry prometheus.Registerer, col
return collector
}
var pathLabelRegexp = regexp.MustCompile("[^a-z/_]+")
func preparePathForLabel(path string) string {
path = strings.ToLower(path)
// replace the /v1/ that indicated the API version
path, _ = strings.CutPrefix(path, "/v1")
// replace all numbers and chars that are not a-z, / or _
reg := regexp.MustCompile("[^a-z/_]+")
path = reg.ReplaceAllString(path, "")
path = pathLabelRegexp.ReplaceAllString(path, "-")
// replace all artifacts of number replacement (//)
path = strings.ReplaceAll(path, "//", "/")
// replace the /v/ that indicated the API version
return strings.Replace(path, "/v/", "/", 1)
return path
}

View File

@ -4,9 +4,9 @@ import (
"context"
"fmt"
"net/url"
"strconv"
"time"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -40,40 +40,32 @@ type ISOClient struct {
// GetByID retrieves an ISO by its ID.
func (c *ISOClient) GetByID(ctx context.Context, id int64) (*ISO, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/isos/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/isos/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.ISOGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.ISOGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, resp, err
}
return ISOFromSchema(body.ISO), resp, nil
return ISOFromSchema(respBody.ISO), resp, nil
}
// GetByName retrieves an ISO by its name.
func (c *ISOClient) GetByName(ctx context.Context, name string) (*ISO, *Response, error) {
if name == "" {
return nil, nil, nil
}
isos, response, err := c.List(ctx, ISOListOpts{Name: name})
if len(isos) == 0 {
return nil, response, err
}
return isos[0], response, err
return firstByName(name, func() ([]*ISO, *Response, error) {
return c.List(ctx, ISOListOpts{Name: name})
})
}
// Get retrieves an ISO by its ID if the input can be parsed as an integer, otherwise it retrieves an ISO by its name.
func (c *ISOClient) Get(ctx context.Context, idOrName string) (*ISO, *Response, error) {
if id, err := strconv.ParseInt(idOrName, 10, 64); err == nil {
return c.GetByID(ctx, id)
}
return c.GetByName(ctx, idOrName)
return getByIDOrName(ctx, c.GetByID, c.GetByName, idOrName)
}
// ISOListOpts specifies options for listing isos.
@ -115,22 +107,17 @@ func (l ISOListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *ISOClient) List(ctx context.Context, opts ISOListOpts) ([]*ISO, *Response, error) {
path := "/isos?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/isos?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.ISOListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.ISOListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
isos := make([]*ISO, 0, len(body.ISOs))
for _, i := range body.ISOs {
isos = append(isos, ISOFromSchema(i))
}
return isos, resp, nil
return allFromSchemaFunc(respBody.ISOs, ISOFromSchema), resp, nil
}
// All returns all ISOs.
@ -140,20 +127,8 @@ func (c *ISOClient) All(ctx context.Context) ([]*ISO, error) {
// AllWithOpts returns all ISOs for the given options.
func (c *ISOClient) AllWithOpts(ctx context.Context, opts ISOListOpts) ([]*ISO, error) {
allISOs := []*ISO{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*ISO, *Response, error) {
opts.Page = page
isos, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allISOs = append(allISOs, isos...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allISOs, nil
}

View File

@ -1,16 +1,14 @@
package hcloud
import (
"bytes"
"context"
"encoding/json"
"fmt"
"net"
"net/http"
"net/url"
"strconv"
"time"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -200,26 +198,21 @@ type LoadBalancerProtection struct {
// changeDNSPtr changes or resets the reverse DNS pointer for an IP address.
// Pass a nil ptr to reset the reverse DNS pointer to its default value.
func (lb *LoadBalancer) changeDNSPtr(ctx context.Context, client *Client, ip net.IP, ptr *string) (*Action, *Response, error) {
const opPath = "/load_balancers/%d/actions/change_dns_ptr"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, lb.ID)
reqBody := schema.LoadBalancerActionChangeDNSPtrRequest{
IP: ip.String(),
DNSPtr: ptr,
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/load_balancers/%d/actions/change_dns_ptr", lb.ID)
req, err := client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.LoadBalancerActionChangeDNSPtrResponse{}
resp, err := client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionChangeDNSPtrResponse](ctx, client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -243,41 +236,33 @@ type LoadBalancerClient struct {
// GetByID retrieves a Load Balancer by its ID. If the Load Balancer does not exist, nil is returned.
func (c *LoadBalancerClient) GetByID(ctx context.Context, id int64) (*LoadBalancer, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/load_balancers/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/load_balancers/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.LoadBalancerGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.LoadBalancerGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, nil, err
return nil, resp, err
}
return LoadBalancerFromSchema(body.LoadBalancer), resp, nil
return LoadBalancerFromSchema(respBody.LoadBalancer), resp, nil
}
// GetByName retrieves a Load Balancer by its name. If the Load Balancer does not exist, nil is returned.
func (c *LoadBalancerClient) GetByName(ctx context.Context, name string) (*LoadBalancer, *Response, error) {
if name == "" {
return nil, nil, nil
}
LoadBalancer, response, err := c.List(ctx, LoadBalancerListOpts{Name: name})
if len(LoadBalancer) == 0 {
return nil, response, err
}
return LoadBalancer[0], response, err
return firstByName(name, func() ([]*LoadBalancer, *Response, error) {
return c.List(ctx, LoadBalancerListOpts{Name: name})
})
}
// Get retrieves a Load Balancer by its ID if the input can be parsed as an integer, otherwise it
// retrieves a Load Balancer by its name. If the Load Balancer does not exist, nil is returned.
func (c *LoadBalancerClient) Get(ctx context.Context, idOrName string) (*LoadBalancer, *Response, error) {
if id, err := strconv.ParseInt(idOrName, 10, 64); err == nil {
return c.GetByID(ctx, id)
}
return c.GetByName(ctx, idOrName)
return getByIDOrName(ctx, c.GetByID, c.GetByName, idOrName)
}
// LoadBalancerListOpts specifies options for listing Load Balancers.
@ -303,22 +288,17 @@ func (l LoadBalancerListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *LoadBalancerClient) List(ctx context.Context, opts LoadBalancerListOpts) ([]*LoadBalancer, *Response, error) {
path := "/load_balancers?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/load_balancers?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.LoadBalancerListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.LoadBalancerListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
LoadBalancers := make([]*LoadBalancer, 0, len(body.LoadBalancers))
for _, s := range body.LoadBalancers {
LoadBalancers = append(LoadBalancers, LoadBalancerFromSchema(s))
}
return LoadBalancers, resp, nil
return allFromSchemaFunc(respBody.LoadBalancers, LoadBalancerFromSchema), resp, nil
}
// All returns all Load Balancers.
@ -328,22 +308,10 @@ func (c *LoadBalancerClient) All(ctx context.Context) ([]*LoadBalancer, error) {
// AllWithOpts returns all Load Balancers for the given options.
func (c *LoadBalancerClient) AllWithOpts(ctx context.Context, opts LoadBalancerListOpts) ([]*LoadBalancer, error) {
allLoadBalancers := []*LoadBalancer{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*LoadBalancer, *Response, error) {
opts.Page = page
LoadBalancers, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allLoadBalancers = append(allLoadBalancers, LoadBalancers...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allLoadBalancers, nil
}
// LoadBalancerUpdateOpts specifies options for updating a Load Balancer.
@ -354,6 +322,11 @@ type LoadBalancerUpdateOpts struct {
// Update updates a Load Balancer.
func (c *LoadBalancerClient) Update(ctx context.Context, loadBalancer *LoadBalancer, opts LoadBalancerUpdateOpts) (*LoadBalancer, *Response, error) {
const opPath = "/load_balancers/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
reqBody := schema.LoadBalancerUpdateRequest{}
if opts.Name != "" {
reqBody.Name = &opts.Name
@ -361,22 +334,12 @@ func (c *LoadBalancerClient) Update(ctx context.Context, loadBalancer *LoadBalan
if opts.Labels != nil {
reqBody.Labels = &opts.Labels
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/load_balancers/%d", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "PUT", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.LoadBalancerUpdateResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := putRequest[schema.LoadBalancerUpdateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return LoadBalancerFromSchema(respBody.LoadBalancer), resp, nil
}
@ -472,73 +435,61 @@ type LoadBalancerCreateResult struct {
// Create creates a new Load Balancer.
func (c *LoadBalancerClient) Create(ctx context.Context, opts LoadBalancerCreateOpts) (LoadBalancerCreateResult, *Response, error) {
const opPath = "/load_balancers"
ctx = ctxutil.SetOpPath(ctx, opPath)
result := LoadBalancerCreateResult{}
reqPath := opPath
reqBody := loadBalancerCreateOptsToSchema(opts)
reqBodyData, err := json.Marshal(reqBody)
respBody, resp, err := postRequest[schema.LoadBalancerCreateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return LoadBalancerCreateResult{}, nil, err
}
req, err := c.client.NewRequest(ctx, "POST", "/load_balancers", bytes.NewReader(reqBodyData))
if err != nil {
return LoadBalancerCreateResult{}, nil, err
return result, resp, err
}
respBody := schema.LoadBalancerCreateResponse{}
resp, err := c.client.Do(req, &respBody)
if err != nil {
return LoadBalancerCreateResult{}, resp, err
}
return LoadBalancerCreateResult{
LoadBalancer: LoadBalancerFromSchema(respBody.LoadBalancer),
Action: ActionFromSchema(respBody.Action),
}, resp, nil
result.LoadBalancer = LoadBalancerFromSchema(respBody.LoadBalancer)
result.Action = ActionFromSchema(respBody.Action)
return result, resp, nil
}
// Delete deletes a Load Balancer.
func (c *LoadBalancerClient) Delete(ctx context.Context, loadBalancer *LoadBalancer) (*Response, error) {
req, err := c.client.NewRequest(ctx, "DELETE", fmt.Sprintf("/load_balancers/%d", loadBalancer.ID), nil)
if err != nil {
return nil, err
}
return c.client.Do(req, nil)
const opPath = "/load_balancers/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
return deleteRequestNoResult(ctx, c.client, reqPath)
}
func (c *LoadBalancerClient) addTarget(ctx context.Context, loadBalancer *LoadBalancer, reqBody schema.LoadBalancerActionAddTargetRequest) (*Action, *Response, error) {
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
const opPath = "/load_balancers/%d/actions/add_target"
ctx = ctxutil.SetOpPath(ctx, opPath)
path := fmt.Sprintf("/load_balancers/%d/actions/add_target", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
var respBody schema.LoadBalancerActionAddTargetResponse
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionAddTargetResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
func (c *LoadBalancerClient) removeTarget(ctx context.Context, loadBalancer *LoadBalancer, reqBody schema.LoadBalancerActionRemoveTargetRequest) (*Action, *Response, error) {
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
const opPath = "/load_balancers/%d/actions/remove_target"
ctx = ctxutil.SetOpPath(ctx, opPath)
path := fmt.Sprintf("/load_balancers/%d/actions/remove_target", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
var respBody schema.LoadBalancerActionRemoveTargetResponse
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionRemoveTargetResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -671,23 +622,18 @@ type LoadBalancerAddServiceOptsHealthCheckHTTP struct {
// AddService adds a service to a Load Balancer.
func (c *LoadBalancerClient) AddService(ctx context.Context, loadBalancer *LoadBalancer, opts LoadBalancerAddServiceOpts) (*Action, *Response, error) {
const opPath = "/load_balancers/%d/actions/add_service"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
reqBody := loadBalancerAddServiceOptsToSchema(opts)
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/load_balancers/%d/actions/add_service", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
var respBody schema.LoadBalancerActionAddServiceResponse
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionAddServiceResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -732,48 +678,38 @@ type LoadBalancerUpdateServiceOptsHealthCheckHTTP struct {
// UpdateService updates a Load Balancer service.
func (c *LoadBalancerClient) UpdateService(ctx context.Context, loadBalancer *LoadBalancer, listenPort int, opts LoadBalancerUpdateServiceOpts) (*Action, *Response, error) {
const opPath = "/load_balancers/%d/actions/update_service"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
reqBody := loadBalancerUpdateServiceOptsToSchema(opts)
reqBody.ListenPort = listenPort
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/load_balancers/%d/actions/update_service", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
var respBody schema.LoadBalancerActionUpdateServiceResponse
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionUpdateServiceResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
// DeleteService deletes a Load Balancer service.
func (c *LoadBalancerClient) DeleteService(ctx context.Context, loadBalancer *LoadBalancer, listenPort int) (*Action, *Response, error) {
const opPath = "/load_balancers/%d/actions/delete_service"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
reqBody := schema.LoadBalancerDeleteServiceRequest{
ListenPort: listenPort,
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/load_balancers/%d/actions/delete_service", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
var respBody schema.LoadBalancerDeleteServiceResponse
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerDeleteServiceResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -784,26 +720,21 @@ type LoadBalancerChangeProtectionOpts struct {
// ChangeProtection changes the resource protection level of a Load Balancer.
func (c *LoadBalancerClient) ChangeProtection(ctx context.Context, loadBalancer *LoadBalancer, opts LoadBalancerChangeProtectionOpts) (*Action, *Response, error) {
const opPath = "/load_balancers/%d/actions/change_protection"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
reqBody := schema.LoadBalancerActionChangeProtectionRequest{
Delete: opts.Delete,
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/load_balancers/%d/actions/change_protection", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.LoadBalancerActionChangeProtectionResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionChangeProtectionResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, err
return ActionFromSchema(respBody.Action), resp, nil
}
// LoadBalancerChangeAlgorithmOpts specifies options for changing the algorithm of a Load Balancer.
@ -813,26 +744,21 @@ type LoadBalancerChangeAlgorithmOpts struct {
// ChangeAlgorithm changes the algorithm of a Load Balancer.
func (c *LoadBalancerClient) ChangeAlgorithm(ctx context.Context, loadBalancer *LoadBalancer, opts LoadBalancerChangeAlgorithmOpts) (*Action, *Response, error) {
const opPath = "/load_balancers/%d/actions/change_algorithm"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
reqBody := schema.LoadBalancerActionChangeAlgorithmRequest{
Type: string(opts.Type),
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/load_balancers/%d/actions/change_algorithm", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.LoadBalancerActionChangeAlgorithmResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionChangeAlgorithmResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, err
return ActionFromSchema(respBody.Action), resp, nil
}
// LoadBalancerAttachToNetworkOpts specifies options for attaching a Load Balancer to a network.
@ -843,29 +769,24 @@ type LoadBalancerAttachToNetworkOpts struct {
// AttachToNetwork attaches a Load Balancer to a network.
func (c *LoadBalancerClient) AttachToNetwork(ctx context.Context, loadBalancer *LoadBalancer, opts LoadBalancerAttachToNetworkOpts) (*Action, *Response, error) {
const opPath = "/load_balancers/%d/actions/attach_to_network"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
reqBody := schema.LoadBalancerActionAttachToNetworkRequest{
Network: opts.Network.ID,
}
if opts.IP != nil {
reqBody.IP = Ptr(opts.IP.String())
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/load_balancers/%d/actions/attach_to_network", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.LoadBalancerActionAttachToNetworkResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionAttachToNetworkResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, err
return ActionFromSchema(respBody.Action), resp, nil
}
// LoadBalancerDetachFromNetworkOpts specifies options for detaching a Load Balancer from a network.
@ -875,56 +796,51 @@ type LoadBalancerDetachFromNetworkOpts struct {
// DetachFromNetwork detaches a Load Balancer from a network.
func (c *LoadBalancerClient) DetachFromNetwork(ctx context.Context, loadBalancer *LoadBalancer, opts LoadBalancerDetachFromNetworkOpts) (*Action, *Response, error) {
const opPath = "/load_balancers/%d/actions/detach_from_network"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
reqBody := schema.LoadBalancerActionDetachFromNetworkRequest{
Network: opts.Network.ID,
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/load_balancers/%d/actions/detach_from_network", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.LoadBalancerActionDetachFromNetworkResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionDetachFromNetworkResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, err
return ActionFromSchema(respBody.Action), resp, nil
}
// EnablePublicInterface enables the Load Balancer's public network interface.
func (c *LoadBalancerClient) EnablePublicInterface(ctx context.Context, loadBalancer *LoadBalancer) (*Action, *Response, error) {
path := fmt.Sprintf("/load_balancers/%d/actions/enable_public_interface", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, nil)
if err != nil {
return nil, nil, err
}
respBody := schema.LoadBalancerActionEnablePublicInterfaceResponse{}
resp, err := c.client.Do(req, &respBody)
const opPath = "/load_balancers/%d/actions/enable_public_interface"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
respBody, resp, err := postRequest[schema.LoadBalancerActionEnablePublicInterfaceResponse](ctx, c.client, reqPath, nil)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, err
return ActionFromSchema(respBody.Action), resp, nil
}
// DisablePublicInterface disables the Load Balancer's public network interface.
func (c *LoadBalancerClient) DisablePublicInterface(ctx context.Context, loadBalancer *LoadBalancer) (*Action, *Response, error) {
path := fmt.Sprintf("/load_balancers/%d/actions/disable_public_interface", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, nil)
if err != nil {
return nil, nil, err
}
respBody := schema.LoadBalancerActionDisablePublicInterfaceResponse{}
resp, err := c.client.Do(req, &respBody)
const opPath = "/load_balancers/%d/actions/disable_public_interface"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
respBody, resp, err := postRequest[schema.LoadBalancerActionDisablePublicInterfaceResponse](ctx, c.client, reqPath, nil)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, err
return ActionFromSchema(respBody.Action), resp, nil
}
// LoadBalancerChangeTypeOpts specifies options for changing a Load Balancer's type.
@ -934,28 +850,21 @@ type LoadBalancerChangeTypeOpts struct {
// ChangeType changes a Load Balancer's type.
func (c *LoadBalancerClient) ChangeType(ctx context.Context, loadBalancer *LoadBalancer, opts LoadBalancerChangeTypeOpts) (*Action, *Response, error) {
const opPath = "/load_balancers/%d/actions/change_type"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, loadBalancer.ID)
reqBody := schema.LoadBalancerActionChangeTypeRequest{}
if opts.LoadBalancerType.ID != 0 {
reqBody.LoadBalancerType = opts.LoadBalancerType.ID
} else {
reqBody.LoadBalancerType = opts.LoadBalancerType.Name
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
if opts.LoadBalancerType.ID != 0 || opts.LoadBalancerType.Name != "" {
reqBody.LoadBalancerType = schema.IDOrName{ID: opts.LoadBalancerType.ID, Name: opts.LoadBalancerType.Name}
}
path := fmt.Sprintf("/load_balancers/%d/actions/change_type", loadBalancer.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.LoadBalancerActionChangeTypeResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.LoadBalancerActionChangeTypeResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -980,32 +889,34 @@ type LoadBalancerGetMetricsOpts struct {
Step int
}
func (o *LoadBalancerGetMetricsOpts) addQueryParams(req *http.Request) error {
query := req.URL.Query()
func (o LoadBalancerGetMetricsOpts) Validate() error {
if len(o.Types) == 0 {
return fmt.Errorf("no metric types specified")
return missingField(o, "Types")
}
if o.Start.IsZero() {
return missingField(o, "Start")
}
if o.End.IsZero() {
return missingField(o, "End")
}
return nil
}
func (o LoadBalancerGetMetricsOpts) values() url.Values {
query := url.Values{}
for _, typ := range o.Types {
query.Add("type", string(typ))
}
if o.Start.IsZero() {
return fmt.Errorf("no start time specified")
}
query.Add("start", o.Start.Format(time.RFC3339))
if o.End.IsZero() {
return fmt.Errorf("no end time specified")
}
query.Add("end", o.End.Format(time.RFC3339))
if o.Step > 0 {
query.Add("step", strconv.Itoa(o.Step))
}
req.URL.RawQuery = query.Encode()
return nil
return query
}
// LoadBalancerMetrics contains the metrics requested for a Load Balancer.
@ -1024,31 +935,32 @@ type LoadBalancerMetricsValue struct {
// GetMetrics obtains metrics for a Load Balancer.
func (c *LoadBalancerClient) GetMetrics(
ctx context.Context, lb *LoadBalancer, opts LoadBalancerGetMetricsOpts,
ctx context.Context, loadBalancer *LoadBalancer, opts LoadBalancerGetMetricsOpts,
) (*LoadBalancerMetrics, *Response, error) {
var respBody schema.LoadBalancerGetMetricsResponse
const opPath = "/load_balancers/%d/metrics?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
if lb == nil {
return nil, nil, fmt.Errorf("illegal argument: load balancer is nil")
if loadBalancer == nil {
return nil, nil, missingArgument("loadBalancer", loadBalancer)
}
path := fmt.Sprintf("/load_balancers/%d/metrics", lb.ID)
req, err := c.client.NewRequest(ctx, "GET", path, nil)
if err := opts.Validate(); err != nil {
return nil, nil, err
}
reqPath := fmt.Sprintf(opPath, loadBalancer.ID, opts.values().Encode())
respBody, resp, err := getRequest[schema.LoadBalancerGetMetricsResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, fmt.Errorf("new request: %v", err)
return nil, resp, err
}
if err := opts.addQueryParams(req); err != nil {
return nil, nil, fmt.Errorf("add query params: %v", err)
}
resp, err := c.client.Do(req, &respBody)
metrics, err := loadBalancerMetricsFromSchema(&respBody)
if err != nil {
return nil, nil, fmt.Errorf("get metrics: %v", err)
return nil, nil, fmt.Errorf("convert response body: %w", err)
}
ms, err := loadBalancerMetricsFromSchema(&respBody)
if err != nil {
return nil, nil, fmt.Errorf("convert response body: %v", err)
}
return ms, resp, nil
return metrics, resp, nil
}
// ChangeDNSPtr changes or resets the reverse DNS pointer for a Load Balancer.

View File

@ -6,6 +6,7 @@ import (
"net/url"
"strconv"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -29,32 +30,27 @@ type LoadBalancerTypeClient struct {
// GetByID retrieves a Load Balancer type by its ID. If the Load Balancer type does not exist, nil is returned.
func (c *LoadBalancerTypeClient) GetByID(ctx context.Context, id int64) (*LoadBalancerType, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/load_balancer_types/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/load_balancer_types/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.LoadBalancerTypeGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.LoadBalancerTypeGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, nil, err
return nil, resp, err
}
return LoadBalancerTypeFromSchema(body.LoadBalancerType), resp, nil
return LoadBalancerTypeFromSchema(respBody.LoadBalancerType), resp, nil
}
// GetByName retrieves a Load Balancer type by its name. If the Load Balancer type does not exist, nil is returned.
func (c *LoadBalancerTypeClient) GetByName(ctx context.Context, name string) (*LoadBalancerType, *Response, error) {
if name == "" {
return nil, nil, nil
}
LoadBalancerTypes, response, err := c.List(ctx, LoadBalancerTypeListOpts{Name: name})
if len(LoadBalancerTypes) == 0 {
return nil, response, err
}
return LoadBalancerTypes[0], response, err
return firstByName(name, func() ([]*LoadBalancerType, *Response, error) {
return c.List(ctx, LoadBalancerTypeListOpts{Name: name})
})
}
// Get retrieves a Load Balancer type by its ID if the input can be parsed as an integer, otherwise it
@ -89,22 +85,17 @@ func (l LoadBalancerTypeListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *LoadBalancerTypeClient) List(ctx context.Context, opts LoadBalancerTypeListOpts) ([]*LoadBalancerType, *Response, error) {
path := "/load_balancer_types?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/load_balancer_types?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.LoadBalancerTypeListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.LoadBalancerTypeListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
LoadBalancerTypes := make([]*LoadBalancerType, 0, len(body.LoadBalancerTypes))
for _, s := range body.LoadBalancerTypes {
LoadBalancerTypes = append(LoadBalancerTypes, LoadBalancerTypeFromSchema(s))
}
return LoadBalancerTypes, resp, nil
return allFromSchemaFunc(respBody.LoadBalancerTypes, LoadBalancerTypeFromSchema), resp, nil
}
// All returns all Load Balancer types.
@ -114,20 +105,8 @@ func (c *LoadBalancerTypeClient) All(ctx context.Context) ([]*LoadBalancerType,
// AllWithOpts returns all Load Balancer types for the given options.
func (c *LoadBalancerTypeClient) AllWithOpts(ctx context.Context, opts LoadBalancerTypeListOpts) ([]*LoadBalancerType, error) {
allLoadBalancerTypes := []*LoadBalancerType{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*LoadBalancerType, *Response, error) {
opts.Page = page
LoadBalancerTypes, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allLoadBalancerTypes = append(allLoadBalancerTypes, LoadBalancerTypes...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allLoadBalancerTypes, nil
}

View File

@ -6,6 +6,7 @@ import (
"net/url"
"strconv"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -28,32 +29,27 @@ type LocationClient struct {
// GetByID retrieves a location by its ID. If the location does not exist, nil is returned.
func (c *LocationClient) GetByID(ctx context.Context, id int64) (*Location, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/locations/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/locations/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.LocationGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.LocationGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, resp, err
}
return LocationFromSchema(body.Location), resp, nil
return LocationFromSchema(respBody.Location), resp, nil
}
// GetByName retrieves an location by its name. If the location does not exist, nil is returned.
func (c *LocationClient) GetByName(ctx context.Context, name string) (*Location, *Response, error) {
if name == "" {
return nil, nil, nil
}
locations, response, err := c.List(ctx, LocationListOpts{Name: name})
if len(locations) == 0 {
return nil, response, err
}
return locations[0], response, err
return firstByName(name, func() ([]*Location, *Response, error) {
return c.List(ctx, LocationListOpts{Name: name})
})
}
// Get retrieves a location by its ID if the input can be parsed as an integer, otherwise it
@ -88,22 +84,17 @@ func (l LocationListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *LocationClient) List(ctx context.Context, opts LocationListOpts) ([]*Location, *Response, error) {
path := "/locations?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/locations?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.LocationListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.LocationListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
locations := make([]*Location, 0, len(body.Locations))
for _, i := range body.Locations {
locations = append(locations, LocationFromSchema(i))
}
return locations, resp, nil
return allFromSchemaFunc(respBody.Locations, LocationFromSchema), resp, nil
}
// All returns all locations.
@ -113,20 +104,8 @@ func (c *LocationClient) All(ctx context.Context) ([]*Location, error) {
// AllWithOpts returns all locations for the given options.
func (c *LocationClient) AllWithOpts(ctx context.Context, opts LocationListOpts) ([]*Location, error) {
allLocations := []*Location{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*Location, *Response, error) {
opts.Page = page
locations, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allLocations = append(allLocations, locations...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allLocations, nil
}

View File

@ -1,6 +1,8 @@
package metadata
import (
"bytes"
"context"
"fmt"
"io"
"net"
@ -11,6 +13,7 @@ import (
"github.com/prometheus/client_golang/prometheus"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/internal/instrumentation"
)
@ -72,24 +75,33 @@ func NewClient(options ...ClientOption) *Client {
if client.instrumentationRegistry != nil {
i := instrumentation.New("metadata", client.instrumentationRegistry)
client.httpClient.Transport = i.InstrumentedRoundTripper()
client.httpClient.Transport = i.InstrumentedRoundTripper(client.httpClient.Transport)
}
return client
}
// get executes an HTTP request against the API.
func (c *Client) get(path string) (string, error) {
url := c.endpoint + path
resp, err := c.httpClient.Get(url)
ctx := ctxutil.SetOpPath(context.Background(), path)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, c.endpoint+path, http.NoBody)
if err != nil {
return "", err
}
resp, err := c.httpClient.Do(req)
if err != nil {
return "", err
}
defer resp.Body.Close()
bodyBytes, err := io.ReadAll(resp.Body)
if err != nil {
return "", err
}
body := string(bodyBytes)
body := string(bytes.TrimSpace(bodyBytes))
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return body, fmt.Errorf("response status was %d", resp.StatusCode)
}

View File

@ -1,16 +1,13 @@
package hcloud
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"net"
"net/url"
"strconv"
"time"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/exp/ctxutil"
"k8s.io/autoscaler/cluster-autoscaler/cloudprovider/hetzner/hcloud-go/hcloud/schema"
)
@ -19,9 +16,10 @@ type NetworkZone string
// List of available Network Zones.
const (
NetworkZoneEUCentral NetworkZone = "eu-central"
NetworkZoneUSEast NetworkZone = "us-east"
NetworkZoneUSWest NetworkZone = "us-west"
NetworkZoneEUCentral NetworkZone = "eu-central"
NetworkZoneUSEast NetworkZone = "us-east"
NetworkZoneUSWest NetworkZone = "us-west"
NetworkZoneAPSouthEast NetworkZone = "ap-southeast"
)
// NetworkSubnetType specifies a type of a subnet.
@ -29,22 +27,30 @@ type NetworkSubnetType string
// List of available network subnet types.
const (
NetworkSubnetTypeCloud NetworkSubnetType = "cloud"
NetworkSubnetTypeServer NetworkSubnetType = "server"
// Used to connect cloud servers and load balancers.
NetworkSubnetTypeCloud NetworkSubnetType = "cloud"
// Used to connect cloud servers and load balancers.
//
// Deprecated: Use [NetworkSubnetTypeCloud] instead.
NetworkSubnetTypeServer NetworkSubnetType = "server"
// Used to connect cloud servers and load balancers with dedicated servers.
//
// See https://docs.hetzner.com/cloud/networks/connect-dedi-vswitch/
NetworkSubnetTypeVSwitch NetworkSubnetType = "vswitch"
)
// Network represents a network in the Hetzner Cloud.
type Network struct {
ID int64
Name string
Created time.Time
IPRange *net.IPNet
Subnets []NetworkSubnet
Routes []NetworkRoute
Servers []*Server
Protection NetworkProtection
Labels map[string]string
ID int64
Name string
Created time.Time
IPRange *net.IPNet
Subnets []NetworkSubnet
Routes []NetworkRoute
Servers []*Server
LoadBalancers []*LoadBalancer
Protection NetworkProtection
Labels map[string]string
// ExposeRoutesToVSwitch indicates if the routes from this network should be exposed to the vSwitch connection.
ExposeRoutesToVSwitch bool
@ -78,41 +84,33 @@ type NetworkClient struct {
// GetByID retrieves a network by its ID. If the network does not exist, nil is returned.
func (c *NetworkClient) GetByID(ctx context.Context, id int64) (*Network, *Response, error) {
req, err := c.client.NewRequest(ctx, "GET", fmt.Sprintf("/networks/%d", id), nil)
if err != nil {
return nil, nil, err
}
const opPath = "/networks/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
var body schema.NetworkGetResponse
resp, err := c.client.Do(req, &body)
reqPath := fmt.Sprintf(opPath, id)
respBody, resp, err := getRequest[schema.NetworkGetResponse](ctx, c.client, reqPath)
if err != nil {
if IsError(err, ErrorCodeNotFound) {
return nil, resp, nil
}
return nil, nil, err
return nil, resp, err
}
return NetworkFromSchema(body.Network), resp, nil
return NetworkFromSchema(respBody.Network), resp, nil
}
// GetByName retrieves a network by its name. If the network does not exist, nil is returned.
func (c *NetworkClient) GetByName(ctx context.Context, name string) (*Network, *Response, error) {
if name == "" {
return nil, nil, nil
}
Networks, response, err := c.List(ctx, NetworkListOpts{Name: name})
if len(Networks) == 0 {
return nil, response, err
}
return Networks[0], response, err
return firstByName(name, func() ([]*Network, *Response, error) {
return c.List(ctx, NetworkListOpts{Name: name})
})
}
// Get retrieves a network by its ID if the input can be parsed as an integer, otherwise it
// retrieves a network by its name. If the network does not exist, nil is returned.
func (c *NetworkClient) Get(ctx context.Context, idOrName string) (*Network, *Response, error) {
if id, err := strconv.ParseInt(idOrName, 10, 64); err == nil {
return c.GetByID(ctx, id)
}
return c.GetByName(ctx, idOrName)
return getByIDOrName(ctx, c.GetByID, c.GetByName, idOrName)
}
// NetworkListOpts specifies options for listing networks.
@ -138,22 +136,17 @@ func (l NetworkListOpts) values() url.Values {
// Please note that filters specified in opts are not taken into account
// when their value corresponds to their zero value or when they are empty.
func (c *NetworkClient) List(ctx context.Context, opts NetworkListOpts) ([]*Network, *Response, error) {
path := "/networks?" + opts.values().Encode()
req, err := c.client.NewRequest(ctx, "GET", path, nil)
const opPath = "/networks?%s"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, opts.values().Encode())
respBody, resp, err := getRequest[schema.NetworkListResponse](ctx, c.client, reqPath)
if err != nil {
return nil, nil, err
return nil, resp, err
}
var body schema.NetworkListResponse
resp, err := c.client.Do(req, &body)
if err != nil {
return nil, nil, err
}
Networks := make([]*Network, 0, len(body.Networks))
for _, s := range body.Networks {
Networks = append(Networks, NetworkFromSchema(s))
}
return Networks, resp, nil
return allFromSchemaFunc(respBody.Networks, NetworkFromSchema), resp, nil
}
// All returns all networks.
@ -163,31 +156,20 @@ func (c *NetworkClient) All(ctx context.Context) ([]*Network, error) {
// AllWithOpts returns all networks for the given options.
func (c *NetworkClient) AllWithOpts(ctx context.Context, opts NetworkListOpts) ([]*Network, error) {
allNetworks := []*Network{}
err := c.client.all(func(page int) (*Response, error) {
return iterPages(func(page int) ([]*Network, *Response, error) {
opts.Page = page
Networks, resp, err := c.List(ctx, opts)
if err != nil {
return resp, err
}
allNetworks = append(allNetworks, Networks...)
return resp, nil
return c.List(ctx, opts)
})
if err != nil {
return nil, err
}
return allNetworks, nil
}
// Delete deletes a network.
func (c *NetworkClient) Delete(ctx context.Context, network *Network) (*Response, error) {
req, err := c.client.NewRequest(ctx, "DELETE", fmt.Sprintf("/networks/%d", network.ID), nil)
if err != nil {
return nil, err
}
return c.client.Do(req, nil)
const opPath = "/networks/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, network.ID)
return deleteRequestNoResult(ctx, c.client, reqPath)
}
// NetworkUpdateOpts specifies options for updating a network.
@ -201,6 +183,11 @@ type NetworkUpdateOpts struct {
// Update updates a network.
func (c *NetworkClient) Update(ctx context.Context, network *Network, opts NetworkUpdateOpts) (*Network, *Response, error) {
const opPath = "/networks/%d"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, network.ID)
reqBody := schema.NetworkUpdateRequest{
Name: opts.Name,
}
@ -211,22 +198,11 @@ func (c *NetworkClient) Update(ctx context.Context, network *Network, opts Netwo
reqBody.ExposeRoutesToVSwitch = opts.ExposeRoutesToVSwitch
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/networks/%d", network.ID)
req, err := c.client.NewRequest(ctx, "PUT", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.NetworkUpdateResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := putRequest[schema.NetworkUpdateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return NetworkFromSchema(respBody.Network), resp, nil
}
@ -245,16 +221,21 @@ type NetworkCreateOpts struct {
// Validate checks if options are valid.
func (o NetworkCreateOpts) Validate() error {
if o.Name == "" {
return errors.New("missing name")
return missingField(o, "Name")
}
if o.IPRange == nil || o.IPRange.String() == "" {
return errors.New("missing IP range")
return missingField(o, "IPRange")
}
return nil
}
// Create creates a new network.
func (c *NetworkClient) Create(ctx context.Context, opts NetworkCreateOpts) (*Network, *Response, error) {
const opPath = "/networks"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := opPath
if err := opts.Validate(); err != nil {
return nil, nil, err
}
@ -283,20 +264,12 @@ func (c *NetworkClient) Create(ctx context.Context, opts NetworkCreateOpts) (*Ne
if opts.Labels != nil {
reqBody.Labels = &opts.Labels
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
req, err := c.client.NewRequest(ctx, "POST", "/networks", bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.NetworkCreateResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.NetworkCreateResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return NetworkFromSchema(respBody.Network), resp, nil
}
@ -307,25 +280,20 @@ type NetworkChangeIPRangeOpts struct {
// ChangeIPRange changes the IP range of a network.
func (c *NetworkClient) ChangeIPRange(ctx context.Context, network *Network, opts NetworkChangeIPRangeOpts) (*Action, *Response, error) {
const opPath = "/networks/%d/actions/change_ip_range"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, network.ID)
reqBody := schema.NetworkActionChangeIPRangeRequest{
IPRange: opts.IPRange.String(),
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/networks/%d/actions/change_ip_range", network.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.NetworkActionChangeIPRangeResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.NetworkActionChangeIPRangeResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -336,6 +304,11 @@ type NetworkAddSubnetOpts struct {
// AddSubnet adds a subnet to a network.
func (c *NetworkClient) AddSubnet(ctx context.Context, network *Network, opts NetworkAddSubnetOpts) (*Action, *Response, error) {
const opPath = "/networks/%d/actions/add_subnet"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, network.ID)
reqBody := schema.NetworkActionAddSubnetRequest{
Type: string(opts.Subnet.Type),
NetworkZone: string(opts.Subnet.NetworkZone),
@ -346,22 +319,12 @@ func (c *NetworkClient) AddSubnet(ctx context.Context, network *Network, opts Ne
if opts.Subnet.VSwitchID != 0 {
reqBody.VSwitchID = opts.Subnet.VSwitchID
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/networks/%d/actions/add_subnet", network.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.NetworkActionAddSubnetResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.NetworkActionAddSubnetResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -372,25 +335,20 @@ type NetworkDeleteSubnetOpts struct {
// DeleteSubnet deletes a subnet from a network.
func (c *NetworkClient) DeleteSubnet(ctx context.Context, network *Network, opts NetworkDeleteSubnetOpts) (*Action, *Response, error) {
const opPath = "/networks/%d/actions/delete_subnet"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, network.ID)
reqBody := schema.NetworkActionDeleteSubnetRequest{
IPRange: opts.Subnet.IPRange.String(),
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/networks/%d/actions/delete_subnet", network.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.NetworkActionDeleteSubnetResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.NetworkActionDeleteSubnetResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -401,26 +359,21 @@ type NetworkAddRouteOpts struct {
// AddRoute adds a route to a network.
func (c *NetworkClient) AddRoute(ctx context.Context, network *Network, opts NetworkAddRouteOpts) (*Action, *Response, error) {
const opPath = "/networks/%d/actions/add_route"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, network.ID)
reqBody := schema.NetworkActionAddRouteRequest{
Destination: opts.Route.Destination.String(),
Gateway: opts.Route.Gateway.String(),
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/networks/%d/actions/add_route", network.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.NetworkActionAddSubnetResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.NetworkActionAddRouteResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -431,26 +384,21 @@ type NetworkDeleteRouteOpts struct {
// DeleteRoute deletes a route from a network.
func (c *NetworkClient) DeleteRoute(ctx context.Context, network *Network, opts NetworkDeleteRouteOpts) (*Action, *Response, error) {
const opPath = "/networks/%d/actions/delete_route"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, network.ID)
reqBody := schema.NetworkActionDeleteRouteRequest{
Destination: opts.Route.Destination.String(),
Gateway: opts.Route.Gateway.String(),
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/networks/%d/actions/delete_route", network.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.NetworkActionDeleteSubnetResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.NetworkActionDeleteRouteResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, nil
}
@ -461,24 +409,19 @@ type NetworkChangeProtectionOpts struct {
// ChangeProtection changes the resource protection level of a network.
func (c *NetworkClient) ChangeProtection(ctx context.Context, network *Network, opts NetworkChangeProtectionOpts) (*Action, *Response, error) {
const opPath = "/networks/%d/actions/change_protection"
ctx = ctxutil.SetOpPath(ctx, opPath)
reqPath := fmt.Sprintf(opPath, network.ID)
reqBody := schema.NetworkActionChangeProtectionRequest{
Delete: opts.Delete,
}
reqBodyData, err := json.Marshal(reqBody)
if err != nil {
return nil, nil, err
}
path := fmt.Sprintf("/networks/%d/actions/change_protection", network.ID)
req, err := c.client.NewRequest(ctx, "POST", path, bytes.NewReader(reqBodyData))
if err != nil {
return nil, nil, err
}
respBody := schema.NetworkActionChangeProtectionResponse{}
resp, err := c.client.Do(req, &respBody)
respBody, resp, err := postRequest[schema.NetworkActionChangeProtectionResponse](ctx, c.client, reqPath, reqBody)
if err != nil {
return nil, resp, err
}
return ActionFromSchema(respBody.Action), resp, err
return ActionFromSchema(respBody.Action), resp, nil
}

Some files were not shown because too many files have changed in this diff Show More