Maksym Fuhol
6cbf801235
Patch TestCleaningSoftTaintsInScaleDown to be compatible with new isScaleDownInCooldown signature.
2025-04-15 10:02:44 +00:00
Kubernetes Prow Robot
18f10c1e00
Merge pull request #7997 from damikag/scale-down-slo-update-metric
...
Emit scale down metric even when there is no scale down candidates.
2025-04-14 13:13:05 -07:00
Kubernetes Prow Robot
25ad4c2c26
Merge pull request #8011 from jinglinliang/allow-third-party-sts-to-drain
...
Allow draining when StatefulSet kind has custom API Group
2025-04-11 13:04:41 -07:00
jinglinliang
25af21c515
Add unit test to allow draining when StatefulSet kind has custom API Group
2025-04-09 14:03:00 -07:00
jinglinliang
cc3a9f5d10
Allow draining when StatefulSet kind has custom API Group
2025-04-09 14:03:00 -07:00
Kubernetes Prow Robot
87a67e3aa0
Merge pull request #7995 from abdelrahman882/cleaningSoftTaintsTesting
...
Add unit test for cleaning deletion soft taint in scale down cool down
2025-04-09 10:48:39 -07:00
Omran
dd125d4ef1
Add unit test for cleaning deletion soft taints in scale down cool down
2025-04-09 08:21:49 +00:00
Daniel Kłobuszewski
f1a44d89cf
Remove outdated GCE cloudprovider owners
2025-04-08 13:24:20 +02:00
Kubernetes Prow Robot
4bc861d097
Merge pull request #7923 from Uladzislau97/nap-resilience
...
Improve resilience of diskTypes requests.
2025-04-08 04:22:40 -07:00
Kubernetes Prow Robot
7c28f52f93
Merge pull request #7854 from AppliedIntuition/master
...
Fix 2 bugs in the OCI integration
2025-04-07 09:14:42 -07:00
Vlad Vasilyeu
93e21d05e2
Replace diskTypes.aggregatedList request with diskTypes.list in FetchAvailableDiskTypes.
2025-04-07 07:50:29 +00:00
Kubernetes Prow Robot
1de2160986
Merge pull request #7908 from Preisschild/fix/capi-patch-instead-update
...
CA: Use Patch to Scale clusterapi nodepools
2025-04-03 07:16:48 -07:00
Kubernetes Prow Robot
dc91330f6a
Merge pull request #7989 from loick111/feature/clusterapi-instances-status
...
ClusterAPI: Report machine phases to improve cluster-autoscaler decisions
2025-04-01 07:44:38 -07:00
Florian Ströger
ecb572a945
Use Patch to Scale clusterapi nodepools to avoid modification conflicts
...
Issue: https://github.com/kubernetes/autoscaler/issues/7872
Signed-off-by: Florian Ströger <stroeger@youniqx.com>
2025-04-01 08:26:45 +02:00
Damika Gamlath
49b271f75a
Emit scale down metric even when there is no scale down candidates.
...
Update scale scaleDownInCooldown definition to skip considering zero candidates as a reason to be in scaleDownInCooldown state
2025-03-31 14:46:23 +00:00
Loick MAHIEUX
005a42b9af
feat(cluster-autoscaler): improve nodes listing in ClusterAPI provider
...
Add improved error handling for machines phase in the ClusterAPI node group
implementation. When a machine is in Deleting/Failed/Pending phase, mark the cloudprovider.Instance
with a status for cluster-autoscaler recovery actions.
The changes:
- Enhance Nodes listing to allow reporting the machine phase in Instance status
- Add error status reporting for failed machines
This change helps identify and manage failed machines more effectively,
allowing the autoscaler to make better scaling decisions.
2025-03-28 15:07:34 +01:00
Kubernetes Prow Robot
db597b1acd
Merge pull request #7966 from pmendelski/htnap-events-for-tpu
...
Emit event on successful async scale-up
2025-03-27 02:32:34 -07:00
Kubernetes Prow Robot
7b6996469b
Merge pull request #7973 from jincong8973/master
...
feat: add ignoreDaemonSetsUtilization and zeroOrMaxNodeScaling to NodeGroupAutoscalingOptions
2025-03-27 00:00:35 -07:00
KrJin
e713b51bd6
feat: add missing field zeroOrMaxNodeScaling and ignoreDaemonSetsUtilization to NodeGroupAutoscalingOptions
...
[squashed]Add field IgnoreDaemonSetsUtilization and zeroOrMaxNodeScaling that missing in externalgrpc proto
2025-03-27 11:28:12 +08:00
Kubernetes Prow Robot
2ca5b44652
Merge pull request #7977 from elmiko/refactor-findscalableproviderids
...
refactor findScalableResourceProviderIDs in clusterapi
2025-03-26 10:22:43 -07:00
elmiko
5e1fc195a3
refactor findScalableResourceProviderIDs in clusterapi
...
this change refactors the function so that it each distinct machine
state can be filtered more easily. the unit tests have been
supplemented, but not changed to ensure that the functionality continues
to work as expected. these changes are to help better detect edge cases
where machines can be transiting through pending phase and might be
removed by the autoscaler.
2025-03-26 12:41:09 -04:00
mendelski
0c522556c5
Emit event on successful async scale-up
2025-03-26 13:11:03 +00:00
Kubernetes Prow Robot
63309979ba
Merge pull request #7826 from Azure/rakechill/update-skewer-version-master
...
Update skewer version to v0.0.19 (master)
2025-03-26 01:30:34 -07:00
Kubernetes Prow Robot
e95e35c94e
Merge pull request #7965 from DigitalVeer/master
...
pricing changes: updated z3 pricing information
2025-03-25 10:48:33 -07:00
Kubernetes Prow Robot
52cd68a498
Merge pull request #7954 from abdelrahman882/FixScaledownCoolDown
...
Fix cool down status condition to trigger scale down
2025-03-24 07:38:33 -07:00
Omran
696af986ed
Add time based drainability rule for non-pdb-assigned system pods
2025-03-24 12:47:16 +00:00
Veer Singh
a226478f53
pricing changes: updated z3 pricing information
2025-03-24 04:06:26 +00:00
eric-higgins-ai
8da9a7b4af
add log messages
2025-03-21 14:02:10 -07:00
eric-higgins-ai
370c8eb78e
Revert "Address comment"
...
This reverts commit 233d5c6e4d .
2025-03-21 13:58:56 -07:00
Omran
2bbe859154
Fix cool down status condition to trigger scale down
2025-03-21 10:21:00 +00:00
Kubernetes Prow Robot
990ab04d85
Merge pull request #7949 from ystryuchkov/master
...
Fix log for node filtering in static autoscaler
2025-03-21 01:58:31 -07:00
Kubernetes Prow Robot
10bb546f9e
Merge pull request #7944 from norbertcyran/proactive-scale-up-sample-scheduled
...
Allow using scheduled pods as samples in proactive scale up
2025-03-20 07:06:33 -07:00
Yahia Badr
5268053d1e
Update default value for scaleDownDelayAfterDelete ( #7957 )
...
* Update default value for scaleDownDelayAfterDelete
Setting defaut value for scaleDownDelayAfterDelete to be scanInterval
instead of 0.
* Revert the change and fix the flag description
2025-03-20 07:04:32 -07:00
Jack Francis
7b5e10156e
s/nodeHasValidProviderID/isProviderIDNormalized
...
Signed-off-by: Jack Francis <jackfrancis@gmail.com>
2025-03-19 12:30:33 -07:00
Jack Francis
4aa465764c
capi: node and provider ID accounting funcs
...
Signed-off-by: Jack Francis <jackfrancis@gmail.com>
2025-03-19 11:40:19 -07:00
elmiko
71d3595cb7
improve failed machine detection in clusterapi
...
This change makes it so that when a failed machine is found during the
`findScalableResourceProviderIDs` it will always gain a normalized
provider ID with failure guard prepended. This is to ensure that
machines which have gained a provider ID from the infrastructure and
then later go into a failed state can be properly removed by the
autoscaler when it wants to correct the size of a node group.
2025-03-19 12:34:29 -04:00
Yuriy Stryuchkov
105429c31e
Fix log for node filtering in static autoscaler
...
Add missing tests
2025-03-19 15:49:34 +01:00
Norbert Cyran
9a5e3d9f3d
Allow using scheduled pods as samples in proactive scale up
2025-03-19 12:33:39 +01:00
elmiko
003e6cd67c
make DecreaseTargetSize more accurate for clusterapi
...
this change ensures that when DecreaseTargetSize is counting the nodes
that it does not include any instances which are considered to be
pending (i.e. not having a node ref), deleting, or are failed. this change will
allow the core autoscaler to then decrease the size of the node group
accordingly, instead of raising an error.
This change also add some code to the unit tests to make detection of
this condition easier.
2025-03-17 19:34:07 -04:00
Kubernetes Prow Robot
214215f320
Merge pull request #7918 from x13n/master
...
Fix incorrect usage of klog Warningf function
2025-03-13 06:11:48 -07:00
Daniel Kłobuszewski
bac35046fb
Fix incorrect usage of klog Warningf function
...
The .*f variants should only ever be called with arguments to format.
This should've really been a part of
https://github.com/kubernetes/autoscaler/pull/7917
2025-03-13 13:50:39 +01:00
Kubernetes Prow Robot
bcbc466e4d
Merge pull request #7917 from x13n/master
...
Fix incorrect usage of klog .*f functions
2025-03-13 05:45:47 -07:00
Daniel Kłobuszewski
780e68f6d2
Fix incorrect usage of klog .*f functions
...
The .*f variants should only ever be called with arguments to format.
2025-03-13 13:24:52 +01:00
Joel Smith
bef1f89a76
Update to golang.org/x/oauth2@v0.27 to fix CVE-2025-22868
...
Signed-off-by: Joel Smith <joelsmith@redhat.com>
2025-03-11 16:56:12 -06:00
Yahia Naguib
241ad7af1e
update address description
2025-03-10 14:25:44 +00:00
Yahia Naguib
738d7dd16d
Migrating flags off main.go to a separate package
2025-03-07 21:11:30 +00:00
Yahia Naguib
57519980c4
Migrating flags off main.go to a separate package
2025-03-07 21:11:29 +00:00
Yahia Naguib
3e9d11b732
Migrating flags off main.go to a separate package
2025-03-07 21:11:27 +00:00
Kubernetes Prow Robot
173a4bde19
Merge pull request #7897 from mtrqq/bug/block-until-resource-caches-are-synced
...
Block cluster autoscaler until API resource caches are synced.
2025-03-07 06:19:45 -08:00
Maksym Fuhol
24f68f98e2
Block cluster autoscaler until API resource caches are synced.
2025-03-07 13:48:26 +00:00