Commit Graph

188 Commits

Author SHA1 Message Date
Bartłomiej Wróblewski d4b812e936 Add filtering out DS pods from scale-up, refactor default pod list processor 2023-01-23 17:14:46 +00:00
Kubernetes Prow Robot f507519916
Merge pull request #5423 from yaroslava-serdiuk/sd-sorting
Add scale down candidates observer
2023-01-19 10:14:16 -08:00
Yaroslava Serdiuk 541ce04e4b Add previous scale down candidate sorting 2023-01-19 16:04:50 +00:00
michael mccune 955396e857 remove clusterapi nodegroupset processor
as discussed with the cluster api community[0], the nodegroupset
processor is being removed from the clusterapi provider implementation
in favor of instructing our community on the use of the
--balancing-ignore-label flag. due to the wide variety of provider
infrastructures that clusterapi can be deployed on, we would prefer to
not encode all of these labels in the autoscaler itself. see the linked
recording for more information.

[0] https://www.youtube.com/watch?v=jbhca_9oPuQ
2023-01-12 15:05:37 -05:00
Kubernetes Prow Robot b94f340af5
Merge pull request #5402 from Bryce-Soghigian/bsoghigian/adding-configurable-difference-ratios
adding configurable difference ratios
2023-01-10 04:03:25 -08:00
bsoghigian 0f8ed0b81f Configurable difference ratios 2023-01-09 22:40:16 -08:00
Kubernetes Prow Robot 3785a2f82a
Merge pull request #5223 from grosser/grosser/burst
cluster-autoscaler: allow setting kuberentes client burst and qps to avoid rate limiting
2022-12-30 06:21:30 -08:00
Michael Grosser cd26bcfe60
allow setting kuberentes client burst and qps to avoid rate limiting 2022-12-29 13:54:04 -08:00
Bartłomiej Wróblewski 62c68e1280 Move PredicateChecker initialization before processors initialization 2022-12-27 15:21:41 +00:00
Kubernetes Prow Robot a46a095fe2
Merge pull request #5362 from yasinlachiny/maxnodetotal
set cluster_autoscaler_max_nodes_count dynamically
2022-12-19 00:33:44 -08:00
yasin.lachiny 6d9fed5211 set cluster_autoscaler_max_nodes_count dynamically
Signed-off-by: yasin.lachiny <yasin.lachiny@gmail.com>
2022-12-11 00:18:03 +01:00
Bartłomiej Wróblewski 2e1b04ff69 Add default PodListProcessor wrapper 2022-12-09 16:26:56 +00:00
Yaroslava Serdiuk ae45571af9 Create a Planner object if --parallelDrain=true 2022-12-07 11:36:05 +00:00
Aleksandra Gacek bae587d20c Break node categorization in scale down planner on timeout. 2022-12-05 11:34:53 +01:00
Bartłomiej Wróblewski 10d3f25996 Use scheduling package in filterOutSchedulable processor 2022-11-23 12:32:59 +00:00
Xintong Liu 524886fca5 Support scaling up node groups to the configured min size if needed 2022-11-02 21:47:00 -07:00
Daniel Kłobuszewski 18f2e67c4f Split out code from simulator package 2022-10-18 11:51:44 +02:00
Kubernetes Prow Robot dc73ea9076
Merge pull request #5235 from UiPath/fix_node_delete
Add option to wait for a period of time after node tainting/cordoning
2022-10-17 04:29:07 -07:00
Kubernetes Prow Robot d022e260a1
Merge pull request #4956 from damirda/feature/scale-up-delay-annotations
Add podScaleUpDelay annotation support
2022-10-13 09:29:02 -07:00
Alexandru Matei 0ee2a359e7 Add option to wait for a period of time after node tainting/cordoning
Node state is refreshed and checked again before deleting the node
It gives kube-scheduler time to acknowledge that nodes state has
changed and to stop scheduling pods on them
2022-10-13 10:37:56 +03:00
Kubernetes Prow Robot b3c6b60e1c
Merge pull request #5060 from yaroslava-serdiuk/deleting-in-batch
Introduce NodeDeleterBatcher to ScaleDown actuator
2022-09-22 10:11:06 -07:00
Yaroslava Serdiuk 65b0d78e6e Introduce NodeDeleterBatcher to ScaleDown actuator 2022-09-22 16:19:45 +00:00
Damir Markovic 11d150e920 Add podScaleUpDelay annotation support 2022-09-05 20:24:19 +02:00
James Ravn 1b98b3823a
Allow balancing by labels exclusively
Adds a new flag `--balance-label` which allows users to balance between
node groups exclusively via labels.

This gives users the flexibility to specify the similarity logic
themselves when --balance-similar-node-groups is in use.
2022-07-06 10:34:18 +01:00
Maciek Pytel ab891418f6 Limit binpacking based on #new_nodes or time
The binpacking algorithm is O(#pending_pods * #new_nodes) and
calculating a very large scale-up can get stuck for minutes or even
hours, leading to CA failing it's healthcheck and going down.
The new limiting prevents this scenario by stopping binpacking after
reaching specified threshold. Any pods that remain pending as a result
of shorter binpacking will be processed next autoscaler loop.

The thresholds used can be controlled with newly introduced flags:
--max-nodes-per-scaleup and --max-nodegroup-binpacking-duration. The
limiting can be disabled by setting both flags to 0 (not recommended,
especially for --max-nodegroup-binpacking-duration).
2022-06-20 17:02:51 +02:00
Michael McCune 8c27f76933 add a flag to allow event duplication
this change brings in a new command line flag,
`--record-duplicated-events`, which allows a user to enable the
duplication of events bypassing the 5 minute de-duplication window.
2022-06-03 14:26:38 -04:00
Yaroslava Serdiuk d919ce3fbf Define AnnotationNodeInfoProvider processor 2022-06-03 16:12:16 +00:00
Yaroslava Serdiuk 7fe27ddf99 GCE: Add --gce-expander-ephemeral-storage-support flag 2022-06-03 16:12:09 +00:00
Kuba Tużnik 7dc0d4f57c CA: implement Actuator boilerplate + cropping nodes to paralellism budgets 2022-05-27 14:24:10 +02:00
weidongcai 03a0475502 Expose backoff time parameters 2022-05-12 15:34:28 +08:00
Grigoris Thanasoulas 719a53e8d7 cluster-autoscaler: Add --max-pod-eviction-time flag
Add a flag to allow the user configure then MaxPodEvictionTime to values
other than the default 2m. This is needed in cases a pod takes more than
2 minutes to be evicted.

Signed-off-by: Grigoris Thanasoulas <gregth@arrikto.com>
2022-04-30 08:52:41 +03:00
Daniel Kłobuszewski e07fd1e130 Move filter_out_schedulable to a separate package 2022-04-26 08:48:45 +02:00
Kubernetes Prow Robot 0123869b7a
Merge pull request #4452 from airbnb/es--grpc-expander-plugin
Add gRPC expander plugin
2022-02-21 06:54:14 -08:00
Evan Sheng 4504f55485 Add grpc expander and tests 2022-02-16 12:34:06 -08:00
Yaroslava Serdiuk a9a7d98f2c Add expire time for nodeInfo cache items 2022-02-09 09:38:32 +00:00
Jayant Jain 729038ff2d Adding support for Debugging Snapshot 2021-12-30 09:08:05 +00:00
ialidzhikov 986d62fb96 Add `--feature-gates` flag to support scale up on volume limits (CSI migration enabled)
Signed-off-by: ialidzhikov <i.alidjikov@gmail.com>
2021-12-19 15:38:17 +02:00
Diego Bonfigli 1b4fcf6bf7 Re-add default expander 2021-12-09 18:27:46 +01:00
Michael McCune 99a242a9e6 add ClusterAPI nodegroupset processor
This allows the ClusterAPI provider to ignore the
`topology.ebs.csi.aws.com/zone` label by adding a custom nodegroupset
processor. It also adds unit tests to exercise the new processor.
2021-11-10 17:01:27 -05:00
Ryan McNamara 068af5bf7e Allow specification of multiple expanders
Multiple expanders can now be specified, expanders now "filter to the
tied for best" instead of "selecting the best" so the output of one
expander is now fed to the input of the next. Each expander may only
be used once to disallow bad configuration. This should not be a change
in functionality as in the event of a tie the random expander is still
used.
2021-09-23 14:31:39 -06:00
Kubernetes Prow Robot 9f84d391f6
Merge pull request #4022 from amrmahdi/amrh/nodegroupminmaxmetrics
[cluster-autoscaler] Publish node group min/max metrics
2021-07-05 07:38:54 -07:00
Daniel Kłobuszewski 081c4664d3 Add a flag to control DaemonSet eviction on non-empty nodes 2021-06-25 11:06:10 +02:00
Amr Hanafi (MAHDI)) f5c2ab7328 Emit the node group metrics behind a flag 2021-05-20 16:49:39 -07:00
Kubernetes Prow Robot 2beea02a29
Merge pull request #3983 from elmiko/cluster-resource-consumption-metrics
Cluster resource consumption metrics
2021-05-13 15:32:04 -07:00
Kubernetes Prow Robot 200415e990
Merge pull request #3940 from mcristina422/patch-1
Release leader election lock on shutdown
2021-05-04 07:21:11 -07:00
Brett Elliott 3b48a3193f Set cluster autoscaler-specific user agent.
Refactored mocks to remove redundancy.
2021-04-06 17:49:35 +02:00
Michael McCune a24ea6c66b add cluster cores and memory bytes count metrics
This change adds 4 metrics that can be used to monitor the minimum and
maximum limits for CPU and memory, as well as the current counts in
cores and bytes, respectively.

The four metrics added are:
* `cluster_autoscaler_cpu_limits_cores`
* `cluster_autoscaler_cluster_cpu_current_cores`
* `cluster_autoscaler_memory_limits_bytes`
* `cluster_autoscaler_cluster_memory_current_bytes`

This change also adds the `max_cores_total` metric to the metrics
proposal doc, as it was previously not recorded there.

User story: As a cluster autoscaler user, I would like to monitor my
cluster through metrics to determine when the cluster is nearing its
limits for cores and memory usage.
2021-04-06 10:35:21 -04:00
Michael Cristina 4cf9a98679
Release leader election lock on shutdown 2021-03-12 12:51:03 -06:00
Eric Mrak and Brett Kochendorfer 43dd34074e Allow name of cluster-autoscaler status ConfigMap to be specificed
This allows us to run two instances of cluster-autoscaler in our
cluster, targeting two different types of autoscaling groups that
require different command-line settings to be passed.
2021-02-17 21:52:54 +00:00
Kubernetes Prow Robot b470c62bfa
Merge pull request #3630 from marc-sensenich/configurable-leader-election-resource-lock-name
Allow for the leader election resourcelock to have a configurable name
2021-01-27 04:59:40 -08:00