autoscaler

Commit Graph

Author	SHA1	Message	Date
Bartłomiej Wróblewski	14655d219f	Remove the MaxNodeProvisioningTimeProvider interface	2023-08-05 11:26:40 +00:00
vadasambar	eff7888f10	refactor: use `actuatorNodeGroupConfigGetter` param in `NewActuator` - instead of passing all the processors (we only need `NodeGroupConfigProcessor`) Signed-off-by: vadasambar <surajrbanakar@gmail.com>	2023-07-06 10:48:58 +05:30
vadasambar	7941bab214	feat: set `IgnoreDaemonSetsUtilization` per nodegroup Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: test cases failing for actuator and scaledown/eligibility - abstract default values into `config` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: rename global `IgnoreDaemonSetsUtilization` -> `GlobalIgnoreDaemonSetsUtilization` in code - there is no change in the flag name - rename `thresholdGetter` -> `configGetter` and tweak it to accomodate `GetIgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: reset help text for `ignore-daemonsets-utilization` flag - because per nodegroup override is supported only for AWS ASG tags as of now Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: add info about overriding `--ignore-daemonsets-utilization` per ASG - in AWS cloud provider README Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: use a limiting interface in actuator in place of `NodeGroupConfigProcessor` interface - to limit the functions that can be used - since we need it only for `GetIgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: tests failing for actuator - rename `staticNodeGroupConfigProcessor` -> `MockNodeGroupConfigGetter` - move `MockNodeGroupConfigGetter` to test/common so that it can be used in different tests Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: go lint errors for `MockNodeGroupConfigGetter` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add tests for `IgnoreDaemonSetsUtilization` in cloud provider dir Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: update node group config processor tests for `IgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: update eligibility test cases for `IgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: run actuation tests for 2 NGS - one with `IgnoreDaemonSetsUtilization`: `false` - one with `IgnoreDaemonSetsUtilization`: `true` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: add tests for `IgnoreDaemonSetsUtilization` in actuator - add helper to generate multiple ds pods dynamically - get rid of mock config processor because it is not required Signed-off-by: vadasambar <surajrbanakar@gmail.com> test: fix failing tests for actuator Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: remove `GlobalIgnoreDaemonSetUtilization` autoscaling option - not required Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: warn message `DefaultScaleDownUnreadyTimeKey` -> `DefaultIgnoreDaemonSetsUtilizationKey` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: use `generateDsPods` instead of `generateDsPod` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: `globaIgnoreDaemonSetsUtilization` -> `ignoreDaemonSetsUtilization` Signed-off-by: vadasambar <surajrbanakar@gmail.com>	2023-07-06 10:31:45 +05:30
Daniel Gutowski	5fed449792	Add ClusterStateRegistry to the AutoscalingContext. Due to the dependency of the MaxNodeProvisionTimeProvider on the context the provider was extracted to a dedicated package and injected to the ClusterStateRegistry after context creation.	2023-07-04 05:00:09 -07:00
Maria Oparka	ca088d26c2	Move MaxNodeProvisionTime to NodeGroupAutoscalingOptions	2023-04-19 11:08:20 +02:00
Bartłomiej Wróblewski	d5d0a3c7b7	Fix drain logic when skipNodesWithCustomControllerPods=false, set NodeDeleteOptions correctly	2023-04-04 09:50:26 +00:00
Kubernetes Prow Robot	205293a7ca	Merge pull request #5537 from arrikto/feature-disable-unready-scaledown cluster-autoscaler: Add option to disable scale down of unready nodes	2023-03-08 07:55:11 -08:00
Grigoris Thanasoulas	6cf8c329da	cluster-autoscaler: Add option to disable scale down of unready nodes Add flag '--scale-down-unready-enabled' to enable or disable scale-down of unready nodes. Default value set to true for backwards compatibility (i.e., allow scale-down of unready nodes). Signed-off-by: Grigoris Thanasoulas <gregth@arrikto.com>	2023-03-06 15:51:10 +02:00
Kubernetes Prow Robot	edf8779bda	Merge pull request #5472 from DataDog/scaledown-nodedeletion-metric-fix Fix scaledown:nodedeletion metric calculation	2023-02-28 07:25:17 -08:00
Bartłomiej Wróblewski	43b459bf84	Track PDBRemainingDisruptions in AutoscalingContext	2023-02-24 12:43:29 +00:00
Bartłomiej Wróblewski	b5ead036a8	Merge taint utils into one package, make taint modifying methods public	2023-02-13 11:29:45 +00:00
dom.bozzuto	1150fcd27a	Fix scaledown:nodedeletion metric calculation The scaledown:nodedeletion metric duration was incorrectly being computed relative to the start of the RunOnce routine, instead of from the actual start of the deletion. Behavior in the start of the routine (like a long cloudproviderrefresh) would incorrectly skew the nodedeletion duration Signed-off-by: Domenic Bozzuto <dom.bozzuto@datadoghq.com>	2023-02-02 12:03:38 -05:00
Bartłomiej Wróblewski	10d3f25996	Use scheduling package in filterOutSchedulable processor	2022-11-23 12:32:59 +00:00
Bartłomiej Wróblewski	4373c467fe	Add ScaleDown.Actuator to AutoscalingContext	2022-11-02 13:12:25 +00:00
Daniel Kłobuszewski	92f5b8673e	Extract scheduling hints to a dedicated object This removes the need for passing maps back and forth when doing scheduling simulations.	2022-10-20 11:44:15 +02:00
Daniel Kłobuszewski	18f2e67c4f	Split out code from simulator package	2022-10-18 11:51:44 +02:00
Daniel Kłobuszewski	95fd1ed645	Remove ScaleDown dependency on clusterStateRegistry	2022-10-17 21:11:44 +02:00
Kubernetes Prow Robot	f445a6a887	Merge pull request #5147 from x13n/scaledown4 Extract criteria for removing unneded nodes to a separate package	2022-10-17 11:51:20 -07:00
Alexandru Matei	0ee2a359e7	Add option to wait for a period of time after node tainting/cordoning Node state is refreshed and checked again before deleting the node It gives kube-scheduler time to acknowledge that nodes state has changed and to stop scheduling pods on them	2022-10-13 10:37:56 +03:00
Daniel Kłobuszewski	3a3ec38a52	Extract criteria for removing unneded nodes to a separate package	2022-09-26 16:49:04 +02:00
Kubernetes Prow Robot	70efe28f8a	Merge pull request #5133 from x13n/scaledown3 Stop treating masters differently in scale down	2022-09-23 11:48:05 -07:00
Kubernetes Prow Robot	b3c6b60e1c	Merge pull request #5060 from yaroslava-serdiuk/deleting-in-batch Introduce NodeDeleterBatcher to ScaleDown actuator	2022-09-22 10:11:06 -07:00
Yaroslava Serdiuk	65b0d78e6e	Introduce NodeDeleterBatcher to ScaleDown actuator	2022-09-22 16:19:45 +00:00
Daniel Kłobuszewski	540ff4ee05	Stop treating masters differently in scale down This filtering was used for two purposes: - Excluding masters from destination candidates - Excluding masters from calculating cluster resources Excluding from destination candidates isn't useful: if pods can schedule there, they will, so removing them from CA simulation doesn't change anything. Excluding from calculating cluster resources actually matches scale up behavior, where master nodes are treated the same way as regular nodes.	2022-09-16 12:54:33 +02:00
Daniel Kłobuszewski	6419abf155	Move resource limits checking to a separate package	2022-09-15 16:18:57 +02:00
Daniel Kłobuszewski	1284ecd718	Extract checks for scale down eligibility	2022-09-01 15:16:56 +02:00
Kuba Tużnik	6bd2432894	CA: switch legacy ScaleDown to use the new Actuator NodeDeletionTracker is now incremented asynchronously for drained nodes, instead of synchronously. This shouldn't change anything in actual behavior, but some tests depended on that, so they had to be adapted. The switch aims to mostly be a semantic no-op, with the following exceptions: * Nodes that fail to be tainted won't be included in NodeDeleteResults, since they are now tainted synchronously.	2022-05-27 15:13:44 +02:00
Kuba Tużnik	cda459b19c	CA: Extract delay logic out of legacy scale-down	2022-05-26 16:55:59 +02:00
Kuba Tużnik	6a1ab52de7	CA: Extract drain logic out of legacy scale-down Function signatures are simplified to take the whole *AutoscalingContext object instead of its various individual fields.	2022-05-26 16:55:59 +02:00
Daniel Kłobuszewski	b0cd570b04	Move handing unremovable nodes to dedicated object	2022-05-24 16:24:10 +02:00
Daniel Kłobuszewski	c550b77020	Make NodeDeletionTracker implement ActuationStatus interface	2022-04-28 17:08:10 +02:00
Daniel Kłobuszewski	5a78f49bc2	Move soft tainting logic to a separate package	2022-04-26 08:48:45 +02:00
Daniel Kłobuszewski	7686a1f326	Move existing ScaleDown code to a separate package	2022-04-26 08:48:45 +02:00

33 Commits