Commit Graph

300 Commits

Author SHA1 Message Date
Bartłomiej Wróblewski b5ead036a8 Merge taint utils into one package, make taint modifying methods public 2023-02-13 11:29:45 +00:00
Kuba Tużnik 7e6762535b CA: stop passing registered upcoming nodes as scale-down candidates
Without this, with aggressive settings, scale-down could be removing
registered upcoming nodes before they have a chance to become ready
(the duration of which should be unrelated to the scale-down settings).
2023-02-10 14:46:19 +01:00
dom.bozzuto 1150fcd27a Fix scaledown:nodedeletion metric calculation
The scaledown:nodedeletion metric duration was incorrectly being computed relative to the start of the RunOnce routine, instead of from the actual start of the deletion. Behavior in the start of the routine (like a long cloudproviderrefresh) would incorrectly skew the nodedeletion duration

Signed-off-by: Domenic Bozzuto <dom.bozzuto@datadoghq.com>
2023-02-02 12:03:38 -05:00
Yaroslava Serdiuk 97159df69b Add scale down candidates observer 2023-01-19 16:04:42 +00:00
yasin.lachiny 7a1668ef12 update prometheus metric min maxNodesCount and a.MaxNodesTotal
Signed-off-by: yasin.lachiny <yasin.lachiny@gmail.com>
2022-12-14 20:51:26 +01:00
yasin.lachiny 6d9fed5211 set cluster_autoscaler_max_nodes_count dynamically
Signed-off-by: yasin.lachiny <yasin.lachiny@gmail.com>
2022-12-11 00:18:03 +01:00
Yaroslava Serdiuk ae45571af9 Create a Planner object if --parallelDrain=true 2022-12-07 11:36:05 +00:00
Xintong Liu 524886fca5 Support scaling up node groups to the configured min size if needed 2022-11-02 21:47:00 -07:00
Bartłomiej Wróblewski 4373c467fe Add ScaleDown.Actuator to AutoscalingContext 2022-11-02 13:12:25 +00:00
Daniel Kłobuszewski 18f2e67c4f Split out code from simulator package 2022-10-18 11:51:44 +02:00
Daniel Kłobuszewski 95fd1ed645 Remove ScaleDown dependency on clusterStateRegistry 2022-10-17 21:11:44 +02:00
Kubernetes Prow Robot dc73ea9076
Merge pull request #5235 from UiPath/fix_node_delete
Add option to wait for a period of time after node tainting/cordoning
2022-10-17 04:29:07 -07:00
Kubernetes Prow Robot d022e260a1
Merge pull request #4956 from damirda/feature/scale-up-delay-annotations
Add podScaleUpDelay annotation support
2022-10-13 09:29:02 -07:00
Alexandru Matei 0ee2a359e7 Add option to wait for a period of time after node tainting/cordoning
Node state is refreshed and checked again before deleting the node
It gives kube-scheduler time to acknowledge that nodes state has
changed and to stop scheduling pods on them
2022-10-13 10:37:56 +03:00
Kubernetes Prow Robot b3c6b60e1c
Merge pull request #5060 from yaroslava-serdiuk/deleting-in-batch
Introduce NodeDeleterBatcher to ScaleDown actuator
2022-09-22 10:11:06 -07:00
Yaroslava Serdiuk 65b0d78e6e Introduce NodeDeleterBatcher to ScaleDown actuator 2022-09-22 16:19:45 +00:00
Clint Fooken 6edb3f26b8 Modifying taint removal logic on startup to consider all nodes instead of ready nodes. 2022-09-19 11:37:38 -07:00
Damir Markovic 11d150e920 Add podScaleUpDelay annotation support 2022-09-05 20:24:19 +02:00
Daniel Kłobuszewski 66bfe55077
Revert "Adding support for identifying nodes that have been deleted from cloud provider that are still registered within Kubernetes" 2022-07-13 10:08:03 +02:00
Kubernetes Prow Robot af5fb0722b
Merge pull request #4896 from fookenc/master
Adding support for identifying nodes that have been deleted from cloud provider that are still registered within Kubernetes
2022-07-04 05:13:24 -07:00
Benjamin Pineau a726944273 Don't deref nil nodegroup in deleteCreatedNodesWithErrors
Various cloudproviders' `NodeGroupForNode()` implementations (including
aws, azure, and gce) can returns a `nil` error _and_ a `nil` nodegroup.
Eg. we're seeing AWS returning that on failed upscales on live clusters.
Checking that `deleteCreatedNodesWithErrors` doesn't return an error is
not enough to safely dereference the nodegroup (as returned by
`NodeGroupForNode()`) by calling nodegroup.Id().

In that situation, logging and returning early seems the safest option,
to give various caches (eg. clusterstateregistry's and cloud provider's)
the opportunity to eventually converge.
2022-05-30 18:47:14 +02:00
Kuba Tużnik 6bd2432894 CA: switch legacy ScaleDown to use the new Actuator
NodeDeletionTracker is now incremented asynchronously
for drained nodes, instead of synchronously. This shouldn't
change anything in actual behavior, but some tests
depended on that, so they had to be adapted.

The switch aims to mostly be a semantic no-op, with
the following exceptions:
* Nodes that fail to be tainted won't be included in
  NodeDeleteResults, since they are now tainted
  synchronously.
2022-05-27 15:13:44 +02:00
Kuba Tużnik bf89c74572 CA: Extract deletion utils out of legacy scale-down
Function signatures are simplified to take the whole
*AutoscalingContext object instead of its individual
fields.
2022-05-26 16:55:59 +02:00
Clint Fooken a278255519 Adding support for identifying nodes that have been deleted from cloud provider that are still registered within Kubernetes. Including code changes first introduced in PR#4211, which will remove taints from all nodes on restarts. 2022-05-17 12:37:42 -07:00
Daniel Kłobuszewski d0f8cc7806 Move the condition for ScaleDownInProgress to legacy scaledown code 2022-05-04 09:24:10 +02:00
Daniel Kłobuszewski c550b77020 Make NodeDeletionTracker implement ActuationStatus interface 2022-04-28 17:08:10 +02:00
Daniel Kłobuszewski 7f8b2da9e3 Separate ScaleDown logic with a new interface 2022-04-26 08:48:45 +02:00
Daniel Kłobuszewski 5a78f49bc2 Move soft tainting logic to a separate package 2022-04-26 08:48:45 +02:00
Daniel Kłobuszewski 7686a1f326 Move existing ScaleDown code to a separate package 2022-04-26 08:48:45 +02:00
Daniel Kłobuszewski a55135fb47 Stop referencing unneededNodes in static_autoscaler 2022-04-26 08:48:45 +02:00
Daniel Kłobuszewski 627284bdae Remove direct access to ScaleDown fields 2022-04-26 08:48:45 +02:00
Yaroslava Serdiuk 8a7b99c7eb Continue CA loop when unregistered nodes were removed 2022-04-12 07:49:42 +00:00
Kubernetes Prow Robot b64d2949a5
Merge pull request #4633 from jayantjain93/debugging-snapshot-1
CA: Debugging snapshot adding a new field for TemplateNode.
2022-01-27 03:02:25 -08:00
Daniel Kłobuszewski 9944137fae Don't cache NodeInfo for recently Ready nodes
There's a race condition between DaemonSet pods getting scheduled to a
new node and Cluster Autoscaler caching that node for the sake of
predicting future nodes in a given node group. We can reduce the risk of
missing some DaemonSet by providing a grace period before accepting nodes in the
cache. 1 minute should be more than enough, except for some pathological
edge cases.
2022-01-26 20:18:53 +01:00
Jayant Jain 537e07fdb1 CA: Debugging snapshot adding a new field for TemplateNode. This captures all the templates for nodegroups present 2022-01-24 17:12:57 +00:00
Jayant Jain 729038ff2d Adding support for Debugging Snapshot 2021-12-30 09:08:05 +00:00
Jayant Jain da5ff3d971 Introduce Empty Cluster Processor
This refactors the handling of cases when the cluster is empty/not ready by CA into a processors in empty_cluster_processor.go
2021-10-13 13:30:30 +00:00
Maciek Pytel a0109324a2 Change parameter order of TemplateNodeInfoProvider
Every other processors (and, I think, function in CA?) that takes
AutoscalingContext has it as first parameter. Changing the new processor
for consistency.
2021-09-13 15:08:14 +02:00
Benjamin Pineau 8485cf2052 Move GetNodeInfosForGroups to it's own processor
Supports providing different NodeInfos sources (either upstream or in
local forks, eg. to properly implement variants like in #4000).

This also moves a large and specialized code chunk out of core, and removes
the need to maintain and pass the GetNodeInfosForGroups() cache from the side,
as processors can hold their states themselves.

No functional changes to GetNodeInfosForGroups(), outside mechanical changes
due to the move: remotely call a few utils functions in core/utils package,
pick context attributes (the processor takes the context as arg rather than
ListerRegistry + PredicateChecker + CloudProvider), and use the builtin cache
rather than receiving it from arguments.
2021-08-16 19:43:10 +02:00
Kubernetes Prow Robot 9f84d391f6
Merge pull request #4022 from amrmahdi/amrh/nodegroupminmaxmetrics
[cluster-autoscaler] Publish node group min/max metrics
2021-07-05 07:38:54 -07:00
Bartłomiej Wróblewski 5076047bf8 Skip iteration loop if node creation failed 2021-06-16 14:40:15 +00:00
Benjamin Pineau 986fe3ae20 Metric for CloudProvider.Refresh() duration
This function can take an variable amount of time due to various
conditions (ie. many nodegroups changes causing forced refreshes,
caches time to live expiries, ...).

Monitoring that duration is useful to diagnose those variations,
and to uncover external issues (ie. throttling from cloud provider)
affecting cluster-autoscaler.
2021-05-31 15:55:28 +02:00
Kubernetes Prow Robot 02985973c6
Merge pull request #4104 from brett-elliott/stopcooldown
Don't start CA in cooldown mode.
2021-05-27 09:12:23 -07:00
Brett Elliott 1880fe6937 Don't start CA in cooldown mode. 2021-05-27 17:53:52 +02:00
Amr Hanafi (MAHDI)) 3ac32b817c Update node group min/max on cloud provider refresh 2021-05-20 17:36:51 -07:00
Benjamin Pineau 030a2152b0 Fix templated nodeinfo names collisions in BinpackingNodeEstimator
Both upscale's `getUpcomingNodeInfos` and the binpacking estimator now uses
the same shared DeepCopyTemplateNode function and inherits its naming
pattern, which is great as that fixes a long standing bug.

Due to that, `getUpcomingNodeInfos` will enrich the cluster snapshots with
generated nodeinfos and nodes having predictable names (using template name
+ an incremental ordinal starting at 0) for upcoming nodes.

Later, when it looks for fitting nodes for unschedulable pods (when upcoming
nodes don't satisfy those (FitsAnyNodeMatching failing due to nodes capacity,
or pods antiaffinity, ...), the binpacking estimator will also build virtual
nodes and place them in a snapshot fork to evaluate scheduler predicates.

Those temporary virtual nodes are built using the same pattern (template name
and an index ordinal also starting at 0) as the one previously used by
`getUpcomingNodeInfos`, which means it will generate the same nodeinfos/nodes
names for nodegroups having upcoming nodes.

But adding nodes by the same name in an existing cluster snapshot isn't
allowed, and the evaluation attempt will fail.

Practically this blocks re-upscales for nodegroups having upcoming nodes,
which can cause a significant delay.
2021-05-19 12:05:40 +02:00
Kubernetes Prow Robot 2beea02a29
Merge pull request #3983 from elmiko/cluster-resource-consumption-metrics
Cluster resource consumption metrics
2021-05-13 15:32:04 -07:00
Bartłomiej Wróblewski 1698e0e583 Separate and refactor custom resources logic 2021-04-07 10:31:11 +00:00
Michael McCune a24ea6c66b add cluster cores and memory bytes count metrics
This change adds 4 metrics that can be used to monitor the minimum and
maximum limits for CPU and memory, as well as the current counts in
cores and bytes, respectively.

The four metrics added are:
* `cluster_autoscaler_cpu_limits_cores`
* `cluster_autoscaler_cluster_cpu_current_cores`
* `cluster_autoscaler_memory_limits_bytes`
* `cluster_autoscaler_cluster_memory_current_bytes`

This change also adds the `max_cores_total` metric to the metrics
proposal doc, as it was previously not recorded there.

User story: As a cluster autoscaler user, I would like to monitor my
cluster through metrics to determine when the cluster is nearing its
limits for cores and memory usage.
2021-04-06 10:35:21 -04:00
Kubernetes Prow Robot 43ab030969
Merge pull request #3888 from mrak/master
Allow name of cluster-autoscaler status ConfigMap to be specificed
2021-03-11 03:22:24 -08:00
Michael McCune 7ecf933e7b add a metric for unregistered nodes removed by cluster autoscaler
This change adds a new metric which counts the number of nodes removed
by the cluster autoscaler due to being unregistered with kubernetes.

User Story

As a cluster-autoscaler user, I would like to know when the autoscaler
is cleaning up nodes that have failed to register with kubernetes. I
would like to monitor the rate at which failed nodes are being removed
so that I can better alert on infrastructure issues which may go
unnoticed elsewhere.
2021-03-04 19:23:03 -05:00
Eric Mrak and Brett Kochendorfer 43dd34074e Allow name of cluster-autoscaler status ConfigMap to be specificed
This allows us to run two instances of cluster-autoscaler in our
cluster, targeting two different types of autoscaling groups that
require different command-line settings to be passed.
2021-02-17 21:52:54 +00:00
Kubernetes Prow Robot 1fc6705724
Merge pull request #3690 from evgenii-petrov-arrival/master
Add unremovable_nodes_count metric
2021-02-17 04:13:06 -08:00
Maciek Pytel 9831623810 Set different hostname label for upcoming nodes
Function copying template node to use for upcoming nodes was
not chaning hostname label, meaning that features relying on
this label (ex. pod antiaffinity on hostname topology) would
treat all upcoming nodes as a single node.
This resulted in triggering too many scale-ups for pods
using such features. Analogous function in binpacking didn't
have the same bug (but it didn't set unique UID or pod names).
I extracted the functionality to a util function used in both
places to avoid the two functions getting out of sync again.
2021-02-12 19:41:04 +01:00
Evgenii Petrov b6f5d5567d Add unremovable_nodes_count metric 2021-02-12 15:47:34 +00:00
Maciek Pytel 3e42b26a22 Per NodeGroup config for scale-down options
This is the implementation of
https://github.com/kubernetes/autoscaler/issues/3583#issuecomment-743215343.
2021-01-25 11:00:17 +01:00
Kubernetes Prow Robot 58be2b7505
Merge pull request #3649 from ClearTax/cordon-node-issue-3648
Adding functionality to cordon the node before destroying it.
2021-01-14 04:19:04 -08:00
atul 7670d7b6af Adding functionality to cordon the node before destroying it. This helps load balancer to remove the node from healthy hosts (ALB does have this support).
This won't fix the issue of 502 completely as there is some time node has to live even after cordoning as to serve In-Flight request but load balancer can be configured to remove Cordon nodes from healthy host list.
This feature is enabled by cordon-node-before-terminating flag with default value as false to retain existing behavior.
2021-01-14 17:21:37 +05:30
Bartłomiej Wróblewski 0fb897b839 Update imports after scheduler scheduler/framework/v1alpha1 removal 2020-11-30 10:48:52 +00:00
Jakub Tużnik bf18d57871 Remove ScaleDownNodeDeleted status since we no longer delete nodes synchronously 2020-10-01 11:12:45 +02:00
Jakub Tużnik 3958c6645d Add an annotation identifying upcoming nodes 2020-07-24 15:20:34 +02:00
Maciek Pytel 655b4081f4 Migrate to klog v2 2020-06-05 17:22:26 +02:00
Jakub Tużnik 73a5cdf928 Address recent breaking changes in scheduler
The following things changed in scheduler and needed to be fixed:
* NodeInfo was moved to schedulerframework
* Some fields on NodeInfo are now exposed directly instead of via getters
* NodeInfo.Pods is now a list of *schedulerframework.PodInfo, not *apiv1.Pod
* SharedLister and NodeInfoLister were moved to schedulerframework
* PodLister was removed
2020-04-24 17:54:47 +02:00
Jakub Tużnik 8f1efc9866 Add NodeInfoProcessor for proccesing nodeInfosForNodeGroups 2020-03-20 15:19:18 +01:00
Łukasz Osipiuk a6023265e7 Add clarifying comment regarding podDestination and scaleDownCandidates variables 2020-03-10 15:18:52 +01:00
Aleksandra Malinowska ce18f7119c change order of arguments for TryToScaleDown 2020-03-10 11:36:57 +01:00
Aleksandra Malinowska 0b7c45e88a stop passing scheduled pods around 2020-03-03 16:23:49 +01:00
Aleksandra Malinowska 572bad61ce use nodes from snapshot in scale down 2020-03-03 16:23:49 +01:00
Aleksandra Malinowska 9c6a0f9aab Filter out expendable pods before initializing snapshot 2020-03-03 12:05:58 +01:00
Kubernetes Prow Robot dbbd4572af
Merge pull request #2861 from aleksandra-malinowska/delta-snapshot-15
Cleanup todo
2020-03-02 05:52:44 -08:00
Aleksandra Malinowska 0c13ce7248 add pods from upcoming nodes to snapshot 2020-02-27 14:12:31 +01:00
Aleksandra Malinowska 7ac3d27cf7 cleanup todo - no op 2020-02-27 11:13:37 +01:00
Julien Balestra 628128f65e cluster-autoscaler/taints: refactor current taint logics in the same package
Signed-off-by: Julien Balestra <julien.balestra@datadoghq.com>
2020-02-25 13:57:23 +01:00
Julien Balestra af270b05f6 cluster-autoscaler/taints: ignore taints on existing nodes
Signed-off-by: Julien Balestra <julien.balestra@datadoghq.com>
2020-02-25 13:55:17 +01:00
Kubernetes Prow Robot bbeead26ac
Merge pull request #2853 from aleksandra-malinowska/fix-ifs
Cleanup ifs in static autoscaler
2020-02-21 06:23:34 -08:00
Aleksandra Malinowska c4d376b9c2 Cleanup ifs in static autoscaler 2020-02-21 15:03:01 +01:00
Aleksandra Malinowska 468061dcfc move initializing snapshot after empty cluster check and API calls 2020-02-21 14:50:27 +01:00
Kubernetes Prow Robot af1dd84305
Merge pull request #2799 from aleksandra-malinowska/delta-snapshot-4
Add delta snapshot implementation
2020-02-14 09:20:17 -08:00
Jakub Tużnik 7a188ab50d Provide ScaleDownStatusProcessor with info about unremovable nodes 2020-02-11 15:27:33 +01:00
Aleksandra Malinowska 9c018ddb7a Cleanup cluster snapshot interface 2020-02-05 13:33:03 +01:00
Łukasz Osipiuk 4b30a6f499 Rename propagateClusterSnapshot to initializeClusterSnapshot 2020-02-04 20:52:08 +01:00
Łukasz Osipiuk 6ed2636f10 Drop PredicateChecker.SnapshotClusterState 2020-02-04 20:51:52 +01:00
Łukasz Osipiuk 98efd05b4b Do not add Pods pointing to inexistent nodes to snapshot 2020-02-04 20:51:49 +01:00
Łukasz Osipiuk d7770e3044 Use ClusterSnapshot in ScaleDown 2020-02-04 20:51:48 +01:00
Łukasz Osipiuk 9bb2fd15d7 Add TODO 2020-02-04 20:51:42 +01:00
Łukasz Osipiuk 69800ab176 Simulate scheduling of pods waiting for preemption in ClusterSnapshot 2020-02-04 20:51:37 +01:00
Łukasz Osipiuk d9891ae3ad Simplify PodListProcessor interface 2020-02-04 20:51:35 +01:00
Łukasz Osipiuk 7e62105cb9 Add upcoming nodes to ClusterSnapshot 2020-02-04 20:51:31 +01:00
Łukasz Osipiuk 83d1c4ff8a Add GetAllPods and GetAllNodes to ClusterSnapshot 2020-02-04 20:51:30 +01:00
Łukasz Osipiuk fa2c6e4d9e Propagate cluster state to ClusterSnapshot 2020-02-04 20:51:27 +01:00
Łukasz Osipiuk 036103c553 Add ClusterSnapshot to AutoscalingContext 2020-02-04 20:51:26 +01:00
Łukasz Osipiuk 373c558303 Extract PredicateChecker interface 2020-02-04 20:51:18 +01:00
Łukasz Osipiuk b01f2fca8f Drop ConfigurePredicateCheckerForLoop 2020-02-04 20:51:14 +01:00
dasydong 68433abb7c Remove duplicate comments 2019-12-28 01:06:22 +08:00
Kubernetes Prow Robot f6ed9c114a
Merge pull request #2588 from losipiuk/lo/snapshot
Snapshot cluster state for scheduler every loop
2019-11-28 05:25:03 -08:00
Łukasz Osipiuk b67854e800 Snapshot cluster state for scheduler every loop
Change-Id: If9d162b83ccc914fe1b02e4689bfe1f4b264407f
2019-11-28 14:02:08 +01:00
Łukasz Osipiuk 17a7bc5164 Ignore NominatedNodeName on Pod if node is gone
Change-Id: I4a119f46e55ca2223f9f0fdd3e75ce3f279e293b
2019-11-27 20:26:00 +01:00
Vivek Bagade 910e75365c remove temporary nodes logic 2019-11-12 11:58:29 +01:00
Jarvis-Zhou 7c9d6e3518 Do not assign return values to variables when not needed 2019-10-25 19:28:00 +08:00
Łukasz Osipiuk 7f083d2393 Move core/utils.go to separate package and split into multiple files 2019-10-22 14:23:40 +02:00