Commit Graph

37 Commits

Author SHA1 Message Date
michael mccune 955396e857 remove clusterapi nodegroupset processor
as discussed with the cluster api community[0], the nodegroupset
processor is being removed from the clusterapi provider implementation
in favor of instructing our community on the use of the
--balancing-ignore-label flag. due to the wide variety of provider
infrastructures that clusterapi can be deployed on, we would prefer to
not encode all of these labels in the autoscaler itself. see the linked
recording for more information.

[0] https://www.youtube.com/watch?v=jbhca_9oPuQ
2023-01-12 15:05:37 -05:00
bsoghigian 0f8ed0b81f Configurable difference ratios 2023-01-09 22:40:16 -08:00
Michael Grosser 62f29d23af
cluster-autoscaler: refactor BalanceScaleUpBetweenGroups 2022-11-15 13:21:29 -08:00
Michael McCune ba9c164463 update clusterapi nodegroups processor
this change adds labels that are used on Alibaba Cloud and IBM Cloud for
CSI and CCM.
2022-08-18 15:55:35 -04:00
James Ravn 1b98b3823a
Allow balancing by labels exclusively
Adds a new flag `--balance-label` which allows users to balance between
node groups exclusively via labels.

This gives users the flexibility to specify the similarity logic
themselves when --balance-similar-node-groups is in use.
2022-07-06 10:34:18 +01:00
Marwan Ahmed 26569925db ignore azure csi topology label for similarity checks and populate it for scale from zero 2021-12-21 20:44:49 +02:00
Michael McCune 99a242a9e6 add ClusterAPI nodegroupset processor
This allows the ClusterAPI provider to ignore the
`topology.ebs.csi.aws.com/zone` label by adding a custom nodegroupset
processor. It also adds unit tests to exercise the new processor.
2021-11-10 17:01:27 -05:00
Michael McCune 828663e97a add topology.ebs.csi.aws.com/zone label to aws nodegroupset processor
This change adds the aforementioned label to the list of ignored labels
in the AWS nodegroupset processor. This change is being made in response
to the addition of this label by the aws-ebs-csi-driver. This label will
eventually be deprecated by the driver, but its use will prevent AWS
users from properly balancing similar nodes. Also adds unit test for the
AWS processor.

ref: https://github.com/kubernetes/autoscaler/issues/3230
ref: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/729
2021-11-10 17:01:08 -05:00
Marwan Ahmed f318400c9e add recent AKS agentpool label to ignore for similarity checks 2021-10-25 14:18:06 -07:00
Brett Elliott 5cf64a2b3c Update vendor to v1.22.0-alpha.1 2021-05-20 22:02:41 +02:00
Bartłomiej Wróblewski 0fb897b839 Update imports after scheduler scheduler/framework/v1alpha1 removal 2020-11-30 10:48:52 +00:00
Benjamin Pineau bfd6fe7fed Ignore topology.gke.io/zone when comparing groups
Commit bb2eed1cff introduced a new `topology.gke.io/zone` label to
GCE nodes templates, for CSI needs.

That label holds zone name, making nodeInfo templates dissimilar
for groups belonging to different zones. The CA otherwise tries to
ignore those zonal labels (ie. it ignores the standards LabelZoneRegion
and LabelZoneFailureDomain) when it looks for nodegroups similarities.
2020-10-12 15:14:21 +02:00
Kubernetes Prow Robot 67dce2e824
Merge pull request #3124 from JoelSpeed/memory-tolerance-quantity
Allow small tolerance on memory capacity when comparing nodegroups
2020-06-24 04:25:17 -07:00
Joel Speed be1d9cb8d6
Allow 1.5% tolerance in memory capacity when comparing nodegroups
In testing, AWS M5 instances can on occasion display approximately a 1% difference
in memory capacity between availability zones, deployed with the same launch
configuration and same AMI.
Allow a 1.5% tolerance to give some buffer on the actual amount of memory discrepancy
since in testing, some examples were just over 1% (eg 1.05%, 1.1%).
Tests are included with capacity values taken from real instances to prevent future
regression.
2020-06-10 12:00:39 +01:00
Maciek Pytel 655b4081f4 Migrate to klog v2 2020-06-05 17:22:26 +02:00
Jakub Tużnik 73a5cdf928 Address recent breaking changes in scheduler
The following things changed in scheduler and needed to be fixed:
* NodeInfo was moved to schedulerframework
* Some fields on NodeInfo are now exposed directly instead of via getters
* NodeInfo.Pods is now a list of *schedulerframework.PodInfo, not *apiv1.Pod
* SharedLister and NodeInfoLister were moved to schedulerframework
* PodLister was removed
2020-04-24 17:54:47 +02:00
Adam Malcontenti-Wilson 8313e969c7 Add support for passing in custom ignore labels 2020-03-17 14:30:03 +11:00
Adam Malcontenti-Wilson 5476125063 Use builder methods to create NodeInfoComparator functions 2020-03-17 13:51:15 +11:00
Maxime Renou a7f3e54770
Add eks.amazonaws.com/nodegroup label to awsIgnoredLabels 2020-02-20 11:36:14 +01:00
Enxebre d422aaaca6 UPSTREAM: <carry>: openshift: Add topology.kubernetes.io labels to be ignored when comparing similar node groups.
Without this, the autoscaler where using the lables in compareLabels and failing to match similar groups in different zones. Starting in kube 1.17 failure-domain.beta.kubernetes.io/* are deprecated in favour of topology.kubernetes.io/* https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesiozone
2020-02-19 18:15:11 +01:00
Colin Murphy dde3341133 Raise maximum memory capacity difference.
AWS M5 instance types may differ in memory capacity by more than 128MB.
2019-10-25 17:18:08 -04:00
Colin Murphy 7f0a42b023 Add additional AWS labels.
Whitelist additional node labels for AWS CNI custom networking and
EC2 lifecycle.

Move AWS ignored node labels to AWS specific file.
2019-10-25 17:17:02 -04:00
Jarvis-Zhou 7c9d6e3518 Do not assign return values to variables when not needed 2019-10-25 19:28:00 +08:00
Kubernetes Prow Robot dc1f19fc47
Merge pull request #2207 from viafoura/kops-node-similarity-fix-master
add kops instance group label to ignore list for similar node group identification.
2019-09-27 07:27:37 -07:00
Andrew McDermott e8b3c2a111 compare_nodegroups: Tolerate small differences in memory capacity
The current comparator expects memory capacity values to be identical.
However across AWS, Azure and GCP I quite often see very small
differences in capacity, typically 8-16Ki. When this occurs the
nodegroups are considered not equal when balancing is in effect which
is unfortunate because, in reality, they are identical.

This change will now tolerate a 128Ki difference before memory
capacity values are considered unequal.
2019-09-06 15:55:51 +01:00
Krzysztof Jastrzebski 75030ee2ec Fix bug in balancing processor. Cluster Autoscaler was stopping scaling
up when there was a multizonal pool with number of nodes exceeding limit for one zone.
2019-07-29 09:28:20 +02:00
Joe Hohertz 754412d7ea also add similar label for eksctl to ignore list
Signed-off-by: Joe Hohertz <joe@viafoura.com>
2019-07-26 10:07:54 -04:00
Joe Hohertz 1999d3b432 add kops instance group label to ignore list for similar node group identification.
Signed-off-by: Joe Hohertz <joe@viafoura.com>
2019-07-23 09:08:00 -04:00
t-qini 89a09ccf00 Refactor the corresponding code. 2019-07-22 08:58:51 +08:00
t-qini f7c563ab06 Modify the code as the simple solution proposed by MaciekPytel. 2019-07-18 23:58:05 +08:00
t-qini 622a838c2c Modify nodal similarity rules. 2019-07-09 16:04:40 +08:00
Łukasz Osipiuk 34a4262ad8 Remove GKE specific node group comparator
Change-Id: I33131fec9b7972780cffde605a087cd2ad002752
2019-03-11 17:49:59 +01:00
Pengfei Ni 2546d0d97c Move leaderelection options to new packages 2019-02-21 13:45:46 +08:00
Pengfei Ni 128729bae9 Move schedulercache to package nodeinfo 2019-02-21 12:41:08 +08:00
Łukasz Osipiuk 016bf7fc2c Use k8s.io/klog instead github.com/golang/glog 2018-11-26 17:30:31 +01:00
Maciej Pytel 01a56a8d73 Add GKE-specific NodeGroupSet processor
Also refactor Balancing processor a bit to make it easily extensible.
2018-10-25 18:50:17 +02:00
Maciej Pytel 6f5e6aab6f Move node group balancing to processor
The goal is to allow customization of this logic
for different use-case and cloudproviders.
2018-10-25 14:04:05 +02:00