as discussed with the cluster api community[0], the nodegroupset
processor is being removed from the clusterapi provider implementation
in favor of instructing our community on the use of the
--balancing-ignore-label flag. due to the wide variety of provider
infrastructures that clusterapi can be deployed on, we would prefer to
not encode all of these labels in the autoscaler itself. see the linked
recording for more information.
[0] https://www.youtube.com/watch?v=jbhca_9oPuQ
Node state is refreshed and checked again before deleting the node
It gives kube-scheduler time to acknowledge that nodes state has
changed and to stop scheduling pods on them
Adds a new flag `--balance-label` which allows users to balance between
node groups exclusively via labels.
This gives users the flexibility to specify the similarity logic
themselves when --balance-similar-node-groups is in use.
The binpacking algorithm is O(#pending_pods * #new_nodes) and
calculating a very large scale-up can get stuck for minutes or even
hours, leading to CA failing it's healthcheck and going down.
The new limiting prevents this scenario by stopping binpacking after
reaching specified threshold. Any pods that remain pending as a result
of shorter binpacking will be processed next autoscaler loop.
The thresholds used can be controlled with newly introduced flags:
--max-nodes-per-scaleup and --max-nodegroup-binpacking-duration. The
limiting can be disabled by setting both flags to 0 (not recommended,
especially for --max-nodegroup-binpacking-duration).
this change brings in a new command line flag,
`--record-duplicated-events`, which allows a user to enable the
duplication of events bypassing the 5 minute de-duplication window.
Add a flag to allow the user configure then MaxPodEvictionTime to values
other than the default 2m. This is needed in cases a pod takes more than
2 minutes to be evicted.
Signed-off-by: Grigoris Thanasoulas <gregth@arrikto.com>
This allows the ClusterAPI provider to ignore the
`topology.ebs.csi.aws.com/zone` label by adding a custom nodegroupset
processor. It also adds unit tests to exercise the new processor.
Multiple expanders can now be specified, expanders now "filter to the
tied for best" instead of "selecting the best" so the output of one
expander is now fed to the input of the next. Each expander may only
be used once to disallow bad configuration. This should not be a change
in functionality as in the event of a tie the random expander is still
used.
This change adds 4 metrics that can be used to monitor the minimum and
maximum limits for CPU and memory, as well as the current counts in
cores and bytes, respectively.
The four metrics added are:
* `cluster_autoscaler_cpu_limits_cores`
* `cluster_autoscaler_cluster_cpu_current_cores`
* `cluster_autoscaler_memory_limits_bytes`
* `cluster_autoscaler_cluster_memory_current_bytes`
This change also adds the `max_cores_total` metric to the metrics
proposal doc, as it was previously not recorded there.
User story: As a cluster autoscaler user, I would like to monitor my
cluster through metrics to determine when the cluster is nearing its
limits for cores and memory usage.
This allows us to run two instances of cluster-autoscaler in our
cluster, targeting two different types of autoscaling groups that
require different command-line settings to be passed.