autoscaler/cluster-autoscaler
Kubernetes Prow Robot f44fb9feaf
Merge pull request #5512 from Shubham82/add_RBAC_permissions_cherryservers
Added RBAC Permission to cherryservers.
2023-02-28 05:47:17 -08:00
..
cloudprovider Merge pull request #5512 from Shubham82/add_RBAC_permissions_cherryservers 2023-02-28 05:47:17 -08:00
clusterstate Merge taint utils into one package, make taint modifying methods public 2023-02-13 11:29:45 +00:00
config Merge pull request #5402 from Bryce-Soghigian/bsoghigian/adding-configurable-difference-ratios 2023-01-10 04:03:25 -08:00
context Track PDBRemainingDisruptions in AutoscalingContext 2023-02-24 12:43:29 +00:00
core Merge pull request #5521 from qianlei90/fix-delete-panic 2023-02-27 02:36:20 -08:00
debuggingsnapshot CA: Debugging snapshotter locking optimisation for better transactions 2022-01-27 11:36:19 +00:00
estimator Fix int formatting in threshold_based_limiter logs 2022-12-08 09:55:01 +01:00
expander Fix a minor typo 2023-02-22 02:21:31 +08:00
hack Minor bugfix to update-vendor script 2022-04-07 18:35:34 +02:00
metrics Add "resource_name" to scaled_up_gpu_nodes_total and scaled_down_gpu_nodes_total metrics 2023-02-22 10:09:45 +00:00
processors fix 2023-02-07 14:35:23 +01:00
proposals Stop applying the beta.kubernetes.io/os and arch 2022-10-27 12:20:04 +08:00
simulator Track PDBRemainingDisruptions in AutoscalingContext 2023-02-24 12:43:29 +00:00
utils Merge pull request #5477 from BigDarkClown/taint 2023-02-23 04:13:34 -08:00
vendor bump cloud-provider-azure version in CA to 1.26.2 for azure imports 2023-02-02 12:02:56 -08:00
version update vendor to v1.27.0-alpha.1 2023-01-25 14:13:29 +00:00
.gitignore
Dockerfile.amd64 feat: use non-root user for base-image 2022-03-11 23:24:18 +01:00
Dockerfile.arm64 feat: use non-root user for base-image 2022-03-11 23:24:18 +01:00
FAQ.md update FQA to add version in the pause container image due the latest that is not valid 2023-02-20 10:51:33 +01:00
Makefile fix generate ec2 instance types 2022-12-05 16:44:00 +02:00
OWNERS Add BigDarkClown to cluster-autoscaler Reviewers 2023-02-09 16:45:42 +00:00
README.md update cluster-autoscaler version compatibility 2023-02-01 15:03:49 -05:00
cloudbuild.yaml
go.mod bump cloud-provider-azure version in CA to 1.26.2 for azure imports 2023-02-02 12:02:56 -08:00
go.sum bump cloud-provider-azure version in CA to 1.26.2 for azure imports 2023-02-02 12:02:56 -08:00
main.go fix 2023-02-07 14:35:23 +01:00
main_test.go
push_image.sh CA - Push Image script - Support newer docker versions 2022-06-14 17:03:50 +01:00
update_toc.py Migrate CA off python2 to python3 2022-03-14 12:52:32 +00:00

README.md

Cluster Autoscaler

Introduction

Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:

  • there are pods that failed to run in the cluster due to insufficient resources.
  • there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.

FAQ/Documentation

An FAQ is available HERE.

You should also take a look at the notes and "gotchas" for your specific cloud provider:

Releases

We recommend using Cluster Autoscaler with the Kubernetes control plane (previously referred to as master) version for which it was meant. The below combinations have been tested on GCP. We don't do cross version testing or compatibility testing in other environments. Some user reports indicate successful use of a newer version of Cluster Autoscaler with older clusters, however, there is always a chance that it won't work as expected.

Starting from Kubernetes 1.12, versioning scheme was changed to match Kubernetes minor releases exactly.

Kubernetes Version CA Version
1.26.X 1.26.X
1.25.X 1.25.X
1.24.X 1.24.X
1.23.X 1.23.X
1.22.X 1.22.X
1.21.X 1.21.X
1.20.X 1.20.X
1.19.X 1.19.X
1.18.X 1.18.X
1.17.X 1.17.X
1.16.X 1.16.X
1.15.X 1.15.X
1.14.X 1.14.X
1.13.X 1.13.X
1.12.X 1.12.X
1.11.X 1.3.X
1.10.X 1.2.X
1.9.X 1.1.X
1.8.X 1.0.X
1.7.X 0.6.X
1.6.X 0.5.X, 0.6.X*
1.5.X 0.4.X
1.4.X 0.3.X

*Cluster Autoscaler 0.5.X is the official version shipped with k8s 1.6. We've done some basic tests using k8s 1.6 / CA 0.6 and we're not aware of any problems with this setup. However, Cluster Autoscaler internally simulates Kubernetes' scheduler and using different versions of scheduler code can lead to subtle issues.

Notable changes

For CA 1.1.2 and later, please check release notes.

CA version 1.1.1:

  • Fixes around metrics in the multiple kube apiserver configuration.
  • Fixes for unready nodes issues when quota is overrun.

CA version 1.1.0:

CA version 1.0.3:

  • Adds support for safe-to-evict annotation on pod. Pods with this annotation can be evicted even if they don't meet other requirements for it.
  • Fixes an issue when too many nodes with GPUs could be added during scale-up (https://github.com/kubernetes/kubernetes/issues/54959).

CA Version 1.0.2:

CA Version 1.0.1:

CA Version 1.0:

With this release we graduated Cluster Autoscaler to GA.

  • Support for 1000 nodes running 30 pods each. See: Scalability testing report
  • Support for 10 min graceful termination.
  • Improved eventing and monitoring.
  • Node allocatable support.
  • Removed Azure support. See: PR removing support with reasoning behind this decision
  • cluster-autoscaler.kubernetes.io/scale-down-disabled annotation for marking nodes that should not be scaled down.
  • scale-down-delay-after-delete and scale-down-delay-after-failure flags replaced scale-down-trial-interval

CA Version 0.6:

CA Version 0.5.4:

  • Fixes problems with node drain when pods are ignoring SIGTERM.

CA Version 0.5.3:

CA Version 0.5.2:

CA Version 0.5.1:

CA Version 0.5:

  • CA continues to operate even if some nodes are unready and is able to scale-down them.
  • CA exports its status to kube-system/cluster-autoscaler-status config map.
  • CA respects PodDisruptionBudgets.
  • Azure support.
  • Alpha support for dynamic config changes.
  • Multiple expanders to decide which node group to scale up.

CA Version 0.4:

  • Bulk empty node deletions.
  • Better scale-up estimator based on binpacking.
  • Improved logging.

CA Version 0.3:

  • AWS support.
  • Performance improvements around scale down.

Deployment

Cluster Autoscaler is designed to run on Kubernetes control plane (previously referred to as master) node. This is the default deployment strategy on GCP. It is possible to run a customized deployment of Cluster Autoscaler on worker nodes, but extra care needs to be taken to ensure that Cluster Autoscaler remains up and running. Users can put it into kube-system namespace (Cluster Autoscaler doesn't scale down node with non-mirrored kube-system pods running on them) and set a priorityClassName: system-cluster-critical property on your pod spec (to prevent your pod from being evicted).

Supported cloud providers: