autoscaler/cluster-autoscaler
Łukasz Osipiuk 5962354c81 Inject Backoff instance to ClusterStateRegistry on creation 2018-11-13 14:25:16 +01:00
..
Godeps Update godeps 2018-11-06 17:28:45 +01:00
_override Add override for aks 2018-03-31 and autorest 2018-07-26 20:30:04 +08:00
cloudprovider Fix typos: reqest->request, approporiate->appropriate 2018-11-10 20:29:34 +08:00
clusterstate Inject Backoff instance to ClusterStateRegistry on creation 2018-11-13 14:25:16 +01:00
config Fixing nits: renamed newPodScaleUpBuffer -> newPodScaleUpDelay, deleted redundant comment 2018-09-17 10:38:28 -04:00
context Clean up estimators 2018-11-06 14:15:42 +01:00
core Inject Backoff instance to ClusterStateRegistry on creation 2018-11-13 14:25:16 +01:00
estimator Mark BasicEstimator as deprecated 2018-11-06 14:15:42 +01:00
expander NodeGroup.Nodes() return Instance struct instead instance name 2018-10-26 14:41:18 +02:00
metrics Add client-go metrics (rest_client_request_*). 2018-09-06 12:35:16 +02:00
processors Add GKE-specific NodeGroupSet processor 2018-10-25 18:50:17 +02:00
proposals Make GetGpuTypeForMetrics more robust 2018-06-26 21:35:16 +02:00
simulator Grammar correction 2018-10-03 12:45:02 -07:00
utils Fix typos: reqest->request, approporiate->appropriate 2018-11-10 20:29:34 +08:00
vendor Update godeps 2018-11-06 17:28:45 +01:00
.gitignore Cluster-Autoscaler - Kubernetes client deps 2016-04-20 11:49:38 +02:00
Dockerfile Add Alibaba Cloud Provider support with no vendor (#1309) 2018-10-23 14:23:34 +02:00
FAQ.md Update FAQ on overprovisioning to account for k8s 1.11 2018-10-26 11:55:25 +01:00
Makefile Support build tags in Makefile 2018-10-15 18:38:52 +02:00
OWNERS Update OWNERS 2018-09-18 11:11:20 +02:00
README.md add alibaba cloud doc link 2018-10-30 20:23:00 +08:00
cloudbuild.yaml Add GCB config for cluster-autoscaler 2018-02-15 16:03:35 -08:00
fix_gopath.sh Update fix_gopath.sh with new overrides 2018-07-26 20:34:47 +08:00
kubernetes.sync Update godeps to include k/k/pkg/client/metrics/prometheus import. 2018-09-06 12:35:16 +02:00
main.go Clean up estimators 2018-11-06 14:15:42 +01:00
main_test.go Move autoscaling options out of static 2018-07-25 10:52:37 +02:00
push_image.sh Pushes go to staging-k8s.gcr.io 2018-01-17 14:16:38 -08:00
run.sh Cluster-Autoscaler: added wrapper script to pass signals 2017-02-28 17:39:29 +01:00
update_toc.py Fix update_toc.py script to stop appending empty lines 2017-06-30 14:18:18 +02:00
version.go Cluster Autoscaler 1.3.0-beta.1 2018-06-11 14:41:21 +02:00

README.md

Cluster Autoscaler

Introduction

Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:

  • there are pods that failed to run in the cluster due to insufficient resources,
  • there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.

FAQ/Documentation

Is available HERE.

Releases

We recommend using Cluster Autoscaler with the Kubernetes master version for which it was meant. The below combinations have been tested on GCP. We don't do cross version testing or compatibility testing in other environments. Some user reports indicate successful use of a newer version of Cluster Autoscaler with older clusters, however, there is always a chance that it won't work as expected.

Starting from Kubernetes 1.12, versioning scheme was changed to match Kubernetes minor releases exactly.

Kubernetes Version CA Version
1.12.X 1.12.X
1.11.X 1.3.X
1.10.X 1.2.X
1.9.X 1.1.X
1.8.X 1.0.X
1.7.X 0.6.X
1.6.X 0.5.X, 0.6.X*
1.5.X 0.4.X
1.4.X 0.3.X

*Cluster Autoscaler 0.5.X is the official version shipped with k8s 1.6. We've done some basic tests using k8s 1.6 / CA 0.6 and we're not aware of any problems with this setup. However, Cluster Autoscaler internally simulates Kubernetes' scheduler and using different versions of scheduler code can lead to subtle issues.

Notable changes

For CA 1.1.2 and later, please check release notes.

CA version 1.1.1:

  • Fixes around metrics in the multi-master configuration.
  • Fixes for unready nodes issues when quota is overrun.

CA version 1.1.0:

CA version 1.0.3:

  • Adds support for safe-to-evict annotation on pod. Pods with this annotation can be evicted even if they don't meet other requirements for it.
  • Fixes an issue when too many nodes with GPUs could be added during scale-up (https://github.com/kubernetes/kubernetes/issues/54959).

CA Version 1.0.2:

CA Version 1.0.1:

CA Version 1.0:

With this release we graduated Cluster Autoscaler to GA.

  • Support for 1000 nodes running 30 pods each. See: Scalability testing report
  • Support for 10 min graceful termination.
  • Improved eventing and monitoring.
  • Node allocatable support.
  • Removed Azure support. See: PR removing support with reasoning behind this decision
  • cluster-autoscaler.kubernetes.io/scale-down-disabled` annotation for marking nodes that should not be scaled down.
  • scale-down-delay-after-deleteandscale-down-delay-after-failureflags replacedscale-down-trial-interval`

CA Version 0.6:

CA Version 0.5.4:

  • Fixes problems with node drain when pods are ignoring SIGTERM.

CA Version 0.5.3:

CA Version 0.5.2:

CA Version 0.5.1:

CA Version 0.5:

  • CA continues to operate even if some nodes are unready and is able to scale-down them.
  • CA exports its status to kube-system/cluster-autoscaler-status config map.
  • CA respects PodDisruptionBudgets.
  • Azure support.
  • Alpha support for dynamic config changes.
  • Multiple expanders to decide which node group to scale up.

CA Version 0.4:

  • Bulk empty node deletions.
  • Better scale-up estimator based on binpacking.
  • Improved logging.

CA Version 0.3:

  • AWS support.
  • Performance improvements around scale down.

Deployment

Cluster Autoscaler is designed to run on Kubernetes master node. This is the default deployment strategy on GCP. It is possible to run a customized deployment of Cluster Autoscaler on worker nodes, but extra care needs to be taken to ensure that Cluster Autoscaler remains up and running. Users can put it into kube-system namespace (Cluster Autoscaler doesn't scale down node with non-mirrored kube-system pods running on them) and add scheduler.alpha.kubernetes.io/critical-pod annotation (so that the rescheduler, if enabled, will kill other pods to make space for it to run).

Supported cloud providers: