History

GuyTempleton 8e83371377 CA Chart - Cleanup Chart Maintainers		2022-06-16 14:54:20 +01:00
..
templates	Merge branch 'master' into master	2022-06-01 15:36:56 +09:00
.helmignore	Restore Cluster Autoscaler Chart name to cluster-autoscaler	2020-11-10 23:18:49 +00:00
Chart.yaml	CA Chart - Cleanup Chart Maintainers	2022-06-16 14:54:20 +01:00
README.md	[CA][Helm] Allow skipping service creation	2022-05-18 10:55:30 +02:00
README.md.gotmpl	Charts - CA - Remove AWS IAM Docs - Replace With Link	2022-03-07 11:28:12 +00:00
values.yaml	[CA][Helm] Allow skipping service creation	2022-05-18 10:55:30 +02:00

README.md

cluster-autoscaler

Scales Kubernetes worker nodes within autoscaling groups.

TL;DR:

$ helm repo add autoscaler https://kubernetes.github.io/autoscaler

# Method 1 - Using Autodiscovery
$ helm install my-release autoscaler/cluster-autoscaler \
--set 'autoDiscovery.clusterName'=<CLUSTER NAME>

# Method 2 - Specifying groups manually
$ helm install my-release autoscaler/cluster-autoscaler \
--set "autoscalingGroups[0].name=your-asg-name" \
--set "autoscalingGroups[0].maxSize=10" \
--set "autoscalingGroups[0].minSize=1"

Introduction

This chart bootstraps a cluster-autoscaler deployment on a Kubernetes cluster using the Helm package manager.

Prerequisites

Helm 3+
Kubernetes 1.8+
- Older versions may work by overriding the image. Cluster autoscaler internally simulates the scheduler and bugs between mismatched versions may be subtle.
Azure AKS specific Prerequisites:
- Kubernetes 1.10+ with RBAC-enabled.

Previous Helm Chart

The previous cluster-autoscaler Helm chart hosted at helm/charts has been moved to this repository in accordance with the Deprecation timeline. Note that a few things have changed between this version and the old version:

This repository only supports Helm chart installations using Helm 3+ since the apiVersion on the charts has been marked as v2.
Previous versions of the Helm chart have not been migrated

Migration from 1.X to 9.X+ versions of this Chart

TL;DR: You should choose to use versions >=9.0.0 of the cluster-autoscaler chart published from this repository; previous versions, and the cluster-autoscaler-chart with versioning 1.X.X published from this repository are deprecated.

Previous versions of this chart - further details

On initial migration of this chart from the `helm/charts` repository this chart was renamed from `cluster-autoscaler` to `cluster-autoscaler-chart` due to technical limitations. This affected all `1.X` releases of the chart, version 2.0.0 of this chart exists only to mark the [`cluster-autoscaler-chart` chart](https://artifacthub.io/packages/helm/cluster-autoscaler/cluster-autoscaler-chart) as deprecated.

Releases of the chart from 9.0.0 onwards return the naming of the chart to cluster-autoscaler and return to following the versioning established by the chart's previous location at .

To migrate from a 1.X release of the chart to a 9.0.0 or later release, you should first uninstall your 1.X install of the cluster-autoscaler-chart chart, before performing the installation of the new cluster-autoscaler chart.

Migration from 9.0 to 9.1

Starting from 9.1.0 the envFromConfigMap value is expected to contain the name of a ConfigMap that is used as ref for envFrom, similar to envFromSecret. If you want to keep the previous behaviour of envFromConfigMap you must rename it to extraEnvConfigMaps.

Installing the Chart

By default, no deployment is created and nothing will autoscale.

You must provide some minimal configuration, either to specify instance groups or enable auto-discovery. It is not recommended to do both.

Either:

Set autoDiscovery.clusterName and provide additional autodiscovery options if necessary or
Set static node group configurations for one or more node groups (using autoscalingGroups or autoscalingGroupsnamePrefix).

To create a valid configuration, follow instructions for your cloud provider:

AWS
GCE
Azure AKS
OpenStack Magnum

AWS - Using auto-discovery of tagged instance groups

Auto-discovery finds ASGs tags as below and automatically manages them based on the min and max size specified in the ASG. cloudProvider=aws only.

Tag the ASGs with keys to match .Values.autoDiscovery.tags, by default: k8s.io/cluster-autoscaler/enabled and k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
Verify the IAM Permissions
Set autoDiscovery.clusterName=<YOUR CLUSTER NAME>
Set awsRegion=<YOUR AWS REGION>
Set (option) awsAccessKeyID=<YOUR AWS KEY ID> and awsSecretAccessKey=<YOUR AWS SECRET KEY> if you want to use AWS credentials directly instead of an instance role

$ helm install my-release autoscaler/cluster-autoscaler --set autoDiscovery.clusterName=<CLUSTER NAME> --set awsRegion=<YOUR AWS REGION>

Alternatively with your own AWS credentials

$ helm install my-release autoscaler/cluster-autoscaler --set autoDiscovery.clusterName=<CLUSTER NAME> --set awsRegion=<YOUR AWS REGION> --set awsAccessKeyID=<YOUR AWS KEY ID> --set awsSecretAccessKey=<YOUR AWS SECRET KEY>

Specifying groups manually

Without autodiscovery, specify an array of elements each containing ASG name, min size, max size. The sizes specified here will be applied to the ASG, assuming IAM permissions are correctly configured.

Verify the IAM Permissions
Either provide a yaml file setting autoscalingGroups (see values.yaml) or use --set e.g.:

$ helm install my-release autoscaler/cluster-autoscaler \
--set "autoscalingGroups[0].name=your-asg-name" \
--set "autoscalingGroups[0].maxSize=10" \
--set "autoscalingGroups[0].minSize=1"

Auto-discovery

For auto-discovery of instances to work, they must be tagged with the keys in .Values.autoDiscovery.tags, which by default are k8s.io/cluster-autoscaler/enabled and k8s.io/cluster-autoscaler/<ClusterName>

The value of the tag does not matter, only the key.

An example kops spec excerpt:

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  name: my.cluster.internal
spec:
  additionalPolicies:
    node: |
      [
        {"Effect":"Allow","Action":["autoscaling:DescribeAutoScalingGroups","autoscaling:DescribeAutoScalingInstances","autoscaling:DescribeLaunchConfigurations","autoscaling:DescribeTags","autoscaling:SetDesiredCapacity","autoscaling:TerminateInstanceInAutoScalingGroup"],"Resource":"*"}
      ]
      ...      
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: my.cluster.internal
  name: my-instances
spec:
  cloudLabels:
    k8s.io/cluster-autoscaler/enabled: ""
    k8s.io/cluster-autoscaler/my.cluster.internal: ""
  image: kops.io/k8s-1.8-debian-jessie-amd64-hvm-ebs-2018-01-14
  machineType: r4.large
  maxSize: 4
  minSize: 0

In this example you would need to --set autoDiscovery.clusterName=my.cluster.internal when installing.

It is not recommended to try to mix this with setting autoscalingGroups

See autoscaler AWS documentation for a more discussion of the setup.

GCE

The following parameters are required:

autoDiscovery.clusterName=any-name
cloud-provider=gce
autoscalingGroupsnamePrefix[0].name=your-ig-prefix,autoscalingGroupsnamePrefix[0].maxSize=10,autoscalingGroupsnamePrefix[0].minSize=1

To use Managed Instance Group (MIG) auto-discovery, provide a YAML file setting autoscalingGroupsnamePrefix (see values.yaml) or use --set when installing the Chart - e.g.

$ helm install my-release autoscaler/cluster-autoscaler \
--set "autoscalingGroupsnamePrefix[0].name=your-ig-prefix,autoscalingGroupsnamePrefix[0].maxSize=10,autoscalingGroupsnamePrefi[0].minSize=1" \
--set autoDiscovery.clusterName=<CLUSTER NAME> \
--set cloudProvider=gce

Note that your-ig-prefix should be a prefix matching one or more MIGs, and not the full name of the MIG. For example, to match multiple instance groups - k8s-node-group-a-standard, k8s-node-group-b-gpu, you would use a prefix of k8s-node-group-.

In the event you want to explicitly specify MIGs instead of using auto-discovery, set members of the autoscalingGroups array directly - e.g.

# where 'n' is the index, starting at 0
-- set autoscalingGroups[n].name=https://content.googleapis.com/compute/v1/projects/$PROJECTID/zones/$ZONENAME/instanceGroupManagers/$FULL-MIG-NAME,autoscalingGroups[n].maxSize=$MAXSIZE,autoscalingGroups[n].minSize=$MINSIZE

Azure AKS

The following parameters are required:

cloudProvider=azure
autoscalingGroups[0].name=your-agent-pool,autoscalingGroups[0].maxSize=10,autoscalingGroups[0].minSize=1
azureClientID: "your-service-principal-app-id"
azureClientSecret: "your-service-principal-client-secret"
azureSubscriptionID: "your-azure-subscription-id"
azureTenantID: "your-azure-tenant-id"
azureClusterName: "your-aks-cluster-name"
azureResourceGroup: "your-aks-cluster-resource-group-name"
azureVMType: "AKS"
azureNodeResourceGroup: "your-aks-cluster-node-resource-group"

OpenStack Magnum

cloudProvider: magnum must be set, and then one of

magnumClusterName=<cluster name or ID> and autoscalingGroups with the names of node groups and min/max node counts
or autoDiscovery.clusterName=<cluster name or ID> with one or more autoDiscovery.roles.

Additionally, cloudConfigPath: "/etc/kubernetes/cloud-config" must be set as this should be the location of the cloud-config file on the host.

Example values files can be found here.

Install the chart with

$ helm install my-release autoscaler/cluster-autoscaler -f myvalues.yaml

Cluster-API

cloudProvider: clusterapi must be set, and then one or more of

autoDiscovery.clusterName
or autoDiscovery.labels See here for more details

Additional config parameters avaible, see the values.yaml for more details clusterAPIMode clusterAPIKubeconfigSecret clusterAPIWorkloadKubeconfigPath clusterAPICloudConfigPath

Uninstalling the Chart

To uninstall my-release:

$ helm uninstall my-release

The command removes all the Kubernetes components associated with the chart and deletes the release.

Tip: List all releases using helm list or start clean with helm uninstall my-release

Additional Configuration

AWS - IAM

The worker running the cluster autoscaler will need access to certain resources and actions depending on the version you run and your configuration of it.

For the up-to-date IAM permissions required, please see the cluster autoscaler's AWS Cloudprovider Readme and switch to the tag of the cluster autoscaler image you are using.

AWS - IAM Roles for Service Accounts (IRSA)

For Kubernetes clusters that use Amazon EKS, the service account can be configured with an IAM role using IAM Roles for Service Accounts to avoid needing to grant access to the worker nodes for AWS resources.

In order to accomplish this, you will first need to create a new IAM role with the above mentions policies. Take care in configuring the trust relationship to restrict access just to the service account used by cluster autoscaler.

Once you have the IAM role configured, you would then need to --set rbac.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::123456789012:role/MyRoleName when installing.

Troubleshooting

The chart will succeed even if the container arguments are incorrect. A few minutes after starting kubectl logs -l "app=aws-cluster-autoscaler" --tail=50 should loop through something like

polling_autoscaler.go:111] Poll finished
static_autoscaler.go:97] Starting main loop
utils.go:435] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
static_autoscaler.go:230] Filtering out schedulables

If not, find a pod that the deployment created and describe it, paying close attention to the arguments under Command. e.g.:

Containers:
  cluster-autoscaler:
    Command:
      ./cluster-autoscaler
      --cloud-provider=aws
# if specifying ASGs manually
      --nodes=1:10:your-scaling-group-name
# if using autodiscovery
      --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<ClusterName>
      --v=4

PodSecurityPolicy

Though enough for the majority of installations, the default PodSecurityPolicy could be too restrictive depending on the specifics of your release. Please make sure to check that the template fits with any customizations made or disable it by setting rbac.pspEnabled to false.

Values

Key	Type	Default	Description
additionalLabels	object	`{}`	Labels to add to each object of the chart.
affinity	object	`{}`	Affinity for pod assignment
autoDiscovery.clusterName	string	`nil`	Enable autodiscovery for `cloudProvider=aws`, for groups matching `autoDiscovery.tags`. Enable autodiscovery for `cloudProvider=clusterapi`, for groups matching `autoDiscovery.labels`. Enable autodiscovery for `cloudProvider=gce`, but no MIG tagging required. Enable autodiscovery for `cloudProvider=magnum`, for groups matching `autoDiscovery.roles`.
autoDiscovery.labels	list	`[]`	Cluster-API labels to match https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/clusterapi/README.md#configuring-node-group-auto-discovery
autoDiscovery.roles	list	`["worker"]`	Magnum node group roles to match.
autoDiscovery.tags	list	`["k8s.io/cluster-autoscaler/enabled","k8s.io/cluster-autoscaler/{{ .Values.autoDiscovery.clusterName }}"]`	ASG tags to match, run through `tpl`.
autoscalingGroups	list	`[]`	For AWS, Azure AKS or Magnum. At least one element is required if not using `autoDiscovery`. For example: - name: asg1 maxSize: 2 minSize: 1
autoscalingGroupsnamePrefix	list	`[]`	For GCE. At least one element is required if not using `autoDiscovery`. For example: - name: ig01 maxSize: 10 minSize: 0
awsAccessKeyID	string	`""`	AWS access key ID (if AWS user keys used)
awsRegion	string	`"us-east-1"`	AWS region (required if `cloudProvider=aws`)
awsSecretAccessKey	string	`""`	AWS access secret key (if AWS user keys used)
azureClientID	string	`""`	Service Principal ClientID with contributor permission to Cluster and Node ResourceGroup. Required if `cloudProvider=azure`
azureClientSecret	string	`""`	Service Principal ClientSecret with contributor permission to Cluster and Node ResourceGroup. Required if `cloudProvider=azure`
azureClusterName	string	`""`	Azure AKS cluster name. Required if `cloudProvider=azure`
azureNodeResourceGroup	string	`""`	Azure resource group where the cluster's nodes are located, typically set as `MC_<cluster-resource-group-name>_<cluster-name>_<location>`. Required if `cloudProvider=azure`
azureResourceGroup	string	`""`	Azure resource group that the cluster is located. Required if `cloudProvider=azure`
azureSubscriptionID	string	`""`	Azure subscription where the resources are located. Required if `cloudProvider=azure`
azureTenantID	string	`""`	Azure tenant where the resources are located. Required if `cloudProvider=azure`
azureUseManagedIdentityExtension	bool	`false`	Whether to use Azure's managed identity extension for credentials. If using MSI, ensure subscription ID, resource group, and azure AKS cluster name are set.
azureVMType	string	`"AKS"`	Azure VM type.
cloudConfigPath	string	`"/etc/gce.conf"`	Configuration file for cloud provider.
cloudProvider	string	`"aws"`	The cloud provider where the autoscaler runs. Currently only `gce`, `aws`, `azure`, `magnum` and `clusterapi` are supported. `aws` supported for AWS. `gce` for GCE. `azure` for Azure AKS. `magnum` for OpenStack Magnum, `clusterapi` for Cluster API.
clusterAPICloudConfigPath	string	`"/etc/kubernetes/mgmt-kubeconfig"`	Path to kubeconfig for connecting to Cluster API Management Cluster, only used if `clusterAPIMode=kubeconfig-kubeconfig or incluster-kubeconfig`
clusterAPIConfigMapsNamespace	string	`""`	Namespace on the workload cluster to store Leader election and status configmaps
clusterAPIKubeconfigSecret	string	`""`	Secret containing kubeconfig for connecting to Cluster API managed workloadcluster Required if `cloudProvider=clusterapi` and `clusterAPIMode=kubeconfig-kubeconfig,kubeconfig-incluster or incluster-kubeconfig`
clusterAPIMode	string	`"incluster-incluster"`	Cluster API mode, see https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/clusterapi/README.md#connecting-cluster-autoscaler-to-cluster-api-management-and-workload-clusters Syntax: workloadClusterMode-ManagementClusterMode for `kubeconfig-kubeconfig`, `incluster-kubeconfig` and `single-kubeconfig` you always must mount the external kubeconfig using either `extraVolumeSecrets` or `extraMounts` and `extraVolumes` if you dont set `clusterAPIKubeconfigSecret`and thus use an in-cluster config or want to use a non capi generated kubeconfig you must do so for the workload kubeconfig as well
clusterAPIWorkloadKubeconfigPath	string	`"/etc/kubernetes/value"`	Path to kubeconfig for connecting to Cluster API managed workloadcluster, only used if `clusterAPIMode=kubeconfig-kubeconfig or kubeconfig-incluster`
containerSecurityContext	object	`{}`	Security context for container
deployment.annotations	object	`{}`	Annotations to add to the Deployment object.
dnsPolicy	string	`"ClusterFirst"`	Defaults to `ClusterFirst`. Valid values are: `ClusterFirstWithHostNet`, `ClusterFirst`, `Default` or `None`. If autoscaler does not depend on cluster DNS, recommended to set this to `Default`.
envFromConfigMap	string	`""`	ConfigMap name to use as envFrom.
envFromSecret	string	`""`	Secret name to use as envFrom.
expanderPriorities	object	`{}`	The expanderPriorities is used if `extraArgs.expander` contains `priority` and expanderPriorities is also set with the priorities. If `extraArgs.expander` contains `priority`, then expanderPriorities is used to define cluster-autoscaler-priority-expander priorities. See: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/expander/priority/readme.md
extraArgs	object	`{"logtostderr":true,"stderrthreshold":"info","v":4}`	Additional container arguments. Refer to https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-the-parameters-to-ca for the full list of cluster autoscaler parameters and their default values. Everything after the first _ will be ignored allowing the use of multi-string arguments.
extraEnv	object	`{}`	Additional container environment variables.
extraEnvConfigMaps	object	`{}`	Additional container environment variables from ConfigMaps.
extraEnvSecrets	object	`{}`	Additional container environment variables from Secrets.
extraVolumeMounts	list	`[]`	Additional volumes to mount.
extraVolumeSecrets	object	`{}`	Additional volumes to mount from Secrets.
extraVolumes	list	`[]`	Additional volumes.
fullnameOverride	string	`""`	String to fully override `cluster-autoscaler.fullname` template.
image.pullPolicy	string	`"IfNotPresent"`	Image pull policy
image.pullSecrets	list	`[]`	Image pull secrets
image.repository	string	`"k8s.gcr.io/autoscaling/cluster-autoscaler"`	Image repository
image.tag	string	`"v1.23.0"`	Image tag
kubeTargetVersionOverride	string	`""`	Allow overriding the `.Capabilities.KubeVersion.GitVersion` check. Useful for `helm template` commands.
magnumCABundlePath	string	`"/etc/kubernetes/ca-bundle.crt"`	Path to the host's CA bundle, from `ca-file` in the cloud-config file.
magnumClusterName	string	`""`	Cluster name or ID in Magnum. Required if `cloudProvider=magnum` and not setting `autoDiscovery.clusterName`.
nameOverride	string	`""`	String to partially override `cluster-autoscaler.fullname` template (will maintain the release name)
nodeSelector	object	`{}`	Node labels for pod assignment. Ref: https://kubernetes.io/docs/user-guide/node-selection/.
podAnnotations	object	`{}`	Annotations to add to each pod.
podDisruptionBudget	object	`{"maxUnavailable":1}`	Pod disruption budget.
podLabels	object	`{}`	Labels to add to each pod.
priorityClassName	string	`"system-cluster-critical"`	priorityClassName
priorityConfigMapAnnotations	object	`{}`	Annotations to add to `cluster-autoscaler-priority-expander` ConfigMap.
prometheusRule.additionalLabels	object	`{}`	Additional labels to be set in metadata.
prometheusRule.enabled	bool	`false`	If true, creates a Prometheus Operator PrometheusRule.
prometheusRule.interval	string	`nil`	How often rules in the group are evaluated (falls back to `global.evaluation_interval` if not set).
prometheusRule.namespace	string	`"monitoring"`	Namespace which Prometheus is running in.
prometheusRule.rules	list	`[]`	Rules spec template (see https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#rule).
rbac.clusterScoped	bool	`true`	if set to false will only provision RBAC to alter resources in the current namespace. Most useful for Cluster-API
rbac.create	bool	`true`	If `true`, create and use RBAC resources.
rbac.pspEnabled	bool	`false`	If `true`, creates and uses RBAC resources required in the cluster with Pod Security Policies enabled. Must be used with `rbac.create` set to `true`.
rbac.serviceAccount.annotations	object	`{}`	Additional Service Account annotations.
rbac.serviceAccount.automountServiceAccountToken	bool	`true`	Automount API credentials for a Service Account.
rbac.serviceAccount.create	bool	`true`	If `true` and `rbac.create` is also true, a Service Account will be created.
rbac.serviceAccount.name	string	`""`	The name of the ServiceAccount to use. If not set and create is `true`, a name is generated using the fullname template.
replicaCount	int	`1`	Desired number of pods
resources	object	`{}`	Pod resource requests and limits.
securityContext	object	`{}`	Security context for pod
service.annotations	object	`{}`	Annotations to add to service
service.create	bool	`true`	If `true`, a Service will be created.
service.externalIPs	list	`[]`	List of IP addresses at which the service is available. Ref: https://kubernetes.io/docs/user-guide/services/#external-ips.
service.labels	object	`{}`	Labels to add to service
service.loadBalancerIP	string	`""`	IP address to assign to load balancer (if supported).
service.loadBalancerSourceRanges	list	`[]`	List of IP CIDRs allowed access to load balancer (if supported).
service.portName	string	`"http"`	Name for service port.
service.servicePort	int	`8085`	Service port to expose.
service.type	string	`"ClusterIP"`	Type of service to create.
serviceMonitor.enabled	bool	`false`	If true, creates a Prometheus Operator ServiceMonitor.
serviceMonitor.interval	string	`"10s"`	Interval that Prometheus scrapes Cluster Autoscaler metrics.
serviceMonitor.namespace	string	`"monitoring"`	Namespace which Prometheus is running in.
serviceMonitor.path	string	`"/metrics"`	The path to scrape for metrics; autoscaler exposes `/metrics` (this is standard)
serviceMonitor.selector	object	`{"release":"prometheus-operator"}`	Default to kube-prometheus install (CoreOS recommended), but should be set according to Prometheus install.
tolerations	list	`[]`	List of node taints to tolerate (requires Kubernetes >= 1.6).
topologySpreadConstraints	list	`[]`	You can use topology spread constraints to control how Pods are spread across your cluster among failure-domains such as regions, zones, nodes, and other user-defined topology domains. (requires Kubernetes >= 1.19).
updateStrategy	object	`{}`	Deployment update strategy