History

Timothy Palpant 7f6b021cef Add h1 and m5 AWS instance types		2017-12-04 13:16:48 -05:00
..
ec2_instance_types	more spelling fixes	2017-11-02 14:21:36 -07:00
README.md	Set code types for syntax highlighting in README.md	2017-11-28 12:47:34 +00:00
auto_scaling.go	Check ASG name list not empty	2017-08-30 13:50:17 +01:00
auto_scaling_groups.go	Fixing for issue 252 by implementing a channel to stop the go routine	2017-11-01 11:00:00 -04:00
aws_cloud_provider.go	Renaming the interface function to Cleanup() for CloudProvider type	2017-11-01 12:41:13 -04:00
aws_cloud_provider_test.go	Renaming the interface function to Cleanup() for CloudProvider type	2017-11-01 12:41:13 -04:00
aws_manager.go	Fixing for issue 252 by implementing a channel to stop the go routine	2017-11-01 11:00:00 -04:00
aws_manager_test.go	Fixing for issue 252 by implementing a channel to stop the go routine	2017-11-01 11:00:00 -04:00
ec2_instance_types.go	Add h1 and m5 AWS instance types	2017-12-04 13:16:48 -05:00

README.md

Cluster Autoscaler on AWS

The cluster autoscaler on AWS scales worker nodes within any specified autoscaling group. It will run as a Deployment in your cluster. This README will go over some of the necessary steps required to get the cluster autoscaler up and running.

Kubernetes Version

Cluster autoscaler must run on v1.3.0 or greater.

Permissions

The worker running the cluster autoscaler will need access to certain resources and actions.

A minimum IAM policy would look like:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup"
            ],
            "Resource": "*"
        }
    ]
}

If you'd like to auto-discover node groups by specifing the --node-group-auto-discover flag, a DescribeTags permission is also required:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeTags",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup"
            ],
            "Resource": "*"
        }
    ]
}

Unfortunately AWS does not support ARNs for autoscaling groups yet so you must use "*" as the resource. More information here.

Deployment Specification

1 ASG Setup (min: 1, max: 10, ASG Name: k8s-worker-asg-1)

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      containers:
        - image: gcr.io/google_containers/cluster-autoscaler:v0.6.0
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
              memory: 300Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --nodes=1:10:k8s-worker-asg-1
          env:
            - name: AWS_REGION
              value: us-east-1
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-certificates.crt"

Multiple ASG Setup

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      containers:
        - image: gcr.io/google_containers/cluster-autoscaler:v0.6.0
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
              memory: 300Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=least-waste
            - --nodes=1:10:k8s-worker-asg-1
            - --nodes=1:3:k8s-worker-asg-2
          env:
            - name: AWS_REGION
              value: us-east-1
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-certificates.crt"

Master Node Setup

To run a CA pod in master node - CA deployment should tolerate the master taint and nodeSelector should be used to schedule the pods in master node.

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
      nodeSelector:
        kubernetes.io/role: master
      containers:
        - image: gcr.io/google_containers/cluster-autoscaler:{{ ca_version }}
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
              memory: 300Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --nodes={{ node_asg_min }}:{{ node_asg_max }}:{{ name }}
          env:
            - name: AWS_REGION
              value: {{ region }}
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-certificates.crt"

Auto-Discovery Setup

As of version v0.5.1, docker images including the support for --node-group-auto-discovery is not yet published to official repository. Please checkout the latest source of this project locally and run REGISTRY=<your docker repo> make release to build and push an image yourself. Then, a manifest like below would run a cluster-autoscaler which auto-discovers ASGs tagged with k8s.io/cluster-autoscaler/enabled and kubernetes.io/cluster/<YOUR CLUSTER NAME> to be node groups. Note that:

kubernetes.io/cluster/<YOUR CLUSTER NAME> is required when k8s.io/cluster-autoscaler/enabled is used across many clusters to prevent ASGs from different clusters recognized as the node groups
There are no --nodes flags passed to cluster-autoscaler because the node groups are automatically discovered by tags

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      containers:
        - image: <your docker repo>/cluster-autoscaler:dev
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 300Mi
            requests:
              cpu: 100m
              memory: 300Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=least-waste
            - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,kubernetes.io/cluster/<YOUR CLUSTER NAME>
          env:
            - name: AWS_REGION
              value: us-east-1
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-certificates.crt"

Scaling a node group to 0

From CA 0.6.1 - it is possible to scale a node group to 0 (and obviously from 0), assuming that all scale-down conditions are met.

If you are using nodeSelector you need to tag the ASG with a node-template key "k8s.io/cluster-autoscaler/node-template/label/" and "k8s.io/cluster-autoscaler/node-template/taint/" if you are using taints.

For example for a node label of foo=bar you would tag the ASG with:

{
    "ResourceType": "auto-scaling-group",
    "ResourceId": "foo.example.com",
    "PropagateAtLaunch": true,
    "Value": "bar",
    "Key": "k8s.io/cluster-autoscaler/node-template/label/foo"
}

And for a taint of "dedicated": "foo:NoSchedule" you would tag the ASG with:

{
    "ResourceType": "auto-scaling-group",
    "ResourceId": "foo.example.com",
    "PropagateAtLaunch": true,
    "Value": "foo:NoSchedule",
    "Key": "k8s.io/cluster-autoscaler/node-template/taint/dedicated"
}

If you'd like to scale node groups from 0, a DescribeLaunchConfigurations permission is also required:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeTags",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup"
            ],
            "Resource": "*"
        }
    ]
}

Common Notes and Gotchas:

The /etc/ssl/certs/ca-certificates.crt should exist by default on your ec2 instance.
Cluster autoscaler is not zone aware (for now), so if you wish to span multiple availability zones in your autoscaling groups beware that cluster autoscaler will not evenly distribute them. For more information, see https://github.com/kubernetes/contrib/pull/1552#r75532949.
By default, cluster autoscaler will not terminate nodes running pods in the kube-system namespace. You can override this default behaviour by passing in the --skip-nodes-with-system-pods=false flag.
By default, cluster autoscaler will wait 10 minutes between scale down operations, you can adjust this using the --scale-down-delay flag. E.g. --scale-down-delay=5m to decrease the scale down delay to 5 minutes.
If you're running multiple ASGs, the --expander flag supports three options: random, most-pods and least-waste. random will expand a random ASG on scale up. most-pods will scale up the ASG that will scheduable the most amount of pods. least-waste will expand the ASG that will waste the least amount of CPU/MEM resources. In the event of a tie, cluster autoscaler will fall back to random.