Node group discovery is now handled by cloudprovider.Refresh() in all cases.
Additionally, explicit node groups can now be used alongside autodiscovery.
This commit adds a new usage of the --node-group-auto-discovery flag intended
for use with the GCE cloud provider. GCE instance groups can be automatically
discovered based on a prefix of their group name. Example usage:
--node-group-auto-discovery=mig:prefix=k8s-mig,minNodes=0,maxNodes=10
Note that unlike the existing AWS ASG autodetection functionality we must
specify the min and max nodes in the flag. This is because MIGs store only
a target size in the GCE API - they do not have a min and max size we can
infer via the API.
In order to alleviate this limitation a little we allow multiple uses of the
autodiscovery flag. For example to discover two classes (big and small) of
instance groups with different size limits:
./cluster-autoscaler \
--node-group-auto-discovery=mig:prefix=k8s-a-small,minNodes=1,maxNodes=10 \
--node-group-auto-discovery=mig:prefix=k8s-a-big,minNodes=1,maxNodes=100
Zonal clusters (i.e. multizone = false in the cloud config) will detect all
managed instance groups within the cluster's zone. Regional clusters will
detect all matching (zonal) managed instance groups within any of that region's
zones.
This flag was true in default setups for every platform,
we haven't heard about any user changing it to false and
after removing check on PodScheduled condition setting it
to false would basically break CA.
This is an alternative implementation of https://github.com/kubernetes/contrib/pull/1982
Notable differences from the original PR are:
* A new flag named `--node-group-auto-discovery` is introduced for opting in to enable the auto-discovery feature.
* For example, specifying `--cloud-provider aws --node-group-auto-discovery asg:tag=k8s.io/cluster-autoscaler/enabled` instructs CA to auto-discover ASGs tagged with `k8s.io/cluster-autoscaler/enabled` to be used as target node groups
* The new code path introduced by this PR is executed only when `node-group-auto-discovery` is specified. There is relatively less chance to break existing features by introducing this change
Resolves https://github.com/kubernetes/contrib/issues/1956
---
Other notes:
* We rely mainly on the `DescribeTags` API rather than `DescribeAutoScalingGroups` so that AWS can filter out unnecessary ASGs which doesn't belong to the k8s cluster, for us.
* If we relied on `DescribeAutoScalingGroups` here, as it doesn't support `Filter`ing, we'd need to iterate over ALL the ASGs available in an AWS account, which isn't desirable due to unnecessary excessive API calls and network usages
* Update cloudprovider/aws/README for the new configuration
* Warn abount invalid combination of flags
according to the review comment https://github.com/kubernetes/autoscaler/pull/11#discussion_r113713138
* Emit a validation error when both --nodes and --node-group-auto-discovery are specified
according to the review comment https://github.com/kubernetes/autoscaler/pull/11#discussion_r113958080
TODO/Possible future improvements before recommending this to everyone:
* Cache the result of an auto-discovery for a configurable period, so that we won't invoke DescribeTags and DescribeAutoScalingGroup APIs too many times
Adds a new optional flag named `configmap` to specify the name of a configmap containing node group specs.
The configmap is polled every `scan-interval` seconds to reconfigure cluster-autoscaler dynamically at runtime.
Example usage:
```
./cluster-autoscaler --v=4 --cloud-provider=aws --skip-nodes-with-local-storage=false --logtostderr --leader-elect=false --configmap=cluster-autoscaler --logtostderr
```
The configmap would look like:
```yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: cluster-autoscaler
namespace: kube-system
data:
settings: |-
{
"nodeGroups": [
{
"minSize": 1,
"maxSize": 2,
"name": "kubeawstest-nodepool1-AutoScaleWorker-1VWD4GAVG35L5"
}
]
}
```
Other notes:
* Make namespace defaults to "kube-system"
according to https://github.com/kubernetes/contrib/pull/2226#discussion_r94144267
* Trigger a full-recreate on a configuration change
according to https://github.com/kubernetes/contrib/pull/2226#issuecomment-269617410
* Introduced `autoscaler/` and moved all the dynamic/recreatable-at-runtime parts of autoscaler into there (Update: the package is now named `core` according to https://github.com/kubernetes/contrib/pull/2226#issuecomment-273071663)
* Extracted the core of CA(=`func Run()` in `main.go`) into `Autoscaler`
* `DynamicAutoscaler` is a wrapper around `Autoscaler` which achieves reconfiguration of CA by recreating an `Autoscaler` instance on a configmap change.
* Moved `scale_down*.go`, `scale_up*.go` and `utils*.go` into the `autoscaler` package accordingly because they seemed to be meant to be collocated in the same package as the core of CA (which is now implemented as `Autoscaler`)
* Moved the `createEventRecorder` func from the `main` package to the `utils/kubernetes` package to make it importable from both `main` and `autoscaler`