This allows a Machine{Set,Deployment} to scale up/down from 0,
providing the following annotations are set:
```yaml
apiVersion: v1
items:
- apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
annotations:
machine.openshift.io/cluster-api-autoscaler-node-group-min-size: "0"
machine.openshift.io/cluster-api-autoscaler-node-group-max-size: "6"
machine.openshift.io/vCPU: "2"
machine.openshift.io/memoryMb: 8G
machine.openshift.io/GPU: "1"
machine.openshift.io/maxPods: "100"
```
Note that `machine.openshift.io/GPU` and `machine.openshift.io/maxPods`
are optional.
For autoscaling from zero, the autoscaler should convert the mem value
received in the appropriate annotation to bytes using powers of two
consistently with other providers and fail if the format received is not
expected. This gives robust behaviour consistent with cloud providers APIs
and providers implementations.
https://cloud.google.com/compute/all-pricinghttps://www.iec.ch/si/binary.htmhttps://github.com/openshift/kubernetes-autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/aws_manager.go#L366
Co-authored-by: Enxebre <alberto.garcial@hotmail.com>
Co-authored-by: Joel Speed <joel.speed@hotmail.co.uk>
Co-authored-by: Michael McCune <elmiko@redhat.com>
Because the autoscaler assumes it can delete nodes in parallel, it
fetches nodegroups for each node in separate go routines and then
instructs each nodegroup to delete a single node.
Because we don't share the nodegroup across go routines, the cached
replica count in the scalableresource can become stale and as such, if
the autoscaler attempts to scale down multiple nodes at a time, the
cluster api provider only actually removes a single node.
To prevent this, we must ensure we have a fresh replica count for every
scale down attempt.
Instead of retrieving it each time from k8s, which easily causes client-side
throttling, which in turn causes each autoscaler run to take multiple
seconds even if only a small number of NodeGroups is involved and nothing
is to do.