Recent changes configured providers to set stable nodes labels names
exclusively (ie. LabelTopologyZone and not LabelZoneFailureDomain, etc),
with older labels names backfilled at nodeInfos templates generation time
(from GetNodeInfoFromTemplate), which isn't invoked from most tests cases.
GCE NodePirce() might have been dereferencing potentially missing labels.
And run hack/update-gofmt.sh where hack/verify-all.sh fails, to pass CI.
And at the same time only set stable labels in all buildGenericLabels
implementations.
This fixes issues when a node group has 0 nodes yet and node labels are
built using buildGenericLabels and the node-template labels.
Issues include (anti-)affinity and nodeSelectors for the given labels,
giving false-negative results for candidate nodes, which leads to ASGs
never scaling up.
The new `CreateInstances()` upscale method replacing `Resize()` API
calls generates new instances names based on the MIG's name (from
`mig.GceRef()`).
Before that change, `Resize()`-initiated upscales were prompting MIGs to
spawn instances named after MIG's `BaseInstanceName` attribute.
Accordingly, `GetMigForInstance()` (still) uses MIG's `BaseInstanceName`
to map instances to their parent MIG and discover which MIGs needs an
immediate refresh.
Down the line the `clusterstate.updateReadinessStats()` periodic
goroutines won't be able to map new ready nodes to their parent MIGs
(until the cache is backfilled upward from k8s node's providerid, ie.
from an hourly goroutine), and those MIGs will be considered non-ready
(because MIG's size>0 while the MIG has no known ready instances).
So after a first upscale, MIGs (having a BaseInstanceName that is not
the MIG's Name) won't be re-upscalable for a while. Example symptoms:
```
cluster-autoscaler W0719 12:35:43.166563 6 clusterstate.go:447] Failed to find readiness information for https://www.googleapis.com/compute/v1/projects/REDACTED-PROJECT/zones/europe-west3-b/instanceGroups/REDACTED-MIGNAME
cluster-autoscaler W0719 12:35:43.193469 6 clusterstate.go:626] Readiness for node group https://www.googleapis.com/compute/v1/projects/REDACTED-PROJECT/zones/europe-west3-b/instanceGroups/REDACTED-MIGNAME not found
```
Beside mapping cache issue, this changed the instance names prefixes for
some users, while it might make sense to keep using basenames when
explicitely provided (might have an use for eg. identification, or name
length limits) and avoid a breaking change before `CreateInstances` hits
a release.
This change updates the AWS examples to have 600Mi of memory because CAS downloads a pricing file that contains EC2 instance info at startup which grows each time there's new EC2 instance information available. Currently the largest region is hitting the 300Mi limit when downloading that file, so we are increasing the memory limit in our examples for customers.
This change is adding github users arunmk, mrajashree, jackfrancis,
shysank, and randomvariable to the reviews for the cluster-api
provider. It also removes frobware and ncdc from the approvers and
reviewers.
Each test works in isolation, but they cause panic when the entire
suite is run (ex. make test-in-docker), because the underlying
metrics library panics when the same metric is registered twice.