Currently the label to identify controller/master node is hard coded to
`node-role.kubernetes.io/master`.
There have been some conversations centered around replacing the label
with `node-role.kubernetes.io/control-plane`.
In [Lokomotive](github.com/kinvolk/lokomotive), the label to identify
the controller/master node is `node.kubernetes.io/master`, the reasons
for this is mentioned in this [issue](https://github.com/kinvolk/lokomotive/issues/227)
This commit makes the label configurable by setting an env variable in
the deployment `CONTROLLER_NODE_IDENTIFIER_LABEL`, if set then the value
in the env variable is used for identifying controller/master nodes, if
not set/passed, then the existing behaviour is followed choosing the
existing label.
Signed-off-by: Imran Pochi <imran@kinvolk.io>
This commit adds another string prefix to consider `equinixmetal://`
along with the existing prefix `packet://`.
When K8s API is queried to get providerID from Node Spec, some machines
return `packet://<uuid>`, whereas some return `equinixmetal://`, this
creates error as the string is not trimmed properly and hence results in
a 404 when an untrimmed string is queried to Equinix Metal API for
device information.
Signed-off-by: Imran Pochi <imran@kinvolk.io>
In the latest version of cluster-autoscaler (cloudprovider: packet), the
code panics and the pods go into CrashLoopBackoff due to an entry
assignment on a nil map.
This commit fixes that by initializing the ConfigFile instance.
I believe this situation is created when the config file doesn't contain
any information about the nodepool and also `default` config is not
present, but this does not take care the use case of when `Global`
section is defined in the config file.
Below is the error reproduced when `[Global]` is used in the config
file.
```
panic: assignment to entry in nil map
goroutine 131 [running]:
k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet.createPacketManagerRest(0x44cf260, 0xc00085e448, 0xc000456670, 0x1, 0x1, 0x0, 0x0, 0x0, 0x3fe0000000000000, 0x3fe0000000000000, ...)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet/packet_manager_rest.go:307 +0xaca
k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet.createPacketManager(0x44cf260, 0xc00085e448, 0xc000456670, 0x1, 0x1, 0x0, 0x0, 0x0, 0x3fe0000000000000, 0x3fe0000000000000, ...)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet/packet_manager.go:64 +0x179
k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet.BuildPacket(0x3fe0000000000000, 0x3fe0000000000000, 0x1bf08eb000, 0x1176592e000, 0xa, 0x0, 0x4e200, 0x0, 0x186a0000000000, 0x0, ...)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet/packet_cloud_provider.go:164 +0xe5
k8s.io/autoscaler/cluster-autoscaler/cloudprovider/builder.buildCloudProvider(0x3fe0000000000000, 0x3fe0000000000000, 0x1bf08eb000, 0x1176592e000, 0xa, 0x0, 0x4e200, 0x0, 0x186a0000000000, 0x0, ...)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/builder/builder_all.go:91 +0x31f
k8s.io/autoscaler/cluster-autoscaler/cloudprovider/builder.NewCloudProvider(0x3fe0000000000000, 0x3fe0000000000000, 0x1bf08eb000, 0x1176592e000, 0xa, 0x0, 0x4e200, 0x0, 0x186a0000000000, 0x0, ...)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/builder/cloud_provider_builder.go:45 +0x1e6
k8s.io/autoscaler/cluster-autoscaler/core.initializeDefaultOptions(0xc0013876e0, 0x452ef01, 0xc000d80e20)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/core/autoscaler.go:101 +0x2fd
k8s.io/autoscaler/cluster-autoscaler/core.NewAutoscaler(0x3fe0000000000000, 0x3fe0000000000000, 0x1bf08eb000, 0x1176592e000, 0xa, 0x0, 0x4e200, 0x0, 0x186a0000000000, 0x0, ...)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/core/autoscaler.go:65 +0x43
main.buildAutoscaler(0xc000313600, 0xc000d00000, 0x4496df, 0x7f9c7b60b4f0)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/main.go:337 +0x368
main.run(0xc00063e230)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/main.go:343 +0x39
main.main.func2(0x453b440, 0xc00029d380)
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/main.go:447 +0x2a
created by k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run
/gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:207 +0x113
```
Signed-off-by: Imran Pochi <imran@kinvolk.io>
Supports providing different NodeInfos sources (either upstream or in
local forks, eg. to properly implement variants like in #4000).
This also moves a large and specialized code chunk out of core, and removes
the need to maintain and pass the GetNodeInfosForGroups() cache from the side,
as processors can hold their states themselves.
No functional changes to GetNodeInfosForGroups(), outside mechanical changes
due to the move: remotely call a few utils functions in core/utils package,
pick context attributes (the processor takes the context as arg rather than
ListerRegistry + PredicateChecker + CloudProvider), and use the builtin cache
rather than receiving it from arguments.
The pricing json for us-east-1 is currently 129MB. Currently fetching
this into memory and parsing results in a large memory footprint on
startup, and can lead to the autoscaler being OOMKilled.
Change the ReadAll/Unmarshal logic to a stream decoder to significantly
reduce the memory use.
Magnum allows using the microversion string "latest",
and it will replace it internally with the highest
microversion that it supports.
This will let the autoscaler use microversion 1.10 which
allows scaling groups to 0 nodes, if it is available.
The autoscaler will still be able to use microversion 1.9
on older versions of magnum.
csidrivers.storage.k8s.io and csistoragecpacities.storage.k8s.io are available on EKS
1.21. Adding permissions to the ClusterRole in the example to avoid the error
messages.
Since Cluster Autoscaler versioning should be in sync with Kubernetes,
update-vendor.sh can simply set the version after a successful
dependency update.