autoscaler

Commit Graph

Author	SHA1	Message	Date
Kubernetes Prow Robot	a498045443	Merge pull request #4237 from DataDog/autoscaling-options-azure implement GetOptions for Azure	2021-08-24 09:47:14 -07:00
Kubernetes Prow Robot	b0681fca7c	Merge pull request #4277 from filintod/fix-autoscaler-ns-permit fix 4256 autoscaler permit	2021-08-24 01:43:15 -07:00
Benjamin Pineau	28cd49c09e	implement GetOptions for Azure Support per-VMSS (scaledown) settings as permited by the cloudprovider's interface `GetOptions()` method.	2021-08-24 09:48:51 +02:00
Filinto Duran	38ccc59458	fix 4256 autoscaler permit it does not need update, only watch/list/get	2021-08-23 15:41:24 -05:00
Kubernetes Prow Robot	d09b8931bb	Merge pull request #4236 from DataDog/autoscaling-options-gce implement GetOptions for GCE	2021-08-23 03:48:00 -07:00
Kubernetes Prow Robot	0e6f5fb25d	Merge pull request #4278 from jmnote/patch-1 presources → resources	2021-08-22 22:38:00 -07:00
Jmnote	288b8aa4fe	presources → resources presources → resources	2021-08-23 12:20:38 +09:00
Benjamin Pineau	d905ec28dd	implement GetOptions for GCE Support per-MIG (scaledown) settings as permited by the cloudprovider's interface `GetOptions()` method.	2021-08-21 18:18:48 +02:00
Kubernetes Prow Robot	fb8fdf819b	Merge pull request #4274 from kinvolk/imran/cloud-provider-packet-fix Cloud provider[Packet] fixes	2021-08-19 11:35:25 -07:00
Imran Pochi	8e6f109dab	packet: Add documentation regarding new env variable Adds documentation regarding env variables introduced: - PACKET_CONTROLLER_NODE_IDENTIFIER_LABEL - INSTALLED_CCM Signed-off-by: Imran Pochi <imran@kinvolk.io>	2021-08-19 21:04:53 +05:30
Imran Pochi	87beac1af7	packet: make controller node label configurable Currently the label to identify controller/master node is hard coded to `node-role.kubernetes.io/master`. There have been some conversations centered around replacing the label with `node-role.kubernetes.io/control-plane`. In [Lokomotive](github.com/kinvolk/lokomotive), the label to identify the controller/master node is `node.kubernetes.io/master`, the reasons for this is mentioned in this [issue](https://github.com/kinvolk/lokomotive/issues/227) This commit makes the label configurable by setting an env variable in the deployment `CONTROLLER_NODE_IDENTIFIER_LABEL`, if set then the value in the env variable is used for identifying controller/master nodes, if not set/passed, then the existing behaviour is followed choosing the existing label. Signed-off-by: Imran Pochi <imran@kinvolk.io>	2021-08-19 21:04:53 +05:30
Imran Pochi	0041a393ec	packet: Consider the string prefix equinixmetal:// This commit adds another string prefix to consider `equinixmetal://` along with the existing prefix `packet://`. When K8s API is queried to get providerID from Node Spec, some machines return `packet://<uuid>`, whereas some return `equinixmetal://`, this creates error as the string is not trimmed properly and hence results in a 404 when an untrimmed string is queried to Equinix Metal API for device information. Signed-off-by: Imran Pochi <imran@kinvolk.io>	2021-08-19 21:04:52 +05:30
Imran Pochi	6e95bccc96	packet: fix panic on entry in nil map In the latest version of cluster-autoscaler (cloudprovider: packet), the code panics and the pods go into CrashLoopBackoff due to an entry assignment on a nil map. This commit fixes that by initializing the ConfigFile instance. I believe this situation is created when the config file doesn't contain any information about the nodepool and also `default` config is not present, but this does not take care the use case of when `Global` section is defined in the config file. Below is the error reproduced when `[Global]` is used in the config file. ``` panic: assignment to entry in nil map goroutine 131 [running]: k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet.createPacketManagerRest(0x44cf260, 0xc00085e448, 0xc000456670, 0x1, 0x1, 0x0, 0x0, 0x0, 0x3fe0000000000000, 0x3fe0000000000000, ...) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet/packet_manager_rest.go:307 +0xaca k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet.createPacketManager(0x44cf260, 0xc00085e448, 0xc000456670, 0x1, 0x1, 0x0, 0x0, 0x0, 0x3fe0000000000000, 0x3fe0000000000000, ...) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet/packet_manager.go:64 +0x179 k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet.BuildPacket(0x3fe0000000000000, 0x3fe0000000000000, 0x1bf08eb000, 0x1176592e000, 0xa, 0x0, 0x4e200, 0x0, 0x186a0000000000, 0x0, ...) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/packet/packet_cloud_provider.go:164 +0xe5 k8s.io/autoscaler/cluster-autoscaler/cloudprovider/builder.buildCloudProvider(0x3fe0000000000000, 0x3fe0000000000000, 0x1bf08eb000, 0x1176592e000, 0xa, 0x0, 0x4e200, 0x0, 0x186a0000000000, 0x0, ...) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/builder/builder_all.go:91 +0x31f k8s.io/autoscaler/cluster-autoscaler/cloudprovider/builder.NewCloudProvider(0x3fe0000000000000, 0x3fe0000000000000, 0x1bf08eb000, 0x1176592e000, 0xa, 0x0, 0x4e200, 0x0, 0x186a0000000000, 0x0, ...) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/cloudprovider/builder/cloud_provider_builder.go:45 +0x1e6 k8s.io/autoscaler/cluster-autoscaler/core.initializeDefaultOptions(0xc0013876e0, 0x452ef01, 0xc000d80e20) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/core/autoscaler.go:101 +0x2fd k8s.io/autoscaler/cluster-autoscaler/core.NewAutoscaler(0x3fe0000000000000, 0x3fe0000000000000, 0x1bf08eb000, 0x1176592e000, 0xa, 0x0, 0x4e200, 0x0, 0x186a0000000000, 0x0, ...) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/core/autoscaler.go:65 +0x43 main.buildAutoscaler(0xc000313600, 0xc000d00000, 0x4496df, 0x7f9c7b60b4f0) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/main.go:337 +0x368 main.run(0xc00063e230) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/main.go:343 +0x39 main.main.func2(0x453b440, 0xc00029d380) /gopath/src/k8s.io/autoscaler/cluster-autoscaler/main.go:447 +0x2a created by k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:207 +0x113 ``` Signed-off-by: Imran Pochi <imran@kinvolk.io>	2021-08-19 19:14:06 +05:30
Kubernetes Prow Robot	c64e97570a	Merge pull request #4234 from by211/by211-patch-1 Fix markdown code not showing correctly	2021-08-17 06:01:13 -07:00
Kubernetes Prow Robot	5766bb7137	Merge pull request #4210 from caogj/fix-usage fixed flag usages	2021-08-16 05:29:17 -07:00
Kubernetes Prow Robot	62e79c3636	Merge pull request #4250 from PhdLoLi/master Fill in the LastUpdateTime Field of VpaCheckpoint Status with Correct Time.	2021-08-16 04:31:48 -07:00
Kubernetes Prow Robot	cac6d4be41	Merge pull request #4261 from tghartland/4254-magnum-microversion Use highest available magnum microversion	2021-08-16 04:23:47 -07:00
Kubernetes Prow Robot	b01f84c803	Merge pull request #4199 from aidy/optimise_generate_ec2 Optimise generate ec2	2021-08-16 04:05:47 -07:00
Adrian Lai	1177ee04f3	Optimise GenerateEC2InstanceTypes unmarshal memory usage The pricing json for us-east-1 is currently 129MB. Currently fetching this into memory and parsing results in a large memory footprint on startup, and can lead to the autoscaler being OOMKilled. Change the ReadAll/Unmarshal logic to a stream decoder to significantly reduce the memory use.	2021-08-16 11:12:59 +01:00
Kubernetes Prow Robot	ece1c4010a	Merge pull request #4246 from brightframe/cuppett/aws-add-storage-apis Fix: Adding additional storage APIs discovered but unable to be watched on EKS	2021-08-16 01:07:47 -07:00
Kubernetes Prow Robot	5d754993c9	Merge pull request #3999 from DataDog/bump-asg-per-describe aws: Set maxAsgNamesPerDescribe to the new maximum value	2021-08-16 01:01:47 -07:00
Thomas Hartland	13cd70eaa5	Use highest available magnum microversion Magnum allows using the microversion string "latest", and it will replace it internally with the highest microversion that it supports. This will let the autoscaler use microversion 1.10 which allows scaling groups to 0 nodes, if it is available. The autoscaler will still be able to use microversion 1.9 on older versions of magnum.	2021-08-13 13:14:51 +02:00
Kubernetes Prow Robot	0265412f69	Merge pull request #4243 from sherman-grewal/updater-deploy-add-ns-env-var Add NAMESPACE as an environment variable to the updater deployment config	2021-08-13 03:16:23 -07:00
Kubernetes Prow Robot	0df0eecea4	Merge pull request #4257 from x13n/patch-2 Make CA version on HEAD match k8s version in go.mod	2021-08-12 06:09:47 -07:00
Daniel Kłobuszewski	fcb0e665ef	Make CA version on HEAD match k8s version in go.mod	2021-08-12 11:46:52 +02:00
Kubernetes Prow Robot	9482d47f7e	Merge pull request #4253 from olagacek/master Extend ScaleUpStatus structure with ScaleUpError field.	2021-08-12 02:17:47 -07:00
Aleksandra Gacek	b194c6f252	Extend ScaleUpStatus structure with ScaleUpError field.	2021-08-12 10:40:58 +02:00
Lijing Wang	80cc8487d0	Merge branch 'kubernetes:master' into master	2021-08-10 12:16:05 +08:00
Kubernetes Prow Robot	e9b605ae48	Merge pull request #4245 from x13n/update-vendor Update Cluster Autoscaler version with vendor	2021-08-09 07:25:32 -07:00
Adrian Lai	329c6522b0	Break out unmarshal from GenerateEC2InstanceTypes Refactor to allow for optimisation	2021-08-09 10:46:30 +01:00
Kubernetes Prow Robot	7e5f8f0045	Merge pull request #4179 from DataDog/aws-requests-metrics Metrics for AWS API calls	2021-08-09 01:19:31 -07:00
Stephen Cuppett	409a5d9c78	Fix: Adding additional APIs discovered but unable to be watched on EKS csidrivers.storage.k8s.io and csistoragecpacities.storage.k8s.io are available on EKS 1.21. Adding permissions to the ClusterRole in the example to avoid the error messages.	2021-08-08 18:42:55 -04:00
Kubernetes Prow Robot	d86a875927	Merge pull request #4222 from harshagv/priority-configmap-annotation allow adding annotations for priority-expander configmap	2021-08-06 08:43:19 -07:00
Daniel Kłobuszewski	b95a611216	Update Cluster Autoscaler version with vendor Since Cluster Autoscaler versioning should be in sync with Kubernetes, update-vendor.sh can simply set the version after a successful dependency update.	2021-08-06 17:30:09 +02:00
Sherman Grewal	8b624757bf	Add NAMESPACE as an environment variable to the updater deployment config	2021-08-05 23:32:10 -04:00
Kubernetes Prow Robot	2dd92cbf37	allow adding annotations for priority-expander configmap	2021-08-06 08:09:19 +08:00
Kubernetes Prow Robot	ca49c2c7ae	Merge pull request #4050 from afirth/patch-1 Add example to AWS readme if taint has value	2021-08-05 13:29:20 -07:00
Kubernetes Prow Robot	4b4bc85aa1	Merge pull request #4046 from sylr/aws-log Improve misleading log	2021-08-05 12:23:19 -07:00
Kubernetes Prow Robot	9d0946bccb	Merge pull request #4241 from towca/jtuznik/n2-pricing GCE: add pricing info for new N2 instance types	2021-08-05 01:45:23 -07:00
Jakub Tużnik	19dffbc145	GCE: add pricing info for new N2 instance types	2021-08-05 10:29:38 +02:00
Kubernetes Prow Robot	9d54f7b782	Merge pull request #4239 from BigDarkClown/move-update-labels Move UpdateDeprecatedTemplateLabels function	2021-08-04 07:49:24 -07:00
Bartłomiej Wróblewski	1e4cb1eafe	Move UpdateDeprecatedTemplateLabels function This is a useful function, we will benefit from having it more accessible then it is currently.	2021-08-04 14:32:39 +00:00
Kubernetes Prow Robot	c563a40a60	Merge pull request #4235 from DataDog/fix-tests-and-gcp-pricing cluster-autoscaler: fix unit tests	2021-08-03 03:12:48 -07:00
Benjamin Pineau	79c63a7b3c	cluster-autoscaler: fix tests and GCE NodePrice Recent changes configured providers to set stable nodes labels names exclusively (ie. LabelTopologyZone and not LabelZoneFailureDomain, etc), with older labels names backfilled at nodeInfos templates generation time (from GetNodeInfoFromTemplate), which isn't invoked from most tests cases. GCE NodePirce() might have been dereferencing potentially missing labels. And run hack/update-gofmt.sh where hack/verify-all.sh fails, to pass CI.	2021-08-03 08:28:49 +02:00
Kubernetes Prow Robot	21fc0c1889	Merge pull request #4053 from codablock/old-labels Also set new (non-beta/non-deprecated) labels in buildGenericLabels	2021-08-01 18:21:21 -07:00
by211	f2eefa9a26	Fix markdown code not showing correctly	2021-07-31 12:25:03 -05:00
Alexander Block	6d84abf0de	Remove obsolete comment arch is not hardcoded anymore	2021-07-29 16:45:09 +02:00
Alexander Block	8f11490c0c	Introduce UpdateDeprecatedTemplateLabels to set beta/deprecated labels And at the same time only set stable labels in all buildGenericLabels implementations. This fixes issues when a node group has 0 nodes yet and node labels are built using buildGenericLabels and the node-template labels. Issues include (anti-)affinity and nodeSelectors for the given labels, giving false-negative results for candidate nodes, which leads to ASGs never scaling up.	2021-07-29 16:45:08 +02:00
Kubernetes Prow Robot	1ecc8b43e1	Merge pull request #4225 from DataDog/gce-createinstances-basename GCE: CreateInstances() should use BaseInstanceName	2021-07-29 05:10:19 -07:00
Benjamin Pineau	655bc6fd4a	GCE: CreateInstances() should use BaseInstanceName The new `CreateInstances()` upscale method replacing `Resize()` API calls generates new instances names based on the MIG's name (from `mig.GceRef()`). Before that change, `Resize()`-initiated upscales were prompting MIGs to spawn instances named after MIG's `BaseInstanceName` attribute. Accordingly, `GetMigForInstance()` (still) uses MIG's `BaseInstanceName` to map instances to their parent MIG and discover which MIGs needs an immediate refresh. Down the line the `clusterstate.updateReadinessStats()` periodic goroutines won't be able to map new ready nodes to their parent MIGs (until the cache is backfilled upward from k8s node's providerid, ie. from an hourly goroutine), and those MIGs will be considered non-ready (because MIG's size>0 while the MIG has no known ready instances). So after a first upscale, MIGs (having a BaseInstanceName that is not the MIG's Name) won't be re-upscalable for a while. Example symptoms: ``` cluster-autoscaler W0719 12:35:43.166563 6 clusterstate.go:447] Failed to find readiness information for https://www.googleapis.com/compute/v1/projects/REDACTED-PROJECT/zones/europe-west3-b/instanceGroups/REDACTED-MIGNAME cluster-autoscaler W0719 12:35:43.193469 6 clusterstate.go:626] Readiness for node group https://www.googleapis.com/compute/v1/projects/REDACTED-PROJECT/zones/europe-west3-b/instanceGroups/REDACTED-MIGNAME not found ``` Beside mapping cache issue, this changed the instance names prefixes for some users, while it might make sense to keep using basenames when explicitely provided (might have an use for eg. identification, or name length limits) and avoid a breaking change before `CreateInstances` hits a release.	2021-07-29 12:41:12 +02:00

1 2 3 4 5 ...

5141 Commits All Branches Search

5141 Commits

All Branches