Commit Graph

101 Commits

Author SHA1 Message Date
dom.bozzuto 066315cfa2 Add detection for VMs that fail provisioning to backoff that nodegroup sooner
When Azure fails to provision a node for a nodegroup due to an instance capacity issue ((Zonal)AllocationFailed) or other reason, the VMSS size increase is still reflected but the new instance gets the status `ProvisioningStateFailed`. This now bubbles up the error to the `cloudprovider.Instance`, where it can be used by in `clusterstate` to put the nodegroup into backoff sooner.
2023-04-24 13:56:30 -04:00
Manish Satwani a99da16658 Incorporated comments 2023-04-20 15:36:21 -07:00
Manish Satwani 3de4ffb6d6 refactored to use buildCacheInstance for Uniform & Flex 2023-04-03 09:52:52 -07:00
Manish Satwani ea9ff4d0eb refactor as per feedback, added test case for Flex 2023-03-27 10:52:16 -07:00
Manish Satwani c418175c7d Adding 'enableVmssFlex' feature flag. 2023-03-15 12:18:16 -07:00
Manish Satwani 61f2700fc4 Modified test case and added test case for Flex 2023-03-13 10:18:47 -07:00
Manish Satwani 84f748f67b Add support for VMSS Flex 2023-02-15 13:12:09 -08:00
Manish Satwani fd7903a548 bump cloud-provider-azure version in CA to 1.26.2 for azure imports 2023-02-02 12:02:56 -08:00
Benjamin Pineau 20c451bbc0 Azure: effectively cache instance-types SKUs
The skewer's library cache is re-created at every call, which causes
pressure on Azure API, and slows down the cluster-autoscaler startup
time by two minutes on my small (120 nodes, 300 VMSS) test cluster.

This was hitting the API twice on cache miss to look for non-promo
instance types (even when the instance name doesn't ends with "_Promo").
2022-07-25 16:09:36 +02:00
Prachi Gandhi 6b05ccab84 bump cloud-provider-azure version in CA 2022-05-11 12:03:33 -07:00
Prachi Gandhi 5d0e23d1bc Support for dynamic SKUs for scale from zero.
Currently, cluster autoscaler uses hard-coded (static) list of instanceTypes to scale from zero as there is no node to build blueprint of the information from. This static list needs to updated every-time a new VMSS is added which is not feasible.
2022-04-20 15:31:38 -07:00
Marwan Ahmed 542e919b18 remove check for returning in-memory size when VMSS is in updating state 2022-04-05 14:46:58 -07:00
Marwan Ahmed d49a131f9e azure vmss cache fixes and improvements 2022-02-16 17:51:56 -08:00
Marwan Ahmed 6689f92cbc update delete async calls in scale sets 2022-01-28 13:58:15 -08:00
Marwan Ahmed e0952eb29d
fix scale set log formatter 2021-12-21 17:57:35 +02:00
Marwan Ahmed 091c72cbb0 cleanup scale set size logs 2021-12-20 15:14:05 +02:00
Marwan Ahmed afb443f9f3 switch azure clients to non-legacy repo 2021-12-06 17:17:10 +02:00
Benjamin Pineau 28cd49c09e implement GetOptions for Azure
Support per-VMSS (scaledown) settings as permited by the
cloudprovider's interface `GetOptions()` method.
2021-08-24 09:48:51 +02:00
Marwan Ahmed 756a3e155d dont proactively decrement azure cache for unregistered nodes 2021-06-10 14:14:34 -07:00
Maciek Pytel 08d18a7bd0 Define interfaces for per NodeGroup config.
This is the first step of implementing
https://github.com/kubernetes/autoscaler/issues/3583#issuecomment-743215343.
New method was added to cloudprovider interface. All existing providers
were updated with a no-op stub implementation that will result in no
behavior change.
The config values specified per NodeGroup are not yet applied.
2021-01-25 11:00:16 +01:00
Cecile Robert-Michon 28badba175
cleanup: refactor Azure cache and remove redundant API calls 2020-12-07 11:55:34 -07:00
Bartłomiej Wróblewski 0fb897b839 Update imports after scheduler scheduler/framework/v1alpha1 removal 2020-11-30 10:48:52 +00:00
Benjamin Pineau ec2e477d1f Azure: keep refreshes spread over time
When a `vmssVmsCacheJitter` is provided, API calls (after start)
will be randomly spread over the provided time range, then happens
at regular interval (for a given VMSS). This prevents API calls
spikes.

But we noticed that the various VMSS' refreshes will progressively
converge and agglomerate over time (in particular after a few large
throttling windows affected the autoscaler), which defeats the
purpose.

Re-randomizing the next refresh deadline every time (rather than
just at autoscaler start) keeps the calls properly spread.
Configuring `vmssVmsCacheJitter` and `vmssVmsCacheTTL` allows users
to control the average and worst case refresh interval (and avg
API call rate). And we can count on VMSS size change detection
to kick early refreshes when needed.

That's a small behaviour change, but possibly still a good time
for that, as `vmssVmsCacheJitter` was introduced recently and
wasn't part of any release yet.
2020-10-19 14:59:08 +02:00
Marwan Ahmed 4f37cec1cf set instance status to deleting to better handle long cache TTLs 2020-10-18 18:58:17 -07:00
Marwan Ahmed 875f35580a gofmt 2020-09-29 13:41:05 -07:00
Marwan Ahmed 6c235f842b move template-related code to its own file 2020-09-29 11:57:12 -07:00
Benjamin Pineau 5982062de2 Azure: support allocatable resources overrides via VMSS tags
This allows to specify effective nodes resources capacity using Scale
Sets tags, preventing wrong CA decisions and infinite upscale when pods
requests are within instance type capacity but over k8s nodes allocatable
(which might comprise system and kubelet's reservations), and when using
node-infos from instances templates (ie. scaling from 0).

This is similar to what AWS (with launch configuration tags) and
GCP (with instances templates metadata) cloud providers offers,
ensuring the tags format is similar to AWS' for consistency.

See also:
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/min_at_zero_gcp.md
2020-09-22 13:15:28 +02:00
Marwan Ahmed 6da62d0b7b dont update capacity if VMSS provisioning state is updating 2020-09-13 22:21:28 -07:00
Benjamin Pineau c168eed930 Azure: optional jitter on initial VMSS VM cache refresh
On (re)start, cluster-autoscaler will refresh all VMSS instances caches
at once, and set those cache TTL to 5mn. All VMSS VM List calls (for VMSS
discovered at boot) will then continuously hit ARM API at the same time,
potentially causing regular throttling bursts.

Exposing an optional jitter subtracted from the initial first scheduled
refresh delay will splay those calls (except for the first one, at start),
while keeping the predictable (max. 5mn, unless the VMSS changed) refresh
interval after the first refresh.
2020-08-19 20:48:28 +02:00
Benjamin Pineau 4997972426 Avoid unwanted VMSS VMs caches invalidations
`fetchAutoAsgs()` is called at regular intervals, fetches a list of VMSS,
then call `Register()` to cache each of those. That registration function
will tell the caller wether that vmss' cache is outdated (when the provided
VMSS, supposedly fresh, is different than the one held in cache) and will
replace existing cache entry by the provided VMSS (which in effect will
require a forced refresh since that ScaleSet struct is passed by
fetchAutoAsgs with a nil lastRefresh time and an empty instanceCache).

To detect changes, `Register()` uses an `reflect.DeepEqual()` between the
provided and the cached VMSS. Which does always find them different: cached
VMSS were enriched with instances lists (while the provided one is blank,
fresh from a simple vmss.list call). That DeepEqual is also fragile due to
the compared structs containing mutexes (that may be held or not) and
refresh timestamps, attributes that shoudln't be relevant to the comparison.

As a consequence, all Register() calls causes indirect cache invalidations
and a costly refresh (VMSS VMS List). The number of Register() calls is
directly proportional to the number of VMSS attached to the cluster, and
can easily triggers ARM API throttling.

With a large number of VMSS, that throttling prevents `fetchAutoAsgs` to
ever succeed (and cluster-autoscaler to start). ie.:

```
I0807 16:55:25.875907     153 azure_scale_set.go:344] GetScaleSetVms: starts
I0807 16:55:25.875915     153 azure_scale_set.go:350] GetScaleSetVms: scaleSet.Name: a-testvmss-10, vmList: []
E0807 16:55:25.875919     153 azure_scale_set.go:352] VirtualMachineScaleSetVMsClient.List failed for a-testvmss-10: &{true 0 2020-08-07 17:10:25.875447854 +0000 UTC m=+913.985215807 azure cloud provider throttled for operation VMSSVMList with reason "client throttled"}
E0807 16:55:25.875928     153 azure_manager.go:538] Failed to regenerate ASG cache: Retriable: true, RetryAfter: 899s, HTTPStatusCode: 0, RawError: azure cloud provider throttled for operation VMSSVMList with reason "client throttled"
F0807 16:55:25.875934     153 azure_cloud_provider.go:167] Failed to create Azure Manager: Retriable: true, RetryAfter: 899s, HTTPStatusCode: 0, RawError: azure cloud provider throttled for operation VMSSVMList with reason "client throttled"
goroutine 28 [running]:
```

From [`ScaleSet` struct attributes](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/azure/azure_scale_set.go#L74-L89)
(manager, sizes, mutexes, refreshes timestamps) only sizes are relevant
to that comparison. `curSize` is not strictly necessary, but comparing it
will provide early instance caches refreshs.
2020-08-18 14:52:02 +02:00
Marwan Ahmed 4cb388fa6e fix potential lock issue 2020-07-06 22:33:18 -07:00
Marwan Ahmed 710abdd713 fix err in log and cleanup other logs 2020-07-06 22:32:47 -07:00
Marwan Ahmed 7e5073192a add context timeouts for the sync part of long running operations 2020-07-06 22:21:38 -07:00
Marwan Ahmed 44ba2ca58b synchronize instance deletions to avoid 409 conflict errors 2020-07-04 18:37:29 -07:00
Kubernetes Prow Robot 35901cddb6
Merge pull request #3278 from marwanad/set-context-timeouts-on-gets
use contexts with timeouts in scale set GET calls
2020-07-04 01:28:48 -07:00
Marwan Ahmed 3891852eec use contexts with timeouts in scale set GET calls 2020-07-03 20:17:56 -07:00
Marwan Ahmed a06b1c9d69 decerement cache by the proper amount 2020-07-03 15:15:13 -07:00
Marwan Ahmed f48b26e538 move lock to the get method 2020-07-03 14:11:32 -07:00
Marwan Ahmed 3969346cb5 remove unneeded error message 2020-07-01 16:14:58 -07:00
Marwan Ahmed e250076338 reduce instance mutex lock scope since its used by the Nodes() call to refresh cache 2020-07-01 16:14:17 -07:00
Kubernetes Prow Robot 1434d14ec7
Merge pull request #3242 from nilo19/bug/disable-increase-when-initializing
Disable increaseSize when the node group is under initialilzation.
2020-06-25 05:24:38 -07:00
qini 81cb1a772a Disable increaseSize when the node group is under initialilzation. 2020-06-25 19:33:29 +08:00
Marwan Ahmed 88ad78f390 no need to invalidate caches on scale-down 2020-06-23 23:22:48 -07:00
Marwan Ahmed 694e089f9e switch scalesets to delete asynchronously without waiting on future 2020-06-17 11:54:54 -07:00
Maciek Pytel 655b4081f4 Migrate to klog v2 2020-06-05 17:22:26 +02:00
Marwan Ahmed 57101652cc return correct error for GetScaleSetVms 2020-06-03 13:53:50 -07:00
Marwan Ahmed d9aaf4d6f3 avoid sending unncessary delete requests if delete is in progress 2020-05-19 17:51:20 -07:00
Jakub Tużnik 73a5cdf928 Address recent breaking changes in scheduler
The following things changed in scheduler and needed to be fixed:
* NodeInfo was moved to schedulerframework
* Some fields on NodeInfo are now exposed directly instead of via getters
* NodeInfo.Pods is now a list of *schedulerframework.PodInfo, not *apiv1.Pod
* SharedLister and NodeInfoLister were moved to schedulerframework
* PodLister was removed
2020-04-24 17:54:47 +02:00
Kubernetes Prow Robot babcd6121a
Merge pull request #3036 from marwanad/fix-deletion-going-below-min
Proactively decrement scale set count during deletion operations
2020-04-13 01:37:47 -07:00
Pengfei Ni 0ba8ca6bbd Remove checking for VMSS provisioningState before scaling 2020-04-10 06:33:04 +00:00