Commit Graph

85 Commits

Author SHA1 Message Date
Anton Khizunov 35b75977ea fix generate ec2 instance types 2022-12-05 16:44:00 +02:00
Anton Khizunov 9f548f5802 aws cloudprovider - separate aws sdk provider from aws manager 2022-08-24 11:53:54 +03:00
Austin Siu 923fe205cd Consolidate DescribeLaunchTemplateVersions logic into one function 2022-08-08 14:45:11 -05:00
Austin Siu b79be36f7f Improve nil checking, some naming, replace json logic for requirement translation with field-setting 2022-08-08 14:45:11 -05:00
Austin Siu 8ac2acf084 Add nil check when building tempalte with requirements, improve readability 2022-08-08 14:45:11 -05:00
Austin Siu 11f5c5d550 Check MIP instance req overrides before LT overrides 2022-08-08 14:45:11 -05:00
Austin Siu 7ff7b5c34a Add support for LT Instance Requirements with no overrides in MIP 2022-08-08 14:45:10 -05:00
Austin Siu fed54b5715 Support attribute-based instance selection for AWS 2022-08-08 14:45:10 -05:00
Johannes Würbach 285500ed2c
feat(aws): reduce auto-discovery API calls 2022-07-21 17:11:18 +02:00
Todd Neal fde836c991 fix variadic log messages 2022-05-09 14:28:15 -05:00
MyannaHarris b4cadfb4e2 [AWS EKS - Scale-to-0] Add Managed Nodegroup Cache
This change adds a Managed Nodegroup cache that will hold labels and taints from the AWS EKS DescribeNodegroup API output. It will be used to get more information for EKS managed nodegroups that are scaled to 0 nodes. Currently this code will only run when the managed nodegroup has 0 nodes and CAS doesn't have a node info object cached already.

Not included in this PR, but information for the future:

To make this used whenever the nodegroup is scaled to 0 nodes we'd have to make a change in the general CAS code [around here](10451c2032/cluster-autoscaler/processors/nodeinfosprovider/mixed_nodeinfos_processor.go (L114))
This general code change would be related to discussion in this old PR about node cache info: https://github.com/kubernetes/autoscaler/pull/4258
2022-04-04 00:51:58 -07:00
David Morrison aebd984e43 fix linting errors 2022-03-02 11:34:17 -08:00
David Morrison 8ac87b3f34 early abort if we get a failed scaling activity event from AWS 2022-03-02 11:34:17 -08:00
David Morrison a0ae713fba convert registeredAsgs to a map 2022-03-02 11:34:16 -08:00
Tyler Montgomery afc835a5dd allow colon in aws asg discovery tag names, update documentation 2022-01-21 10:34:20 -06:00
Kubernetes Prow Robot ec2681b0ae
Merge pull request #4444 from MyannaHarris/aws_describe_nodegroup_api
[AWS EKS - Scale-to-0] Add EKS service and DescribeNodegroup API call
2021-11-30 09:30:56 -08:00
MyannaHarris 06b4297581 [AWS EKS - Scale-to-0] Add EKS service and DescribeNodegroup API call
This change adds the AWS EKS service to the vendor service directory and adds the DescribeNodegroup API call to the AWSWrapper. This will be used for the AWS scale-to-0 project to get more information about empty managed nodegroups

Related proposal: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/circumvent-tag-limit-aws.md
2021-11-22 11:30:21 -08:00
Benjamin Pineau 4d6aac7a06 implement GetOptions for AWS
Support per-ASG (scaledown) settings as permited by the
cloudprovider's interface GetOptions() method.
2021-11-15 15:24:41 +01:00
Kubernetes Prow Robot c4b56ea561
Merge pull request #4073 from airbnb/drmorr--airbnb--cache-launch-templates
cache ASG InstanceTypes for AWS
2021-10-27 10:35:25 -07:00
MyannaHarris 76da63564d [AWS EKS - Scale-to-0] Update conditional to check cluster-name as well
This change updates the conditional to check for the cluster-name label as well since we need both for the DescribeNodegroup API call and a customer can accidentally delete either.
2021-10-14 16:45:24 -07:00
MyannaHarris 6a14e6eb60 [AWS EKS - Scale-to-0] Add check for the AWS EKS tags on the ASG
This change is the first change for the AWS EKS Managed Nodegroups support for scale-to-0 changes in the cluster autoscaler. It checks for the AWS EKS specific tags that we automaticaly add for Managed Nodegroups.

Variable name update in test Co-authored-by: Guy Templeton <guyjtempleton@googlemail.com>
2021-10-04 15:30:22 -07:00
David Morrison cf99f347c6 fix up some variable names and comments 2021-10-04 09:28:23 -07:00
David Morrison 27d96021a4 handle LT/LC cache misses 2021-09-22 21:38:26 -07:00
David Morrison 4999f05f3d cache ASG InstanceTypes for AWS 2021-09-22 15:17:26 -07:00
Kubernetes Prow Robot 5d754993c9
Merge pull request #3999 from DataDog/bump-asg-per-describe
aws: Set maxAsgNamesPerDescribe to the new maximum value
2021-08-16 01:01:47 -07:00
Kubernetes Prow Robot 4b4bc85aa1
Merge pull request #4046 from sylr/aws-log
Improve misleading log
2021-08-05 12:23:19 -07:00
Alexander Block 6d84abf0de Remove obsolete comment
arch is not hardcoded anymore
2021-07-29 16:45:09 +02:00
Alexander Block 8f11490c0c Introduce UpdateDeprecatedTemplateLabels to set beta/deprecated labels
And at the same time only set stable labels in all buildGenericLabels
implementations.

This fixes issues when a node group has 0 nodes yet and node labels are
built using buildGenericLabels and the node-template labels.

Issues include (anti-)affinity and nodeSelectors for the given labels,
giving false-negative results for candidate nodes, which leads to ASGs
never scaling up.
2021-07-29 16:45:08 +02:00
Sylvain Rabot 43a4d51d8d
Update cluster-autoscaler/cloudprovider/aws/aws_manager.go
Co-authored-by: Guy Templeton <guyjtempleton@googlemail.com>
2021-05-24 12:49:30 +02:00
Kubernetes Prow Robot a5802a2280
Merge pull request #3848 from DataDog/aws-arm-support
aws: support arm64 instances
2021-05-23 08:45:38 -07:00
Kubernetes Prow Robot 6c4101b64c
Merge pull request #3797 from DataDog/aws-not-refreshes-dogpiles
aws: Don't pile up successive full refreshes during AWS scaledowns
2021-05-03 14:54:07 -07:00
Sylvain Rabot 535a21263e
Improve misleading log
Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
2021-04-28 17:58:35 +02:00
Benjamin Pineau 0cc9e52c28 Set maxAsgNamesPerDescribe to the new maximum value
While this was previously effectively limited to 50, `DescribeAutoScalingGroups` now supports
fetching 100 ASG per calls on all regions, matching what's documented:
https://docs.aws.amazon.com/autoscaling/ec2/APIReference/API_DescribeAutoScalingGroups.html
```
     AutoScalingGroupNames.member.N
       The names of the Auto Scaling groups.
       By default, you can only specify up to 50 names.
       You can optionally increase this limit using the MaxRecords parameter.
     MaxRecords
       The maximum number of items to return with this call.
       The default value is 50 and the maximum value is 100.
```

Doubling this halves API calls on large clusters, which should help to prevent throttling.
2021-04-19 15:55:23 +02:00
Benjamin Pineau 3ffe4b3557 aws: support arm64 instances
Sets the `kubernetes.io/arch` (and legacy `beta.kubernetes.io/arch`)
to the proper instance architecture.

While at it, re-gen the instance types list (adding new instance types
that were missing)
2021-04-19 15:45:19 +02:00
Benjamin Pineau 037dc7367a Don't pile up successive full refreshes during AWS scaledowns
Force refreshing everything at every DeleteNodes calls causes slow down
and throttling on large clusters with many ASGs (and lot of activity).

That function might be called several times in a row during scale-down
(once for each ASG having a node to be removed). Each time the forced
refresh will re-discover all ASGs, all LaunchConfigurations, then re-list all
instances from discovered ASGs.

That immediate refresh isn't required anyway, as the cache's DeleteInstances
concrete implementation will decrement the nodegroup size, and we can
schedule a grouped refresh for the next loop iteration.
2021-04-19 15:11:43 +02:00
Sylvain Rabot bb208bec91
Allow generic labels to be overwritten by tags coming from the ASG
Signed-off-by: Sylvain Rabot <sylvain@abstraction.fr>
2021-03-07 17:44:09 +01:00
Jakub Tużnik 8c14f25fca k8s.io/kubernetes/pkg/kubelet/apis -> k8s.io/kubelet/pkg/apis
The package changed place.
2021-03-02 13:27:21 +01:00
Kubernetes Prow Robot a01610dbb9
Merge pull request #3185 from DataDog/jb/aws-instance-types
cluster-autoscaler: use generated instance types
2020-06-12 09:23:58 -07:00
Maciek Pytel 655b4081f4 Migrate to klog v2 2020-06-05 17:22:26 +02:00
Julien Balestra ac504608e9 cluster-autoscaler: use generated instance types
Signed-off-by: Julien Balestra <julien.balestra@datadoghq.com>
2020-06-03 15:52:08 +02:00
Jiaxin Shan c53810cc00 Support arbitrary custom resource building template
Signed-off-by: Jiaxin Shan <seedjeffwan@gmail.com>
2020-03-13 10:53:39 -07:00
Julien Balestra 716836acde cluster-autoscaler/aws: batch launch config query and ttl cache
Signed-off-by: Julien Balestra <julien.balestra@datadoghq.com>
2020-02-19 16:36:42 +01:00
Ace Eldeib d63067e70b refactor: move aws discovery config 2019-10-17 05:19:11 -07:00
Jiaxin Shan 7eb864502c Load AWS EC2 Instance Types dynamically 2019-10-14 17:19:42 -07:00
Alfred Krohmer f7695c52ea AWS – use `session.NewSession` instead of `session.New` to
`session.New` is deprecated and requires the `AWS_SDK_LOAD_CONFIG`
environment variable to be set in order to automatically call
`AssumeRoleWithWebIdentity` when `AWS_WEB_IDENTITY_TOKEN_FILE` is set
(which is not documented and most likely unintended).
2019-10-07 15:09:51 +02:00
Kubernetes Prow Robot 8d9010e11e
Merge pull request #2248 from Jeffwan/mixed_instance_policy
Add MixedInstancesPolicy struct to better handle instance type
2019-08-26 02:40:20 -07:00
Łukasz Piątkowski 8d9b81caaa correctly handle lack of capacity of AWS spot ASGs 2019-08-19 12:43:53 +02:00
Jiaxin Shan 8d567eb102 Add MixedInstancesPolicy struct to better handle instance type
Ensures that when MixedInstancePolicy is used in an AWS AutoScalingGroup, that
the buildInstanceType() AWS manager method returns an instance type after looking
at the MixedInstancePolicy.LaunchTemplateSpecification. The buildInstanceType()
method is called in numerous places including on cluster scale up actions.

Also adds documentation highlighting the minimum version of cluster autoscaler
supporting MixedInstancePolicy is 1.14
2019-08-15 14:40:11 -07:00
Łukasz Osipiuk 738ce84c2e Update gofmt 2019-06-06 20:35:16 +02:00
Łukasz Osipiuk cba6582fe1 Use k8s.io/legacy-cloud-providers 2019-06-06 20:27:27 +02:00