Kubernetes Prow Robot
791f0d8355
Merge pull request #2281 from DataDog/JulienBalestra/mig-block
...
cluster-autoscaler: blocked if an instance is detached from MIG
2019-09-11 05:03:22 -07:00
Julien Balestra
3441f616e1
cluster-autoscaler/skip-node: unblock cluster autoscaler when having a single nodegroup for node error
...
Signed-off-by: Julien Balestra <julien.balestra@datadoghq.com>
2019-09-11 13:40:23 +02:00
Krzysztof Jastrzebski
839cdaaa09
Stop disabling Cluster Autoscaler when there is no ready nodes.
2019-09-06 14:45:34 +02:00
Julien Balestra
6d707a08ac
cluster-autoscaler/metrics: expose the scale down cooldown
...
Signed-off-by: Julien Balestra <julien.balestra@datadoghq.com>
2019-08-27 18:12:33 +02:00
Kubernetes Prow Robot
9aac43e237
Merge pull request #2235 from piontec/fix/aws_spots_squashed
...
correctly handle lack of capacity of AWS spot ASGs
2019-08-19 04:27:30 -07:00
Kubernetes Prow Robot
3f0a5fa3c2
Merge pull request #2233 from vivekbagade/surge
...
Adding ScaleDownNodeProcessor
2019-08-19 03:59:32 -07:00
Łukasz Piątkowski
8d9b81caaa
correctly handle lack of capacity of AWS spot ASGs
2019-08-19 12:43:53 +02:00
Vivek Bagade
dc64d0aab2
Adding ScaleDownNodeProcessor
2019-08-12 20:19:55 +02:00
Jakub Tużnik
44ae89dd09
Communicate the result of RemoveUnneededNodeGroups to ScaleDownStatusProcessor
2019-08-12 17:03:51 +02:00
t-qini
f7c563ab06
Modify the code as the simple solution proposed by MaciekPytel.
2019-07-18 23:58:05 +08:00
t-qini
622a838c2c
Modify nodal similarity rules.
2019-07-09 16:04:40 +08:00
Pengfei Ni
d45fee06da
Ensure upcoming nodes are different
2019-07-02 16:52:19 +08:00
silenceper
478660a6bb
fix error
2019-06-28 18:49:58 +08:00
Vivek Bagade
90aa28a077
Move pod packing in upcoming nodes to RunOnce from Estimator for performance improvements
2019-06-19 14:48:47 +02:00
Łukasz Osipiuk
0bcf5315a7
Do not fail loop iteration if unregistered nodes cannot be removed
...
The mechanism of unregistered nodes removal is not the first
responsibility of Cluster Autoscaler. We do not want to renderi CA
unsable (disable scale-up and scale-down) if removing unregistered nodes
cannot be done for prolonged period of time.
2019-06-10 13:45:54 +02:00
Jakub Tużnik
bb382f47f9
Retain information about scale-up failures in CSR
...
This will provide the AutoscalingStatusProcessor with information
about failed scale-ups.
2019-06-05 16:53:30 +02:00
Krzysztof Jastrzebski
4831d76288
Cache cloud provider node instances in cluster state.
2019-05-31 10:11:51 +02:00
Kubernetes Prow Robot
cb4e60f8d4
Merge pull request #2031 from krzysztof-jastrzebski/master
...
Add functionality which delays node deletion to let other components prepare for deletion.
2019-05-20 00:57:13 -07:00
Kubernetes Prow Robot
8d2ec08b2c
Merge pull request #2015 from losipiuk/lo/pass-via-context
...
Add methods for passing arbitrary object via autoscaling context
2019-05-17 08:12:07 -07:00
Łukasz Osipiuk
e76558c65f
Add methods for passing arbitrary object via autoscaling context
...
Change-Id: I066e58010a0aef4989bfc1f73b90bc69c773b26e
2019-05-17 16:38:12 +02:00
Krzysztof Jastrzebski
4247c8b032
Implement functionality which delays node deletion when node has
...
annotation with prefix
'delay-deletion.cluster-autoscaler.kubernetes.io/'.
2019-05-17 16:06:17 +02:00
Chris Bradfield
92ea680f1a
Implement an --ignore-taint flag
...
This change adds support for a user to specify taints to ignore when
considering a node as a template for a node group.
2019-05-14 10:22:59 -07:00
Łukasz Osipiuk
c9811e87b4
Include pods with NominatedNodeName in scheduledPods list used for scale-up considerations
...
Change-Id: Ie4c095b30bf0cd1f160f1ac4b8c1fcb8c0524096
2019-04-15 16:59:13 +02:00
Łukasz Osipiuk
db4c6f1133
Migrate filter out schedulabe to PodListProcessor
2019-04-15 16:59:13 +02:00
Łukasz Osipiuk
5c09c50774
Pass ready nodes list to PodListProcessor
2019-04-15 16:59:13 +02:00
Łukasz Osipiuk
c6115b826e
Define ProcessorCallbacks interface
2019-04-15 16:59:13 +02:00
Jiaxin Shan
90666881d3
Move GPULabel and GPUTypes to cloud provider
2019-03-25 13:03:01 -07:00
Łukasz Osipiuk
2474dc2fd5
Call CloudProvider.Refresh before getNodeInfosForGroups
...
We need to call refresh before getNodeInfosForGroups. If we have
stale state getNodeInfosForGroups may fail and we will end up in infinite crash looping.
2019-03-12 12:07:49 +01:00
Aleksandra Malinowska
62a28f3005
Soft taint when there are no candidates
2019-03-11 14:05:09 +01:00
Uday Ruddarraju
91b7bc08a1
Fixing minor error handling bug in static autoscaler
2019-03-07 15:16:27 -08:00
Aleksandra Malinowska
a824e87957
Only soft taint nodes if there's no scale down to do
2019-02-25 17:11:15 +01:00
Pengfei Ni
128729bae9
Move schedulercache to package nodeinfo
2019-02-21 12:41:08 +08:00
Jacek Kaniuk
d969baff22
Cache exemplar ready node for each node group
2019-02-11 17:40:58 +01:00
Vivek Bagade
c6b87841ce
Added a new method that uses pod packing to filter schedulable pods
...
filterOutSchedulableByPacking is an alternative to the older
filterOutSchedulable. filterOutSchedulableByPacking sorts pods in
unschedulableCandidates by priority and filters out pods that can be
scheduled on free capacity on existing nodes. It uses a basic packing
approach to do this. Pods with nominatedNodeName set are always
filtered out.
filterOutSchedulableByPacking is set to be used by default, but, this
can be toggled off by setting filter-out-schedulable-pods-uses-packing
flag to false, which would then activate the older and more lenient
filterOutSchedulable(now called filterOutSchedulableSimple).
Added test cases for both methods.
2019-01-25 16:09:51 +05:30
Vivek Bagade
79ef3a6940
unexporting methods in utils.go
2019-01-25 00:06:03 +05:30
Jacek Kaniuk
0c64e0932a
Tainting unneeded nodes as PreferNoSchedule
2019-01-21 13:06:50 +01:00
Łukasz Osipiuk
85a83b62bd
Pass nodeGroup->NodeInfo map to ClusterStateRegistry
...
Change-Id: Ie2a51694b5731b39c8a4135355a3b4c832c26801
2019-01-08 15:52:00 +01:00
Kubernetes Prow Robot
ab7f1e69be
Merge pull request #1464 from losipiuk/lo/stockouts2
...
Better quota-exceeded/stockout handling
2018-12-31 05:28:08 -08:00
Łukasz Osipiuk
2fbae197f4
Handle possible stockout/quota scale-up errors
2018-12-28 17:17:07 +01:00
Maciej Pytel
60babe7158
Use kubernetes lister for daemonset instead of custom one
...
Also migrate to using apps/v1.DaemonSet instead of old
extensions/v1beta1.
2018-12-28 13:55:41 +01:00
Thomas Hartland
d0dd00c602
Fix logged error in static autoscaler
2018-12-04 16:59:57 +01:00
Łukasz Osipiuk
016bf7fc2c
Use k8s.io/klog instead github.com/golang/glog
2018-11-26 17:30:31 +01:00
Łukasz Osipiuk
5962354c81
Inject Backoff instance to ClusterStateRegistry on creation
2018-11-13 14:25:16 +01:00
Aleksandra Malinowska
bf6ff4be8e
Clean up estimators
2018-11-06 14:15:42 +01:00
Jakub Tużnik
8179e4e716
Refactor the scale-(up|down) status processors so that they have more info available
...
Replace the simple boolean ScaledUp property of ScaleUpStatus with a more
comprehensive ScaleUpResult. Add more possible values to ScaleDownResult.
Refactor the processors execution so that they are always executed every
iteration, even if RunOnce exits earlier.
2018-09-20 17:12:02 +02:00
Steve Scaffidi
88d857222d
Renamed one more variable for consistency
...
Change-Id: Idf42fd58089a1e75f3291ab7cc583735c68735f2
2018-09-17 14:08:10 -04:00
Steve Scaffidi
56b5456269
Fixing nits: renamed newPodScaleUpBuffer -> newPodScaleUpDelay, deleted redundant comment
...
Change-Id: I7969194d8e07e2fb34029d0d7990341c891d0623
2018-09-17 10:38:28 -04:00
Steve Scaffidi
33b93cbc5f
Add configurable delay for pod age before considering for scale-up
...
- This is intended to address the issue described in https://github.com/kubernetes/autoscaler/issues/923
- the delay is configurable via a CLI option
- in production (on AWS) we set this to a value of 2m
- the delay could possibly be set as low as 30s and still be effective depending on your workload and environment
- the default of 0 for the CLI option results in no change to the CA's behavior from defaults.
Change-Id: I7e3f36bb48641faaf8a392cca01a12b07fb0ee35
2018-09-14 13:55:09 -04:00
Jakub Tużnik
71111da20c
Add a scale down status processor, refactor so that there's more scale down info available to it
2018-09-12 14:52:20 +02:00
Jakub Tużnik
054f0b3b90
Add AutoscalingStatusProcessor
2018-08-07 14:47:06 +02:00