Commit Graph

115 Commits

Author SHA1 Message Date
Jacek Kaniuk d969baff22 Cache exemplar ready node for each node group 2019-02-11 17:40:58 +01:00
Vivek Bagade c6b87841ce Added a new method that uses pod packing to filter schedulable pods
filterOutSchedulableByPacking is an alternative to the older
filterOutSchedulable. filterOutSchedulableByPacking sorts pods in
unschedulableCandidates by priority and filters out pods that can be
scheduled on free capacity on existing nodes. It uses a basic packing
approach to do this. Pods with nominatedNodeName set are always
filtered out.

filterOutSchedulableByPacking is set to be used by default, but, this
can be toggled off by setting filter-out-schedulable-pods-uses-packing
flag to false, which would then activate the older and more lenient
filterOutSchedulable(now called filterOutSchedulableSimple).

Added test cases for both methods.
2019-01-25 16:09:51 +05:30
Vivek Bagade 79ef3a6940 unexporting methods in utils.go 2019-01-25 00:06:03 +05:30
Jacek Kaniuk 0c64e0932a Tainting unneeded nodes as PreferNoSchedule 2019-01-21 13:06:50 +01:00
Łukasz Osipiuk 85a83b62bd Pass nodeGroup->NodeInfo map to ClusterStateRegistry
Change-Id: Ie2a51694b5731b39c8a4135355a3b4c832c26801
2019-01-08 15:52:00 +01:00
Kubernetes Prow Robot ab7f1e69be
Merge pull request #1464 from losipiuk/lo/stockouts2
Better quota-exceeded/stockout handling
2018-12-31 05:28:08 -08:00
Łukasz Osipiuk 2fbae197f4 Handle possible stockout/quota scale-up errors 2018-12-28 17:17:07 +01:00
Maciej Pytel 60babe7158 Use kubernetes lister for daemonset instead of custom one
Also migrate to using apps/v1.DaemonSet instead of old
extensions/v1beta1.
2018-12-28 13:55:41 +01:00
Thomas Hartland d0dd00c602 Fix logged error in static autoscaler 2018-12-04 16:59:57 +01:00
Łukasz Osipiuk 016bf7fc2c Use k8s.io/klog instead github.com/golang/glog 2018-11-26 17:30:31 +01:00
Łukasz Osipiuk 5962354c81 Inject Backoff instance to ClusterStateRegistry on creation 2018-11-13 14:25:16 +01:00
Aleksandra Malinowska bf6ff4be8e Clean up estimators 2018-11-06 14:15:42 +01:00
Jakub Tużnik 8179e4e716 Refactor the scale-(up|down) status processors so that they have more info available
Replace the simple boolean ScaledUp property of ScaleUpStatus with a more
comprehensive ScaleUpResult. Add more possible values to ScaleDownResult.
Refactor the processors execution so that they are always executed every
iteration, even if RunOnce exits earlier.
2018-09-20 17:12:02 +02:00
Steve Scaffidi 88d857222d Renamed one more variable for consistency
Change-Id: Idf42fd58089a1e75f3291ab7cc583735c68735f2
2018-09-17 14:08:10 -04:00
Steve Scaffidi 56b5456269 Fixing nits: renamed newPodScaleUpBuffer -> newPodScaleUpDelay, deleted redundant comment
Change-Id: I7969194d8e07e2fb34029d0d7990341c891d0623
2018-09-17 10:38:28 -04:00
Steve Scaffidi 33b93cbc5f Add configurable delay for pod age before considering for scale-up
- This is intended to address the issue described in https://github.com/kubernetes/autoscaler/issues/923
  - the delay is configurable via a CLI option
  - in production (on AWS) we set this to a value of 2m
  - the delay could possibly be set as low as 30s and still be effective depending on your workload and environment
  - the default of 0 for the CLI option results in no change to the CA's behavior from defaults.

Change-Id: I7e3f36bb48641faaf8a392cca01a12b07fb0ee35
2018-09-14 13:55:09 -04:00
Jakub Tużnik 71111da20c Add a scale down status processor, refactor so that there's more scale down info available to it 2018-09-12 14:52:20 +02:00
Jakub Tużnik 054f0b3b90 Add AutoscalingStatusProcessor 2018-08-07 14:47:06 +02:00
Aleksandra Malinowska 90e8a7a2d9 Move initializing defaults out of main 2018-08-02 14:04:03 +02:00
Aleksandra Malinowska 6f9b6f8290 Move ListerRegistry to context 2018-07-26 13:31:49 +02:00
Aleksandra Malinowska 07e52e6c79 Move creating cloud provider out of context 2018-07-25 13:43:47 +02:00
Aleksandra Malinowska 0976d2aa07 Move autoscaling options out of static 2018-07-25 10:52:37 +02:00
Aleksandra Malinowska 6b94d7172d Move AutoscalingOptions to config/static 2018-07-23 15:52:27 +02:00
Krzysztof Jastrzebski 2df2568841 Move removing unneeded autoprovisioned node groups to node group manager 2018-06-22 14:26:12 +02:00
Beata Skiba b8ae6df5d3 Add post scale up status processor. 2018-06-06 13:34:49 +02:00
Maciej Pytel 856855987b Move some GKE-specific logic outside core
No change in actual logic being executed. Added a new
NodeGroupListProcessor interface to encapsulate the existing logic.
Moved PodListProcessor and refactor how it's passed around
to make it consistent and easy to add similar interfaces.
2018-05-29 12:57:19 +02:00
Maciej Pytel 5faa41e683 Move PodListProcessor to new directory
It's not really a util and with more processors
coming it makes more sense to keep them in dedicated place.
2018-05-29 12:00:47 +02:00
Karol Gołąb 4c710950de Move ClusterStateRegistry to StaticAutoscaler
AutoscalingContext is basically a configuration and few static helpers
and API handles.
ClusterStateRegistry is state and thus moved to other state-keeping
objects.
2018-05-24 13:03:01 +02:00
Karol Gołąb 5bfab7d9b2 Return value moved to the caller 2018-05-18 14:59:15 +02:00
Karol Gołąb fa6f25a70a Extract ClusterStateRegistry update with its soft dependency 2018-05-18 10:25:15 +02:00
Karol Gołąb dc34b43a40 Extract another tiny method 2018-05-18 10:10:51 +02:00
Karol Gołąb 34f6a45a04 Extract method to hide a tiny bit of complexity 2018-05-18 10:01:52 +02:00
Aleksandra Malinowska d7dc3616f7
Merge pull request #868 from kgolab/kg-clean-up-010
Move metrics update to proper place
2018-05-17 14:52:18 +02:00
Karol Gołąb e31bf0bb58 Move metrics.Autoscaling after all Node-level operations & checks 2018-05-17 14:37:43 +02:00
Aleksandra Malinowska 3b6cfc7c2b
Merge pull request #870 from kgolab/kg-clean-up-012
Set lastScaleDownFailTime properly
2018-05-17 12:09:15 +02:00
MaciekPytel 444201d1e7
Merge pull request #871 from kgolab/kg-clean-up-013
Extract duplicate code into a single method
2018-05-17 11:49:49 +02:00
Karol Gołąb 400147a075 Extract duplicate code into a single method 2018-05-17 10:01:04 +02:00
Karol Gołąb b8cbdf4178 Set lastScaleDownFailTime properly - the ScaleDownError check was unreachable 2018-05-17 09:50:22 +02:00
Karol Gołąb 38a5951e22 Check glog.V once 2018-05-17 09:47:52 +02:00
Karol Gołąb ccca078a2b Move metrics update to proper place 2018-05-17 09:46:25 +02:00
MaciekPytel bc39d4dcd5
Merge pull request #842 from kgolab/kg-clean-up-008
Merge two variables into one.
2018-05-14 10:54:43 +02:00
Aleksandra Malinowska b52ec59b05 Fix cleaning up taints 2018-05-11 12:00:48 +02:00
Karol Gołąb f1f92f065e Merge two variables into one. 2018-05-10 14:32:37 +02:00
Karol Gołąb ae203ed517 Removed unused CloudProvider() method. 2018-05-08 11:23:55 +02:00
Karol Gołąb 854fcc1ff8 Remove implementation details (CleanUp) from the interface.
The CleanUp method is instead called directly from the implementation,
when required.
Test updated in a quick way since the mock we're using does not support
AtLeast(1) - thus Times(2).
2018-05-07 15:24:14 +02:00
Krzysztof Jastrzebski 88b769b324 Refactor cluster autoscaler builder and add pod list processor. 2018-04-26 12:37:51 +02:00
Aleksandra Malinowska 7e1353a865 Ignore TPU resource in simulations 2018-04-11 12:26:22 +02:00
Maciej Pytel abbc45da2e Delay scale-up including GPU request
Nodes with GPU are expensive and it's likely a bunch of pods
using them will be created in a batch. In this case we can
wait a bit for all pods to be created to make more efficient
scale-up decision.
2018-03-02 15:55:04 +01:00
anniedy bf59e3daa5 Typo fix unneded->[unneeded] (#623)
* Update clusterstate.md

* Update scale_down.go

* Update static_autoscaler.go
2018-02-07 17:36:58 +01:00
Beata Skiba 346a5c26a9 Remove old unregistered nodes before checking cluster healthiness 2018-02-01 16:34:50 +01:00