autoscaler

Commit Graph

Author	SHA1	Message	Date
Jakub Tużnik	8179e4e716	Refactor the scale-(up\|down) status processors so that they have more info available Replace the simple boolean ScaledUp property of ScaleUpStatus with a more comprehensive ScaleUpResult. Add more possible values to ScaleDownResult. Refactor the processors execution so that they are always executed every iteration, even if RunOnce exits earlier.	2018-09-20 17:12:02 +02:00
Steve Scaffidi	88d857222d	Renamed one more variable for consistency Change-Id: Idf42fd58089a1e75f3291ab7cc583735c68735f2	2018-09-17 14:08:10 -04:00
Steve Scaffidi	56b5456269	Fixing nits: renamed newPodScaleUpBuffer -> newPodScaleUpDelay, deleted redundant comment Change-Id: I7969194d8e07e2fb34029d0d7990341c891d0623	2018-09-17 10:38:28 -04:00
Steve Scaffidi	33b93cbc5f	Add configurable delay for pod age before considering for scale-up - This is intended to address the issue described in https://github.com/kubernetes/autoscaler/issues/923 - the delay is configurable via a CLI option - in production (on AWS) we set this to a value of 2m - the delay could possibly be set as low as 30s and still be effective depending on your workload and environment - the default of 0 for the CLI option results in no change to the CA's behavior from defaults. Change-Id: I7e3f36bb48641faaf8a392cca01a12b07fb0ee35	2018-09-14 13:55:09 -04:00
Jakub Tużnik	71111da20c	Add a scale down status processor, refactor so that there's more scale down info available to it	2018-09-12 14:52:20 +02:00
Jakub Tużnik	054f0b3b90	Add AutoscalingStatusProcessor	2018-08-07 14:47:06 +02:00
Aleksandra Malinowska	90e8a7a2d9	Move initializing defaults out of main	2018-08-02 14:04:03 +02:00
Aleksandra Malinowska	6f9b6f8290	Move ListerRegistry to context	2018-07-26 13:31:49 +02:00
Aleksandra Malinowska	07e52e6c79	Move creating cloud provider out of context	2018-07-25 13:43:47 +02:00
Aleksandra Malinowska	0976d2aa07	Move autoscaling options out of static	2018-07-25 10:52:37 +02:00
Aleksandra Malinowska	6b94d7172d	Move AutoscalingOptions to config/static	2018-07-23 15:52:27 +02:00
Krzysztof Jastrzebski	2df2568841	Move removing unneeded autoprovisioned node groups to node group manager	2018-06-22 14:26:12 +02:00
Beata Skiba	b8ae6df5d3	Add post scale up status processor.	2018-06-06 13:34:49 +02:00
Maciej Pytel	856855987b	Move some GKE-specific logic outside core No change in actual logic being executed. Added a new NodeGroupListProcessor interface to encapsulate the existing logic. Moved PodListProcessor and refactor how it's passed around to make it consistent and easy to add similar interfaces.	2018-05-29 12:57:19 +02:00
Maciej Pytel	5faa41e683	Move PodListProcessor to new directory It's not really a util and with more processors coming it makes more sense to keep them in dedicated place.	2018-05-29 12:00:47 +02:00
Karol Gołąb	4c710950de	Move ClusterStateRegistry to StaticAutoscaler AutoscalingContext is basically a configuration and few static helpers and API handles. ClusterStateRegistry is state and thus moved to other state-keeping objects.	2018-05-24 13:03:01 +02:00
Karol Gołąb	5bfab7d9b2	Return value moved to the caller	2018-05-18 14:59:15 +02:00
Karol Gołąb	fa6f25a70a	Extract ClusterStateRegistry update with its soft dependency	2018-05-18 10:25:15 +02:00
Karol Gołąb	dc34b43a40	Extract another tiny method	2018-05-18 10:10:51 +02:00
Karol Gołąb	34f6a45a04	Extract method to hide a tiny bit of complexity	2018-05-18 10:01:52 +02:00
Aleksandra Malinowska	d7dc3616f7	Merge pull request #868 from kgolab/kg-clean-up-010 Move metrics update to proper place	2018-05-17 14:52:18 +02:00
Karol Gołąb	e31bf0bb58	Move metrics.Autoscaling after all Node-level operations & checks	2018-05-17 14:37:43 +02:00
Aleksandra Malinowska	3b6cfc7c2b	Merge pull request #870 from kgolab/kg-clean-up-012 Set lastScaleDownFailTime properly	2018-05-17 12:09:15 +02:00
MaciekPytel	444201d1e7	Merge pull request #871 from kgolab/kg-clean-up-013 Extract duplicate code into a single method	2018-05-17 11:49:49 +02:00
Karol Gołąb	400147a075	Extract duplicate code into a single method	2018-05-17 10:01:04 +02:00
Karol Gołąb	b8cbdf4178	Set lastScaleDownFailTime properly - the ScaleDownError check was unreachable	2018-05-17 09:50:22 +02:00
Karol Gołąb	38a5951e22	Check glog.V once	2018-05-17 09:47:52 +02:00
Karol Gołąb	ccca078a2b	Move metrics update to proper place	2018-05-17 09:46:25 +02:00
MaciekPytel	bc39d4dcd5	Merge pull request #842 from kgolab/kg-clean-up-008 Merge two variables into one.	2018-05-14 10:54:43 +02:00
Aleksandra Malinowska	b52ec59b05	Fix cleaning up taints	2018-05-11 12:00:48 +02:00
Karol Gołąb	f1f92f065e	Merge two variables into one.	2018-05-10 14:32:37 +02:00
Karol Gołąb	ae203ed517	Removed unused CloudProvider() method.	2018-05-08 11:23:55 +02:00
Karol Gołąb	854fcc1ff8	Remove implementation details (CleanUp) from the interface. The CleanUp method is instead called directly from the implementation, when required. Test updated in a quick way since the mock we're using does not support AtLeast(1) - thus Times(2).	2018-05-07 15:24:14 +02:00
Krzysztof Jastrzebski	88b769b324	Refactor cluster autoscaler builder and add pod list processor.	2018-04-26 12:37:51 +02:00
Aleksandra Malinowska	7e1353a865	Ignore TPU resource in simulations	2018-04-11 12:26:22 +02:00
Maciej Pytel	abbc45da2e	Delay scale-up including GPU request Nodes with GPU are expensive and it's likely a bunch of pods using them will be created in a batch. In this case we can wait a bit for all pods to be created to make more efficient scale-up decision.	2018-03-02 15:55:04 +01:00
anniedy	bf59e3daa5	Typo fix unneded->[unneeded] (#623 ) * Update clusterstate.md * Update scale_down.go * Update static_autoscaler.go	2018-02-07 17:36:58 +01:00
Beata Skiba	346a5c26a9	Remove old unregistered nodes before checking cluster healthiness	2018-02-01 16:34:50 +01:00
Aleksandra Malinowska	b17b6c3ec5	Wait before publishing no nodes ready after start	2018-01-16 19:04:38 +01:00
Aleksandra Malinowska	27efa05b1d	Publish ClusterUnhealthy events	2018-01-16 16:56:36 +01:00
Aleksandra Malinowska	1b728d411b	Publish status and metrics for empty cluster	2018-01-16 16:07:29 +01:00
Marcin Wielgus	15b10c8f67	Skip iteration if pending pods are too new	2017-12-28 16:55:44 +01:00
Marcin Wielgus	ded016dfd8	Merge pull request #461 from MaciekPytel/gpu_unready_fix Consider GPU nodes unready until allocatable GPU is > 0	2017-11-13 15:29:27 +01:00
Maciej Pytel	d81dca5991	Mark nodes with uninitialized GPUs as unready	2017-11-10 17:56:10 +01:00
Marcin Wielgus	439fd3c9ec	Merge pull request #411 from krzysztof-jastrzebski/priority Adds priority preemption support to cluster autoscaler.	2017-11-08 09:09:26 +01:00
Beata Skiba	2b28ac1a04	Add a workaround for scaling of VMs with GPUs When a machine with GPU becomes ready it can take up to 15 minutes before it reports that GPU is allocatable. This can cause Cluster Autoscaler to trigger a second unnecessary scale up. The workaround sets allocatable to capacity for GPU so that a node that waits for GPUs to become ready to use will be considered as a place where pods requesting GPUs can be scheduled.	2017-11-06 16:04:22 +01:00
Edward Tsang	4104a91991	more spelling fixes	2017-11-02 14:21:36 -07:00
Maciej Pytel	9c2ebccbfe	Write events when autoprovisioned nodegroup is created / deleted	2017-10-25 17:39:30 +02:00
Maciej Pytel	07511f444a	Add Refresh method to cloud provider This can be used to dynamically update cloud provider config (in particular list of managed NodeGroups and their min/max constraints). Add GKE implementation.	2017-10-24 18:36:29 +02:00
Krzysztof Jastrzebski	d9c00e5ce1	Adds priority preemption support to cluster autoscaler.	2017-10-23 09:54:56 +02:00

1 2 3

103 Commits