autoscaler

Commit Graph

Author	SHA1	Message	Date
Łukasz Osipiuk	a266420f6a	Recalculate clusterStateRegistry after adding multiple node groups	2018-10-02 17:15:20 +02:00
Łukasz Osipiuk	437efe4af6	If possible use nodeInfo based on created node group	2018-10-02 15:46:45 +02:00
Jakub Tużnik	8179e4e716	Refactor the scale-(up\|down) status processors so that they have more info available Replace the simple boolean ScaledUp property of ScaleUpStatus with a more comprehensive ScaleUpResult. Add more possible values to ScaleDownResult. Refactor the processors execution so that they are always executed every iteration, even if RunOnce exits earlier.	2018-09-20 17:12:02 +02:00
Łukasz Osipiuk	bf8cfef10b	NodeGroupManager.CreateNodeGroup can return extra created node groups.	2018-09-19 13:55:51 +02:00
Łukasz Osipiuk	705a6d87e2	fixup! Call CheckPodsSchedulableOnNode in scale_up.go via caching layer	2018-09-17 13:01:19 +02:00
Łukasz Osipiuk	0ad4efe920	Call CheckPodsSchedulableOnNode in scale_up.go via caching layer	2018-09-13 17:01:15 +02:00
Aleksandra Malinowska	b88e6019f7	code review fixes 3	2018-08-28 18:11:04 +02:00
Aleksandra Malinowska	5620f76c62	Pass NoScaleUpInfo to ScaleUpStatus processor	2018-08-28 14:26:03 +02:00
Aleksandra Malinowska	cd9808185e	Report reason why pod didn't trigger scale-up	2018-08-28 14:11:36 +02:00
Aleksandra Malinowska	398a1ac153	Fix error on node info not found for group	2018-07-23 11:16:12 +02:00
Pengfei Ni	1dd0147d9e	Add more events for CA	2018-07-09 15:42:05 +08:00
Aleksandra Malinowska	800ee56b34	Refactor and extend GPU metrics error types	2018-07-05 13:13:11 +02:00
Karol Gołąb	aae4d1270a	Make GetGpuTypeForMetrics more robust	2018-06-26 21:35:16 +02:00
Karol Gołąb	5eb7021f82	Add GPU-related scaled_up & scaled_down metrics (#974 ) * Add GPU-related scaled_up & scaled_down metrics * Fix name to match SD naming convention * Fix import after master rebase * Change the logic to include GPU-being-installed nodes	2018-06-22 21:00:52 +02:00
Krzysztof Jastrzebski	99c8c51bb3	Create NodeGroupManager which is responsible for creating/deleting node groups.	2018-06-14 16:11:32 +02:00
Łukasz Osipiuk	b7323bc0d1	Respect GPU limits in scale_up	2018-06-14 15:46:58 +02:00
Łukasz Osipiuk	dfcbedb41f	Take into consideration nodes from not autoscaled groups when enforcing resource limits	2018-06-14 15:31:40 +02:00
Łukasz Osipiuk	9f75099d2c	Restructure checking resource limits in scale_up.go Preparatory work for before introducing GPU limits	2018-06-13 19:00:37 +02:00
Pengfei Ni	be3dd85503	Update scheduler cache package	2018-06-11 13:54:12 +08:00
Łukasz Osipiuk	9c61477d25	Do not return error when getting cpu/memory capacity of node	2018-06-08 15:04:57 +02:00
Beata Skiba	b8ae6df5d3	Add post scale up status processor.	2018-06-06 13:34:49 +02:00
Maciej Pytel	856855987b	Move some GKE-specific logic outside core No change in actual logic being executed. Added a new NodeGroupListProcessor interface to encapsulate the existing logic. Moved PodListProcessor and refactor how it's passed around to make it consistent and easy to add similar interfaces.	2018-05-29 12:57:19 +02:00
Krzysztof Jastrzebski	6761d7f354	Execute predicates only for similar pods.	2018-05-29 09:36:11 +02:00
Karol Gołąb	4c710950de	Move ClusterStateRegistry to StaticAutoscaler AutoscalingContext is basically a configuration and few static helpers and API handles. ClusterStateRegistry is state and thus moved to other state-keeping objects.	2018-05-24 13:03:01 +02:00
Joachim Bartosik	bfb70e40ee	Allow passing taints to Node Group creation.	2018-05-18 14:33:33 +02:00
Krzysztof Jastrzebski	88b769b324	Refactor cluster autoscaler builder and add pod list processor.	2018-04-26 12:37:51 +02:00
Aleksandra Malinowska	feb4ad9e14	Add utility for limiting logging	2018-03-22 12:57:22 +01:00
Marcin Wielgus	04bec08e84	Compilation fix	2018-03-20 20:11:36 +01:00
Maciej Pytel	b7f8622eb2	Create node groups with GPU in scale-up.go This is still not implemented in cloudprovider. Extended NewNodeGroup inteface to have a way of passing parameters for more complex resources.	2017-12-11 13:12:22 +01:00
Maciej Pytel	c376ef3c87	Add metrics for autoprovisioning	2017-10-31 17:42:58 +01:00
Maciej Pytel	9c2ebccbfe	Write events when autoprovisioned nodegroup is created / deleted	2017-10-25 17:39:30 +02:00
Krzysztof Jastrzebski	56ac572666	Adds resource limits to cloud provider.	2017-10-23 16:06:56 +02:00
Maciej Pytel	02ccba3338	Update clusterstate after scale-up	2017-10-17 16:11:25 +02:00
Maciej Pytel	3498507220	Handle nodegroup id changing upon creation	2017-10-17 14:02:46 +02:00
Maciej Pytel	e12ee88f5f	Add failed scale-up reason in metric	2017-09-26 13:40:34 +02:00
Aleksandra Malinowska	197b05b180	respect minimum cores/memory limit during scale down	2017-09-13 10:10:47 +02:00
Krzysztof Jastrzebski	b1396c3cd1	Fix filtering for autoprovisioned node groups and add unit test.	2017-09-12 16:20:23 +02:00
Aleksandra Malinowska	d43029c180	implement blocking scale up beyond max cores & memory	2017-09-08 12:50:00 +02:00
Marcin Wielgus	e85e94510d	Tests for add autoprovisioned node groups	2017-09-06 02:44:16 +02:00
Marcin Wielgus	1ad8d9e10c	Build template NodeInfo for node autoprovisioning	2017-09-05 17:28:49 +02:00
Sergey Lanzman	437a3f60e1	Small optimize code	2017-09-04 23:50:45 +03:00
Marcin Wielgus	ae00f0544b	Merge pull request #290 from mwielgus/max-nap-groups Limit autoprovisioned groups to 15	2017-09-01 23:49:33 +05:30
Marcin Wielgus	de524a6688	Limit autoprovisioned groups to 15	2017-09-01 18:25:28 +02:00
Maciej Pytel	a86268f114	Write event on scale-up failure	2017-09-01 13:34:20 +02:00
Marcin Wielgus	f217d4ac93	Do not return error from exist	2017-09-01 00:24:01 +02:00
Marcin Wielgus	22f856d4da	Small refactoring in ScaleUp	2017-08-31 13:21:20 +02:00
Marcin Wielgus	6b9e56f0f9	Node autoprovisioning in scale up	2017-08-31 01:33:52 +02:00
Maciej Pytel	281afa7147	precompute predicateMetadata in scale-down	2017-08-29 16:29:45 +02:00
Maciej Pytel	fb6ef75d12	Don't create verbose errors in predicates if we ignore them Turns out all this string formatting is pretty damn expensive.	2017-08-24 15:18:38 +02:00
Maciej Pytel	6aacbb5bf7	Backoff for node group after failed scale-up	2017-08-04 15:40:23 +02:00

1 2

63 Commits