Łukasz Osipiuk
a266420f6a
Recalculate clusterStateRegistry after adding multiple node groups
2018-10-02 17:15:20 +02:00
Łukasz Osipiuk
437efe4af6
If possible use nodeInfo based on created node group
2018-10-02 15:46:45 +02:00
Jakub Tużnik
8179e4e716
Refactor the scale-(up|down) status processors so that they have more info available
...
Replace the simple boolean ScaledUp property of ScaleUpStatus with a more
comprehensive ScaleUpResult. Add more possible values to ScaleDownResult.
Refactor the processors execution so that they are always executed every
iteration, even if RunOnce exits earlier.
2018-09-20 17:12:02 +02:00
Łukasz Osipiuk
bf8cfef10b
NodeGroupManager.CreateNodeGroup can return extra created node groups.
2018-09-19 13:55:51 +02:00
Łukasz Osipiuk
705a6d87e2
fixup! Call CheckPodsSchedulableOnNode in scale_up.go via caching layer
2018-09-17 13:01:19 +02:00
Łukasz Osipiuk
0ad4efe920
Call CheckPodsSchedulableOnNode in scale_up.go via caching layer
2018-09-13 17:01:15 +02:00
Aleksandra Malinowska
b88e6019f7
code review fixes 3
2018-08-28 18:11:04 +02:00
Aleksandra Malinowska
5620f76c62
Pass NoScaleUpInfo to ScaleUpStatus processor
2018-08-28 14:26:03 +02:00
Aleksandra Malinowska
cd9808185e
Report reason why pod didn't trigger scale-up
2018-08-28 14:11:36 +02:00
Aleksandra Malinowska
398a1ac153
Fix error on node info not found for group
2018-07-23 11:16:12 +02:00
Pengfei Ni
1dd0147d9e
Add more events for CA
2018-07-09 15:42:05 +08:00
Aleksandra Malinowska
800ee56b34
Refactor and extend GPU metrics error types
2018-07-05 13:13:11 +02:00
Karol Gołąb
aae4d1270a
Make GetGpuTypeForMetrics more robust
2018-06-26 21:35:16 +02:00
Karol Gołąb
5eb7021f82
Add GPU-related scaled_up & scaled_down metrics ( #974 )
...
* Add GPU-related scaled_up & scaled_down metrics
* Fix name to match SD naming convention
* Fix import after master rebase
* Change the logic to include GPU-being-installed nodes
2018-06-22 21:00:52 +02:00
Krzysztof Jastrzebski
99c8c51bb3
Create NodeGroupManager which is responsible for creating/deleting node groups.
2018-06-14 16:11:32 +02:00
Łukasz Osipiuk
b7323bc0d1
Respect GPU limits in scale_up
2018-06-14 15:46:58 +02:00
Łukasz Osipiuk
dfcbedb41f
Take into consideration nodes from not autoscaled groups when enforcing resource limits
2018-06-14 15:31:40 +02:00
Łukasz Osipiuk
9f75099d2c
Restructure checking resource limits in scale_up.go
...
Preparatory work for before introducing GPU limits
2018-06-13 19:00:37 +02:00
Pengfei Ni
be3dd85503
Update scheduler cache package
2018-06-11 13:54:12 +08:00
Łukasz Osipiuk
9c61477d25
Do not return error when getting cpu/memory capacity of node
2018-06-08 15:04:57 +02:00
Beata Skiba
b8ae6df5d3
Add post scale up status processor.
2018-06-06 13:34:49 +02:00
Maciej Pytel
856855987b
Move some GKE-specific logic outside core
...
No change in actual logic being executed. Added a new
NodeGroupListProcessor interface to encapsulate the existing logic.
Moved PodListProcessor and refactor how it's passed around
to make it consistent and easy to add similar interfaces.
2018-05-29 12:57:19 +02:00
Krzysztof Jastrzebski
6761d7f354
Execute predicates only for similar pods.
2018-05-29 09:36:11 +02:00
Karol Gołąb
4c710950de
Move ClusterStateRegistry to StaticAutoscaler
...
AutoscalingContext is basically a configuration and few static helpers
and API handles.
ClusterStateRegistry is state and thus moved to other state-keeping
objects.
2018-05-24 13:03:01 +02:00
Joachim Bartosik
bfb70e40ee
Allow passing taints to Node Group creation.
2018-05-18 14:33:33 +02:00
Krzysztof Jastrzebski
88b769b324
Refactor cluster autoscaler builder and add pod list processor.
2018-04-26 12:37:51 +02:00
Aleksandra Malinowska
feb4ad9e14
Add utility for limiting logging
2018-03-22 12:57:22 +01:00
Marcin Wielgus
04bec08e84
Compilation fix
2018-03-20 20:11:36 +01:00
Maciej Pytel
b7f8622eb2
Create node groups with GPU in scale-up.go
...
This is still not implemented in cloudprovider.
Extended NewNodeGroup inteface to have a way of passing
parameters for more complex resources.
2017-12-11 13:12:22 +01:00
Maciej Pytel
c376ef3c87
Add metrics for autoprovisioning
2017-10-31 17:42:58 +01:00
Maciej Pytel
9c2ebccbfe
Write events when autoprovisioned nodegroup is created / deleted
2017-10-25 17:39:30 +02:00
Krzysztof Jastrzebski
56ac572666
Adds resource limits to cloud provider.
2017-10-23 16:06:56 +02:00
Maciej Pytel
02ccba3338
Update clusterstate after scale-up
2017-10-17 16:11:25 +02:00
Maciej Pytel
3498507220
Handle nodegroup id changing upon creation
2017-10-17 14:02:46 +02:00
Maciej Pytel
e12ee88f5f
Add failed scale-up reason in metric
2017-09-26 13:40:34 +02:00
Aleksandra Malinowska
197b05b180
respect minimum cores/memory limit during scale down
2017-09-13 10:10:47 +02:00
Krzysztof Jastrzebski
b1396c3cd1
Fix filtering for autoprovisioned node groups and add unit test.
2017-09-12 16:20:23 +02:00
Aleksandra Malinowska
d43029c180
implement blocking scale up beyond max cores & memory
2017-09-08 12:50:00 +02:00
Marcin Wielgus
e85e94510d
Tests for add autoprovisioned node groups
2017-09-06 02:44:16 +02:00
Marcin Wielgus
1ad8d9e10c
Build template NodeInfo for node autoprovisioning
2017-09-05 17:28:49 +02:00
Sergey Lanzman
437a3f60e1
Small optimize code
2017-09-04 23:50:45 +03:00
Marcin Wielgus
ae00f0544b
Merge pull request #290 from mwielgus/max-nap-groups
...
Limit autoprovisioned groups to 15
2017-09-01 23:49:33 +05:30
Marcin Wielgus
de524a6688
Limit autoprovisioned groups to 15
2017-09-01 18:25:28 +02:00
Maciej Pytel
a86268f114
Write event on scale-up failure
2017-09-01 13:34:20 +02:00
Marcin Wielgus
f217d4ac93
Do not return error from exist
2017-09-01 00:24:01 +02:00
Marcin Wielgus
22f856d4da
Small refactoring in ScaleUp
2017-08-31 13:21:20 +02:00
Marcin Wielgus
6b9e56f0f9
Node autoprovisioning in scale up
2017-08-31 01:33:52 +02:00
Maciej Pytel
281afa7147
precompute predicateMetadata in scale-down
2017-08-29 16:29:45 +02:00
Maciej Pytel
fb6ef75d12
Don't create verbose errors in predicates if we ignore them
...
Turns out all this string formatting is pretty damn expensive.
2017-08-24 15:18:38 +02:00
Maciej Pytel
6aacbb5bf7
Backoff for node group after failed scale-up
2017-08-04 15:40:23 +02:00