Vivek Bagade
79ef3a6940
unexporting methods in utils.go
2019-01-25 00:06:03 +05:30
Łukasz Osipiuk
85a83b62bd
Pass nodeGroup->NodeInfo map to ClusterStateRegistry
...
Change-Id: Ie2a51694b5731b39c8a4135355a3b4c832c26801
2019-01-08 15:52:00 +01:00
Kubernetes Prow Robot
4002559a4c
Merge pull request #1516 from frobware/fix-max-nodes-total-upstream
...
fix calculation of max cluster size
2019-01-03 10:02:38 -08:00
Maciej Pytel
3f0da8947a
Use listers in scale-up
2019-01-02 15:56:01 +01:00
Kubernetes Prow Robot
ab7f1e69be
Merge pull request #1464 from losipiuk/lo/stockouts2
...
Better quota-exceeded/stockout handling
2018-12-31 05:28:08 -08:00
Łukasz Osipiuk
9689b30ee4
Do not use time.Now() in RegisterFailedScaleUp
2018-12-28 17:17:07 +01:00
Łukasz Osipiuk
da5bef307b
Allow updating Increase for ScaleUpRequest in ClusterStateRegistry
2018-12-28 17:17:07 +01:00
Maciej Pytel
60babe7158
Use kubernetes lister for daemonset instead of custom one
...
Also migrate to using apps/v1.DaemonSet instead of old
extensions/v1beta1.
2018-12-28 13:55:41 +01:00
Andrew McDermott
5bc77f051c
UPSTREAM: <carry>: fix calculation of max cluster size
...
When scaling up, the calculation for the maximum size of the cluster
based on `--max-nodes-total` doesn't take into account any nodes that
are in the process of coming up. This allows the cluster to grow
beyond the size specified.
With this change I now see:
scale_up.go:266] 21 other pods are also unschedulable
scale_up.go:423] Best option to resize: openshift-cluster-api/amcdermo-ca-worker-us-east-2b
scale_up.go:427] Estimated 18 nodes needed in openshift-cluster-api/amcdermo-ca-worker-us-east-2b
scale_up.go:432] Capping size to max cluster total size (23)
static_autoscaler.go:275] Failed to scale up: max node total count already reached
2018-12-18 17:05:19 +00:00
Łukasz Osipiuk
016bf7fc2c
Use k8s.io/klog instead github.com/golang/glog
2018-11-26 17:30:31 +01:00
k8s-ci-robot
7008fb50be
Merge pull request #1380 from losipiuk/lo/backoff
...
Make Backoff interface
2018-11-07 05:13:43 -08:00
Aleksandra Malinowska
6febc1ddb0
Fix formatted log messages
2018-11-06 14:51:43 +01:00
Aleksandra Malinowska
bf6ff4be8e
Clean up estimators
2018-11-06 14:15:42 +01:00
Łukasz Osipiuk
0e2c3739b7
Use NodeGroup as key in Backoff
2018-10-30 18:17:26 +01:00
Łukasz Osipiuk
55fc1e2f00
Store NodeGroup in ScaleUpRequest and ScaleDownRequest
2018-10-30 18:03:04 +01:00
Maciej Pytel
6f5e6aab6f
Move node group balancing to processor
...
The goal is to allow customization of this logic
for different use-case and cloudproviders.
2018-10-25 14:04:05 +02:00
Łukasz Osipiuk
a266420f6a
Recalculate clusterStateRegistry after adding multiple node groups
2018-10-02 17:15:20 +02:00
Łukasz Osipiuk
437efe4af6
If possible use nodeInfo based on created node group
2018-10-02 15:46:45 +02:00
Jakub Tużnik
8179e4e716
Refactor the scale-(up|down) status processors so that they have more info available
...
Replace the simple boolean ScaledUp property of ScaleUpStatus with a more
comprehensive ScaleUpResult. Add more possible values to ScaleDownResult.
Refactor the processors execution so that they are always executed every
iteration, even if RunOnce exits earlier.
2018-09-20 17:12:02 +02:00
Łukasz Osipiuk
bf8cfef10b
NodeGroupManager.CreateNodeGroup can return extra created node groups.
2018-09-19 13:55:51 +02:00
Łukasz Osipiuk
705a6d87e2
fixup! Call CheckPodsSchedulableOnNode in scale_up.go via caching layer
2018-09-17 13:01:19 +02:00
Łukasz Osipiuk
0ad4efe920
Call CheckPodsSchedulableOnNode in scale_up.go via caching layer
2018-09-13 17:01:15 +02:00
Aleksandra Malinowska
b88e6019f7
code review fixes 3
2018-08-28 18:11:04 +02:00
Aleksandra Malinowska
5620f76c62
Pass NoScaleUpInfo to ScaleUpStatus processor
2018-08-28 14:26:03 +02:00
Aleksandra Malinowska
cd9808185e
Report reason why pod didn't trigger scale-up
2018-08-28 14:11:36 +02:00
Aleksandra Malinowska
398a1ac153
Fix error on node info not found for group
2018-07-23 11:16:12 +02:00
Pengfei Ni
1dd0147d9e
Add more events for CA
2018-07-09 15:42:05 +08:00
Aleksandra Malinowska
800ee56b34
Refactor and extend GPU metrics error types
2018-07-05 13:13:11 +02:00
Karol Gołąb
aae4d1270a
Make GetGpuTypeForMetrics more robust
2018-06-26 21:35:16 +02:00
Karol Gołąb
5eb7021f82
Add GPU-related scaled_up & scaled_down metrics ( #974 )
...
* Add GPU-related scaled_up & scaled_down metrics
* Fix name to match SD naming convention
* Fix import after master rebase
* Change the logic to include GPU-being-installed nodes
2018-06-22 21:00:52 +02:00
Krzysztof Jastrzebski
99c8c51bb3
Create NodeGroupManager which is responsible for creating/deleting node groups.
2018-06-14 16:11:32 +02:00
Łukasz Osipiuk
b7323bc0d1
Respect GPU limits in scale_up
2018-06-14 15:46:58 +02:00
Łukasz Osipiuk
dfcbedb41f
Take into consideration nodes from not autoscaled groups when enforcing resource limits
2018-06-14 15:31:40 +02:00
Łukasz Osipiuk
9f75099d2c
Restructure checking resource limits in scale_up.go
...
Preparatory work for before introducing GPU limits
2018-06-13 19:00:37 +02:00
Pengfei Ni
be3dd85503
Update scheduler cache package
2018-06-11 13:54:12 +08:00
Łukasz Osipiuk
9c61477d25
Do not return error when getting cpu/memory capacity of node
2018-06-08 15:04:57 +02:00
Beata Skiba
b8ae6df5d3
Add post scale up status processor.
2018-06-06 13:34:49 +02:00
Maciej Pytel
856855987b
Move some GKE-specific logic outside core
...
No change in actual logic being executed. Added a new
NodeGroupListProcessor interface to encapsulate the existing logic.
Moved PodListProcessor and refactor how it's passed around
to make it consistent and easy to add similar interfaces.
2018-05-29 12:57:19 +02:00
Krzysztof Jastrzebski
6761d7f354
Execute predicates only for similar pods.
2018-05-29 09:36:11 +02:00
Karol Gołąb
4c710950de
Move ClusterStateRegistry to StaticAutoscaler
...
AutoscalingContext is basically a configuration and few static helpers
and API handles.
ClusterStateRegistry is state and thus moved to other state-keeping
objects.
2018-05-24 13:03:01 +02:00
Joachim Bartosik
bfb70e40ee
Allow passing taints to Node Group creation.
2018-05-18 14:33:33 +02:00
Krzysztof Jastrzebski
88b769b324
Refactor cluster autoscaler builder and add pod list processor.
2018-04-26 12:37:51 +02:00
Aleksandra Malinowska
feb4ad9e14
Add utility for limiting logging
2018-03-22 12:57:22 +01:00
Marcin Wielgus
04bec08e84
Compilation fix
2018-03-20 20:11:36 +01:00
Maciej Pytel
b7f8622eb2
Create node groups with GPU in scale-up.go
...
This is still not implemented in cloudprovider.
Extended NewNodeGroup inteface to have a way of passing
parameters for more complex resources.
2017-12-11 13:12:22 +01:00
Maciej Pytel
c376ef3c87
Add metrics for autoprovisioning
2017-10-31 17:42:58 +01:00
Maciej Pytel
9c2ebccbfe
Write events when autoprovisioned nodegroup is created / deleted
2017-10-25 17:39:30 +02:00
Krzysztof Jastrzebski
56ac572666
Adds resource limits to cloud provider.
2017-10-23 16:06:56 +02:00
Maciej Pytel
02ccba3338
Update clusterstate after scale-up
2017-10-17 16:11:25 +02:00
Maciej Pytel
3498507220
Handle nodegroup id changing upon creation
2017-10-17 14:02:46 +02:00