autoscaler

Commit Graph

Author	SHA1	Message	Date
Vivek Bagade	79ef3a6940	unexporting methods in utils.go	2019-01-25 00:06:03 +05:30
Jacek Kaniuk	0c64e0932a	Tainting unneeded nodes as PreferNoSchedule	2019-01-21 13:06:50 +01:00
Maciej Pytel	9060014992	Use listers in scale-down	2018-12-31 14:55:38 +01:00
lsytj0413	672dddd23a	refactor(*): fix golint warning	2018-12-19 10:04:08 +08:00
Andrew McDermott	fd3fd85f26	UPSTREAM: <carry>: handle nil nodeGroup in calculateScaleDownGpusTotal Explicitly handle nil as a return value for nodeGroup in `calculateScaleDownGpusTotal()` when `NodeGroupForNode()` is called for GPU nodes that don't exist. The current logic generates a runtime exception: "reflect: call of reflect.Value.IsNil on zero Value" Looking through the rest of the tree all the other places that use this pattern additionally and explicitly check whether `nodeGroup == nil` first. This change now completes the pattern in `calculateScaleDownGpusTotal()`. Looking at the other occurrences of this pattern we see: ``` File: clusterstate/clusterstate.go 488:26: if nodeGroup == nil \|\| reflect.ValueOf(nodeGroup).IsNil() { File: core/utils.go 231:26: if nodeGroup == nil \|\| reflect.ValueOf(nodeGroup).IsNil() { 322:26: if nodeGroup == nil \|\| reflect.ValueOf(nodeGroup).IsNil() { 394:27: if nodeGroup == nil \|\| reflect.ValueOf(nodeGroup).IsNil() { 461:26: if nodeGroup == nil \|\| reflect.ValueOf(nodeGroup).IsNil() { File: core/scale_down.go 185:6: if reflect.ValueOf(nodeGroup).IsNil() { 608:27: if nodeGroup == nil \|\| reflect.ValueOf(nodeGroup).IsNil() { 747:26: if nodeGroup == nil \|\| reflect.ValueOf(nodeGroup).IsNil() { 1010:25: if nodeGroup == nil \|\| reflect.ValueOf(nodeGroup).IsNil() { ``` with the notable exception at core/scale_down.go:185 which is `calculateScaleDownGpusTotal()`. With this change, and invoking the autoscaler with: ``` ... --max-nodes-total=24 \ --cores-total=8:128 \ --memory-total=4:256 \ --gpu-total=nvidia.com/gpu:0:16 \ --gpu-total=amd.com/gpu:0:4 \ ... ``` I no longer see a runtime exception.	2018-12-05 18:54:07 +00:00
Łukasz Osipiuk	016bf7fc2c	Use k8s.io/klog instead github.com/golang/glog	2018-11-26 17:30:31 +01:00
Alex Price	4ae7acbacc	add flags to ignore daemonsets and mirror pods when calculating resource utilization of a node Adds the flag --ignore-daemonsets-utilization and --ignore-mirror-pods-utilization (defaults to false) and when enabled, factors DaemonSet and mirror pods out when calculating the resource utilization of a node.	2018-11-23 15:24:25 +11:00
Łukasz Osipiuk	55fc1e2f00	Store NodeGroup in ScaleUpRequest and ScaleDownRequest	2018-10-30 18:03:04 +01:00
Jakub Tużnik	71111da20c	Add a scale down status processor, refactor so that there's more scale down info available to it	2018-09-12 14:52:20 +02:00
Pengfei Ni	1dd0147d9e	Add more events for CA	2018-07-09 15:42:05 +08:00
Aleksandra Malinowska	800ee56b34	Refactor and extend GPU metrics error types	2018-07-05 13:13:11 +02:00
Karol Gołąb	aae4d1270a	Make GetGpuTypeForMetrics more robust	2018-06-26 21:35:16 +02:00
Marcin Wielgus	f2e76e2592	Merge pull request #1008 from krzysztof-jastrzebski/master Move removing unneeded autoprovisioned node groups to node group manager	2018-06-22 21:01:36 +02:00
Karol Gołąb	5eb7021f82	Add GPU-related scaled_up & scaled_down metrics (#974 ) * Add GPU-related scaled_up & scaled_down metrics * Fix name to match SD naming convention * Fix import after master rebase * Change the logic to include GPU-being-installed nodes	2018-06-22 21:00:52 +02:00
Krzysztof Jastrzebski	2df2568841	Move removing unneeded autoprovisioned node groups to node group manager	2018-06-22 14:26:12 +02:00
Nic Doye	ebadbda2b2	issues/933 Consider making UnremovableNodeRecheckTimeout configurable	2018-06-18 11:54:14 +01:00
Łukasz Osipiuk	b7323bc0d1	Respect GPU limits in scale_up	2018-06-14 15:46:58 +02:00
Łukasz Osipiuk	9f75099d2c	Restructure checking resource limits in scale_up.go Preparatory work for before introducing GPU limits	2018-06-13 19:00:37 +02:00
Łukasz Osipiuk	087a5cc9a9	Respect GPU limits in scale_down	2018-06-13 14:19:59 +02:00
Łukasz Osipiuk	1fa44a4d3a	Fix bug resulting resource limits not being enforced in scale_down	2018-06-11 16:39:07 +02:00
Łukasz Osipiuk	519064e1ec	Extract isNodeBeingDeleted function	2018-06-11 14:21:07 +02:00
Łukasz Osipiuk	6c57a01fc9	Restructure checking resource limits in scale_down.go	2018-06-11 14:02:40 +02:00
Łukasz Osipiuk	9c61477d25	Do not return error when getting cpu/memory capacity of node	2018-06-08 15:04:57 +02:00
Krzysztof Jastrzebski	adad14c2c9	Delete autoprovisioned node pool after all nodes are deleted.	2018-05-28 14:22:18 +02:00
Karol Gołąb	4c710950de	Move ClusterStateRegistry to StaticAutoscaler AutoscalingContext is basically a configuration and few static helpers and API handles. ClusterStateRegistry is state and thus moved to other state-keeping objects.	2018-05-24 13:03:01 +02:00
Aleksandra Malinowska	ffeebde8d8	Add support for rescheduled pods with the same name in drain	2018-05-10 12:00:56 +02:00
Marcin Wielgus	9c5728fd74	Merge pull request #836 from kgolab/kg-clean-up-004 Use timestamp argument	2018-05-08 20:24:37 +02:00
Karol Gołąb	53b1c6a394	Use timestamp argument	2018-05-08 13:08:30 +02:00
Karol Gołąb	da16642bcf	Make the code slightly more idiomatic go	2018-05-08 11:35:01 +02:00
Beata Skiba	054f6d8650	Merge pull request #794 from krzysztof-jastrzebski/pods Refactor cluster autoscaler builder and add pod list processor.	2018-04-26 13:08:56 +02:00
Krzysztof Jastrzebski	88b769b324	Refactor cluster autoscaler builder and add pod list processor.	2018-04-26 12:37:51 +02:00
Aleksandra Malinowska	3d599bfabe	Rephrase unremovable node warning	2018-04-18 13:43:32 +02:00
Aleksandra Malinowska	4c594db7f8	Run spellchecker	2018-03-15 15:47:49 +01:00
anniedy	bf59e3daa5	Typo fix unneded->[unneeded] (#623 ) * Update clusterstate.md * Update scale_down.go * Update static_autoscaler.go	2018-02-07 17:36:58 +01:00
Marcin Wielgus	439fd3c9ec	Merge pull request #411 from krzysztof-jastrzebski/priority Adds priority preemption support to cluster autoscaler.	2017-11-08 09:09:26 +01:00
Edward Tsang	4104a91991	more spelling fixes	2017-11-02 14:21:36 -07:00
Maciej Pytel	c376ef3c87	Add metrics for autoprovisioning	2017-10-31 17:42:58 +01:00
Maciej Pytel	9c2ebccbfe	Write events when autoprovisioned nodegroup is created / deleted	2017-10-25 17:39:30 +02:00
Krzysztof Jastrzebski	56ac572666	Adds resource limits to cloud provider.	2017-10-23 16:06:56 +02:00
Krzysztof Jastrzebski	d9c00e5ce1	Adds priority preemption support to cluster autoscaler.	2017-10-23 09:54:56 +02:00
Aleksandra Malinowska	4c31a57374	fix leaking taints in case of cloud provider error on node deletion	2017-09-22 17:55:48 +02:00
Marcin Wielgus	f04113d746	Remove TargetSize() from loops iterating over nodes	2017-09-13 22:33:17 +02:00
Aleksandra Malinowska	197b05b180	respect minimum cores/memory limit during scale down	2017-09-13 10:10:47 +02:00
Aleksandra Malinowska	187c02693e	Taint empty nodes to be deleted	2017-09-12 17:40:05 +02:00
Marcin Wielgus	3039a0e813	Merge pull request #319 from krzysztof-jastrzebski/core-test Core/static_autoscaler.go unit tests.	2017-09-12 13:11:11 +02:00
Beata Skiba	eba0fa2f95	Remove nodes that are not in the cluster from unremovableNodes	2017-09-11 20:01:02 +02:00
Krzysztof Jastrzebski	0aec68a46d	Core/static_autoscaler.go unit tests. Current time usage refactoring.	2017-09-11 15:07:21 +02:00
Marcin Wielgus	db63ac3a18	Merge pull request #324 from aleksandra-malinowska/scale-down-pod-not-found Add checking for pod not found error on eviction	2017-09-11 15:10:08 +05:30
Beata Skiba	6e5784a519	Always add empty nodes to unneeded nodes	2017-09-08 15:55:18 +02:00
Aleksandra Malinowska	fbc8462b10	Add checking for not found error	2017-09-08 15:45:44 +02:00

1 2

89 Commits