autoscaler

Commit Graph

Author	SHA1	Message	Date
Kubernetes Prow Robot	791f0d8355	Merge pull request #2281 from DataDog/JulienBalestra/mig-block cluster-autoscaler: blocked if an instance is detached from MIG	2019-09-11 05:03:22 -07:00
Julien Balestra	3441f616e1	cluster-autoscaler/skip-node: unblock cluster autoscaler when having a single nodegroup for node error Signed-off-by: Julien Balestra <julien.balestra@datadoghq.com>	2019-09-11 13:40:23 +02:00
Krzysztof Jastrzebski	839cdaaa09	Stop disabling Cluster Autoscaler when there is no ready nodes.	2019-09-06 14:45:34 +02:00
Julien Balestra	6d707a08ac	cluster-autoscaler/metrics: expose the scale down cooldown Signed-off-by: Julien Balestra <julien.balestra@datadoghq.com>	2019-08-27 18:12:33 +02:00
Kubernetes Prow Robot	9aac43e237	Merge pull request #2235 from piontec/fix/aws_spots_squashed correctly handle lack of capacity of AWS spot ASGs	2019-08-19 04:27:30 -07:00
Kubernetes Prow Robot	3f0a5fa3c2	Merge pull request #2233 from vivekbagade/surge Adding ScaleDownNodeProcessor	2019-08-19 03:59:32 -07:00
Łukasz Piątkowski	8d9b81caaa	correctly handle lack of capacity of AWS spot ASGs	2019-08-19 12:43:53 +02:00
Vivek Bagade	dc64d0aab2	Adding ScaleDownNodeProcessor	2019-08-12 20:19:55 +02:00
Jakub Tużnik	44ae89dd09	Communicate the result of RemoveUnneededNodeGroups to ScaleDownStatusProcessor	2019-08-12 17:03:51 +02:00
t-qini	f7c563ab06	Modify the code as the simple solution proposed by MaciekPytel.	2019-07-18 23:58:05 +08:00
t-qini	622a838c2c	Modify nodal similarity rules.	2019-07-09 16:04:40 +08:00
Pengfei Ni	d45fee06da	Ensure upcoming nodes are different	2019-07-02 16:52:19 +08:00
silenceper	478660a6bb	fix error	2019-06-28 18:49:58 +08:00
Vivek Bagade	90aa28a077	Move pod packing in upcoming nodes to RunOnce from Estimator for performance improvements	2019-06-19 14:48:47 +02:00
Łukasz Osipiuk	0bcf5315a7	Do not fail loop iteration if unregistered nodes cannot be removed The mechanism of unregistered nodes removal is not the first responsibility of Cluster Autoscaler. We do not want to renderi CA unsable (disable scale-up and scale-down) if removing unregistered nodes cannot be done for prolonged period of time.	2019-06-10 13:45:54 +02:00
Jakub Tużnik	bb382f47f9	Retain information about scale-up failures in CSR This will provide the AutoscalingStatusProcessor with information about failed scale-ups.	2019-06-05 16:53:30 +02:00
Krzysztof Jastrzebski	4831d76288	Cache cloud provider node instances in cluster state.	2019-05-31 10:11:51 +02:00
Kubernetes Prow Robot	cb4e60f8d4	Merge pull request #2031 from krzysztof-jastrzebski/master Add functionality which delays node deletion to let other components prepare for deletion.	2019-05-20 00:57:13 -07:00
Kubernetes Prow Robot	8d2ec08b2c	Merge pull request #2015 from losipiuk/lo/pass-via-context Add methods for passing arbitrary object via autoscaling context	2019-05-17 08:12:07 -07:00
Łukasz Osipiuk	e76558c65f	Add methods for passing arbitrary object via autoscaling context Change-Id: I066e58010a0aef4989bfc1f73b90bc69c773b26e	2019-05-17 16:38:12 +02:00
Krzysztof Jastrzebski	4247c8b032	Implement functionality which delays node deletion when node has annotation with prefix 'delay-deletion.cluster-autoscaler.kubernetes.io/'.	2019-05-17 16:06:17 +02:00
Chris Bradfield	92ea680f1a	Implement an --ignore-taint flag This change adds support for a user to specify taints to ignore when considering a node as a template for a node group.	2019-05-14 10:22:59 -07:00
Łukasz Osipiuk	c9811e87b4	Include pods with NominatedNodeName in scheduledPods list used for scale-up considerations Change-Id: Ie4c095b30bf0cd1f160f1ac4b8c1fcb8c0524096	2019-04-15 16:59:13 +02:00
Łukasz Osipiuk	db4c6f1133	Migrate filter out schedulabe to PodListProcessor	2019-04-15 16:59:13 +02:00
Łukasz Osipiuk	5c09c50774	Pass ready nodes list to PodListProcessor	2019-04-15 16:59:13 +02:00
Łukasz Osipiuk	c6115b826e	Define ProcessorCallbacks interface	2019-04-15 16:59:13 +02:00
Jiaxin Shan	90666881d3	Move GPULabel and GPUTypes to cloud provider	2019-03-25 13:03:01 -07:00
Łukasz Osipiuk	2474dc2fd5	Call CloudProvider.Refresh before getNodeInfosForGroups We need to call refresh before getNodeInfosForGroups. If we have stale state getNodeInfosForGroups may fail and we will end up in infinite crash looping.	2019-03-12 12:07:49 +01:00
Aleksandra Malinowska	62a28f3005	Soft taint when there are no candidates	2019-03-11 14:05:09 +01:00
Uday Ruddarraju	91b7bc08a1	Fixing minor error handling bug in static autoscaler	2019-03-07 15:16:27 -08:00
Aleksandra Malinowska	a824e87957	Only soft taint nodes if there's no scale down to do	2019-02-25 17:11:15 +01:00
Pengfei Ni	128729bae9	Move schedulercache to package nodeinfo	2019-02-21 12:41:08 +08:00
Jacek Kaniuk	d969baff22	Cache exemplar ready node for each node group	2019-02-11 17:40:58 +01:00
Vivek Bagade	c6b87841ce	Added a new method that uses pod packing to filter schedulable pods filterOutSchedulableByPacking is an alternative to the older filterOutSchedulable. filterOutSchedulableByPacking sorts pods in unschedulableCandidates by priority and filters out pods that can be scheduled on free capacity on existing nodes. It uses a basic packing approach to do this. Pods with nominatedNodeName set are always filtered out. filterOutSchedulableByPacking is set to be used by default, but, this can be toggled off by setting filter-out-schedulable-pods-uses-packing flag to false, which would then activate the older and more lenient filterOutSchedulable(now called filterOutSchedulableSimple). Added test cases for both methods.	2019-01-25 16:09:51 +05:30
Vivek Bagade	79ef3a6940	unexporting methods in utils.go	2019-01-25 00:06:03 +05:30
Jacek Kaniuk	0c64e0932a	Tainting unneeded nodes as PreferNoSchedule	2019-01-21 13:06:50 +01:00
Łukasz Osipiuk	85a83b62bd	Pass nodeGroup->NodeInfo map to ClusterStateRegistry Change-Id: Ie2a51694b5731b39c8a4135355a3b4c832c26801	2019-01-08 15:52:00 +01:00
Kubernetes Prow Robot	ab7f1e69be	Merge pull request #1464 from losipiuk/lo/stockouts2 Better quota-exceeded/stockout handling	2018-12-31 05:28:08 -08:00
Łukasz Osipiuk	2fbae197f4	Handle possible stockout/quota scale-up errors	2018-12-28 17:17:07 +01:00
Maciej Pytel	60babe7158	Use kubernetes lister for daemonset instead of custom one Also migrate to using apps/v1.DaemonSet instead of old extensions/v1beta1.	2018-12-28 13:55:41 +01:00
Thomas Hartland	d0dd00c602	Fix logged error in static autoscaler	2018-12-04 16:59:57 +01:00
Łukasz Osipiuk	016bf7fc2c	Use k8s.io/klog instead github.com/golang/glog	2018-11-26 17:30:31 +01:00
Łukasz Osipiuk	5962354c81	Inject Backoff instance to ClusterStateRegistry on creation	2018-11-13 14:25:16 +01:00
Aleksandra Malinowska	bf6ff4be8e	Clean up estimators	2018-11-06 14:15:42 +01:00
Jakub Tużnik	8179e4e716	Refactor the scale-(up\|down) status processors so that they have more info available Replace the simple boolean ScaledUp property of ScaleUpStatus with a more comprehensive ScaleUpResult. Add more possible values to ScaleDownResult. Refactor the processors execution so that they are always executed every iteration, even if RunOnce exits earlier.	2018-09-20 17:12:02 +02:00
Steve Scaffidi	88d857222d	Renamed one more variable for consistency Change-Id: Idf42fd58089a1e75f3291ab7cc583735c68735f2	2018-09-17 14:08:10 -04:00
Steve Scaffidi	56b5456269	Fixing nits: renamed newPodScaleUpBuffer -> newPodScaleUpDelay, deleted redundant comment Change-Id: I7969194d8e07e2fb34029d0d7990341c891d0623	2018-09-17 10:38:28 -04:00
Steve Scaffidi	33b93cbc5f	Add configurable delay for pod age before considering for scale-up - This is intended to address the issue described in https://github.com/kubernetes/autoscaler/issues/923 - the delay is configurable via a CLI option - in production (on AWS) we set this to a value of 2m - the delay could possibly be set as low as 30s and still be effective depending on your workload and environment - the default of 0 for the CLI option results in no change to the CA's behavior from defaults. Change-Id: I7e3f36bb48641faaf8a392cca01a12b07fb0ee35	2018-09-14 13:55:09 -04:00
Jakub Tużnik	71111da20c	Add a scale down status processor, refactor so that there's more scale down info available to it	2018-09-12 14:52:20 +02:00
Jakub Tużnik	054f0b3b90	Add AutoscalingStatusProcessor	2018-08-07 14:47:06 +02:00

1 2 3

147 Commits