autoscaler

Commit Graph

Author	SHA1	Message	Date
Piotr Betkier	ac1c7b5463	use k8s.io/component-helpers/resource for pod request calculations	2025-04-22 17:36:17 +02:00
Jayant Jain	76b20e430f	Revert "Fix nil pointer exception for case when node is nil while processing …"	2023-08-04 13:29:58 +02:00
Jayant Jain	e777d7962c	Fix nil pointer exception for case when node is nil while processing gpuInfo	2023-08-01 12:00:10 +02:00
Hakan Bostan	2ea2fb66f6	Add "resource_name" to scaled_up_gpu_nodes_total and scaled_down_gpu_nodes_total metrics * Added the new resource_name field to scaled_up/down_gpu_nodes_total, representing the resource name for the gpu. * Changed metrics registrations to use GpuConfig	2023-02-22 10:09:45 +00:00
Hakan Bostan	1f646e4095	Add GetNodeGpuConfig to cloud provider * Added GetNodeGpuConfig to cloud provider which returns a GpuConfig struct containing the gpu label, type and resource name if the node has a GPU. * Added initial implementaion of the GetNodeGpuConfig to all cloud providers.	2023-02-14 14:08:29 +00:00
Flavian	f1b6d4ded6	handle directx nodes the same as gpu nodes	2022-09-23 09:55:14 +02:00
Bartłomiej Wróblewski	1698e0e583	Separate and refactor custom resources logic	2021-04-07 10:31:11 +00:00
Maciek Pytel	655b4081f4	Migrate to klog v2	2020-06-05 17:22:26 +02:00
Julien Balestra	af270b05f6	cluster-autoscaler/taints: ignore taints on existing nodes Signed-off-by: Julien Balestra <julien.balestra@datadoghq.com>	2020-02-25 13:55:17 +01:00
Jiaxin Shan	90666881d3	Move GPULabel and GPUTypes to cloud provider	2019-03-25 13:03:01 -07:00
Łukasz Osipiuk	016bf7fc2c	Use k8s.io/klog instead github.com/golang/glog	2018-11-26 17:30:31 +01:00
Łukasz Osipiuk	52aaac362f	Remove GetGpuRequests function	2018-09-05 11:58:46 +02:00
Aleksandra Malinowska	800ee56b34	Refactor and extend GPU metrics error types	2018-07-05 13:13:11 +02:00
Karol Gołąb	553db2c9fc	Separated errors	2018-07-05 11:30:12 +02:00
Karol Gołąb	aae4d1270a	Make GetGpuTypeForMetrics more robust	2018-06-26 21:35:16 +02:00
Karol Gołąb	5eb7021f82	Add GPU-related scaled_up & scaled_down metrics (#974 ) * Add GPU-related scaled_up & scaled_down metrics * Fix name to match SD naming convention * Fix import after master rebase * Change the logic to include GPU-being-installed nodes	2018-06-22 21:00:52 +02:00
Łukasz Osipiuk	57ea19599e	Explicitly return AutoscalerError from GetNodeTargetGpus	2018-06-14 15:46:58 +02:00
Łukasz Osipiuk	087a5cc9a9	Respect GPU limits in scale_down	2018-06-13 14:19:59 +02:00
Karol Gołąb	bada827839	Simplify the code by removing superfluous variable	2018-05-18 09:38:47 +02:00
Karol Gołąb	f877f5a64e	Remove unused error handling	2018-05-10 12:15:42 +02:00
Maciej Pytel	abbc45da2e	Delay scale-up including GPU request Nodes with GPU are expensive and it's likely a bunch of pods using them will be created in a batch. In this case we can wait a bit for all pods to be created to make more efficient scale-up decision.	2018-03-02 15:55:04 +01:00
Maciej Pytel	d876d74912	Ignore unfitness in price expander if using GPU	2018-03-02 15:50:43 +01:00
Maciej Pytel	b7f8622eb2	Create node groups with GPU in scale-up.go This is still not implemented in cloudprovider. Extended NewNodeGroup inteface to have a way of passing parameters for more complex resources.	2017-12-11 13:12:22 +01:00
Maciej Pytel	6554919700	Helper function to calculate GPU requests for NAP	2017-12-11 13:12:22 +01:00
Marcin Wielgus	f8c0e20ad9	Source fix after godep update	2017-11-28 14:01:43 +01:00
Maciej Pytel	d81dca5991	Mark nodes with uninitialized GPUs as unready	2017-11-10 17:56:10 +01:00
Beata Skiba	2b28ac1a04	Add a workaround for scaling of VMs with GPUs When a machine with GPU becomes ready it can take up to 15 minutes before it reports that GPU is allocatable. This can cause Cluster Autoscaler to trigger a second unnecessary scale up. The workaround sets allocatable to capacity for GPU so that a node that waits for GPUs to become ready to use will be considered as a place where pods requesting GPUs can be scheduled.	2017-11-06 16:04:22 +01:00

27 Commits