autoscaler

Commit Graph

Author	SHA1	Message	Date
t-qini	89a09ccf00	Refactor the corresponding code.	2019-07-22 08:58:51 +08:00
t-qini	f7c563ab06	Modify the code as the simple solution proposed by MaciekPytel.	2019-07-18 23:58:05 +08:00
t-qini	622a838c2c	Modify nodal similarity rules.	2019-07-09 16:04:40 +08:00
Vivek Bagade	90aa28a077	Move pod packing in upcoming nodes to RunOnce from Estimator for performance improvements	2019-06-19 14:48:47 +02:00
Krzysztof Jastrzebski	4831d76288	Cache cloud provider node instances in cluster state.	2019-05-31 10:11:51 +02:00
Krzysztof Jastrzebski	4247c8b032	Implement functionality which delays node deletion when node has annotation with prefix 'delay-deletion.cluster-autoscaler.kubernetes.io/'.	2019-05-17 16:06:17 +02:00
Kubernetes Prow Robot	c756ed3953	Merge pull request #1963 from cjbradfield/ignore-taints add --ignore-taint flag and ignore taints added by TaintNodesByCondition	2019-05-15 02:18:21 -07:00
Chris Bradfield	92ea680f1a	Implement an --ignore-taint flag This change adds support for a user to specify taints to ignore when considering a node as a template for a node group.	2019-05-14 10:22:59 -07:00
Thomas Hartland	80aa40bda7	Move CA version to own package	2019-05-06 11:30:08 +02:00
Łukasz Osipiuk	db4c6f1133	Migrate filter out schedulabe to PodListProcessor	2019-04-15 16:59:13 +02:00
Jiaxin Shan	83ae66cebc	Consider GPU utilization in scaling down	2019-04-04 01:12:51 -07:00
Aleksandra Malinowska	600ba8ad10	Fix default scale down delay after delete	2019-04-02 12:47:10 +02:00
Łukasz Osipiuk	34a4262ad8	Remove GKE specific node group comparator Change-Id: I33131fec9b7972780cffde605a087cd2ad002752	2019-03-11 17:49:59 +01:00
Kubernetes Prow Robot	8944afd901	Merge pull request #1720 from aleksandra-malinowska/events-client Use separate client for events	2019-02-26 12:00:19 -08:00
Aleksandra Malinowska	f304722a1f	Use separate client for events	2019-02-25 13:58:54 +01:00
Pengfei Ni	2546d0d97c	Move leaderelection options to new packages	2019-02-21 13:45:46 +08:00
Pengfei Ni	4f7600911f	Update flag package to k8s.io/component-base/cli/flag	2019-02-21 11:45:33 +08:00
Jacek Kaniuk	f054c53c46	Account for kernel reserved memory in capacity calculations	2019-02-08 17:04:07 +01:00
Marcin Wielgus	99f1dcf9d2	Merge branch 'master' into crc-fix-error-format	2019-02-01 17:22:57 +01:00
Vivek Bagade	c6b87841ce	Added a new method that uses pod packing to filter schedulable pods filterOutSchedulableByPacking is an alternative to the older filterOutSchedulable. filterOutSchedulableByPacking sorts pods in unschedulableCandidates by priority and filters out pods that can be scheduled on free capacity on existing nodes. It uses a basic packing approach to do this. Pods with nominatedNodeName set are always filtered out. filterOutSchedulableByPacking is set to be used by default, but, this can be toggled off by setting filter-out-schedulable-pods-uses-packing flag to false, which would then activate the older and more lenient filterOutSchedulable(now called filterOutSchedulableSimple). Added test cases for both methods.	2019-01-25 16:09:51 +05:30
Jacek Kaniuk	0c64e0932a	Tainting unneeded nodes as PreferNoSchedule	2019-01-21 13:06:50 +01:00
CodeLingo Bot	c0603afdeb	Fix error format strings according to best practices from CodeReviewComments Fix error format strings according to best practices from CodeReviewComments Fix error format strings according to best practices from CodeReviewComments Reverted incorrect change to with error format string Signed-off-by: CodeLingo Bot <hello@codelingo.io> Signed-off-by: CodeLingoBot <hello@codelingo.io> Signed-off-by: CodeLingo Bot <hello@codelingo.io> Signed-off-by: CodeLingo Bot <bot@codelingo.io> Resolve conflict Signed-off-by: CodeLingo Bot <hello@codelingo.io> Signed-off-by: CodeLingoBot <hello@codelingo.io> Signed-off-by: CodeLingo Bot <hello@codelingo.io> Signed-off-by: CodeLingo Bot <bot@codelingo.io> Fix error strings in testscases to remedy failing tests Signed-off-by: CodeLingo Bot <bot@codelingo.io> Fix more error strings to remedy failing tests Signed-off-by: CodeLingo Bot <bot@codelingo.io>	2019-01-11 09:10:31 +13:00
Łukasz Osipiuk	d53928a11d	Initialize klog	2018-11-26 20:21:23 +01:00
Łukasz Osipiuk	016bf7fc2c	Use k8s.io/klog instead github.com/golang/glog	2018-11-26 17:30:31 +01:00
Alex Price	4ae7acbacc	add flags to ignore daemonsets and mirror pods when calculating resource utilization of a node Adds the flag --ignore-daemonsets-utilization and --ignore-mirror-pods-utilization (defaults to false) and when enabled, factors DaemonSet and mirror pods out when calculating the resource utilization of a node.	2018-11-23 15:24:25 +11:00
SataQiu	a110adf4fb	fix typo: posistive -> positive	2018-11-15 15:48:08 +08:00
Aleksandra Malinowska	bf6ff4be8e	Clean up estimators	2018-11-06 14:15:42 +01:00
Maciej Pytel	01a56a8d73	Add GKE-specific NodeGroupSet processor Also refactor Balancing processor a bit to make it easily extensible.	2018-10-25 18:50:17 +02:00
Steve Scaffidi	56b5456269	Fixing nits: renamed newPodScaleUpBuffer -> newPodScaleUpDelay, deleted redundant comment Change-Id: I7969194d8e07e2fb34029d0d7990341c891d0623	2018-09-17 10:38:28 -04:00
Steve Scaffidi	33b93cbc5f	Add configurable delay for pod age before considering for scale-up - This is intended to address the issue described in https://github.com/kubernetes/autoscaler/issues/923 - the delay is configurable via a CLI option - in production (on AWS) we set this to a value of 2m - the delay could possibly be set as low as 30s and still be effective depending on your workload and environment - the default of 0 for the CLI option results in no change to the CA's behavior from defaults. Change-Id: I7e3f36bb48641faaf8a392cca01a12b07fb0ee35	2018-09-14 13:55:09 -04:00
Łukasz Osipiuk	01a2e4d3cf	Update leader election configuration after godeps update	2018-09-05 16:54:15 +02:00
Aleksandra Malinowska	90e8a7a2d9	Move initializing defaults out of main	2018-08-02 14:04:03 +02:00
Aleksandra Malinowska	3b1b731c91	Move constructing cloud provider dynamic config structs into cloud provider builder	2018-07-25 13:43:47 +02:00
Aleksandra Malinowska	07e52e6c79	Move creating cloud provider out of context	2018-07-25 13:43:47 +02:00
Aleksandra Malinowska	0976d2aa07	Move autoscaling options out of static	2018-07-25 10:52:37 +02:00
Aleksandra Malinowska	6b94d7172d	Move AutoscalingOptions to config/static	2018-07-23 15:52:27 +02:00
Sheldon Kwok	20293c2365	Bump kubernetes.sync and fix main.go with new k8 Godeps	2018-07-17 02:54:35 -07:00
Aleksandra Malinowska	82fa2df52f	Lower default expendable pod priority cutoff to -10	2018-07-04 13:45:32 +02:00
Nic Doye	ebadbda2b2	issues/933 Consider making UnremovableNodeRecheckTimeout configurable	2018-06-18 11:54:14 +01:00
Łukasz Osipiuk	087a5cc9a9	Respect GPU limits in scale_down	2018-06-13 14:19:59 +02:00
MaciekPytel	705eeb0a7b	Merge pull request #934 from losipiuk/lukaszos/cleanup-how-mult-string-flags-are-handled-in-main-1fdd5 Cleanup how multi-string flags are handled in main()	2018-06-08 14:53:12 +02:00
Łukasz Osipiuk	53fc344eca	Cleanup how multi-string flags are handled in main()	2018-06-08 13:36:52 +02:00
Aleksandra Malinowska	3ccfa5be23	Move universal constants to separate module	2018-05-17 18:36:43 +02:00
Aleksandra Malinowska	fcc3d004f5	Use bytes instead of MB for memory limits	2018-05-17 17:35:39 +02:00
Aleksandra Malinowska	820f688d2a	Update max unready nodes to 45%	2018-05-17 12:51:45 +02:00
Beata Skiba	f3a242cc8a	Small refactor of main.go	2018-05-15 12:39:33 +02:00
Karol Gołąb	74b540fdab	Remove DynamicAutoscaler since it's unused (#851 ) * Remove DynamicAutoscaler since it's unused * Remove configmap flag with its unused-elsewhere dependecies * gofmt	2018-05-14 20:22:42 +02:00
Karol Gołąb	854fcc1ff8	Remove implementation details (CleanUp) from the interface. The CleanUp method is instead called directly from the implementation, when required. Test updated in a quick way since the mock we're using does not support AtLeast(1) - thus Times(2).	2018-05-07 15:24:14 +02:00
Krzysztof Jastrzebski	88b769b324	Refactor cluster autoscaler builder and add pod list processor.	2018-04-26 12:37:51 +02:00
Aleksandra Malinowska	f98e953eb4	Add regional flag	2018-03-12 14:15:56 +01:00
yank1	ee3f3881b9	fix typo in main file	2018-02-07 00:27:10 +08:00
Marcin Wielgus	88d97c2254	Merge pull request #462 from negz/gcedisco Support autodetection of GCE managed instance groups by name prefix	2017-12-18 21:08:22 +01:00
Aleksandra Malinowska	312f989c15	Don't register metrics unless on leading master	2017-12-14 16:08:20 +01:00
Nic Cope	e96ff07896	Replace the Polling Autoscaler Node group discovery is now handled by cloudprovider.Refresh() in all cases. Additionally, explicit node groups can now be used alongside autodiscovery.	2017-12-11 13:09:56 -08:00
Nic Cope	6a704a6cf4	Break down cloud provider builder by provider The Build method was getting pretty big, this hopefully makes it a little more readable. It also fixes a few minor error shadowing bugs.	2017-12-11 13:09:56 -08:00
Nic Cope	982f9e41a3	Support autodetection of GCE managed instance groups by name prefix This commit adds a new usage of the --node-group-auto-discovery flag intended for use with the GCE cloud provider. GCE instance groups can be automatically discovered based on a prefix of their group name. Example usage: --node-group-auto-discovery=mig:prefix=k8s-mig,minNodes=0,maxNodes=10 Note that unlike the existing AWS ASG autodetection functionality we must specify the min and max nodes in the flag. This is because MIGs store only a target size in the GCE API - they do not have a min and max size we can infer via the API. In order to alleviate this limitation a little we allow multiple uses of the autodiscovery flag. For example to discover two classes (big and small) of instance groups with different size limits: ./cluster-autoscaler \ --node-group-auto-discovery=mig:prefix=k8s-a-small,minNodes=1,maxNodes=10 \ --node-group-auto-discovery=mig:prefix=k8s-a-big,minNodes=1,maxNodes=100 Zonal clusters (i.e. multizone = false in the cloud config) will detect all managed instance groups within the cluster's zone. Regional clusters will detect all matching (zonal) managed instance groups within any of that region's zones.	2017-12-11 13:09:56 -08:00
Pengfei Ni	8f7d35b4e0	Enable azure options for autoscaler	2017-11-16 21:31:49 +08:00
Marcin Wielgus	2589c43a61	Merge pull request #469 from aleksandra-malinowska/single-unregistered-flag Remove --unregistered-node-removal-time flag	2017-11-16 13:07:52 +01:00
Aleksandra Malinowska	2ff962e53e	Remove --unregistered-node-removal-time flag	2017-11-15 11:11:30 +01:00
Aleksandra Malinowska	11a7d9f137	Fix typos in FAQ	2017-11-14 14:46:09 +01:00
Marcin Wielgus	439fd3c9ec	Merge pull request #411 from krzysztof-jastrzebski/priority Adds priority preemption support to cluster autoscaler.	2017-11-08 09:09:26 +01:00
Maciej Pytel	c376ef3c87	Add metrics for autoprovisioning	2017-10-31 17:42:58 +01:00
Krzysztof Jastrzebski	d9c00e5ce1	Adds priority preemption support to cluster autoscaler.	2017-10-23 09:54:56 +02:00
Maciej Pytel	9ded6f9c9e	Rename clusterName flag to cluster-name for consistency	2017-10-16 14:11:27 +02:00
Matt Terry	63310ef41a	Introduce new flags to control scale down behavior: scale-down-delay-after-delete and scale-down-delay-after-failure, replacing scale-down-trial-interval. scale-down-delay-after-add replaces scale-down-delay	2017-09-18 17:09:44 -07:00
Aleksandra Malinowska	197b05b180	respect minimum cores/memory limit during scale down	2017-09-13 10:10:47 +02:00
Aleksandra Malinowska	d43029c180	implement blocking scale up beyond max cores & memory	2017-09-08 12:50:00 +02:00
Marcin Wielgus	f9cabf3a1a	Merge pull request #297 from bskiba/additional-k Only consider up to 10% of the nodes as additional candidates for scale down	2017-09-07 04:34:23 +05:30
Sergey Lanzman	415f53cdea	Change from deprecated Core to CoreV1 for kube client	2017-09-04 22:16:21 +03:00
Beata Skiba	a6c18b87d2	Only consider up to 10% of the nodes as additional candidates for scale down.	2017-09-04 17:37:02 +02:00
Clayton Coleman	f411e38bb8	Support resource-lock type configmap for leader election The lock type parameter was being ignored. Use the new factory method to instantiate the lock type.	2017-09-03 18:14:46 -04:00
Marcin Wielgus	de524a6688	Limit autoprovisioned groups to 15	2017-09-01 18:25:28 +02:00
Marcin Wielgus	c0b48e4a15	Merge pull request #285 from mwielgus/loglevel Set verbosity for each of the glog.Info logs	2017-09-01 16:42:11 +05:30
Marcin Wielgus	2d8f59e23d	Set verbosity for each of the glog.Info logs	2017-09-01 12:34:29 +02:00
Beata Skiba	576e4105db	Make ScaleDownNonEmptyCandidatesCount a flag.	2017-08-31 15:05:06 +02:00
Marcin Wielgus	19507aa0de	Node autoprovisioning flag	2017-08-31 00:48:54 +02:00
Mark Janssen	f53fb8b6ed	Minor fixes	2017-08-29 23:11:35 +02:00
Marcin Wielgus	81e9226d17	Merge pull request #267 from mwielgus/gke-cp-1 Add GKE mode to GCE cloud provider	2017-08-29 18:26:07 +05:30
Marcin Wielgus	3d55a669ce	Merge pull request #268 from drinktee/master add kubeconfig flag to create kube-client	2017-08-29 16:14:36 +05:30
chenguoyan01	403cd8a11e	add kubeconfig flag to create kube-client	2017-08-29 15:41:32 +08:00
Marcin Wielgus	51a5ad58c0	GKE NodePool support for NAP - get NP/Migs via api - part 1	2017-08-28 20:50:02 +02:00
Marcin Wielgus	718e5db78e	Run node drain/delete in a separate goroutine	2017-08-28 12:12:31 +02:00
Zach Gardner	8c23346c72	Update main.go Fix a typo (`waints` -> `waits`	2017-08-24 05:19:24 -07:00
Beata Skiba	2ae609b93a	Merge pull request #237 from bskiba/split_scale_down Drill down scale down metrics	2017-08-22 16:41:55 +02:00
Beata Skiba	43c9b6b06b	Add cleaner function labels for metrics exporting.	2017-08-22 16:09:42 +02:00
Beata Skiba	596b165808	Cloud Provider Interface for Kubemark This allows to run Custer Autoscaler on Kubemark. See autoscaler/cluster-autoscaler/proposals/kubemark_integration.md for more details.	2017-08-22 15:19:10 +02:00
Beata Skiba	14df1b808b	Drill down scale down metrics Split scale down duration into three parts: 1. Find nodes to remove 2. Node deletion 3. Misc operations	2017-08-18 14:17:02 +02:00
Marcin Wielgus	6df186aeac	Remove Azure support	2017-08-17 22:36:31 +02:00
Maciej Pytel	95b5b4be94	Remove --verify-unschedulabe-pods flag This flag was true in default setups for every platform, we haven't heard about any user changing it to false and after removing check on PodScheduled condition setting it to false would basically break CA.	2017-08-16 17:31:59 +02:00
Marcin Wielgus	f8541bdb6d	Unexport leader election functions	2017-08-11 18:13:26 +02:00
Marcin Wielgus	9116e4c08c	Compilation fix for CA after godeps update	2017-08-11 17:56:47 +02:00
Ivan Towlson	902d2414b7	Fixed typoes of name 'Kubernetes'	2017-08-03 14:20:23 +12:00
Yusuke Kuoka	3e8cc02243	cluster-autoscaler: Fix node group auto discovery for AWS not to mix up ASGs from different k8s clusters	2017-06-22 15:59:53 +09:00
Marcin Wielgus	63e679a74f	Merge pull request #120 from MaciekPytel/fix_graceful_flag Fix typos related to max-graceful-termination-sec	2017-06-14 14:42:35 +02:00
Maciej Pytel	767367c866	Fix typos related to max-graceful-termination-sec	2017-06-14 14:14:21 +02:00
Maciej Pytel	fe514ed75d	Make status configmap respect namespace parameter	2017-06-14 14:07:13 +02:00
Marcin Wielgus	1bedee5707	Update GODEPS	2017-06-13 14:48:24 +02:00
Marcin Wielgus	e2e171b7b7	Enable pricing in expander factory	2017-06-09 11:09:43 -07:00
Maciej Pytel	cd186f3ebc	Balance sizes of similar nodegroups in scale-up	2017-06-06 00:52:38 +02:00
Aleksandra Malinowska	972772440a	Add failing health check if autoscaler loop consistently returns error	2017-05-29 11:31:57 +02:00
Aleksandra Malinowska	7c94367099	Add health check	2017-05-25 11:37:44 +02:00
Maciej Pytel	f716a7e496	Add typed errors; add errors_total metric To keep reasonable commit size only top-level files use new errors. Will add them in other files in next commits.	2017-05-18 14:09:15 +02:00
Maciej Pytel	4cdf06ea94	Added CA metrics related to autoscaler execution	2017-05-11 14:51:04 +02:00
Yusuke Kuoka	e9c7cd0733	cluster-autoscaler: Re: AWS Autoscaler autodiscover ASG names and sizes This is an alternative implementation of https://github.com/kubernetes/contrib/pull/1982 Notable differences from the original PR are: * A new flag named `--node-group-auto-discovery` is introduced for opting in to enable the auto-discovery feature. * For example, specifying `--cloud-provider aws --node-group-auto-discovery asg:tag=k8s.io/cluster-autoscaler/enabled` instructs CA to auto-discover ASGs tagged with `k8s.io/cluster-autoscaler/enabled` to be used as target node groups * The new code path introduced by this PR is executed only when `node-group-auto-discovery` is specified. There is relatively less chance to break existing features by introducing this change Resolves https://github.com/kubernetes/contrib/issues/1956 --- Other notes: * We rely mainly on the `DescribeTags` API rather than `DescribeAutoScalingGroups` so that AWS can filter out unnecessary ASGs which doesn't belong to the k8s cluster, for us. * If we relied on `DescribeAutoScalingGroups` here, as it doesn't support `Filter`ing, we'd need to iterate over ALL the ASGs available in an AWS account, which isn't desirable due to unnecessary excessive API calls and network usages * Update cloudprovider/aws/README for the new configuration * Warn abount invalid combination of flags according to the review comment https://github.com/kubernetes/autoscaler/pull/11#discussion_r113713138 * Emit a validation error when both --nodes and --node-group-auto-discovery are specified according to the review comment https://github.com/kubernetes/autoscaler/pull/11#discussion_r113958080 TODO/Possible future improvements before recommending this to everyone: * Cache the result of an auto-discovery for a configurable period, so that we won't invoke DescribeTags and DescribeAutoScalingGroup APIs too many times	2017-05-10 08:36:02 +09:00
Maciej Pytel	e8440ee15e	Fix PVC informer issue	2017-04-24 14:12:27 +02:00
Marcin Wielgus	34eb4973f8	Fix imports in cluster autoscaler after migrating it from contrib	2017-04-18 15:42:04 +02:00
Maciej Pytel	c87d10f042	Cluster-Autoscaler: fix ignoring node groups config	2017-03-03 17:21:24 +01:00
Marcin Wielgus	1cd861227a	Cluster-autoscaler: precheck that the api server link is ok	2017-03-03 14:39:23 +01:00
Maciej Pytel	84f19c1e1e	Cluster-Autoscaler: add map to disable status configmap	2017-03-02 15:35:00 +01:00
Marcin Wielgus	2ffaddb7c0	Cluster-autoscaler: lint	2017-03-02 15:15:07 +01:00
Marcin Wielgus	72a47dc2b2	Cluster-autoscaler: update code for 1.6 k8s sync	2017-03-02 14:34:49 +01:00
Maciej Pytel	d0196c9e1b	Cluster-Autoscaler: Delete status configmap on exit	2017-02-28 17:19:23 +01:00
Yusuke Kuoka	baee799524	cluster-autoscaler: Dynamic Reconfiguration via ConfigMaps Adds a new optional flag named `configmap` to specify the name of a configmap containing node group specs. The configmap is polled every `scan-interval` seconds to reconfigure cluster-autoscaler dynamically at runtime. Example usage: ``` ./cluster-autoscaler --v=4 --cloud-provider=aws --skip-nodes-with-local-storage=false --logtostderr --leader-elect=false --configmap=cluster-autoscaler --logtostderr ``` The configmap would look like: ```yaml kind: ConfigMap apiVersion: v1 metadata: name: cluster-autoscaler namespace: kube-system data: settings: \|- { "nodeGroups": [ { "minSize": 1, "maxSize": 2, "name": "kubeawstest-nodepool1-AutoScaleWorker-1VWD4GAVG35L5" } ] } ``` Other notes: * Make namespace defaults to "kube-system" according to https://github.com/kubernetes/contrib/pull/2226#discussion_r94144267 * Trigger a full-recreate on a configuration change according to https://github.com/kubernetes/contrib/pull/2226#issuecomment-269617410 * Introduced `autoscaler/` and moved all the dynamic/recreatable-at-runtime parts of autoscaler into there (Update: the package is now named `core` according to https://github.com/kubernetes/contrib/pull/2226#issuecomment-273071663) * Extracted the core of CA(=`func Run()` in `main.go`) into `Autoscaler` * `DynamicAutoscaler` is a wrapper around `Autoscaler` which achieves reconfiguration of CA by recreating an `Autoscaler` instance on a configmap change. * Moved `scale_down.go`, `scale_up.go` and `utils.go` into the `autoscaler` package accordingly because they seemed to be meant to be collocated in the same package as the core of CA (which is now implemented as `Autoscaler`) Moved the `createEventRecorder` func from the `main` package to the `utils/kubernetes` package to make it importable from both `main` and `autoscaler`	2017-02-24 20:36:47 +09:00

... 3 4 5 6 7

313 Commits