kubernetes/kops - kops - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
justinsb	f2d4eeb104	reconcile: wait for apiserver to response before trying rolling-update The rolling-update requires the apiserver (when called without --cloudonly), so reconcile should wait for apiserver to start responding. Implement this by reusing "validate cluster", but filtering to only the instance groups and pods that we expect to be online.	2025-01-13 17:47:48 -05:00
justinsb	ebcfebe50e	chore: add context to rolling update functions Move it out of the struct, and into the function parameters. This is more go idiomatic.	2024-12-27 14:22:51 -05:00
justinsb	3646a610b1	refactor: Move GetCloudProvider to cluster This lets us use labels (or annotations), meaning we can experiment with different clouds without changing the API. We also add initial (experimental/undocumented) support for exposing a "Metal" provider.	2024-08-26 08:20:37 -04:00
justinsb	65fe6dc3c4	refactor: ApplyClusterCmd clearly returns results By having an explicit return value, we set ourselves up for better reuse.	2024-07-04 14:54:00 -04:00
Ciprian Hacman	28939f865b	azure: Implement DeleteInstance for rolling update	2024-04-07 16:02:22 +03:00
justinsb	2a9343a168	Generate revisions of NLB objects, and introduce cleanup phase This lets us safely make changes to otherwise immutable fields, in particular for adding security groups to NLBs created without them. We detect the older versions, and create deletion tasks to remove them. These tasks can be deferred, and we expect them to be deferred to a "prune" phase that runs after cluster apply. Co-authored-by: Ciprian Hacman <ciprian@hakman.dev>	2024-02-17 11:41:15 -05:00
Leïla MARABESE	c02fb479dc	reconcile instancegroup	2023-08-29 17:42:19 +02:00
Jack Andersen	89dfafefe7	Make struct members private, alter formatting, add unwrap method Signed-off-by: Jack Andersen <jandersen@plaid.com>	2022-12-21 09:30:19 -08:00
Jack Andersen	f5f71f17f9	Satisfy the Is interface with ValidationTimeoutError and change callers of err check Signed-off-by: Jack Andersen <jandersen@plaid.com>	2022-12-21 09:30:17 -08:00
Jack Andersen	2bd5403f37	Create a specific error type for validation timeouts and classify as exitable Signed-off-by: Jack Andersen <jandersen@plaid.com>	2022-12-21 09:30:16 -08:00
John Gardiner Myers	de9055b588	Update control-plane terminology in CLI output strings	2022-11-23 21:32:10 -08:00
John Gardiner Myers	d39ba74bd7	Change the control-plane IG role to "ControlPlane" in v1alpha3 API	2022-11-22 17:05:29 -08:00
Ole Markus With	a5b1722110	Ensure kOps doesn't surge on karpenter IGs	2022-10-17 15:22:39 +02:00
justinsb	4b2f773748	rolling-update: don't deregister our only apiserver If we do, we can't drain the node afterwards. We also are going to have dropped connections in this case anyway.	2022-09-15 09:16:57 -04:00
Ole Markus With	1ea5243406	Warm pool-enabled ASGs scaled to zero will no longer panic	2022-09-09 11:08:00 +02:00
Ole Markus With	c260cf69b3	Log errors from detachInstance	2022-06-27 19:58:16 +02:00
Ciprian Hacman	b5f14b589b	Add initial support for Hetzner Cloud	2022-05-09 06:12:15 +03:00
Ole Markus With	2ba9c1670f	Only delete node object on GCE	2022-03-06 07:34:52 +01:00
John Gardiner Myers	70f7d9bdb2	Use function to get cloud provider from cluster spec	2022-03-02 21:59:47 -08:00
Bronson Mirafuentes	86b0ef0d0c	add drain-timeout flag to rolling-update cluster	2022-01-20 14:05:55 -08:00
Jesse Haka	b88d110f58	Drain OpenStack loadbalancers	2021-12-31 13:16:02 +02:00
Ole Markus With	5e944f1a15	Do not try to detach karpenter nodes from ASGs	2021-12-15 09:56:33 +01:00
Ciprian Hacman	ea7df00719	Run hack/update-gofmt.sh	2021-12-01 22:39:50 +02:00
John Gardiner Myers	4396270d74	Fix out of bounds error when instance detach fails	2021-11-08 23:00:28 -08:00
John Gardiner Myers	d46ee9c883	Exclude nodes from load balancers upon cordoning	2021-04-20 17:58:26 -07:00
Ole Markus With	09615935fd	Make kOps CLI handle ASG warm pools	2021-04-15 11:10:23 +02:00
Ole Markus With	ab1b85818d	Pass ctx to drain helper In some rare cases, we hit an NPR because the k8s code tries to use the ctx we are not passing.	2021-03-26 10:29:11 +01:00
Markos Chandras	0a49650c70	aws: Graceful handling of EC2 detach errors Sometimes, we observe the following error during a rolling update: error detaching instance "i-XXXX", node "ip-10-X-X-X.ec2.internal": error detaching instance "i-XXXX": ValidationError: The instance i-XXXX is not part of Auto Scaling group XXXXX The sequence of events that lead to this problem is the following: - A new ASG object is being built from the launch template - Existing instances are being added to it - An existing instance is being ignored because it's already terminating W0205 08:01:32.593377 191 aws_cloud.go:791] ignoring instance as it is terminating: i-XXXX in autoscaling group: XXXX - Due to maxSurge, the terminating instance is trying to be detached from the autoscaling group and fails. As such, in case of EC@ ASG deatch failures we can simply try to detach the next node instead of aborting the whole update operation.	2021-03-05 15:01:30 +02:00
Ole Markus With	5a2f1274fb	Don't try to detach masters	2020-11-28 09:44:42 +01:00
Kubernetes Prow Robot	0b5646e94a	Merge pull request #10266 from rifelpet/k8s120 Update k8s dependencies to 1.20.0-beta.2	2020-11-18 10:48:07 -08:00
Peter Rifel	47354ce010	Update kubectl drain fields for 1.20	2020-11-18 11:55:03 -06:00
Bharath Vedartham	208199ba85	instancegroups: Clear out the TODO comment Now that we are able to associate pod validation failures with the instance groups. We can remove the TODO comment	2020-11-15 11:07:45 +05:30
Kubernetes Prow Robot	7b26ec4b6d	Merge pull request #10065 from bharath-123/feature/instancegroup-specific-validation Avoid waiting on validation during rolling update for inapplicable instance groups	2020-11-05 22:38:50 -08:00
zouyu	2e6b50f9e4	Some typos Signed-off-by: zouyu <zouy.fnst@cn.fujitsu.com>	2020-11-03 16:28:30 +08:00
Bharath Vedartham	7067f5f47a	instancegroups: Ignore validation errors in unrelated instance groups When unrelated instance groups produce validation errors, the instance group being updated produces a failure and is forced to wait for rolling update to continue. This can be avoided as failures in different node instance groups usually don't affect the instance group being affected in any way.	2020-10-31 19:17:24 +05:30
Srikanth Rao	4d251fe900	[Digital Ocean] Implement Delete Instance logic for rolling update (#10000 ) * Add delete Instance implementation for DO * Add warning for DeleteInstance usage * Use reconcile option for rolling update * Update pkg/instancegroups/instancegroups.go Co-authored-by: Ciprian Hacman <ciprianhacman@gmail.com> Co-authored-by: Ciprian Hacman <ciprianhacman@gmail.com>	2020-10-13 10:06:27 -07:00
Ole Markus With	aa66c4f6d8	Add rolling upgrade to openstack	2020-10-01 20:07:44 +02:00
Ole Markus With	63f13322d5	Don't pass ctx and cluster everywhere	2020-09-23 08:30:24 +02:00
Ole Markus With	0ec71686b9	Refactor cloudinstancegroupmember in a more independent cloud instance representation Apply suggestions from code review Co-authored-by: John Gardiner Myers <jgmyers@proofpoint.com>	2020-08-30 21:37:03 +02:00
Ole Markus With	ff6c04938d	Add kops delete instance command Add support for deleting instance by k8s node name Add yes flag	2020-08-28 08:43:30 +02:00
Peter Rifel	4d9f0128a3	Upgrade to klog2 This splits up the kubernetes 1.19 PR to make it easier to keep up to date until we get it sorted out.	2020-08-16 20:56:48 -05:00
John Gardiner Myers	cc2b647d06	Create separate field for disabling rolling updates	2020-06-19 22:19:26 -07:00
ZouYu	2fc52ec6be	fix some go-lint warning Signed-off-by: ZouYu <zouy.fnst@cn.fujitsu.com>	2020-06-09 08:52:50 +08:00
John Gardiner Myers	091893fd20	Simplify rolling update internal methods	2020-05-29 10:52:03 -07:00
John Gardiner Myers	dd884a6a64	fix missing space Co-authored-by: Peter Rifel <rifelpet@users.noreply.github.com>	2020-05-29 10:35:15 -07:00
John Gardiner Myers	7756be7fbc	Try validating multiple times before updating instancegroup	2020-05-22 20:26:02 -07:00
John Gardiner Myers	df7e0b18b6	Ignore already-deleted nodes during rolling update	2020-04-26 21:41:54 -07:00
Justin Santa Barbara	ffb6cd61aa	Rolling-update validation harmonization This is a follow-on to #8868; I believe the intent of that was to expose the option to do more (or fewer) retries. We previously had a single retry to prevent flapping; this basically unifies the previous behaviour with the idea of making it configurable. * validate-count=0 effectively turns off validation. * validate-count=1 will do a single validation, without flapping detection. * validate-count>=2 will require N succesful validations in a row, waiting ValidateSuccessDuration in between. A nice side-effect of this is that the tests now explicitly specify ValidateCount=1 instead of setting ValidateSuccessDuration=0, which had the side effect of doing the equivalent to ValidateCount=1.	2020-04-17 01:40:02 -04:00
Justin Santa Barbara	31bb16d4d1	Add context.Context to most signatures The client-go signature for most methods adds a context.Context object, and also makes Options mandatory. Feed through a context.Context through many of our methods (but use context.TODO to stop it getting totally out of hand!)	2020-04-11 14:44:17 -04:00
Jesse Haka	11eaacd53e	validationtimes -> validationcount	2020-04-08 13:55:29 +03:00

1 2 3

116 Commits