Leïla MARABESE
c02fb479dc
reconcile instancegroup
2023-08-29 17:42:19 +02:00
Jack Andersen
89dfafefe7
Make struct members private, alter formatting, add unwrap method
...
Signed-off-by: Jack Andersen <jandersen@plaid.com>
2022-12-21 09:30:19 -08:00
Jack Andersen
f5f71f17f9
Satisfy the Is interface with ValidationTimeoutError and change callers of err check
...
Signed-off-by: Jack Andersen <jandersen@plaid.com>
2022-12-21 09:30:17 -08:00
Jack Andersen
2bd5403f37
Create a specific error type for validation timeouts and classify as exitable
...
Signed-off-by: Jack Andersen <jandersen@plaid.com>
2022-12-21 09:30:16 -08:00
John Gardiner Myers
de9055b588
Update control-plane terminology in CLI output strings
2022-11-23 21:32:10 -08:00
John Gardiner Myers
d39ba74bd7
Change the control-plane IG role to "ControlPlane" in v1alpha3 API
2022-11-22 17:05:29 -08:00
Ole Markus With
a5b1722110
Ensure kOps doesn't surge on karpenter IGs
2022-10-17 15:22:39 +02:00
justinsb
4b2f773748
rolling-update: don't deregister our only apiserver
...
If we do, we can't drain the node afterwards. We also are going to
have dropped connections in this case anyway.
2022-09-15 09:16:57 -04:00
Ole Markus With
1ea5243406
Warm pool-enabled ASGs scaled to zero will no longer panic
2022-09-09 11:08:00 +02:00
Ole Markus With
c260cf69b3
Log errors from detachInstance
2022-06-27 19:58:16 +02:00
Ciprian Hacman
b5f14b589b
Add initial support for Hetzner Cloud
2022-05-09 06:12:15 +03:00
Ole Markus With
2ba9c1670f
Only delete node object on GCE
2022-03-06 07:34:52 +01:00
John Gardiner Myers
70f7d9bdb2
Use function to get cloud provider from cluster spec
2022-03-02 21:59:47 -08:00
Bronson Mirafuentes
86b0ef0d0c
add drain-timeout flag to rolling-update cluster
2022-01-20 14:05:55 -08:00
Jesse Haka
b88d110f58
Drain OpenStack loadbalancers
2021-12-31 13:16:02 +02:00
Ole Markus With
5e944f1a15
Do not try to detach karpenter nodes from ASGs
2021-12-15 09:56:33 +01:00
Ciprian Hacman
ea7df00719
Run hack/update-gofmt.sh
2021-12-01 22:39:50 +02:00
John Gardiner Myers
4396270d74
Fix out of bounds error when instance detach fails
2021-11-08 23:00:28 -08:00
John Gardiner Myers
d46ee9c883
Exclude nodes from load balancers upon cordoning
2021-04-20 17:58:26 -07:00
Ole Markus With
09615935fd
Make kOps CLI handle ASG warm pools
2021-04-15 11:10:23 +02:00
Ole Markus With
ab1b85818d
Pass ctx to drain helper
...
In some rare cases, we hit an NPR because the k8s code tries to use the
ctx we are not passing.
2021-03-26 10:29:11 +01:00
Markos Chandras
0a49650c70
aws: Graceful handling of EC2 detach errors
...
Sometimes, we observe the following error during a rolling update:
error detaching instance "i-XXXX", node "ip-10-X-X-X.ec2.internal": error detaching instance "i-XXXX": ValidationError: The instance i-XXXX is not part of Auto Scaling group XXXXX
The sequence of events that lead to this problem is the following:
- A new ASG object is being built from the launch template
- Existing instances are being added to it
- An existing instance is being ignored because it's already terminating
W0205 08:01:32.593377 191 aws_cloud.go:791] ignoring instance as it is terminating: i-XXXX in autoscaling group: XXXX
- Due to maxSurge, the terminating instance is trying to be detached
from the autoscaling group and fails.
As such, in case of EC@ ASG deatch failures we can simply try to detach
the next node instead of aborting the whole update operation.
2021-03-05 15:01:30 +02:00
Ole Markus With
5a2f1274fb
Don't try to detach masters
2020-11-28 09:44:42 +01:00
Kubernetes Prow Robot
0b5646e94a
Merge pull request #10266 from rifelpet/k8s120
...
Update k8s dependencies to 1.20.0-beta.2
2020-11-18 10:48:07 -08:00
Peter Rifel
47354ce010
Update kubectl drain fields for 1.20
2020-11-18 11:55:03 -06:00
Bharath Vedartham
208199ba85
instancegroups: Clear out the TODO comment
...
Now that we are able to associate pod validation failures with the
instance groups. We can remove the TODO comment
2020-11-15 11:07:45 +05:30
Kubernetes Prow Robot
7b26ec4b6d
Merge pull request #10065 from bharath-123/feature/instancegroup-specific-validation
...
Avoid waiting on validation during rolling update for inapplicable instance groups
2020-11-05 22:38:50 -08:00
zouyu
2e6b50f9e4
Some typos
...
Signed-off-by: zouyu <zouy.fnst@cn.fujitsu.com>
2020-11-03 16:28:30 +08:00
Bharath Vedartham
7067f5f47a
instancegroups: Ignore validation errors in unrelated instance groups
...
When unrelated instance groups produce validation errors, the instance group
being updated produces a failure and is forced to wait for rolling update to continue.
This can be avoided as failures in different node instance groups usually don't affect
the instance group being affected in any way.
2020-10-31 19:17:24 +05:30
Srikanth Rao
4d251fe900
[Digital Ocean] Implement Delete Instance logic for rolling update ( #10000 )
...
* Add delete Instance implementation for DO
* Add warning for DeleteInstance usage
* Use reconcile option for rolling update
* Update pkg/instancegroups/instancegroups.go
Co-authored-by: Ciprian Hacman <ciprianhacman@gmail.com>
Co-authored-by: Ciprian Hacman <ciprianhacman@gmail.com>
2020-10-13 10:06:27 -07:00
Ole Markus With
aa66c4f6d8
Add rolling upgrade to openstack
2020-10-01 20:07:44 +02:00
Ole Markus With
63f13322d5
Don't pass ctx and cluster everywhere
2020-09-23 08:30:24 +02:00
Ole Markus With
0ec71686b9
Refactor cloudinstancegroupmember in a more independent cloud instance representation
...
Apply suggestions from code review
Co-authored-by: John Gardiner Myers <jgmyers@proofpoint.com>
2020-08-30 21:37:03 +02:00
Ole Markus With
ff6c04938d
Add kops delete instance command
...
Add support for deleting instance by k8s node name
Add yes flag
2020-08-28 08:43:30 +02:00
Peter Rifel
4d9f0128a3
Upgrade to klog2
...
This splits up the kubernetes 1.19 PR to make it easier to keep up to date until we get it sorted out.
2020-08-16 20:56:48 -05:00
John Gardiner Myers
cc2b647d06
Create separate field for disabling rolling updates
2020-06-19 22:19:26 -07:00
ZouYu
2fc52ec6be
fix some go-lint warning
...
Signed-off-by: ZouYu <zouy.fnst@cn.fujitsu.com>
2020-06-09 08:52:50 +08:00
John Gardiner Myers
091893fd20
Simplify rolling update internal methods
2020-05-29 10:52:03 -07:00
John Gardiner Myers
dd884a6a64
fix missing space
...
Co-authored-by: Peter Rifel <rifelpet@users.noreply.github.com>
2020-05-29 10:35:15 -07:00
John Gardiner Myers
7756be7fbc
Try validating multiple times before updating instancegroup
2020-05-22 20:26:02 -07:00
John Gardiner Myers
df7e0b18b6
Ignore already-deleted nodes during rolling update
2020-04-26 21:41:54 -07:00
Justin Santa Barbara
ffb6cd61aa
Rolling-update validation harmonization
...
This is a follow-on to #8868 ; I believe the intent of that was to
expose the option to do more (or fewer) retries.
We previously had a single retry to prevent flapping; this basically
unifies the previous behaviour with the idea of making it
configurable.
* validate-count=0 effectively turns off validation.
* validate-count=1 will do a single validation, without flapping
detection.
* validate-count>=2 will require N succesful validations in a row,
waiting ValidateSuccessDuration in between.
A nice side-effect of this is that the tests now explicitly specify
ValidateCount=1 instead of setting ValidateSuccessDuration=0, which
had the side effect of doing the equivalent to ValidateCount=1.
2020-04-17 01:40:02 -04:00
Justin Santa Barbara
31bb16d4d1
Add context.Context to most signatures
...
The client-go signature for most methods adds a context.Context
object, and also makes Options mandatory. Feed through a
context.Context through many of our methods (but use context.TODO to
stop it getting totally out of hand!)
2020-04-11 14:44:17 -04:00
Jesse Haka
11eaacd53e
validationtimes -> validationcount
2020-04-08 13:55:29 +03:00
Jesse Haka
e1e79790ef
validate cluster n times in rolling update
2020-04-08 13:55:24 +03:00
John Gardiner Myers
6844eef4ca
Switch to the k/k implementation of drain.Helper
2020-04-05 10:22:49 -07:00
John Gardiner Myers
03eb8246c7
Refactor/simplify rolling update
2020-03-09 11:05:58 -07:00
John Gardiner Myers
ed73726195
Address review comments
2020-02-28 21:05:43 -08:00
John Gardiner Myers
ebfcf5d909
Implement recovery from previous failed surge rolling updates
2020-01-27 20:45:16 -08:00
John Gardiner Myers
cee662d521
Implement MaxSurge happy path
2020-01-27 20:45:16 -08:00