justinsb
f2d4eeb104
reconcile: wait for apiserver to response before trying rolling-update
...
The rolling-update requires the apiserver (when called without --cloudonly),
so reconcile should wait for apiserver to start responding.
Implement this by reusing "validate cluster", but filtering to only the instance groups
and pods that we expect to be online.
2025-01-13 17:47:48 -05:00
justinsb
ebcfebe50e
chore: add context to rolling update functions
...
Move it out of the struct, and into the function parameters.
This is more go idiomatic.
2024-12-27 14:22:51 -05:00
Peter Rifel
f05284a2f9
Migrate Instance Group management to aws-sdk-go-v2
2024-04-13 16:01:41 -04:00
Peter Rifel
d4d39eb0fe
Migrate autoscaling to aws-sdk-go-v2
2024-03-31 23:04:06 -05:00
justinsb
49dfdabb79
cloudmock: Add context functions to mock
2023-11-09 08:17:10 -05:00
Ciprian Hacman
65c24a9f3d
Add missing mock functions
2023-11-09 08:17:10 -05:00
Jack Andersen
6efd68f428
Remove optionality and exit when specific error prefix is matched
...
Signed-off-by: Jack Andersen <jandersen@plaid.com>
2022-12-21 09:30:14 -08:00
Jack Andersen
f9ea9b3ef8
Add a flag to rolling update to fail immediately on IG error
...
Signed-off-by: Jack Andersen <jandersen@plaid.com>
2022-12-21 09:30:13 -08:00
John Gardiner Myers
d39ba74bd7
Change the control-plane IG role to "ControlPlane" in v1alpha3 API
2022-11-22 17:05:29 -08:00
Ciprian Hacman
8f79c9bd68
Replace fi.Bool/Float*/Int*/String() with fi.PtrTo()
2022-11-19 03:45:22 +02:00
Ciprian Hacman
cb99db0757
Run make goimports
2022-08-17 07:03:33 +03:00
Ole Markus With
982463683d
Remove checks that doesn't work when we do not delete the node object
2022-03-06 07:34:52 +01:00
Ciprian Hacman
ea7df00719
Run hack/update-gofmt.sh
2021-12-01 22:39:50 +02:00
John Gardiner Myers
4396270d74
Fix out of bounds error when instance detach fails
2021-11-08 23:00:28 -08:00
John Gardiner Myers
d46ee9c883
Exclude nodes from load balancers upon cordoning
2021-04-20 17:58:26 -07:00
Ole Markus With
09615935fd
Make kOps CLI handle ASG warm pools
2021-04-15 11:10:23 +02:00
Ole Markus With
2659a30280
Make get instances respect needs-update annotation
...
Make it possible for addons to set needs-update annotation
Use onDelete update strategy for cilium and set needs-update annotation
Rename node roles
2020-11-16 08:26:17 +01:00
Bharath Vedartham
1e18a5d344
rollingupdate_test: add tests for rolling update
...
The tests create a cluster with 2 node instance groups and 1 master and bastion instance groups.
Only one node instance group requires rolling update.
instanceGroupNodeSpecificErrorClusterValidator mocks a validation failure for a given node group.
rolling update should not fail if the cluster validator reports an error in an unrelated instance group.
2020-10-31 19:17:45 +05:30
Ole Markus With
63f13322d5
Don't pass ctx and cluster everywhere
2020-09-23 08:30:24 +02:00
Ole Markus With
0ec71686b9
Refactor cloudinstancegroupmember in a more independent cloud instance representation
...
Apply suggestions from code review
Co-authored-by: John Gardiner Myers <jgmyers@proofpoint.com>
2020-08-30 21:37:03 +02:00
Ciprian Hacman
5a9cc3d216
Fix int to string conversions
2020-07-26 09:09:52 +03:00
John Gardiner Myers
cc2b647d06
Create separate field for disabling rolling updates
2020-06-19 22:19:26 -07:00
John Gardiner Myers
af90ecdddf
Reduce test flakiness
2020-05-22 19:33:01 -07:00
Justin Santa Barbara
ffb6cd61aa
Rolling-update validation harmonization
...
This is a follow-on to #8868 ; I believe the intent of that was to
expose the option to do more (or fewer) retries.
We previously had a single retry to prevent flapping; this basically
unifies the previous behaviour with the idea of making it
configurable.
* validate-count=0 effectively turns off validation.
* validate-count=1 will do a single validation, without flapping
detection.
* validate-count>=2 will require N succesful validations in a row,
waiting ValidateSuccessDuration in between.
A nice side-effect of this is that the tests now explicitly specify
ValidateCount=1 instead of setting ValidateSuccessDuration=0, which
had the side effect of doing the equivalent to ValidateCount=1.
2020-04-17 01:40:02 -04:00
Justin Santa Barbara
31bb16d4d1
Add context.Context to most signatures
...
The client-go signature for most methods adds a context.Context
object, and also makes Options mandatory. Feed through a
context.Context through many of our methods (but use context.TODO to
stop it getting totally out of hand!)
2020-04-11 14:44:17 -04:00
Jesse Haka
11eaacd53e
validationtimes -> validationcount
2020-04-08 13:55:29 +03:00
Jesse Haka
e1e79790ef
validate cluster n times in rolling update
2020-04-08 13:55:24 +03:00
John Gardiner Myers
33e23166e4
Support the kops.k8s.io/needs-update annotation on nodes
2020-03-09 22:43:09 -07:00
John Gardiner Myers
99100dc4a0
Fix flaky test
2020-03-03 20:54:22 -08:00
John Gardiner Myers
ebfcf5d909
Implement recovery from previous failed surge rolling updates
2020-01-27 20:45:16 -08:00
John Gardiner Myers
cee662d521
Implement MaxSurge happy path
2020-01-27 20:45:16 -08:00
John Gardiner Myers
640f5f5b74
Terminate AWS instances through EC2 instead of Autoscaling
2020-01-27 20:15:10 -08:00
John Gardiner Myers
5f72d12132
Reduce test flakiness
2020-01-12 21:27:55 -08:00
John Gardiner Myers
10d6416b8e
Allow MaxConcurrency for masters and bastions
2020-01-11 18:50:35 -08:00
John Gardiner Myers
0c3651c9c8
Implement MaxUnavailable
2020-01-05 12:09:55 -08:00
John Gardiner Myers
01dd793604
Specify number of NotReady instances in makeGroup() parameter
2020-01-04 10:47:08 -08:00
John Gardiner Myers
39f849271b
Fold setUpCloud() into getGroups()
2020-01-04 09:08:00 -08:00
John Gardiner Myers
612e4ae484
Extract creation of the CloudInstanceGroup
2020-01-04 09:08:00 -08:00
John Gardiner Myers
cba59afac4
Change taint key per review comment
2020-01-03 10:07:21 -08:00
John Gardiner Myers
97ad2c3b54
Taint nodes needing update
2019-12-30 16:06:00 -08:00
John Gardiner Myers
5189cc1ef6
Add a third instance to each nodes group in rolling update tests
2019-12-30 13:48:37 -08:00
John Gardiner Myers
92581ab4a1
Create nodes for instances in rolling update tests
2019-12-30 13:48:37 -08:00
John Gardiner Myers
77769855af
Return groups from getTestSetup()
2019-12-30 13:48:34 -08:00
Justin Santa Barbara
84835ce0ba
Update pkg/instancegroups/rollingupdate_test.go
...
Co-Authored-By: John Gardiner Myers <jgmyers@proofpoint.com>
2019-12-17 21:25:18 -05:00
Peter Rifel
a24d9b6455
remove more trailing whitespace
2019-12-17 13:03:16 -06:00
Peter Rifel
85a1d23c18
remove trailing whitespace that was breaking gofmt
2019-12-17 12:49:20 -06:00
Justin Santa Barbara
8373c9fc4d
tests: increase timeout in rolling update tests
...
We never know when e.g. a GC is going to delay us, so we need a lot
more padding on these timeouts.
2019-12-17 09:59:21 -05:00
John Gardiner Myers
19e165759b
Add unit test for flapping validation
2019-12-13 13:45:21 -08:00
Jesse Haka
44183aef7f
validate cluster twice
2019-12-12 08:48:15 +02:00
John Gardiner Myers
1239c05e71
Validate after updating bastion
2019-12-09 18:45:51 -08:00