Commit Graph

152 Commits

Author SHA1 Message Date
John Gardiner Myers 640f5f5b74 Terminate AWS instances through EC2 instead of Autoscaling 2020-01-27 20:15:10 -08:00
John Gardiner Myers d56ad41334 Address review comments 2020-01-26 14:18:51 -08:00
John Gardiner Myers 5f72d12132 Reduce test flakiness 2020-01-12 21:27:55 -08:00
John Gardiner Myers 10d6416b8e Allow MaxConcurrency for masters and bastions 2020-01-11 18:50:35 -08:00
John Gardiner Myers 0c3651c9c8 Implement MaxUnavailable 2020-01-05 12:09:55 -08:00
John Gardiner Myers 0952374027 Extract maybeValidate 2020-01-05 12:09:55 -08:00
John Gardiner Myers 91f4920537 Extract drainTerminateAndWait() 2020-01-05 12:09:55 -08:00
John Gardiner Myers adaf903b90 Create resolveSettings 2020-01-05 12:09:54 -08:00
Kubernetes Prow Robot 121d9f461f
Merge pull request #7909 from johngmyers/remove-drain-feature-flag
Remove DrainAndValidateRollingUpdate feature flag
2020-01-05 11:15:40 -08:00
Kubernetes Prow Robot a22af4fa80
Merge pull request #8239 from johngmyers/simplify-rolling
Simplify code for rolling updates of nodes
2020-01-04 13:13:40 -08:00
John Gardiner Myers 01dd793604 Specify number of NotReady instances in makeGroup() parameter 2020-01-04 10:47:08 -08:00
John Gardiner Myers 39f849271b Fold setUpCloud() into getGroups() 2020-01-04 09:08:00 -08:00
John Gardiner Myers 612e4ae484 Extract creation of the CloudInstanceGroup 2020-01-04 09:08:00 -08:00
John Gardiner Myers cba59afac4 Change taint key per review comment 2020-01-03 10:07:21 -08:00
John Gardiner Myers cd499f6f09 Remove unused code 2019-12-31 14:33:05 -08:00
John Gardiner Myers 0cbd76ecfb Simplify code for rolling updates of nodes 2019-12-31 10:25:55 -08:00
John Gardiner Myers 97ad2c3b54 Taint nodes needing update 2019-12-30 16:06:00 -08:00
John Gardiner Myers 5189cc1ef6 Add a third instance to each nodes group in rolling update tests 2019-12-30 13:48:37 -08:00
John Gardiner Myers 92581ab4a1 Create nodes for instances in rolling update tests 2019-12-30 13:48:37 -08:00
John Gardiner Myers 77769855af Return groups from getTestSetup() 2019-12-30 13:48:34 -08:00
John Gardiner Myers 1d3d5c1d2f pkg/instancegroups - fix static check 2019-12-22 20:56:27 -08:00
Justin Santa Barbara 84835ce0ba
Update pkg/instancegroups/rollingupdate_test.go
Co-Authored-By: John Gardiner Myers <jgmyers@proofpoint.com>
2019-12-17 21:25:18 -05:00
Peter Rifel a24d9b6455
remove more trailing whitespace 2019-12-17 13:03:16 -06:00
Peter Rifel 85a1d23c18
remove trailing whitespace that was breaking gofmt 2019-12-17 12:49:20 -06:00
Justin Santa Barbara 8373c9fc4d tests: increase timeout in rolling update tests
We never know when e.g. a GC is going to delay us, so we need a lot
more padding on these timeouts.
2019-12-17 09:59:21 -05:00
John Gardiner Myers 2850826a52 Improve logging of cluster revalidation 2019-12-13 13:48:47 -08:00
John Gardiner Myers 19e165759b Add unit test for flapping validation 2019-12-13 13:45:21 -08:00
Jesse Haka 44183aef7f validate cluster twice 2019-12-12 08:48:15 +02:00
John Gardiner Myers 1239c05e71 Validate after updating bastion 2019-12-09 18:45:51 -08:00
John Gardiner Myers 2e36124f77 Expose ValidateTickDuration for use by unit tests 2019-12-09 18:43:20 -08:00
Kashif Saadat fcf6f0098c Canal Typha spec and apimachinery 2019-12-06 15:36:48 +00:00
John Gardiner Myers 4eccd3d53f Remove DrainAndValidateRollingUpdate feature flag 2019-12-05 22:50:04 -08:00
John Gardiner Myers 38b19e53b4 Add a second master to rolling update tests 2019-11-19 16:55:39 -08:00
John Gardiner Myers 8121a84089 Improve rolling update test coverage 2019-11-19 16:55:39 -08:00
John Gardiner Myers cfca6fae10 extract RollingUpdateCluster initialization in rollingupdate tests 2019-11-19 16:55:39 -08:00
John Gardiner Myers d82c834fe3 extract common test setup in rollingupdate tests 2019-11-19 16:55:39 -08:00
John Gardiner Myers 7e8c77a8bf extract CloudInstanceGroup setup in rollingupdate tests 2019-11-19 16:55:36 -08:00
John Gardiner Myers 3d6d6734e5 Make rollingupdate test assertions succinct 2019-11-19 16:45:55 -08:00
Peter Rifel 3dc06afa12 Fix goimports errors
It turns out we werent running verify-goimports in our CI jobs.

While we work to get that enabled, we can at least unblock the releases by doing a one-time fix of the failing goimports
2019-11-19 05:05:02 -08:00
John Gardiner Myers b63cc36f88 Update bazel 2019-11-07 22:59:51 -08:00
Justin Santa Barbara 5a0b199119
Merge branch 'master' into fix-roll-validation 2019-11-07 20:54:10 -05:00
John Gardiner Myers 5b8bed77fa Don't update first node in instancegroup if cluster fails validation 2019-11-04 16:26:39 -08:00
John Gardiner Myers 63e0c5e726 Add tests for cluster validation during rolling update 2019-11-04 16:26:39 -08:00
feifei.zhang@huawei.com 4b49412105 fix golint failures 2019-10-31 20:22:37 +08:00
yuxiaobo 0bd700781e Correct word misspelling 2019-09-29 22:23:07 +08:00
Justin SB 97f552778f
Add env vars, update tests 2019-09-25 12:48:13 -04:00
Justin SB 728e582360
Fill out kops controller functionality
k8s 1.16 requires that we move label setting away from the kubelet, to
a central controller.  kops-controller is that controller.
2019-09-25 12:04:34 -04:00
mikesplain 9e55b8230a Update copyright notices
Also cleans some white spaces
2019-09-09 14:47:51 -04:00
Kubernetes Prow Robot a7ed5ae0fc
Merge pull request #7510 from justinsb/set_delete_local_data
DeleteLocalData on drain
2019-09-03 08:46:29 -07:00
Justin SB 8b02cf14ca
DeleteLocalData on drain
This restores the kops behaviour before the refactor, where pods using
emptyDir would not block the rolling update.
2019-09-03 07:07:44 -07:00
Justin SB 96aa37d03d
Remove unused ClientGetter from Drain code
This is no longer needed now we are using the k8s drain code.
2019-09-03 07:04:37 -07:00
Kubernetes Prow Robot a957428446
Merge pull request #7470 from justinsb/update_to_k115
Update to kubernetes 1.15
2019-08-27 10:24:43 -07:00
Justin SB b1f8f84306
Code changes for 1.15 2019-08-25 16:00:39 -04:00
Jesse Haka b581e7a305 print all failures 2019-08-23 16:06:37 +03:00
Justin SB 76d03b3f71
Generated files: glog -> klog 2019-05-06 12:56:03 -04:00
Justin SB 3e33ac7682
Change code from glog to klog
We don't call klog.InitFlags yet, because that will cause a flag
redefinition error until we get everyone to stop using glog.  That
will happen when we update to k8s 1.13.
2019-05-06 12:54:51 -04:00
Justin SB 78ebe93f9f
Update kubernetes dependencies to 1.13.5
Notable changes:

* openapi-gen moved to k8s.io/kube-openapi/cmd/openapi-gen
* templates moved to k8s.io/kubernetes/pkg/kubectl/util/templates
2019-05-06 09:58:37 -04:00
Derek Lemon -T (delemon - AEROTEK INC at Cisco) 4f0169bb79 codegen 2019-01-16 09:30:40 -07:00
Rodrigo Menezes 2b9243ff8c Getting things ready for when we are ready for 1.12 2018-12-04 18:50:17 -08:00
mikeweiwei 027d324aaf If don't use formatted output,fix logging calls 2018-10-10 19:19:09 +08:00
Bheesham Persaud 65e9a86b39 Fix minor typo. 2018-09-29 02:18:40 -04:00
Justin Santa Barbara 62e8e17077 Code fixes for k8s 1.11 API changes 2018-09-28 20:14:45 -04:00
Justin Santa Barbara e982087e3e Delete nodes from k8s api during rolling-update
This prevents a race where if the new node comes back with the same
name, it will still be cordoned.  This seems to be more likely on GCE.
2018-09-22 16:06:07 -04:00
Mikulas Dite 525c0a9bc8 fix rolling-update prompt when nodeName is unset
Updated to use same logic as DeleteInstance does: print at least the host id
as that is always available and only include node name if set.
2018-08-16 16:12:17 +02:00
Mikulas Dite eab3a7824e fix cloudonly rolling-update ignores interactive 2018-08-16 16:12:17 +02:00
Justin Santa Barbara 3a1ce236d1 Simplify logic around master rolling-update
We were using a waitgroup, but we weren't actually running in parallel.
2018-07-21 23:04:22 -04:00
Deniz Zoeteman 2a69901d52 Add message to error for stopping rolling update after failure 2018-07-17 18:56:31 +02:00
Deniz Zoeteman b06e3efa4d Stop with rolling update if bastions or masters failed to update 2018-07-16 16:56:47 +02:00
Deniz Zoeteman 84796eac0b Fail cluster validation for rolling-update if a failure occurs 2018-07-14 21:41:10 +02:00
Eric Herot 2090479da5 Let people know that stopping an instance can sometimes take a while 2018-06-13 18:21:24 -04:00
Eric Herot 0e47086ff5 Communicate that we're going to wait for stabilization after draining
The wait for this is very long (90s) by default, which is long enough that many users may assume things are hanging if we don't say what they're waiting for.
2018-06-13 18:18:23 -04:00
Rajat Jindal 3961d85e44 set GracePeriodSeconds to -1 2018-05-10 18:33:54 -07:00
Haoyun 1b8c222026
fix a grammar mistake
fix a grammar mistake
2018-03-28 12:31:10 +08:00
Justin Santa Barbara 55e3a5f212 Validation: Take a cluster object, not just the name 2018-03-20 01:12:07 -04:00
AdamDang 12183af654
Update instancegroups.go
Line 340: “Delete and CloudInstanceGroups”
It should be “Delete a CloudInstanceGroups”, is it?
2018-03-15 19:01:11 +08:00
Justin Santa Barbara 85b972bc28 Fill out cloudmock to do a basic lifecycle test 2018-03-11 17:04:30 -04:00
Mike Splain 45a57915e2 Fix bazel deprecation notice 2018-02-26 09:36:13 -05:00
Mike Splain f40dc50a25 Update BUILD files to account for some recent changes 2018-02-12 17:16:33 -05:00
Kashif Saadat 670f8e6b19 Fix drain command for rolling-updates 2018-02-05 16:30:45 +00:00
k8s-ci-robot 923118eee0
Merge pull request #4166 from mrballcb/interactive_cli_opt
Interactive cli opt
2018-01-26 12:25:33 -08:00
Todd Lyons 5c1b646896 Maintainer recommended code/style updates 2018-01-18 22:22:34 -08:00
chrislovecnm 4dd3bb1dea Updating bazel BUILD files with new go_rules version 2017-12-29 15:03:14 -07:00
Todd Lyons 7f7306d4f9 Lint fixes by make gofmt 2017-12-28 15:42:53 -08:00
Todd Lyons 73b29b68e6 User input to continue/abort rolling update 2017-12-28 14:57:28 -08:00
Todd Lyons 2f0d888d18 Remove useless line 2017-12-28 11:00:55 -08:00
Todd Lyons 40eed60dd8 Interactive cli arg framework
Just builds, haven't tested yet.
2017-12-28 10:54:17 -08:00
Nico Piderman 69519f558b Spelling fix in instancegroups.go error msg 2017-12-07 10:08:15 -05:00
chrislovecnm 609e268a1d gazelle updates with new bazel version 2017-11-05 17:41:53 -07:00
Justin Santa Barbara eec1141a41 Rationalize timeouts for rolling-update
The intervals remain the minimum time between instances; drain &
validate time is additional.
2017-10-17 11:44:46 -04:00
Kubernetes Submit Queue 518e97d97b Merge pull request #3510 from justinsb/bazel
Automatic merge from submit-queue.

Initial bazel support

Builds on the 1.8 version bump

The "trick" is to strip the BUILD & BUILD.bazel files from the vendor-ed deps.

Will rebase after 1.8 version bump merges.
2017-10-03 01:19:27 -07:00
Justin Santa Barbara 737f2fcd80 rolling-update - initial GCE support 2017-10-02 23:07:35 -04:00
Justin Santa Barbara 0143be7c4f autogen: BUILD and BUILD.bazel 2017-10-02 14:27:21 -04:00
Justin Santa Barbara 3478031533 API types changed package 2017-10-01 14:03:56 -04:00
Justin Santa Barbara abd48ee653 Name CloudInstanceGroupMember consistently
Keep the naming of the type consistent.
2017-09-30 17:39:53 -04:00
chrislovecnm a431eb3e43 refactoring to use cloud based GetGroups 2017-09-29 12:29:07 -06:00
chrislovecnm 2f12a3e521 refactoring delete into its own file 2017-09-28 15:52:50 -06:00
chrislovecnm 93f3600f36 adding aws_cloud instancegroups delete and get methods 2017-09-28 15:52:50 -06:00
chrislovecnm 8dabeecd3b tweaking ux printing rolled cluster name 2017-09-23 19:41:36 -06:00
chrislovecnm ec2f0dfdf3 reusing the node and master duration for validation periods 2017-09-23 18:11:48 -06:00
Lars Lehtonen 1da7d66fd1
fixed swallowed errors under pkg subdirectory. 2017-07-15 13:49:17 -07:00