Commit Graph

73 Commits

Author SHA1 Message Date
justinsb f2d4eeb104 reconcile: wait for apiserver to response before trying rolling-update
The rolling-update requires the apiserver (when called without --cloudonly),
so reconcile should wait for apiserver to start responding.

Implement this by reusing "validate cluster", but filtering to only the instance groups
and pods that we expect to be online.
2025-01-13 17:47:48 -05:00
justinsb 859a9fd9f1 chore: refactor factory to accept a cluster
This should allow us to build our own rest config in future,
rather than relying on the kubeconfig being configured correctly.

To do this, we need to stop sharing the factory between the channels
and kops commands.
2024-12-27 15:36:37 -05:00
Ciprian Hacman 26a424bcd7 validation: Use constants for validating node labels 2024-01-09 12:31:20 +02:00
justinsb ca67b1ca1e Refactor: rename IsGossip -> UsesLegacyGossip
We want to be able to use "dns=none" (without peer-to-peer gossip)
even for clusters that have the k8s.local extension.  These were
previously called "gossip clusters", but really that is an
implementation; what actually matters to users is that they don't rely
on writing records into a DNS zone (such as Route53).
2023-05-22 21:50:16 -04:00
John Gardiner Myers de9055b588 Update control-plane terminology in CLI output strings 2022-11-23 21:32:10 -08:00
John Gardiner Myers d39ba74bd7 Change the control-plane IG role to "ControlPlane" in v1alpha3 API 2022-11-22 17:05:29 -08:00
Ciprian Hacman 4e5ded6dc3 hetzner: Create cluster without DNS or Gossip 2022-10-27 11:29:37 +03:00
Ciprian Hacman dc98c74428 Move Gossip check to cluster struct 2022-10-21 09:48:07 +03:00
Ciprian Hacman 85026145a1 Always infer gossip DNS from cluster name 2022-10-02 12:54:37 +03:00
Ole Markus With 2ff655a688 Fix control plane validation 2022-04-18 15:32:27 +02:00
Ciprian Hacman ea7df00719 Run hack/update-gofmt.sh 2021-12-01 22:39:50 +02:00
Peter Rifel 8d1d16c342
Clarify the deployment responsible for API DNS in error message 2021-10-28 11:29:38 -05:00
John Gardiner Myers 2328ec2044 Report the placeholder address that was found 2021-10-27 22:15:08 -07:00
John Gardiner Myers d4cf1a80f0 Create placeholder DNS records of correct type for IPv6 clusters 2021-10-26 20:13:01 -07:00
Rajat Jindal 0ca28d986c do not validate detached nodes 2021-04-28 21:30:46 +05:30
Ole Markus With 09615935fd Make kOps CLI handle ASG warm pools 2021-04-15 11:10:23 +02:00
Ole Markus With 20bd724f5e Add support for scaling out the control plane with dedicated apiserver nodes
Ensure apiserver role can only be used on AWS (because of firewalling)

Apply api-server label to CP as well

Consolidate node not ready validation message

Guard apiserver nodes with a feature flag

Rename Apiserver role to APIServer

Add an integration test for apiserver nodes

Rename Apiserver role to APIServer

Enumerate all roles in rolling update docs

Apply suggestions from code review

Co-authored-by: Steven E. Harris <seh@panix.com>
2021-03-20 20:57:00 +01:00
Bharath Vedartham 5c9b688984 validate_cluster: Create node to instance group mapping to get pod instance group
In the ValidationError struct, there is a field to identify the instance group to
which the ValidationError is associated with.

For pod related ValidationErrors, it is not straightforward to identify the
instance group to which the pod is associated with.

To acheive this, we create a node to instance group mapping in ValidateNodes.
This node to instance group mapping is used in collectPodFailures to identify the
pod instance group by using the pod's hostIp field.

We don't associate system-cluster-critical pods to instance groups as those pod
failures are cluster wide
2020-11-15 11:06:03 +05:30
Kubernetes Prow Robot 01b17be97e
Merge pull request #10221 from eddycharly/fix-validation
Fix cluster validation dependency on local kubeconfig
2020-11-14 14:17:03 -08:00
Charles-Edouard Brétéché 116af0c74b pass host only instead of the whole config 2020-11-12 08:37:51 +01:00
Charles-Edouard Brétéché 709e1b6cbd Fix cluster validation dependency on local kubeconfig 2020-11-11 21:11:54 +01:00
John Gardiner Myers c2434a2e08 Remove components from cluster validation 2020-11-10 23:36:46 -08:00
Bharath Vedartham 49f2a0e10a validate_cluster: Add InstanceGroup field to ValidationError struct
The InstanceGroup field in ValidationError struct is an optional field meant
to indicate the InstanceGroup which has reported that failure. This field either
holds a pointer to the instance group which caused the validation error or can be
nil which indicates that we were unable to determine the instance group to which
this failure should be attributed to.

This field is mainly used to identify whether a failure is worth waiting for
when validating a particular instance group.
2020-10-31 19:16:42 +05:30
John Gardiner Myers ca241a5193 Don't require PriorityClassName to pass missing-static-pod checks 2020-10-13 22:42:11 -07:00
Ole Markus With 0ec71686b9 Refactor cloudinstancegroupmember in a more independent cloud instance representation
Apply suggestions from code review

Co-authored-by: John Gardiner Myers <jgmyers@proofpoint.com>
2020-08-30 21:37:03 +02:00
Peter Rifel 4d9f0128a3
Upgrade to klog2
This splits up the kubernetes 1.19 PR to make it easier to keep up to date until we get it sorted out.
2020-08-16 20:56:48 -05:00
John Gardiner Myers a760080f01 Prefer the GA label for node zone 2020-06-14 16:35:57 -07:00
Kubernetes Prow Robot a454f0ff83
Merge pull request #9118 from johngmyers/validate-missing-asg
Return cluster validation failure if ASG missing
2020-05-13 14:36:23 -07:00
John Gardiner Myers 154833e652 Fail cluster validation if too few nodes for ig's target size 2020-05-12 22:28:26 -07:00
John Gardiner Myers 23d48f01d6 Return cluster validation failure if ASG missing 2020-05-11 21:19:02 -07:00
John Gardiner Myers 06376302e4 Don't test static pods on non-ready nodes 2020-05-02 22:09:53 -07:00
John Gardiner Myers c524290f9e Test more static pods during cluster validation 2020-05-02 22:09:53 -07:00
Justin Santa Barbara 31bb16d4d1 Add context.Context to most signatures
The client-go signature for most methods adds a context.Context
object, and also makes Options mandatory.  Feed through a
context.Context through many of our methods (but use context.TODO to
stop it getting totally out of hand!)
2020-04-11 14:44:17 -04:00
Kubernetes Prow Robot a210ec9649
Merge pull request #8446 from johngmyers/validate-priority
Use PriorityClassName instead of namespace in cluster validation
2020-03-11 10:09:37 -07:00
John Gardiner Myers 1b7c5139e0 Merge branch 'master' into surge 2020-03-03 17:53:18 -08:00
John Gardiner Myers a99ef7c8d2 Handle Unknown pod phase in cluster validation 2020-02-25 21:04:40 -08:00
John Gardiner Myers c557289c4b Use PriorityClassName instead of namespace in cluster validation 2020-02-25 21:04:37 -08:00
John Gardiner Myers 8148f2da69 Fail cluster validation if a master missing kube-controller-manager 2020-02-20 21:50:11 -08:00
John Gardiner Myers be12d88cc3 Detached instances don't count against instancegroup minimums 2020-01-27 20:15:11 -08:00
John Gardiner Myers 80dc001b23 Determine node role from instancegroup spec 2019-12-18 21:47:16 -08:00
John Gardiner Myers bd4e1277ae Pass the cloud object to validator from caller 2019-11-13 22:19:55 -08:00
John Gardiner Myers 55f4fcb419 Extract the list of instance groups earlier in validation 2019-11-13 22:08:52 -08:00
John Gardiner Myers 63e0c5e726 Add tests for cluster validation during rolling update 2019-11-04 16:26:39 -08:00
mikesplain 9e55b8230a Update copyright notices
Also cleans some white spaces
2019-09-09 14:47:51 -04:00
Justin SB 3e33ac7682
Change code from glog to klog
We don't call klog.InitFlags yet, because that will cause a flag
redefinition error until we get everyone to stop using glog.  That
will happen when we update to k8s 1.13.
2019-05-06 12:54:51 -04:00
Derek Lemon -T (delemon - AEROTEK INC at Cisco) 4f0169bb79 codegen 2019-01-16 09:30:40 -07:00
Justin Santa Barbara 83f40e0334 Fix missed error check in hasPlaceHolderIP 2018-12-25 10:53:50 -05:00
Justin Santa Barbara f49aba4147
Consider pending pods to be a validation failure
Also log the names of the non-ready containers.
2018-12-20 10:08:40 -05:00
Justin SB a96a58ac78
Include name of unhealthy component in validation error
Rolling-update just prints the message, and indeed I think the message
should be self-contained.
2018-11-27 09:53:40 -05:00
Raffaele Di Fazio d477e96c38 Added initial implementation of ACM cert for Kubernetes API ELB 2018-07-06 09:29:54 +02:00