Commit Graph

99 Commits

Author SHA1 Message Date
justinsb f2d4eeb104 reconcile: wait for apiserver to response before trying rolling-update
The rolling-update requires the apiserver (when called without --cloudonly),
so reconcile should wait for apiserver to start responding.

Implement this by reusing "validate cluster", but filtering to only the instance groups
and pods that we expect to be online.
2025-01-13 17:47:48 -05:00
justinsb 859a9fd9f1 chore: refactor factory to accept a cluster
This should allow us to build our own rest config in future,
rather than relying on the kubeconfig being configured correctly.

To do this, we need to stop sharing the factory between the channels
and kops commands.
2024-12-27 15:36:37 -05:00
Ciprian Hacman 26a424bcd7 validation: Use constants for validating node labels 2024-01-09 12:31:20 +02:00
justinsb ca67b1ca1e Refactor: rename IsGossip -> UsesLegacyGossip
We want to be able to use "dns=none" (without peer-to-peer gossip)
even for clusters that have the k8s.local extension.  These were
previously called "gossip clusters", but really that is an
implementation; what actually matters to users is that they don't rely
on writing records into a DNS zone (such as Route53).
2023-05-22 21:50:16 -04:00
John Gardiner Myers de9055b588 Update control-plane terminology in CLI output strings 2022-11-23 21:32:10 -08:00
John Gardiner Myers d39ba74bd7 Change the control-plane IG role to "ControlPlane" in v1alpha3 API 2022-11-22 17:05:29 -08:00
Ciprian Hacman 4e5ded6dc3 hetzner: Create cluster without DNS or Gossip 2022-10-27 11:29:37 +03:00
Ciprian Hacman dc98c74428 Move Gossip check to cluster struct 2022-10-21 09:48:07 +03:00
Ciprian Hacman 85026145a1 Always infer gossip DNS from cluster name 2022-10-02 12:54:37 +03:00
Ole Markus With 2ff655a688 Fix control plane validation 2022-04-18 15:32:27 +02:00
Ole Markus With ce2e877aeb Remove bazel files from vendor 2022-04-12 13:29:03 +02:00
Ciprian Hacman ea7df00719 Run hack/update-gofmt.sh 2021-12-01 22:39:50 +02:00
Peter Rifel 8d1d16c342
Clarify the deployment responsible for API DNS in error message 2021-10-28 11:29:38 -05:00
John Gardiner Myers 2328ec2044 Report the placeholder address that was found 2021-10-27 22:15:08 -07:00
John Gardiner Myers d4cf1a80f0 Create placeholder DNS records of correct type for IPv6 clusters 2021-10-26 20:13:01 -07:00
Rajat Jindal 0ca28d986c do not validate detached nodes 2021-04-28 21:30:46 +05:30
Rajat Jindal 1fed9c7711 add testcase demonstrating detached nodes getting validated 2021-04-28 21:30:46 +05:30
Ole Markus With 09615935fd Make kOps CLI handle ASG warm pools 2021-04-15 11:10:23 +02:00
Ole Markus With 20bd724f5e Add support for scaling out the control plane with dedicated apiserver nodes
Ensure apiserver role can only be used on AWS (because of firewalling)

Apply api-server label to CP as well

Consolidate node not ready validation message

Guard apiserver nodes with a feature flag

Rename Apiserver role to APIServer

Add an integration test for apiserver nodes

Rename Apiserver role to APIServer

Enumerate all roles in rolling update docs

Apply suggestions from code review

Co-authored-by: Steven E. Harris <seh@panix.com>
2021-03-20 20:57:00 +01:00
Bharath Vedartham 424ab3734e validate_cluster_test: enhance validatePodFailure tests
We are now able to identify the instance group associated with a pod.

Add an extra layer to the validatePodFailure to the test where we
create a mock InstanceGroup and associate the pod failures with the
instance group to which the pod belongs to.
2020-11-15 11:07:21 +05:30
Bharath Vedartham 5c9b688984 validate_cluster: Create node to instance group mapping to get pod instance group
In the ValidationError struct, there is a field to identify the instance group to
which the ValidationError is associated with.

For pod related ValidationErrors, it is not straightforward to identify the
instance group to which the pod is associated with.

To acheive this, we create a node to instance group mapping in ValidateNodes.
This node to instance group mapping is used in collectPodFailures to identify the
pod instance group by using the pod's hostIp field.

We don't associate system-cluster-critical pods to instance groups as those pod
failures are cluster wide
2020-11-15 11:06:03 +05:30
Kubernetes Prow Robot 01b17be97e
Merge pull request #10221 from eddycharly/fix-validation
Fix cluster validation dependency on local kubeconfig
2020-11-14 14:17:03 -08:00
Charles-Edouard Brétéché 116af0c74b pass host only instead of the whole config 2020-11-12 08:37:51 +01:00
Charles-Edouard Brétéché ee2b25e561 update bazel 2020-11-11 21:19:40 +01:00
Charles-Edouard Brétéché 709e1b6cbd Fix cluster validation dependency on local kubeconfig 2020-11-11 21:11:54 +01:00
John Gardiner Myers c2434a2e08 Remove components from cluster validation 2020-11-10 23:36:46 -08:00
Bharath Vedartham f99c04fafa validate_cluster_test: Update validate_cluster_tests
This commit fixes the unit tests for validate_cluster to reflect the addition of the new
InstanceGroup field in struct ValidationError
2020-10-31 19:16:54 +05:30
Bharath Vedartham 49f2a0e10a validate_cluster: Add InstanceGroup field to ValidationError struct
The InstanceGroup field in ValidationError struct is an optional field meant
to indicate the InstanceGroup which has reported that failure. This field either
holds a pointer to the instance group which caused the validation error or can be
nil which indicates that we were unable to determine the instance group to which
this failure should be attributed to.

This field is mainly used to identify whether a failure is worth waiting for
when validating a particular instance group.
2020-10-31 19:16:42 +05:30
John Gardiner Myers ca241a5193 Don't require PriorityClassName to pass missing-static-pod checks 2020-10-13 22:42:11 -07:00
Ole Markus With 0ec71686b9 Refactor cloudinstancegroupmember in a more independent cloud instance representation
Apply suggestions from code review

Co-authored-by: John Gardiner Myers <jgmyers@proofpoint.com>
2020-08-30 21:37:03 +02:00
Peter Rifel 4d9f0128a3
Upgrade to klog2
This splits up the kubernetes 1.19 PR to make it easier to keep up to date until we get it sorted out.
2020-08-16 20:56:48 -05:00
John Gardiner Myers a760080f01 Prefer the GA label for node zone 2020-06-14 16:35:57 -07:00
Kubernetes Prow Robot a454f0ff83
Merge pull request #9118 from johngmyers/validate-missing-asg
Return cluster validation failure if ASG missing
2020-05-13 14:36:23 -07:00
John Gardiner Myers 154833e652 Fail cluster validation if too few nodes for ig's target size 2020-05-12 22:28:26 -07:00
John Gardiner Myers 23d48f01d6 Return cluster validation failure if ASG missing 2020-05-11 21:19:02 -07:00
John Gardiner Myers 06376302e4 Don't test static pods on non-ready nodes 2020-05-02 22:09:53 -07:00
John Gardiner Myers c524290f9e Test more static pods during cluster validation 2020-05-02 22:09:53 -07:00
Justin Santa Barbara 31bb16d4d1 Add context.Context to most signatures
The client-go signature for most methods adds a context.Context
object, and also makes Options mandatory.  Feed through a
context.Context through many of our methods (but use context.TODO to
stop it getting totally out of hand!)
2020-04-11 14:44:17 -04:00
Kubernetes Prow Robot a210ec9649
Merge pull request #8446 from johngmyers/validate-priority
Use PriorityClassName instead of namespace in cluster validation
2020-03-11 10:09:37 -07:00
John Gardiner Myers 1b7c5139e0 Merge branch 'master' into surge 2020-03-03 17:53:18 -08:00
John Gardiner Myers a99ef7c8d2 Handle Unknown pod phase in cluster validation 2020-02-25 21:04:40 -08:00
John Gardiner Myers c557289c4b Use PriorityClassName instead of namespace in cluster validation 2020-02-25 21:04:37 -08:00
John Gardiner Myers 8148f2da69 Fail cluster validation if a master missing kube-controller-manager 2020-02-20 21:50:11 -08:00
John Gardiner Myers be12d88cc3 Detached instances don't count against instancegroup minimums 2020-01-27 20:15:11 -08:00
John Gardiner Myers 80dc001b23 Determine node role from instancegroup spec 2019-12-18 21:47:16 -08:00
John Gardiner Myers 92e8545902 Increase validation test coverage 2019-12-03 15:56:56 -08:00
John Gardiner Myers 21694bd545 Make validation assertions more concise 2019-12-03 15:56:53 -08:00
John Gardiner Myers fa2a651666 Test validation through the public interface 2019-12-03 15:56:08 -08:00
John Gardiner Myers bd4e1277ae Pass the cloud object to validator from caller 2019-11-13 22:19:55 -08:00
John Gardiner Myers 55f4fcb419 Extract the list of instance groups earlier in validation 2019-11-13 22:08:52 -08:00