Commit Graph

1931 Commits

Author SHA1 Message Date
justinsb 826a778f58 tests: add tests for kubectl get assets 2025-05-02 14:58:19 -04:00
justinsb e1afbab608 tests: verify that we can marshal tasks to json 2025-03-27 08:03:16 -04:00
Kubernetes Prow Robot 7c52ef7b74
Merge pull request #17274 from zetaab/feat/fixrollingupdatetime
make --admin configurable to rolling-update
2025-02-24 11:00:30 -08:00
Peter Rifel 5ac11aa55a
Cleanup logging for reconcile cluster 2025-02-23 21:04:31 -06:00
Jesse Haka d5cea90a82 make --admin configurable to rolling-update 2025-02-22 10:01:57 +02:00
justinsb 3ea73f47f8 Better dumping via private IP when bastion is not set
Previously this would always fail in a confusing way,
regardless of whether we had connectivity,
because we tried to connect to an empty-string host.

Now we are more explicit about the error,
and will at least try to connect directly.
2025-02-19 08:25:33 -05:00
justinsb 5c2c304b7f Remove reconcile flag from `kops update`
We have `kops reconcile`, and it's confusing having both.

We didn't ship the --reconcile flag in any released version.
2025-01-20 10:35:24 -05:00
justinsb 284b15be19 Support strong-typing for --target values
A small cleanup that makes our code a little more robust.
2025-01-19 09:21:05 -05:00
Kubernetes Prow Robot 961a786b65
Merge pull request #17214 from justinsb/reconcile_dryrun_should_be_an_update
reconcile: if --yes is not provided, print the same output as `update cluster` does
2025-01-15 04:20:33 -08:00
justinsb 48f12bed83 reconcile: if --yes is not provided, print the same output as `update cluster` does 2025-01-15 06:18:26 -05:00
justinsb f2d4eeb104 reconcile: wait for apiserver to response before trying rolling-update
The rolling-update requires the apiserver (when called without --cloudonly),
so reconcile should wait for apiserver to start responding.

Implement this by reusing "validate cluster", but filtering to only the instance groups
and pods that we expect to be online.
2025-01-13 17:47:48 -05:00
Ciprian Hacman eac132daae Add IPv6 support for kindnet 2025-01-09 17:50:06 +02:00
Antonio Ojea f2c239dd81 add kindnet network plugin
add kindnet as an experimental network addon

containerd adds the requirement to use the loopback cni plugin,
kindnet provides that capability and containerd does not require it
since containerd/containerd/pull/10238

Change-Id: I1397a90186885b02e98b5ffa444fe629c1046757
2025-01-08 01:09:37 +00:00
Kubernetes Prow Robot 7af9770c59
Merge pull request #17155 from justinsb/build_our_own_rest_config_2
chore: generate kubeconfig on the fly
2025-01-06 10:30:16 +01:00
Ciprian Hacman eaf796c3c0 Remove support for K8s 1.26 in kOps 1.32 2025-01-04 15:01:41 +02:00
Kubernetes Prow Robot 51db52f025
Merge pull request #17154 from justinsb/build_our_own_rest_config
chore: refactor factory to accept a cluster
2024-12-28 09:08:12 +01:00
justinsb 324117cc52 chore: generate kubeconfig on the fly
Some kOps actions require connecting to the cluster, but
we don't always have a kubeconfig available.

This commit adds a function to generate a client config on the fly
(including a certificate) when needed.
2024-12-27 16:37:59 -05:00
justinsb 859a9fd9f1 chore: refactor factory to accept a cluster
This should allow us to build our own rest config in future,
rather than relying on the kubeconfig being configured correctly.

To do this, we need to stop sharing the factory between the channels
and kops commands.
2024-12-27 15:36:37 -05:00
justinsb ebcfebe50e chore: add context to rolling update functions
Move it out of the struct, and into the function parameters.

This is more go idiomatic.
2024-12-27 14:22:51 -05:00
justinsb ab613ff114 Add `kops reconcile cluster` command
This all-in-one command is a replacement for having to run multiple commands,
while still respecting the version skew policy.

It does the same thing as `kops update cluster --reconcile`:

* Updates the control plane nodes
* Does a rolling update of the control plane nodes
* Updates "normal" nodes and bastion nodes
* Does a rolling update of these nodes
* Prunes old resources that are no longer used
2024-12-05 12:27:08 -05:00
justinsb 7c95effdb4 Introduce --reconcile flag to kOps
Kubernetes 1.31 now stops nodes joining a cluster if the minor version
of the node is greater than the minor version of the control plane.

The addition of the instance-group-roles flag to update means that we
can now update / rolling-update the control plane first.  However, we
must now issue four commands:

* Update control plane
* Rolling update control plane
* Update nodes
* Rolling update nodes

This adds a flag to automate this process.  It is implemented by
executing those 4 steps in sequence.

Update is also smart enough to not update the nodes if this would
violate the skew policy, but we do this explicitly in the reconcile
command to be clearer and safer.
2024-12-05 11:36:13 -05:00
justinsb 4a63a118b2 Remove unusued kubernetesVersion from AssetBuilder
This field is no longer used, and can be removed.
2024-12-04 08:57:17 -05:00
Rafael da Fonseca cc15357999 Automatically preserve kubelet supported version skew on worker nodes, while control plane is being updated
Co-authored-by: Peter Rifel <rifelpet@users.noreply.github.com>
2024-12-03 07:36:16 -05:00
justinsb b124625c62 toolbox dump: support dumping only k8s resources
Because metal does not support cloud-resource discovery, we need to
skip this in our metal tests.
2024-11-12 13:11:34 -05:00
Ciprian Hacman 1683894999 Allow updating the cluster one instance group at a time
Co-Authored-By: Ciprian Hacman <ciprianhacman@gmail.com>
2024-11-09 11:34:28 -05:00
justinsb 4be079c3e1 fix: upgrade cluster kubernetes selection logic
Currently it relies on us updating the channel version in two places,
but this makes `kops upgrade cluster` inconsistent with `kops update cluster`.

`kops update cluster` also tells us to run `kops upgrade cluster`,
which then might not recommend an upgrade.
2024-10-12 08:16:44 -04:00
justinsb 0963d73cc5 metal: initial support for adding hosts
The bulk of this work is implementing a clientset for use in kops-controller.
2024-09-18 09:03:43 -04:00
justinsb eda7c25fa9 metal: stub node identification for bare metal 2024-09-14 13:50:31 -04:00
Peter Rifel 3f3d0f11c5
Discover a bastion load balancer and use it for dumping artifacts 2024-09-06 19:34:31 -05:00
Peter Rifel 7581394f66
Give each controller unique names 2024-09-05 21:57:01 -05:00
justinsb 6a2a723bd2 refactor: give clear error message if challenge endpoint cannot be found
We were not handling this particularly clearly before, although it should only happen in development.
2024-08-29 05:32:51 -04:00
justinsb 3646a610b1 refactor: Move GetCloudProvider to cluster
This lets us use labels (or annotations), meaning we can experiment
with different clouds without changing the API.

We also add initial (experimental/undocumented) support for exposing a "Metal" provider.
2024-08-26 08:20:37 -04:00
Ciprian Hacman ec4e88a7f9 aws: Fix conversion for instance-selector flags 2024-08-25 20:00:50 +03:00
Kubernetes Prow Robot 2b39cbe78a
Merge pull request #16746 from hakman/dependencies/update-1723183478
Fix verify-golangci-lint
2024-08-09 13:18:09 -07:00
Kubernetes Prow Robot 2a1f1f287d
Merge pull request #16705 from hakman/gce-startup-script
gce: Add option to use startup script instead of user-data
2024-08-09 13:18:03 -07:00
Ciprian Hacman 689462af01 Fix verify-golangci-lint 2024-08-09 20:43:43 +03:00
pengbanban 771186943f chore: fix function name
Signed-off-by: pengbanban <pengbanban@aliyun.com>
2024-08-04 19:19:29 +08:00
justinsb 839914a0d0 refactor: support multiple podCIDRs in the node patch
Breaking down the metal support into bite-sized chunks.
2024-07-30 15:52:08 -04:00
Ciprian Hacman 599f97c88c Add option for enabling GCE startup script tests 2024-07-30 06:43:07 +03:00
Ciprian Hacman 9c597bb13a Update opentelemetry.io schema to v1.26.0 2024-07-13 17:21:47 +03:00
justinsb 65fe6dc3c4 refactor: ApplyClusterCmd clearly returns results
By having an explicit return value, we set ourselves up for better reuse.
2024-07-04 14:54:00 -04:00
github-actions 33b26b7130 Update dependencies 2024-06-08 15:30:32 +03:00
Ciprian Hacman 86f5d455e5
Release 1.30.0-alpha.1 (#16563)
* Release 1.30.0-alpha.1

* Update tests for K8s v1.30

* Remove mentions of K8s v1.24
2024-05-11 23:40:27 -07:00
justinsb fc049f40f3 Use embedded hashes for our well-known assets
Rather than downloading the hash every time, we can record the hashes
for our well-known assets and bake them into the kOps binary.  If the
hash is not baked in, we will continue to fall-back to downloading it,
this is important for new k8s versions, or where the user specifies a
version of one of our well-known assets (such as containerd).
2024-05-09 10:27:33 -04:00
Ciprian Hacman caef84abbf Update integration tests to include node-problem-detector 2024-05-07 17:50:07 +03:00
Peter Rifel 62df0dba04
Migrate AWS Verifier to aws-sdk-go-v2 2024-05-05 08:39:20 -04:00
cuiyourong 0aebba8798 Fix function name in comment
Signed-off-by: cuiyourong <cuiyourong@gmail.com>
2024-04-23 18:07:40 +08:00
racequite 524f27c54c chore: fix function names in comment
Signed-off-by: racequite <quiterace@gmail.com>
2024-04-19 12:42:19 +08:00
Peter Rifel 907e58b7d4
Update instance selector to aws-sdk-go-v2 2024-04-13 16:03:30 -04:00
Peter Rifel f0c0c29121
Migrate EC2 Networking resource types to aws-sdk-go-v2 2024-04-13 16:01:39 -04:00