After upgrading Cilium to 1.8 via kops one of our clusters had a total
outage due to cilium reporting errors as below:
```
level=error msg="endpoint regeneration failed" containerID= datapathPolicyRevision=0 desiredPolicyRevision=1 endpointID=592 error="Failed to load tc filter: exit status 1" identity=40147 ipv4= ipv6= k8sPodName=/ subsys=endpoint
```
upon searching Cilium slack we found the below thread:
https://cilium.slack.com/archives/C1MATJ5U5/p1616400216167600
which recommended setting `enable-host-reachable-services` to true will
address the problems. We set the field and it fixed our issues too,
however we observed that kops does not have a means to configure this
hence this PR.
We will like to have this backported after it has been merged.
This change adds a new command and functionality for updating
instance group configuration via command line arguments. This
behavior mimics the `set cluster` command.
* refactor TargetLoadBalancer to use DNSTarget interface instead of LoadBalancer
* add LoadBalancerClass fields into api
* make api machinery
* WIP: Implemented API loadbalancer class, allowing NLB and ELB support on AWS for new clusters.
* perform vendoring related tasks and apply fixes identified from hack/
dissallow spotinst + nlb
remove reflection in status_discovery.go
Add precreated additional security groups to the Master nodes in case of NLB
Remove support for attaching individual instances to NLB; only rely on ASG attachments
Don't specify Classic loadbalancer in GCE integration test
* add utility function to the kops model context to make LoadBalancer comparisons simpler
* use DNSTarget interface when locating DNSName of API ELB
* wip: create target group task
* Consolidate TargetGroup tasks
* Use context helper for determining api load balancer type to avoid nil pointers
* Update NLB creation to use target group ARN from separate task rather than creating a TG in-line
* Address staticcheck and bazel failures
* Removing NLB Attachment tasks because they're not used since we switched to defining them as a part of the ASGs
* Address PR review feedback
* Only set LB Class field for AWS clusters, fix nil pointer
* Move target group attributes from NLB task to TG task, removing unused attributes
* Add terraform and cloudformation support for NLBs, listeners, and target groups
* Update integration test for NLB support
* Fix NLB name format to pass terraform validation
* Preserve security group rule names when switching ELB to NLB to reduce destructive terraform changes
* Use elbv2 enums and address some TODOs
* Set healthcheck values in target group
* Find TG tags, fix NLB name detection
* Fix more spurious changes reported by lifecycle integration test
* Fix spotinst validation, more code cleanup
* Address more PR feedback
* ReconcileTargetGroups unit test + more code simplification
* Addressing PR feedback Renaming task 1. awstasks.LoadBalancer -> awstasks.ClassicLoadBalancer
* Addressing PR feedback Renaming task: ELBName() -> CLBName() / LinkToELB() -> LinkToCLB()
* Addressing PR feedback: Various text changes
* fix export of kubecfg
* address TargetGroup should have the same name as the NLB
* should address error when fetching tags due to missing ARN
* Update expected and crds
* Add feature table to NLB docs
* Address more feedback and remove some TODOs that arent applicable anymore
* Update spotinst validation error message
Co-authored-by: Peter Rifel <pgrifel@gmail.com>
We tend to build cloud, call some method, and then build cloud over
again. It would be easier to just pass the first one along.
Passing along cloud would also make it easier to mock cloud.
We create a simple exec plugin command which can create and renew
short-lived admin credentials on the fly, essentially leveraging the
security of the underlying cloud credentials.
Co-authored-by: John Gardiner Myers <jgmyers@proofpoint.com>
This means we no longer have to individually hard-code the `kops set`
fields, however we use the "language" we're now demonstrated.
We add tests to ensure we have parity with our existing (hard-coded)
setter logic.
This is causing problems with the Kubernetes 1.19 code-generator.
A nil entry in these slices wouldn't be valid anyways, so this should have no impact.
This requires passing a cloud object in additional places throughout the validation package and originating mostly from cmd/kops
This means that some kops commands now require valid cloud provider credentials, but I don't think this is an issue because the vast majority of use-cases already require the same cloud provider credentials in order to interact with the state store.
The client-go signature for most methods adds a context.Context
object, and also makes Options mandatory. Feed through a
context.Context through many of our methods (but use context.TODO to
stop it getting totally out of hand!)
pull-kops-e2e-kubernetes-aws — Job failed. [Details](https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/kops/7160/pull-kops-e2e-kubernetes-aws/1141685394924900352/)
```
I0620 12:37:27.976] /root/.cache/bazel/_bazel_prow/ae5d1f01453377487c630b230ced7d61/sandbox/linux-sandbox/836/execroot/__main__/pkg/commands/set_cluster.go:97:16: cluster.Spec.masterPublicName undefined (type "k8s.io/kops/pkg/apis/kops".ClusterSpec has no field or method masterPublicName, but does have MasterPublicName)
```
fixes typo
We upload to a location that includes the version, but we need to
specify the version in KOPS_BASE_URL. We expose an option to make
`kops version` more amenable to this scripting.
These shortcut commands make it easy to set enableEtcdTLS and
enableTLSAuth.
`kops set cluster cluster.spec.etcdClusters[*].enableEtcdTLS=true`
`kops set cluster cluster.spec.etcdClusters[*].enableTLSAuth=true`
Introduce an experimental kops set cluster command, for setting
individual fields in the same style as the kops create cluster
--override flags.
For now, feature flag gated by the same SpecOverrideFlag feature flag.
Also split out pkg/commands package to facilitate testing.