Since AWS does not resolve instance hostnames to ipv6, ipv6-only pods that talk to kubelet API has to use node IP, not hostname. Thus we need to add IPs to kubelet server cert.
The role names are checked in node bootstrap.
If profile names are provided, bootstrap will fail.
Because profile name and role name do not always mactch in AWS IAM
Sometimes we see the following error during a rolling update:
I1125 18:12:46.467059 165 instancegroups.go:340] Draining the node: "ip-X-X-X-X.X.compute.internal".
I1125 18:12:46.473365 165 instancegroups.go:359] deleting node "ip-X-X-X-X.X.compute.internal" from kubernetes
I1125 18:12:46.476756 165 instancegroups.go:486] Stopping instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal", in group "X" (this may take a while).
E1125 18:12:46.523269 165 instancegroups.go:367] error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXX' does not exist
status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0
It's possible that the EC2 instance to have disappeared by the time it
was detached (it may have been a spot instance for example)
In any case, we can't do much when we do not find an instance id, and
throwing this error during the update is not very user friendly.
As such, we can simply report and tolerate this problem instead of
exiting with non-zero code. This is similar to how we handle missing
EC2 when updating an IG[1]
[1] https://github.com/kubernetes/kops/pull/594
* refactor TargetLoadBalancer to use DNSTarget interface instead of LoadBalancer
* add LoadBalancerClass fields into api
* make api machinery
* WIP: Implemented API loadbalancer class, allowing NLB and ELB support on AWS for new clusters.
* perform vendoring related tasks and apply fixes identified from hack/
dissallow spotinst + nlb
remove reflection in status_discovery.go
Add precreated additional security groups to the Master nodes in case of NLB
Remove support for attaching individual instances to NLB; only rely on ASG attachments
Don't specify Classic loadbalancer in GCE integration test
* add utility function to the kops model context to make LoadBalancer comparisons simpler
* use DNSTarget interface when locating DNSName of API ELB
* wip: create target group task
* Consolidate TargetGroup tasks
* Use context helper for determining api load balancer type to avoid nil pointers
* Update NLB creation to use target group ARN from separate task rather than creating a TG in-line
* Address staticcheck and bazel failures
* Removing NLB Attachment tasks because they're not used since we switched to defining them as a part of the ASGs
* Address PR review feedback
* Only set LB Class field for AWS clusters, fix nil pointer
* Move target group attributes from NLB task to TG task, removing unused attributes
* Add terraform and cloudformation support for NLBs, listeners, and target groups
* Update integration test for NLB support
* Fix NLB name format to pass terraform validation
* Preserve security group rule names when switching ELB to NLB to reduce destructive terraform changes
* Use elbv2 enums and address some TODOs
* Set healthcheck values in target group
* Find TG tags, fix NLB name detection
* Fix more spurious changes reported by lifecycle integration test
* Fix spotinst validation, more code cleanup
* Address more PR feedback
* ReconcileTargetGroups unit test + more code simplification
* Addressing PR feedback Renaming task 1. awstasks.LoadBalancer -> awstasks.ClassicLoadBalancer
* Addressing PR feedback Renaming task: ELBName() -> CLBName() / LinkToELB() -> LinkToCLB()
* Addressing PR feedback: Various text changes
* fix export of kubecfg
* address TargetGroup should have the same name as the NLB
* should address error when fetching tags due to missing ARN
* Update expected and crds
* Add feature table to NLB docs
* Address more feedback and remove some TODOs that arent applicable anymore
* Update spotinst validation error message
Co-authored-by: Peter Rifel <pgrifel@gmail.com>
This should be much easier to start and to get under testing; it only
works with a load balancer, it sets the apiserver into anonymous-auth
allowed, it grants the anonymous auth user permission to read our jwks
tokens. But it shouldn't need a second bucket or anything of that
nature.
Co-authored-by: John Gardiner Myers <jgmyers@proofpoint.com>
We don't call klog.InitFlags yet, because that will cause a flag
redefinition error until we get everyone to stop using glog. That
will happen when we update to k8s 1.13.
On merge of https://github.com/kubernetes/kops/pull/6277 the launchconfiguration or template is evaluated; where as before LC was just taken at face value, now the LC/LT is checked for existence. This causes an issue on rolling nodes, nodes where the LC has dissappeared due to retention and terminating instances
When try `make govet` on `go1.11beta1`. It complains many things
related to invalid string formatting:
```
pkg/kubemanifest/visitor.go:35: Verbose.Infof format %s has arg v of wrong type bool
pkg/kubemanifest/visitor.go:40: Verbose.Infof format %s has arg v of wrong type float64
upup/pkg/fi/cloudup/alitasks/disk.go:76: Verbose.Info call has possible formatting directive %q
upup/pkg/fi/cloudup/alitasks/disk.go:91: Verbose.Info call has possible formatting directive %q
upup/pkg/fi/cloudup/alitasks/launchconfiguration.go:89: Verbose.Info call has possible formatting directive %q
upup/pkg/fi/cloudup/alitasks/loadbalancer.go:71: Verbose.Info call has possible formatting directive %q
upup/pkg/fi/cloudup/alitasks/loadbalancer.go:125: Verbose.Info call has possible formatting directive %q
upup/pkg/fi/cloudup/alitasks/scalinggroup.go:71: Verbose.Info call has possible formatting directive %q
dns-controller/pkg/dns/dnscontroller.go:603: Verbose.Infof format %s has arg records of wrong type []dns.Record
dns-controller/cmd/dns-controller/main.go:184: Verbose.Info call has possible formatting directive %q
pkg/acls/s3/storage.go:62: Verbose.Infof format %q arg u.String is a func value, not called
pkg/apis/kops/validation/validation_test.go:199: T.Fatalf format %q has arg config of wrong type *k8s.io/kops/pkg/apis/kops.DockerConfig
pkg/resources/aws/aws.go:1306: Warning call has possible formatting directive %q
pkg/resources/aws/aws.go:1313: Warning call has possible formatting directive %v
upup/pkg/fi/cloudup/aliup/ali_cloud.go:218: Verbose.Info call has possible formatting directive %q
upup/pkg/fi/cloudup/aliup/ali_cloud.go:290: Verbose.Info call has possible formatting directive %q
upup/pkg/fi/fitasks/keypair.go:266: Errorf format %q has arg e.Name of wrong type *string
upup/pkg/fi/files_owner.go:56: Infof format %s has arg group of wrong type *fi.Group
upup/pkg/fi/users.go:57: Warning call has possible formatting directive %q
upup/pkg/fi/users.go:63: Warning call has possible formatting directive %q
upup/pkg/fi/users.go:68: Warning call has possible formatting directive %q
upup/pkg/fi/users.go:129: Warning call has possible formatting directive %q
upup/pkg/fi/users.go:135: Warning call has possible formatting directive %q
upup/pkg/fi/nodeup/nodetasks/file.go:313: Errorf format %q has arg e.Mode of wrong type *string
upup/pkg/fi/cloudup/awsup/aws_cloud.go:1021: Warningf format %q reads arg #2, but call has 1 arg
upup/pkg/fi/cloudup/awsup/aws_cloud.go:1025: Warningf format %q reads arg #2, but call has 1 arg
```
Where we can identify the SSH user to use, we can include it in kops
toolbox dump. This is a precursor to trying to better understand
what's in an image (warnings about NVME or network drivers, or showing
the correct SSH username)
Automatic merge from submit-queue.
Add --subnets and --utility-subnets to kops create cluster
This change adds two new options to `kops create cluster`
When specifying `--vpc`, `--subnets` can be specified as an unordered array of subnet ids. Kops will then look up the zones of the subnets to find which zone to add the subnet id to.
If `--topology private` is also specified, `--utility-subnets` can similarly be specified.
~If a zone was specified but a subnet wasn't given that matches the zone, then the subnet will be allocated a CIDR with the current behaviour.~ This case fails validation here 7bd0a6a703/pkg/apis/kops/validation/validation.go (L151)
I can add unit tests and docs changes if required, but I am keen to get feedback before I proceed much further.
I have only added support for AWS.
I have tested this by running a command similar to this:
```bash
kops create cluster \
--zones=us-east-1a,us-east-1b,us-east-1c \
--topology private \
--master-zones=us-east-1a,us-east-1b,us-east-1c \
--vpc $vpc_id \
--subnets subnet-111111,subnet-222222,subnet-333333 \
--utility-subnets subnet-444444,subnet-555555,subnet-666666 \
$cluster_hosted_zone_name
```
And the cluster spec was as expected.
Works around nil SleepDelay problem: latest aws-sdk-go (in k8s 1.9 and
kops 1.8) has updated SleepDelay logic; fix is in
https://github.com/kubernetes/kubernetes/pull/55307 but that is only in
1.9.
Set the SleepDelay to work around the problem.
Automatic merge from submit-queue.
Add Zones field to InstanceGroup
The Zones field can specify zones where they are not specified on a
Subnet, for example on GCE where we have regional subnets.