Automatic merge from submit-queue.
DNS Controller Optional
The current implementation enforces a dns-controller is running; given the user can switch the make the kube-apiserver server Internal and then reuse the dns for the masterInternalName; this effectlively removes the need to run the service (assuming your not using it for pods, node and service dns)
- adding a disableDnsController to the ExternalDNS spec provides a toggle on the addon (name is definitely up for debate)
- the default behaviour remains, the dns-controller is always pushed as an addon
Works around nil SleepDelay problem: latest aws-sdk-go (in k8s 1.9 and
kops 1.8) has updated SleepDelay logic; fix is in
https://github.com/kubernetes/kubernetes/pull/55307 but that is only in
1.9.
Set the SleepDelay to work around the problem.
Renamed the k8s-1.8 manifest to a k8s-1.7. This is required because of config
change that occurs between k8s 1.6 and k8s 1.7. This refactor will also
be re-used when Calico Kubernetes data source support is added to kops.
Updated bootstrapchannelbuilder with the new Calico version numbers.
Automatic merge from submit-queue.
Respect the shared tag when deleting route tables
Fixes#3828.
Modifies the `buildTrackerForRouteTable` function (used by `ListRouteTables`) to set the `Shared` field of each returned route table resource, based on the presence of the `kubernetes.io/cluster/<clustername>: shared` tag. This prevents route tables with this tag from being deleted.
WIP while I add some more tests.
Automatic merge from submit-queue.
Implement volume task for Openstack platform
Implement volume task to create volume for ETCD cluster.
Which issue this PR fixes: #3886
The current implementation requires enforces a dns-controller is running; given the user can switch the make the kube-apiserver server Internal and then reuse the dns for the masterInternalName; this effectlively removes the need to run the service (assuming your not using it for pods, node and service dns)
- adding a disableDnsController to the ExternalDNS spec provides a toggle on the addon (name is definitely up for debate)
- the default behaviour remains, the dns-controller is always pushed as an addon
Automatic merge from submit-queue.
add openstack cloud provider
Add an Openstack cloud provider. It does not implement all the interfaces of fi.Cloud, hence, can not create a cluster, but it can pass the work flow of creating cluster for the command like "kops create cluster --cloud openstack --zones nova -v 15 --target direct --yes myoscluster4.k8s.local"
Which issue this PR fixes: #3819
* Limit each CNI provider to 100m
* Remove CPU limits - they cause serious problems
(https://github.com/kubernetes/kubernetes/issues/51135), but this also
makes the CPU allocation less problematic.
* Bump versions and start introducing the `-kops.1` suffix preemptively.
* Upgrade flannel to 0.9.0 as it fixes a lot.
Automatic merge from submit-queue.
kube-router: remove beta annotation versions (deprectated in 1.8) of init container
kube-router: remove beta annotation versions (deprectated in 1.8) of init container and move init container to spec section
- adding a fix to the building of the argument, as the double quote cause an yaml parsing error
error building tasks: error remapping manifest addons/dns-controller.addons.k8s.io/pre-k8s-1.6.yaml: error parsing yaml: error converting YAML to JSON: yaml: line 37: did not find expected key
This lets us configure cross-project permissions while ourselves needing
minimal permissions, but also gives us a nice hook for future lockdown
of object-level permissions.
This ensures that the cluster can read the kops state store files, even
if the GCS bucket is in a different project.
We automatically set up an IAM access policy that grants access.
Automatic merge from submit-queue.
Implement DigitalOcean Droplet FI Task
Implements cloudup fi tasks for DigitalOcean droplets. It makes a few assumptions to reduce the size of this PR, those will be addressed in future PRs.
Also does some cleanup in the DigitalOcean `dns` package.
Automatic merge from submit-queue.
UsePolicyConfigMap for kube-scheduler
Continued from #3546
In this version, a single option `usePolicyConfigMap` is added that will install scheduler.addons.k8s.io, which contains a default configmap.
Automatic merge from submit-queue.
adding kubernetes core rate limiter handlers
This PR is re-using the handlers from the k8s core project, to create a global rate limiting.
This work starts work on https://github.com/kubernetes/kops/issues/3471
Automatic merge from submit-queue.
Initial aggregation work
Create the keypairs, which are supposed to be signed by a different CA.
Set the `--requestheader-...` flags on apiserver.
Fix#3152Fix#2691
* Stop setting the Name tag on a shared subnet/vpc
* Stop setting the legacy KubernetesCluster tag on a shared subnet/vpc
that is new enough (>=1.6); we rely on the shared tags instead
* Set tags on shared subnets; i.e. we _do_ set the shared tag on a
shared subnet; that is important for ELBs
* Set tags on shared VPCs; i.e. we _do_ set the shared tag on a shared
VPC; that is not used but consistent with subnets.
* Add tests for shared subnet
Automatic merge from submit-queue.
Add Cloud Controller Manager addon
This adds the CCM addon for the Kubernetes cluster.
This is a follow-up PR to https://github.com/kubernetes/kops/pull/3408.
cc @chrislovecnm @andrewsykim
Automatic merge from submit-queue.
Add Calico v2.5 support for Kubernetes v1.8+
Added support for Canal (Calico) v2.5.1, which is required to work with Kubernetes v1.8.0+.
Older versions of Calico relied on ThirdPartyResources API to store it's config data, however this is now fully deprecated in Kubernetes v1.8 and has moved over to CustomResourceDefinitions (CRD). Calico v2.5+ has been updated to use CRD, however there is a manual upgrade process involved to migrate the configuration data across: https://github.com/projectcalico/calico/blob/master/upgrade/v2.5/README.md
including a Weave Net template for Kubernetes 1.7 and above which adds
a volume-mount for the iptables lock file, which avoids collisions
between Weave components and kube-proxy that would result in a
half-configured Weave network.
This is only for version 1.7 and above because it requires the change
in https://github.com/kubernetes/kubernetes/issues/47212
Automatic merge from submit-queue.
Use system:kube-router User for clusterrole binding
Kube-router as it provides service proxy as well, it has a chicken-egg problem (can not
access api server till it can setup service proxy), so service account are not usable. certificate generated for kube-router has CN `system:kube-router`, so user `system:kube-router` need to be given necessary RBAC permissions
Fixes#3463
Automatic merge from submit-queue.
Initial bazel support
Builds on the 1.8 version bump
The "trick" is to strip the BUILD & BUILD.bazel files from the vendor-ed deps.
Will rebase after 1.8 version bump merges.
Automatic merge from submit-queue.
Update kube-dns to 1.14.5 for CVE-2017-14491
As described: https://security.googleblog.com/2017/10/behind-masq-yet-more-dns-and-dhcp.html
Not sure if it'd be possible to cut a new 1.7 release with this or something to give people a quick fix.
Current work around would be to manually update the addons in s3. For those who may reference this, simply upgrading to 1.7.7 will not fix this in kops.
### Edit
~ @chrislovecnm
Please see https://github.com/kubernetes/kops/issues/3512 for more information on how to address these concerns with current kops releases. We are still in the process of testing this release of kube-dns, which is a very critical component of kubernetes.
Automatic merge from submit-queue.
Add Zones field to InstanceGroup
The Zones field can specify zones where they are not specified on a
Subnet, for example on GCE where we have regional subnets.
Automatic merge from submit-queue. .
Add external-dns as addon.
This superseeds route53mapper as it has multicloud support documentation and YAML taken from https://github.com/kubernetes-incubator/external-dns
Automatic merge from submit-queue. .
Create GCE networks in auto mode, not legacy mode
auto mode allows for conversion to custom mode at the API level, and
legacy mode is deprecated.
Automatic merge from submit-queue. .
Support for using hostPort when using calico
For enabling hostPort we need to turn on portmap cni plugin.
In this PR I updated calico and calico-cni images to latest version which already includes the portmap binary, and then I only needed to modify the cni config file to enable it and change its extension from .conf to .conflist.
This is related to:
https://github.com/kubernetes/kops/issues/3132
I think we should do the same for kube-router, flannel and weave (are there any other cni plugin supported by kops?)
Automatic merge from submit-queue. .
DNS Controller Limitation
The current implementation does not place any limitation on the dns annontation which the dns-controller can consume. In a multi-tenented environment we have to ensure certain safe guards are met, so users can't by accident or intentionally alter our internal dns. Note; the current behaviour has not been changed;
- added the --watch-namespace option to the dns controller and WatchNamespace to the spec
- cleaned up area of the code where possible or related
- fixed an vetting issues that i came across on the journey
- renamed the dns-controller watcher files
The current implementation does not place any limitation on the dns annontation which the dns-controller can consume. In a multi-tenented environment was have to ensure certain safe guards are met, so users can't byt accident or intentionally alter our internal dns. Note; the current behaviour has not been changed;
- added the --watch-namespace option to the dns controller and WatchNamespace to the spec
- cleaned up area of the code where possible or related
- fixed an vetting issues that i came across on the journey
- renamed the dns-controller watcher files
Automatic merge from submit-queue. .
Support additional config options for Canal Networking
Add support for additional global and iptables configuration options within the Canal Networking Spec: https://docs.projectcalico.org/v2.4/reference/felix/configuration
- **ChainInsertMode:** Controls whether Felix inserts rules to the top of iptables chains, or appends to the bottom. Leaving the default option is safest to prevent accidentally breaking connectivity. Default: 'insert' (other options: 'append')
- **PrometheusMetricsEnabled:** Set to enable the experimental Prometheus metrics server (default: false)
- **PrometheusMetricsPort:** TCP port that the experimental Prometheus metrics server should bind to (default: 9091)
- **PrometheusGoMetricsEnabled:** Enable Prometheus Go runtime metrics collection
- **PrometheusProcessMetricsEnabled:** Enable Prometheus process metrics collection
Automatic merge from submit-queue
Add romana to built-in CNI options
This PR adds `romana` as a networking option for kops.
It installs the latest "preview" release of Romana v2.0, which provides the expected features in terms of IP allocations and route configuration. Network policy features are being ported to 2.0 and will be in the final release. (We intend to submit a followup PR for kops as part of that rolling out that release.)
Note: in this setup, we're using the etcd cluster that kops deploys for k8s. This isn't ideal, but some possibilities (eg: StatefulSets) aren't practical for the CNI itself, and creating a parallel etcd cluster via manifests seemed to be a more-intrusive approach than using the existing one.
If this is a concern or problem, then I'm very open to discussing and implementing it based on your suggestions.
Also, some functionality is exclusive to AWS environments. Other cloud platforms are on Romana's roadmap but not developed yet. Let me know that restriction needs to be enforced in code or directly documented.
Automatic merge from submit-queue
Flannel: change default backend type
We support udp, which has to the default for backwards-compatibility,
but also new clusters will now use vxlan.
Automatic merge from submit-queue
Toolbox template
Extending the current implementation of toolbox template to include multiple files and snippets. Note, I've removed the requirements for defaults as I think people should be forced to specifically pass them
- allowing the users to use a snippets directory for reusable templates
- allows the users to specify multiple templates files via multiple --template <path>, use a directory or both
- allows the users to specify multiple configuration files via multiple --values <path>, use a directory or both
- adding a safety check to ensure templates don't reference an unknown values
- fixing the vetting issues to the method YamlToJson -> YAMLToJSON
- as usual anything a saw on the journey which doesn't comply with go-vet got changed
Examples of a snippet
```YAML
hooks:
- name: some_service.service
manifest: |
{{ include "some_service.service" . | indent 6 }}
```
We currently use something similar to template our cluster and instances group documents, handling the differences between prod, ci and ephemeral
Extending the current implementation of toolbox template to include multiple files and snippets. Note, i've removed the requirements for defaults as I think people should be forced to specifically pass them.
- fixing the vetting iseues to the method YamlToJson -> YAMLToJSON
- adding a safety check to ensure templates don't reference an unknown value
- extending the unit test to ensure the above works on main and snippets
- include the ability to specify multiple configuration files, useful for common.yaml and prod.yaml etc
Requested Changes - Toolbox Templating
Added the requested changes
- moved the templater into it's own package rather than using base util
- moved to using the sprig library for additional template function
- @note: i couldn't find a native way in sprig to do snippets, also the i've overloaded the indent as it appears to do the indent on all lines rather than on the newline, meaning i'd have to shift my first line back by the indent to get it to work, which seems ugly
Automatic merge from submit-queue
Check actual EbsOptimized status during cluster update
Fixes#3313.
It seems like the actual EbsOptimized state of the LaunchConfiguration is not read during `kops update cluster` and always trigges a modification of instance-groups that have `rootVolumeOptimization: true`.
If any meaningful test can be created for this, please let me know.
This will allow us to set CIDRs for nodeport access, which in turn will
allow e2e tests that require nodeport access to pass.
Then add a feature-flagged flag to `kops create cluster` to allow
arbitrary setting of spec values; currently the only value supported is
cluster.spec.nodePortAccess
Automatic merge from submit-queue
Implementing GCE as an interface - modelling aws cloud provider
GCE and other cloud providers are structs instead of an interface. AWS cloud provider implements an interface. This PR refactors `GCECloud` as an interface, and creates `gceCloudImplementation`.
- [x] Need to e2e test
Automatic merge from submit-queue
Allow user defined endpoint to host action for Canal
Adds ability to define `Networking.Canal.DefaultEndpointToHostAction` in the Cluster Spec. This allows you to customise the behaviour of traffic routing from a pod to the host (after calico iptables chains have been processed). `ACCEPT` is the default value and is left as-is.
`If you want to allow some or all traffic from endpoint to host, set this parameter to “RETURN” or “ACCEPT”. Use “RETURN” if you have your own rules in the iptables “INPUT” chain; Calico will insert its rules at the top of that chain, then “RETURN” packets to the “INPUT” chain once it has completed processing workload endpoint egress policy.`
Kube-router was using --cluster-cidr flag to get the subnet allocated
for pod CIDR's. But now kube-router has the ability internally to infer
the CIDR allocated for the pod's by getting the information from
kubernetes API server node spec's
Automatic merge from submit-queue
Cluster / InstanceGroup File Assets
@chrislovecnm @justinsb ...
The current implementation does not make it ease to fully customize nodes before kube install. This PR adds the ability to include file assets in the cluster and instaneGroup spec which can be consumed by nodeup. Allowing those whom need (i.e. me :-)) greater flexibilty around their nodes. @Note, nothing is enforced, so unless you've specified anything everything is as the same
- updated the cluster_spec.md to reflect the changes
- permit users to place inline files into the cluster and instance group specs
- added the ability to template the files, the Cluster and InstanceGroup specs are passed into context
- cleaned up and missed comment, unordered imports etc along the journey
notes: In addition to this; need to look at the detecting the changes in the cluster and instance group spec. Think out loud perhaps using a last_known_configuration annotation, similar to kubernetes
Automatic merge from submit-queue
Delete old tags when cloudLabels / labels / taints are removed
If you remove custom cloudLabels/labels/taints from the cluster configuration, kops does not correctly update the AWS resources to delete the tags. This seems to be because it only calls the AWS API method `CreateOrUpdateTags`, which won't remove tags that aren't in the supplied list.
The current behaviour is that every `kops update cluster` will show a tag difference but never successfully apply the changes (remove the extra tags).
This PR will perform a diff of the current and expected tags, and call the `DeleteTags` API if there are any tags to delete.
Automatic merge from submit-queue
Set lifecycle on ElasticIP and NAT Gateway tasks to avoid spurious changes
Identified in issue #476
Extension of fixes within PR #3226
Automatic merge from submit-queue
starting work on file assets builder
I refactored to the dockerassets pkg to assetstasks, in order to not add yet another package. Added file copy task, that I have tested with s3 locally, but not certain how to add memfs tests.
Fixes: https://github.com/kubernetes/kops/issues/3086
The current implementation does not make it ease to fully customize nodes before kube install. This PR adds the ability to include file assets in the cluster and instaneGroup spec which can be consumed by nodeup. Allowing those whom need (i.e. me :-)) greater flexibilty around their nodes. @Note, nothing is enforced, so unless you've specified anything everything is as the same
- updated the cluster_spec.md to reflect the changes
- permit users to place inline files into the cluster and instance group specs
- added the ability to template the files, the Cluster and InstanceGroup specs are passed into context
- cleaned up and missed comment, unordered imports etc along the journey
Automatic merge from submit-queue
Update Canal to the latest
Update Calico and Flannel versions
- Calico to v2.4.1
- Flannel to v0.8.0
The #3161 issue should be reviewed for the Default Deny NetworkPolicy behavior change this PR brings along.
Automatic merge from submit-queue
Cluster Hooks Enhancement
Cluster Hook Enhancement
The current implementation is presently limited to docker exec, without ordering or any bells and whistles. This PR extends the functionality of the hook spec by;
- adds ordering to the hooks, with users able to set the requires and before of the unit
- cleaned up the manifest code, added tests and permit setting a section raw
- added the ability to filter hooks via master and node roles
- updated the documentation to reflect the changes
- extending the hooks to permit adding hooks per instancegroup as well cluster
- @note, instanceGroup are permitted to override the cluster wide one for ease of testing
- on the journey tried to fix an go idioms such as import ordering, comments for global export etc
- @question: v1alpha1 doesn't appear to have Subnet fields, are these different version being used anywhere?
Automatic merge from submit-queue
Etcd v3 Support
Etcd V3 Support
The current implementation is running v2.2.1 which is two years old and end of life. This PR adds the ability to use etcd v3 and set the versions if required. Note at the moment the image is still using the gcr.io registry image and much like Etcd TLS PR there presently is no 'automated' migration path from v2 to v3.
- the feature is gated behind the version of the etcd cluster, both clusters events and main must use the same storage type
- the version for v2 is unchanged and pinned at v2.2.1 with v3 using v3.0.17
- @question: we should consider allowing the user to override the images though I think this should be addressed generically, than one offs here and then. I know @chrislovecnm is working on a asset registry??