Fixes#6827
We upgrade the Server and ServerAuthorization CRD versions from v1alpha1 to v1beta1. This version update does not change the schema at all and the v1alpha1 versions will continue to be served for now. We also update the CLI and control plane to use the v1beta1 versions.
Signed-off-by: Alex Leong <alex@buoyant.io>
We initially implemented a mechanism to automatically authorize
unauthenticated traffic from each pod's Kubelet's IP. Our initial method
of determining a pod's Kubelet IP--using the first IP from its node's
pod CIDRs--is not a generally usable solution. In particular, CNIs
complicate matters (and EKS doesn't even set the podCIDRs field).
This change removes the policy controller's node watch and removes the
`default:kubelet` authorization. When using a restrictive default
policy, users will have to define `serverauthorization` resources that
permit kubelet traffic. It's probably possible to programatically
generate these authorizations (i.e. by inspecting pod probe
configurations); but this is out of scope for the core control plane
functionality.
We add a validating admission controller to the policy controller which validates `Server` resources. When a `Server` admission request is received, we look at all existing `Server` resources in the cluster and ensure that no other `Server` has an identical selector and port.
Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
The policy controller's readiness and liveness admin endpoint is tied to
watch state: the controller only advertises liveness when all watches
have received updates; and after a watch disconnects liveness fails
until a new update is received.
However, in some environments--especially when the API server ends the
stream before the client gracefully reconnects--the watch terminess so
that liveness is not advertises even though the client resumes watching
resources. Because the watch is resumed with a `resourceVersion`, no
updates are provided despite the watch being reestablished, and liveness
checks fail until the pod is terminated (or an update is received).
To fix this, we modify readiness advertisements to fail only until the
initial state is acquired from all watches. After this, the controller
serves cached state indefinitely.
While diagnosing this, logging changes were needed, especially for the
`Watch` type. Watches now properly maintain logging contexts and state
transitions are logged in more cases. The signature and logging context
of `Index::run` has been updated as well. Additionally, node lookup
debug logs have been elaborated to help confirm that 'pending' messages
are benign.
We can't use the typical multiarch docker build with the proxy:
qemu-hosted arm64/arm builds take 45+ minutes before failing due to
missing tooling--specifically `protoc`. (While there is a `protoc`
binary available for arm64, there are no binaries available for 32-bit
arm hosts).
To fix this, this change updates the release process to cross-build the
policy-controller on an amd64 host to the target architecture. We
separate the policy-controller's dockerfiles as `amd64.dockerfile`,
`arm64.dockerfile`, and `arm.dockerfile`. Then, in CI we build and push
each of these images individually (in parallel, via a build matrix).
Once all of these are complete, we use the `docker manifest` CLI tools
to unify these images into a single multi-arch manifest.
This cross-building approach requires that we move from using
`native-tls` to `rustls`, as we cannot build against the platform-
appropriate native TLS libraries. The policy-controller is now feature-
flagged to use `rustls` by default, though it may be necessary to use
`native-tls` in local development, as `rustls` cannot validate TLS
connections that target IP addresses.
The policy-controller has also been updated to pull in `tracing-log` for
compatibility with crates that do not use `tracing` natively. This was
helpful while debugging connectivity issue with the Kubernetes cluster.
The `bin/docker-build-policy-controller` helper script now *only* builds
the amd64 variant of the policy controller. It fails when asked to build
multiarch images.
kube v0.59 depends on k8s-openapi v0.13, which includes breaking
changes.
This change updates these dependencies and modifies our code to account
for these changes.
Furthermore, we now use the k8s-openapi feature `v1_16` so that we use
an API version that is compatible with Linkerd's minimum support
kubernetes version.
Closes#6657#6658#6659
We've implemented a new controller--in Rust!--that implements discovery
APIs for inbound server policies. This change imports this code from
linkerd/polixy@25af9b5e.
This policy controller watches nodes, pods, and the recently-introduced
`policy.linkerd.io` CRD resources. It indexes these resources and serves
a gRPC API that will be used by proxies to configure the inbound proxy
for policy enforcement.
This change introduces a new policy-controller container image and adds a
container to the `Linkerd-destination` pod along with a `linkerd-policy` service
to be used by proxies.
This change adds a `policyController` object to the Helm `values.yaml` that
supports configuring the policy controller at runtime.
Proxies are not currently configured to use the policy controller at runtime. This
will change in an upcoming proxy release.